diff --git a/docs/rootless.md b/docs/rootless.md index 7efdd8654a..698f581222 100644 --- a/docs/rootless.md +++ b/docs/rootless.md @@ -20,43 +20,107 @@ $ grep ^$(whoami): /etc/subgid penguin:231072:65536 ``` - ### Distribution-specific hint -#### Debian (excluding Ubuntu) -* `sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone"` is required +Using Ubuntu kernel is recommended. + +#### Ubuntu +* No preparation is needed. +* `overlay2` is enabled by default ([Ubuntu-specific kernel patch](https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/fs/overlayfs?id=3b7da90f28fe1ed4b79ef2d994c81efbc58f1144)). +* Known to work on Ubuntu 16.04 and 18.04. + +#### Debian GNU/Linux +* Add `kernel.unprivileged_userns_clone=1` to `/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl -p` +* To use `overlay2` storage driver (recommended), run `sudo modprobe overlay permit_mounts_in_userns=1` ([Debian-specific kernel patch, introduced in Debian 10](https://salsa.debian.org/kernel-team/linux/blob/283390e7feb21b47779b48e0c8eb0cc409d2c815/debian/patches/debian/overlayfs-permit-mounts-in-userns.patch)). Put the configuration to `/etc/modprobe.d` for persistence. +* Known to work on Debian 9 and 10. `overlay2` is only supported since Debian 10 and needs `modprobe` configuration described above. #### Arch Linux -* `sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone"` is required +* Add `kernel.unprivileged_userns_clone=1` to `/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl -p` #### openSUSE * `sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter` is required. (This is likely to be required on other distros as well) +* Known to work on openSUSE 15. + +#### Fedora 31 and later +* Run `sudo grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=0"` and reboot. + +#### Fedora 30 +* No preparation is needed + +#### RHEL/CentOS 8 +* No preparation is needed #### RHEL/CentOS 7 -* `sudo sh -c "echo 28633 > /proc/sys/user/max_user_namespaces"` is required -* [COPR package `vbatts/shadow-utils-newxidmap`](https://copr.fedorainfracloud.org/coprs/vbatts/shadow-utils-newxidmap/) needs to be installed +* Add `user.max_user_namespaces=28633` to `/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl -p` +* `systemctl --user` does not work by default. Run the daemon directly without systemd: `dockerd-rootless.sh --experimental --storage-driver vfs` +* Known to work on RHEL/CentOS 7.7. Older releases require extra configuration steps. +* RHEL/CentOS 7.6 and older releases require [COPR package `vbatts/shadow-utils-newxidmap`](https://copr.fedorainfracloud.org/coprs/vbatts/shadow-utils-newxidmap/) to be installed. +* RHEL/CentOS 7.5 and older releases require running `sudo grubby --update-kernel=ALL --args="user_namespace.enable=1"` and reboot. -## Restrictions +## Known limitations -* Only `vfs` graphdriver is supported. However, on [Ubuntu](http://kernel.ubuntu.com/git/ubuntu/ubuntu-artful.git/commit/fs/overlayfs?h=Ubuntu-4.13.0-25.29&id=0a414bdc3d01f3b61ed86cfe3ce8b63a9240eba7) and a few distros, `overlay2` and `overlay` are also supported. +* Only `vfs` graphdriver is supported. However, on Ubuntu and Debian 10, `overlay2` and `overlay` are also supported. * Following features are not supported: * Cgroups (including `docker top`, which depends on the cgroups device controller) * Apparmor * Checkpoint * Overlay network * Exposing SCTP ports -* To expose a TCP/UDP port, the host port number needs to be set to >= 1024. +* To use `ping` command, see [Routing ping packets](#routing-ping-packets) +* To expose privileged TCP/UDP ports (< 1024), see [Exposing privileged ports](#exposing-privileged-ports) + +## Install + +The installation script is available at https://get.docker.com/rootless . + +```console +$ curl -fsSL https://get.docker.com/rootless | sh +``` + +Make sure to run the script as a non-root user. + +The script will show the environment variables that are needed to be set: + +```console +$ curl -fsSL https://get.docker.com/rootless | sh +... +# Docker binaries are installed in /home/penguin/bin +# WARN: dockerd is not in your current PATH or pointing to /home/penguin/bin/dockerd +# Make sure the following environment variables are set (or add them to ~/.bashrc): + +export PATH=/home/penguin/bin:$PATH +export PATH=$PATH:/sbin +export DOCKER_HOST=unix:///run/user/1001/docker.sock + +# +# To control docker service run: +# systemctl --user (start|stop|restart) docker +# +``` + +To install the binaries manually without using the installer, extract `docker-rootless-extras-.tar.gz` along with `docker-.tar.gz`: https://download.docker.com/linux/static/stable/x86_64/ ## Usage ### Daemon -You need to run `dockerd-rootless.sh` instead of `dockerd`. - +Use `systemctl --user` to manage the lifecycle of the daemon: ```console -$ dockerd-rootless.sh --experimental +$ systemctl --user start docker ``` -As Rootless mode is experimental per se, currently you always need to run `dockerd-rootless.sh` with `--experimental`. + +To launch the daemon on system startup, enable systemd lingering: +```console +$ sudo loginctl enable-linger $(whoami) +``` + +To run the daemon directly without systemd, you need to run `dockerd-rootless.sh` instead of `dockerd`: +```console +$ dockerd-rootless.sh --experimental --storage-driver vfs +``` + +As Rootless mode is experimental, currently you always need to run `dockerd-rootless.sh` with `--experimental`. +You also need `--storage-driver vfs` unless using Ubuntu or Debian 10 kernel. Remarks: * The socket path is set to `$XDG_RUNTIME_DIR/docker.sock` by default. `$XDG_RUNTIME_DIR` is typically set to `/run/user/$UID`. @@ -69,12 +133,24 @@ Remarks: ### Client -You can just use the upstream Docker client but you need to set the socket path explicitly. +You need to set the socket path explicitly. ```console -$ docker -H unix://$XDG_RUNTIME_DIR/docker.sock run -d nginx +$ export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/docker.sock +$ docker run -d nginx ``` +### Rootless Docker in Docker + +To run Rootless Docker inside "rootful" Docker, use `docker:-dind-rootless` image instead of `docker:-dind` image. + +```console +$ docker run -d --name dind-rootless --privileged docker:19.03-dind-rootless --experimental +``` + +`docker:-dind-rootless` image runs as a non-root user (UID 1000). +However, `--privileged` is required for disabling seccomp, AppArmor, and mount masks. + ### Expose Docker API socket via TCP To expose the Docker API socket via TCP, you need to launch `dockerd-rootless.sh` with `DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp"`. @@ -88,12 +164,23 @@ $ DOCKERD_ROOTLESS_ROOTLESSKIT_FLAGS="-p 0.0.0.0:2376:2376/tcp" \ ### Routing ping packets -To route ping packets, you need to set up `net.ipv4.ping_group_range` properly as the root. +Add `net.ipv4.ping_group_range = 0 2147483647` to `/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl -p`. + +### Exposing privileged ports + +To expose privileged ports (< 1024), set `CAP_NET_BIND_SERVICE` on `rootlesskit` binary. ```console -$ sudo sh -c "echo 0 2147483647 > /proc/sys/net/ipv4/ping_group_range" +$ sudo setcap cap_net_bind_service=ep $HOME/bin/rootlesskit ``` +Or add `net.ipv4.ip_unprivileged_port_start=0` to `/etc/sysctl.conf` (or `/etc/sysctl.d`) and run `sudo sysctl -p`. + +### Limiting resources + +Currently rootless mode ignores cgroup-related `docker run` flags such as `--cpus` and `memory`. +However, traditional `ulimit` and [`cpulimit`](https://github.com/opsengine/cpulimit) can be still used, though it works in process-granularity rather than container-granularity. + ### Changing network stack `dockerd-rootless.sh` uses [slirp4netns](https://github.com/rootless-containers/slirp4netns) (if installed) or [VPNKit](https://github.com/moby/vpnkit) as the network stack by default.