moby--moby

mirror of https://github.com/moby/moby.git synced 2022-11-09 12:21:53 -05:00

Author	SHA1	Message	Date
Akihiro Suda	3deac5dc85	btrfs: annotate error with human-readable hint string Add hints for "Failed to destroy btrfs snapshot <DIR> for <ID>: operation not permitted" on rootless Related to issue 41762 Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2021-07-27 15:45:02 +09:00
Michal Rostecki	1ec689c4c2	btrfs: Do not disable quota on cleanup Before this change, cleanup of the btrfs driver (occuring on each daemon shutdown) resulted in disabling quotas. It was done with an assumption that quotas can be enabled or disabled on a subvolume level, which is not true - enabling or disabling quota is always done on a filesystem level. That was leading to disabling quota on btrfs filesystems on each daemon shutdown. This change fixes that behavior and removes misleading `subvol` prefix from functions and methods which set up quota (on a filesystem level). Fixes: #34593 Fixes: `401c8d1767` ("Add disk quota support for btrfs") Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>	2021-04-13 16:23:39 +01:00
Akihiro Suda	62b5194f62	btrfs: Allow unprivileged user to delete subvolumes (kernel >= 4.18) Fix issue 41762 Cherry-pick "drivers: btrfs: Allow unprivileged user to delete subvolumes" from containers/storage `831e32b6bd` > In btrfs, subvolume can be deleted by IOC_SNAP_DESTROY ioctl but there > is one catch: unprivileged IOC_SNAP_DESTROY call is restricted by default. > > This is because IOC_SNAP_DESTROY only performs permission checks on > the top directory(subvolume) and unprivileged user might delete dirs/files > which cannot be deleted otherwise. This restriction can be relaxed if > user_subvol_rm_allowed mount option is used. > > Although the above ioctl had been the only way to delete a subvolume, > btrfs now allows deletion of subvolume just like regular directory > (i.e. rmdir sycall) since kernel 4.18. > > So if we fail to cleanup subvolume in subvolDelete(), just fallback to > system.EnsureRmoveall() to try to cleanup subvolumes again. > (Note: quota needs privilege, so if quota is enabled we do not fallback) > > This fix will allow non-privileged container works with btrfs backend. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>	2021-03-26 14:30:40 +09:00
Brian Goff	7f5e39bd4f	Use real root with 0701 perms Various dirs in /var/lib/docker contain data that needs to be mounted into a container. For this reason, these dirs are set to be owned by the remapped root user, otherwise there can be permissions issues. However, this uneccessarily exposes these dirs to an unprivileged user on the host. Instead, set the ownership of these dirs to the real root (or rather the UID/GID of dockerd) with 0701 permissions, which allows the remapped root to enter the directories but not read/write to them. The remapped root needs to enter these dirs so the container's rootfs can be configured... e.g. to mount /etc/resolve.conf. This prevents an unprivileged user from having read/write access to these dirs on the host. The flip side of this is now any user can enter these directories. Signed-off-by: Brian Goff <cpuguy83@gmail.com> (cherry picked from commit `e908cc3901`) Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2021-02-02 13:01:25 +01:00
Kir Kolyshkin	39048cf656	Really switch to moby/sys/mount* Switch to moby/sys/mount and mountinfo. Keep the pkg/mount for potential outside users. This commit was generated by the following bash script: ``` set -e -u -o pipefail for file in $(git grep -l 'docker/docker/pkg/mount"' \| grep -v ^pkg/mount); do sed -i -e 's#/docker/docker/pkg/mount"#/moby/sys/mount"#' \ -e 's#mount\.$GetMounts\\|Mounted\\|Info\\|[A-Za-z]*Filter$#mountinfo.\1#g' \ $file goimports -w $file done ``` Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2020-03-20 09:46:25 -07:00
Sebastiaan van Stijn	ec4bc83258	daemon/graphdriver: normalize comment formatting Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-11-27 15:43:23 +01:00
Sebastiaan van Stijn	cba180cac9	graphdriver/btrfs: SA4003: no value of type uint64 is less than 0 (staticcheck) ``` daemon/graphdriver/btrfs/btrfs.go:609:5: SA4003: no value of type uint64 is less than 0 (staticcheck) if driver.options.size <= 0 { ^ ``` Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-10-18 00:45:39 +02:00
Sebastiaan van Stijn	07ff4f1de8	goimports: fix imports Format the source according to latest goimports. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-09-18 12:56:54 +02:00
Eli Uriegas	e665263b10	daemon: Remove btrfs_noversion build flag btrfs_noversion was added in `d7c37b5a28` for distributions that did not have the `btrfs/version.h` header file. Seeing how all of the distributions we currently support do have the `btrfs/version.h` file we should probably just remove this build flag altogether. Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>	2019-08-06 22:55:29 +00:00
Kir Kolyshkin	6533136961	pkg/mount: wrap mount/umount errors The errors returned from Mount and Unmount functions are raw syscall.Errno errors (like EPERM or EINVAL), which provides no context about what has happened and why. Similar to os.PathError type, introduce mount.Error type with some context. The error messages will now look like this: > mount /tmp/mount-tests/source:/tmp/mount-tests/target, flags: 0x1001: operation not permitted or > mount tmpfs:/tmp/mount-test-source-516297835: operation not permitted Before this patch, it was just > operation not permitted [v2: add Cause()] [v3: rename MountError to Error, document Cause()] [v4: fixes; audited all users] [v5: make Error type private; changes after @cpuguy83 reviews] Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-12-10 20:07:02 -08:00
Kir Kolyshkin	16d822bba8	btrfs: ensure graphdriver home is bind mount For some reason, shared mount propagation between the host and a container does not work for btrfs, unless container root directory (i.e. graphdriver home) is a bind mount. The above issue was reproduced on SLES 12sp3 + btrfs using the following script: #!/bin/bash set -eux -o pipefail # DIR should not be under a subvolume DIR=${DIR:-/lib} MNT=$DIR/my-mnt FILE=$MNT/file ID=$(docker run -d --privileged -v $DIR:$DIR:rshared ubuntu sleep 24h) docker exec $ID mkdir -p $MNT docker exec $ID mount -t tmpfs tmpfs $MNT docker exec $ID touch $FILE ls -l $FILE umount $MNT docker rm -f $ID which fails this way: + ls -l /lib/my-mnt/file ls: cannot access '/lib/my-mnt/file': No such file or directory meaning the mount performed inside a priviledged container is not propagated back to the host (even if all the mounts have "shared" propagation mode). The remedy to the above is to make graphdriver home a bind mount. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>	2018-10-11 23:45:00 -07:00
Salahuddin Khan	763d839261	Add ADD/COPY --chown flag support to Windows This implements chown support on Windows. Built-in accounts as well as accounts included in the SAM database of the container are supported. NOTE: IDPair is now named Identity and IDMappings is now named IdentityMapping. The following are valid examples: ADD --chown=Guest . <some directory> COPY --chown=Administrator . <some directory> COPY --chown=Guests . <some directory> COPY --chown=ContainerUser . <some directory> On Windows an owner is only granted the permission to read the security descriptor and read/write the discretionary access control list. This fix also grants read/write and execute permissions to the owner. Signed-off-by: Salahuddin Khan <salah@docker.com>	2018-08-13 21:59:11 -07:00
Alejandro González Hevia	9392838150	Standardized log messages accross the different storage drivers. Now all of the storage drivers use the field "storage-driver" in their log messages, which is set to name of the respective driver. Storage drivers changed: - Aufs - Btrfs - Devicemapper - Overlay - Overlay 2 - Zfs Signed-off-by: Alejandro GonzÃlez Hevia <alejandrgh11@gmail.com>	2018-03-27 14:37:30 +02:00
Yong Tang	742d4506bd	Golint fix up This fix fixes a golint issue. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2018-02-23 16:40:37 +00:00
Brian Goff	68c3201626	Merge pull request #36237 from cpuguy83/zfs_do_not_unmount Do not recursive unmount on cleanup of zfs/btrfs	2018-02-14 09:49:17 -05:00
Brian Goff	2fe4f888be	Do not recursive unmount on cleanup of zfs/btrfs This was added in #36047 just as a way to make sure the tree is fully unmounted on shutdown. For ZFS this could be a breaking change since there was no unmount before. Someone could have setup the zfs tree themselves. It would be better, if we really do want the cleanup to actually the unpacked layers checking for mounts rather than a blind recursive unmount of the root. BTRFS does not use mounts and does not need to unmount anyway. These was only an unmount to begin with because for some reason the btrfs tree was being moutned with `private` propagation. For the other graphdrivers that still have a recursive unmount here... these were already being unmounted and performing the recursive unmount shouldn't break anything. If anyone had anything mounted at the graphdriver location it would have been unmounted on shutdown anyway. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-02-07 15:08:17 -05:00
Daniel Nephin	4f0d95fa6e	Add canonical import comment Signed-off-by: Daniel Nephin <dnephin@docker.com>	2018-02-05 16:51:57 -05:00
Brian Goff	9803272f2d	Do not make graphdriver homes private mounts. The idea behind making the graphdrivers private is to prevent leaking mounts into other namespaces. Unfortunately this is not really what happens. There is one case where this does work, and that is when the namespace was created before the daemon's namespace. However with systemd each system servie winds up with it's own mount namespace. This causes a race betwen daemon startup and other system services as to if the mount is actually private. This also means there is a negative impact when other system services are started while the daemon is running. Basically there are too many things that the daemon does not have control over (nor should it) to be able to protect against these kinds of leakages. One thing is certain, setting the graphdriver roots to private disconnects the mount ns heirarchy preventing propagation of unmounts... new mounts are of course not propagated either, but the behavior is racey (or just bad in the case of restarting services)... so it's better to just be able to keep mount propagation in tact. It also does not protect situations like `-v /var/lib/docker:/var/lib/docker` where all mounts are recursively bound into the container anyway. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2018-01-18 09:34:00 -05:00
Sebastiaan van Stijn	b4a6313969	Golint: remove redundant ifs Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2018-01-15 00:42:25 +01:00
Sebastiaan van Stijn	f9c8fa305e	Perform fsmagic detection on driver's home-dir if it exists The fsmagic check was always performed on "data-root" (`/var/lib/docker`), not on the storage-driver's home directory (e.g. `/var/lib/docker/<somedriver>`). This caused detection to be done on the wrong filesystem in situations where `/var/lib/docker/<somedriver>` was a mount, and a different filesystem than `/var/lib/docker` itself. This patch checks if the storage-driver's home directory exists, and only falls back to `/var/lib/docker` if it doesn't exist. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2017-12-04 17:10:07 -08:00
Sebastiaan van Stijn	38b3af567f	Remove deprecated MkdirAllAs(), MkdirAs() Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2017-11-21 13:53:54 +01:00
Akash Gupta	7a7357dae1	LCOW: Implemented support for docker cp + build This enables docker cp and ADD/COPY docker build support for LCOW. Originally, the graphdriver.Get() interface returned a local path to the container root filesystem. This does not work for LCOW, so the Get() method now returns an interface that LCOW implements to support copying to and from the container. Signed-off-by: Akash Gupta <akagup@microsoft.com>	2017-09-14 12:07:52 -07:00
Derek McGowan	1009e6a40b	Update logrus to v1.0.1 Fixes case sensitivity issue Signed-off-by: Derek McGowan <derek@mcgstyle.net>	2017-07-31 13:16:46 -07:00
Christopher Jones	069fdc8a08	[project] change syscall to /x/sys/unix\|windows Changes most references of syscall to golang.org/x/sys/ Ones aren't changes include, Errno, Signal and SysProcAttr as they haven't been implemented in /x/sys/. Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com> [s390x] switch utsname from unsigned to signed per `33267e036f` char in s390x in the /x/sys/unix package is now signed, so change the buildtags Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>	2017-07-11 08:00:32 -04:00
Yong Tang	16328cc207	Persist the quota size for btrfs so that daemon restart keeps quota This commit is an extension of fix for 29325 based on the review comment. In this commit, the quota size for btrfs is kept in `/var/lib/docker/btrfs/quotas` so that a daemon restart keeps quota. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2017-06-01 21:15:51 -07:00
Yong Tang	e907c6418a	Remove btrfs quota groups after containers destroyed This fix tries to address the issue raised in 29325 where btrfs quota groups are not clean up even after containers have been destroyed. The reason for the issue is that btrfs quota groups have to be explicitly destroyed. This fix fixes this issue. This fix is tested manually in Ubuntu 16.04, with steps specified in 29325. This fix fixes 29325. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2017-06-01 20:24:26 -07:00
Brian Goff	54dcbab25e	Do not remove containers from memory on error Before this, if `forceRemove` is set the container data will be removed no matter what, including if there are issues with removing container on-disk state (rw layer, container root). In practice this causes a lot of issues with leaked data sitting on disk that users are not able to clean up themselves. This is particularly a problem while the `EBUSY` errors on remove are so prevalent. So for now let's not keep this behavior. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2017-05-05 17:02:04 -04:00
Antonio Murdaca	abbbf91498	Switch to using opencontainers/selinux for selinux bindings Signed-off-by: Antonio Murdaca <runcom@redhat.com>	2017-04-24 21:29:47 +02:00
Yong Tang	b36e613d9f	Run btrfs rescan only if userDiskQuota is enabled This fix tries to address the issue raised in 29810 where btrfs subvolume removal failed when docker is in an unprivileged lxc container. The failure was caused by `Failed to rescan btrfs quota` with `operation not permitted`. However, if disk quota is not enabled, there is no need to run a btrfs rescan at the first place. This fix checks for `quotaEnabled` and only run btrfs rescan if `quotaEnabled` is true. This fix fixes 29810. Signed-off-by: Yong Tang <yong.tang.github@outlook.com>	2017-01-05 05:18:11 -08:00
wefine	f78f7de96a	fix t.Errorf to t.Error in serveral _test.go Signed-off-by: wefine <wang.xiaoren@zte.com.cn>	2016-11-14 17:54:43 +08:00
Vivek Goyal	b937aa8e69	Pass all graphdriver create() parameters in a struct This allows for easy extension of adding more parameters to existing parameters list. Otherwise adding a single parameter changes code at so many places. Signed-off-by: Vivek Goyal <vgoyal@redhat.com>	2016-11-09 15:59:58 -05:00
Zhu Guihua	401c8d1767	Add disk quota support for btrfs Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>	2016-05-05 14:35:13 +08:00
John Howard	fec6cd2eb9	Merge pull request #20525 from Microsoft/sjw/update-graphdriver-create Adding readOnly parameter to graphdriver Create method	2016-04-08 20:44:03 -07:00
Stefan J. Wernli	ef5bfad321	Adding readOnly parameter to graphdriver Create method Since the layer store was introduced, the level above the graphdriver now differentiates between read/write and read-only layers. This distinction is useful for graphdrivers that need to take special steps when creating a layer based on whether it is read-only or not. Adding this parameter allows the graphdrivers to differentiate, which in the case of the Windows graphdriver, removes our dependence on parsing the id of the parent for "-init" in order to infer this information. This will also set the stage for unblocking some of the layer store unit tests in the next preview build of Windows. Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>	2016-04-06 13:52:53 -07:00
Julio Montes	a038cccf88	Fix compilation errors with btrfs-progs-4.5 btrfs-progs-4.5 introduces device delete by devid for this reason btrfs_ioctl_vol_args_v2's name was encapsulated in a union this patch is for setting btrfs_ioctl_vol_args_v2's name using a C function in order to preserve compatibility with all btrfs-progs versions Signed-off-by: Julio Montes <imc.coder@gmail.com>	2016-04-01 08:58:29 -06:00
Shishir Mahajan	b16decfccf	CLI flag for docker create(run) to change block device size. Signed-off-by: Shishir Mahajan <shishir.mahajan@redhat.com>	2016-03-28 10:05:18 -04:00
Kai Qiang Wu(Kennan)	c33cdf9ee3	Fix the typo Signed-off-by: Kai Qiang Wu(Kennan) <wkqwu@cn.ibm.com>	2016-02-16 07:00:01 +00:00
Liu Bo	b2e27fee53	Graphdriver/btrfs: Avoid using single d.Get() For btrfs driver, in d.Create(), Get() of parentDir is called but not followed by Put(). If we apply SElinux mount label, we need to mount btrfs subvolumes in d.Get(), without a Put() would end up with a later Remove() failure on "Device resourse is busy". This calls the subvolume helper function directly in d.Create(). Signed-off-by: Liu Bo <bo.li.liu@oracle.com>	2016-02-04 10:25:24 -08:00
Kai Qiang Wu(Kennan)	feda5d7684	Make btrfs call same interface as others Most storage drivers call graphdriver.GetFSMagic(home), it is more clean to easy to maintain. So btrfs need to adopt such change. Signed-off-by: Kai Qiang Wu(Kennan) <wkqwu@cn.ibm.com>	2016-02-01 07:50:21 +00:00
Phil Estes	72e65e8793	Fix btrfs subvolume snapshot dir perms for user namespaces Make sure btrfs mounted subvolumes are owned properly when a remapped root exists (user namespaces are enabled, for example) Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)	2016-01-07 23:05:28 -05:00
Shijiang Wei	de7f6cf16b	ingnore the NotExist error when removing inexistent files Signed-off-by: Shijiang Wei <mountkin@gmail.com>	2015-12-25 15:19:48 +08:00
Vincent Batts	f57d56350e	Merge pull request #18686 from cpuguy83/fix_btrfs_subvol_delete_panic Fix btrfs recursive btrfs subvol delete	2015-12-16 14:26:40 -05:00
Brian Goff	f9befce2d3	Fix btrfs recursive btrfs subvol delete Really fixing 2 things: 1. Panic when any error is detected while walking the btrfs graph dir on removal due to no error check. 2. Nested subvolumes weren't actually being removed due to passing in the wrong path On point 2, for a path detected as a nested subvolume, we were calling `subvolDelete("/path/to/subvol", "subvol")`, where the last part of the path was duplicated due to a logic error, and as such actually causing point #1 since `subvolDelete` joins the two arguemtns, and `/path/to/subvol/subvol` (the joined version) doesn't exist. Also adds a test for nested subvol delete. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2015-12-15 18:12:40 -05:00
Justas Brazauskas	927b334ebf	Fix typos found across repository Signed-off-by: Justas Brazauskas <brazauskasjustas@gmail.com>	2015-12-13 18:04:12 +02:00
Dan Walsh	1716d497a4	Relabel BTRFS Content on container Creation This change will allow us to run SELinux in a container with BTRFS back end. We continue to work on fixing the kernel/BTRFS but this change will allow SELinux Security separation on BTRFS. It basically relabels the content on container creation. Just relabling -init directory in BTRFS use case. Everything looks like it works. I don't believe tar/achive stores the SELinux labels, so we are good as far as docker commit. Tested Speed on startup with BTRFS on top of loopback directory. BTRFS not on loopback should get even better perfomance on startup time. The more inodes inside of the container image will increase the relabel time. This patch will give people who care more about security the option of runnin BTRFS with SELinux. Those who don't want to take the slow down can disable SELinux either in individual containers or for all containers by continuing to disable SELinux in the daemon. Without relabel: > time docker run --security-opt label:disable fedora echo test test real 0m0.918s user 0m0.009s sys 0m0.026s With Relabel test real 0m1.942s user 0m0.007s sys 0m0.030s Signed-off-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Dan Walsh <dwalsh@redhat.com>	2015-11-11 14:49:27 -05:00
Phil Estes	442b45628e	Add user namespace (mapping) support to the Docker engine Adds support for the daemon to handle user namespace maps as a per-daemon setting. Support for handling uid/gid mapping is added to the builder, archive/unarchive packages and functions, all graphdrivers (except Windows), and the test suite is updated to handle user namespace daemon rootgraph changes. Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)	2015-10-09 17:47:37 -04:00
Chun Chen	2458452a3b	Try to resize data and metadata loopback file when initiating devicemapper Signed-off-by: Chun Chen <ramichen@tencent.com>	2015-09-24 09:31:00 +08:00
Jessica Frazelle	bd06432ba3	cleanup and fix btrfs subvolume recursion deletion Signed-off-by: Jessica Frazelle <acidburn@docker.com>	2015-08-25 13:00:41 -07:00
Ma Shimiao	dea78fc2ce	fix 9939: docker does not remove btrfs subvolumes when destroying container Signed-off-by: Ma Shimiao <mashimiao.fnst@cn.fujitsu.com>	2015-08-24 14:52:07 -07:00
Srini Brahmaroutu	22873eae31	fix unit test breakage due to lint changes Addresses #14756 Signed-off-by: Srini Brahmaroutu <srbrahma@us.ibm.com>	2015-07-31 00:22:28 +00:00

1 2

74 commits