1
0
Fork 0
mirror of https://github.com/moby/moby.git synced 2022-11-09 12:21:53 -05:00
moby--moby/daemon/graphdriver/devmapper
Kir Kolyshkin 516010e92d Simplify/fix MkdirAll usage
This subtle bug keeps lurking in because error checking for `Mkdir()`
and `MkdirAll()` is slightly different wrt to `EEXIST`/`IsExist`:

 - for `Mkdir()`, `IsExist` error should (usually) be ignored
   (unless you want to make sure directory was not there before)
   as it means "the destination directory was already there"

 - for `MkdirAll()`, `IsExist` error should NEVER be ignored.

Mostly, this commit just removes ignoring the IsExist error, as it
should not be ignored.

Also, there are a couple of cases then IsExist is handled as
"directory already exist" which is wrong. As a result, some code
that never worked as intended is now removed.

NOTE that `idtools.MkdirAndChown()` behaves like `os.MkdirAll()`
rather than `os.Mkdir()` -- so its description is amended accordingly,
and its usage is handled as such (i.e. IsExist error is not ignored).

For more details, a quote from my runc commit 6f82d4b (July 2015):

    TL;DR: check for IsExist(err) after a failed MkdirAll() is both
    redundant and wrong -- so two reasons to remove it.

    Quoting MkdirAll documentation:

    > MkdirAll creates a directory named path, along with any necessary
    > parents, and returns nil, or else returns an error. If path
    > is already a directory, MkdirAll does nothing and returns nil.

    This means two things:

    1. If a directory to be created already exists, no error is
    returned.

    2. If the error returned is IsExist (EEXIST), it means there exists
    a non-directory with the same name as MkdirAll need to use for
    directory. Example: we want to MkdirAll("a/b"), but file "a"
    (or "a/b") already exists, so MkdirAll fails.

    The above is a theory, based on quoted documentation and my UNIX
    knowledge.

    3. In practice, though, current MkdirAll implementation [1] returns
    ENOTDIR in most of cases described in #2, with the exception when
    there is a race between MkdirAll and someone else creating the
    last component of MkdirAll argument as a file. In this very case
    MkdirAll() will indeed return EEXIST.

    Because of #1, IsExist check after MkdirAll is not needed.

    Because of #2 and #3, ignoring IsExist error is just plain wrong,
    as directory we require is not created. It's cleaner to report
    the error now.

    Note this error is all over the tree, I guess due to copy-paste,
    or trying to follow the same usage pattern as for Mkdir(),
    or some not quite correct examples on the Internet.

    [1] https://github.com/golang/go/blob/f9ed2f75/src/os/path.go

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-11-27 17:32:12 -08:00
..
device_setup.go devmapper autosetup: add check for thin_check 2017-08-14 13:25:28 +03:00
deviceset.go Simplify/fix MkdirAll usage 2017-11-27 17:32:12 -08:00
devmapper_doc.go Fix typos found across repository 2015-12-13 18:04:12 +02:00
devmapper_test.go devmapper: add a test for mount leak workaround 2017-11-08 11:02:11 +11:00
driver.go Simplify/fix MkdirAll usage 2017-11-27 17:32:12 -08:00
mount.go Switch Stat syscalls to x/sys/unix 2017-07-27 10:09:02 +02:00
README.md Add option to auto-configure blkdev for devmapper 2017-05-03 13:49:15 -04:00

devicemapper - a storage backend based on Device Mapper

Theory of operation

The device mapper graphdriver uses the device mapper thin provisioning module (dm-thinp) to implement CoW snapshots. The preferred model is to have a thin pool reserved outside of Docker and passed to the daemon via the --storage-opt dm.thinpooldev option. Alternatively, the device mapper graphdriver can setup a block device to handle this for you via the --storage-opt dm.directlvm_device option.

As a fallback if no thin pool is provided, loopback files will be created. Loopback is very slow, but can be used without any pre-configuration of storage. It is strongly recommended that you do not use loopback in production. Ensure your Docker daemon has a --storage-opt dm.thinpooldev argument provided.

In loopback, a thin pool is created at /var/lib/docker/devicemapper (devicemapper graph location) based on two block devices, one for data and one for metadata. By default these block devices are created automatically by using loopback mounts of automatically created sparse files.

The default loopback files used are /var/lib/docker/devicemapper/devicemapper/data and /var/lib/docker/devicemapper/devicemapper/metadata. Additional metadata required to map from docker entities to the corresponding devicemapper volumes is stored in the /var/lib/docker/devicemapper/devicemapper/json file (encoded as Json).

In order to support multiple devicemapper graphs on a system, the thin pool will be named something like: docker-0:33-19478248-pool, where the 0:33 part is the minor/major device nr and 19478248 is the inode number of the /var/lib/docker/devicemapper directory.

On the thin pool, docker automatically creates a base thin device, called something like docker-0:33-19478248-base of a fixed size. This is automatically formatted with an empty filesystem on creation. This device is the base of all docker images and containers. All base images are snapshots of this device and those images are then in turn used as snapshots for other images and eventually containers.

Information on docker info

As of docker-1.4.1, docker info when using the devicemapper storage driver will display something like:

$ sudo docker info
[...]
Storage Driver: devicemapper
 Pool Name: docker-253:1-17538953-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 107.4 GB
 Data file: /dev/loop4
 Metadata file: /dev/loop4
 Data Space Used: 2.536 GB
 Data Space Total: 107.4 GB
 Data Space Available: 104.8 GB
 Metadata Space Used: 7.93 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.14 GB
 Udev Sync Supported: true
 Data loop file: /home/docker/devicemapper/devicemapper/data
 Metadata loop file: /home/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.82-git (2013-10-04)
[...]

status items

Each item in the indented section under Storage Driver: devicemapper are status information about the driver.

  • Pool Name name of the devicemapper pool for this driver.
  • Pool Blocksize tells the blocksize the thin pool was initialized with. This only changes on creation.
  • Base Device Size tells the maximum size of a container and image
  • Data file blockdevice file used for the devicemapper data
  • Metadata file blockdevice file used for the devicemapper metadata
  • Data Space Used tells how much of Data file is currently used
  • Data Space Total tells max size the Data file
  • Data Space Available tells how much free space there is in the Data file. If you are using a loop device this will report the actual space available to the loop device on the underlying filesystem.
  • Metadata Space Used tells how much of Metadata file is currently used
  • Metadata Space Total tells max size the Metadata file
  • Metadata Space Available tells how much free space there is in the Metadata file. If you are using a loop device this will report the actual space available to the loop device on the underlying filesystem.
  • Udev Sync Supported tells whether devicemapper is able to sync with Udev. Should be true.
  • Data loop file file attached to Data file, if loopback device is used
  • Metadata loop file file attached to Metadata file, if loopback device is used
  • Library Version from the libdevmapper used

About the devicemapper options

The devicemapper backend supports some options that you can specify when starting the docker daemon using the --storage-opt flags. This uses the dm prefix and would be used something like dockerd --storage-opt dm.foo=bar.

These options are currently documented both in the man page and in the online documentation. If you add an options, update both the man page and the documentation.