Older kernel can't handle O_PATH in open() so this will
fail on dirs and symlinks. For dirs wa can fallback to
the normal Utimes, but for symlinks there is not much to do
but ignore their timestamps.
This creates a container by copying the corresponding files
from the layers into the containers. This is not gonna be very useful
on a developer setup, as there is no copy-on-write or general diskspace
sharing. It also makes container instantiation slower.
However, it may be useful in deployment where we don't always have a lot
of containers running (long-running daemons) and where we don't
do a lot of docker commits.
Change the comparison to better handle files that are copied during
container creation but not actually changed:
* Inode - this will change during a copy
* ctime - this will change during a copy (as we can't set it back)
* blocksize - this will change for sparse files during copy
* size for directories - this can change anytime but doesn't
necessarily reflect an actual contents change
* Compare mtimes at microsecond precision (as this is what utimes has)
There are some changes here that make the file metadata better match
the layer files:
* Set the mode of the file after the chown, as otherwise the per-group/uid
specific flags and e.g. sticky bit is lost
* Use lchown instead of chown
* Delay mtime updates to after all other changes so that later file
creation doesn't change the mtime for the parent directory
* Use Futimes in combination with O_PATH|O_NOFOLLOW to set mtime on symlinks
Rather than scan the files in the old directory twice to detect the
deletions we now scan both directories twice and then do all the
diffing on the in-memory structure.
This is more efficient, but it also lets us diff more complex things
later that are not exact on-disk trees.
The init layer needs to be topmost to make sure certain files
are always there (for instance, the ubuntu:12.10 image wrongly
has /dev/shm being a symlink to /run/shm, and we need to override
that). However, previously the devmapper code implemented the
init layer by putting it in the base devmapper device, which meant
layers above it could override these files (so that ubuntu:12.10
broke).
So, instead we put the base layer in *each* images devmapper device.
This is "safe" because we still have the pristine layer data
in the layer directory. Also, it means we diff the container
against the image with the init layer applied, so it won't show
up in diffs/commits.
lxc-start requires / to be mounted private, otherwise the changes
it does inside the container (both mounts and unmounts) will propagate
out to the host.
We work around this by starting up lxc-start in its own namespace where
we set / to rprivate.
Unfortunately go can't really execute any code between clone and exec,
so we can't do this in a nice way. Instead we have a horrible hack that
use the unshare command, the shell and the mount command...
Right now this does nothing but add a new layer, but it means
that all DeviceMounts are paired with DeviceUnmounts so that we
can track (and cleanup) active mounts.
I currently need this to get the tests running, otherwise it will
mount the docker.test binary inside the containers, which doesn't
work due to the libdevmapper.so dependency.
We wrap the "real" DeviceSet for each test so that we get only
a single device-mapper pool and loopback mounts, but still
separate out the IDs in the tests. This makes the test run
much faster.
This wraps an existing DeviceSet and just adds a prefix to all ids in
it. This will be useful for reusing a single DeviceSet for all the tests
(but with separate ids)
There are a few changes:
* Callers can specify if they want recursive behaviour or not
* All file listings to tar are sent on stdin, to handle long lists better
* We can pass in a list of filenames which will be created as empty
files in the tarball
This is exactly what we want for the creation of layer tarballs given
a container fs, a set of files to add and a set of whiteout files to create.
To do diffing we just compare file metadata, so this relies
on things like size and mtime/ctime to catch any changes.
Its *possible* to trick this by updating a file without
changing the size and setting back the mtime/ctime, but
that seems pretty unlikely to happen in reality, and lets
us avoid comparing the actual file data.
Without this there is really no way to map back from the device-mapper
devices to the actual docker image/container ids in case the json file
somehow got lost
This supports creating images from layers and mounting them
for running a container.
Not supported yet are:
* Creating diffs between images/containers
* Creating layers for new images from a device-mapper container
This interface matches the device-mapper implementation (DeviceSetDM)
but is free from any dependencies. This allows core docker code
to refer to a DeviceSet without having an explicit dependency on
the devmapper package.
This is important, because the devmapper package has external
dependencies which are not wanted in the docker client app, as it
needs to run with minimal dependencies in the docker image.
In some builds the main docker binary is not statically linked,
and as such not usable in as the .dockerinit binary, for those
cases we look for a separately shipped docker-init binary and
use that instead.
This is a module that uses the device-mapper create CoW snapshots
You instantiate a DeviceSetDM object on a specified root (/var/lib/docker),
and it will create a subdirectory there called "loopback". It will
contain two sparse files which are loopback mounted into
a thin-pool device-mapper device called "docker-pool".
We then create a base snapshot in the pool with an empty filesystem
which can be used as a base for docker snapshots. It also keeps track
of the mapping between docker image ids and the snapshots in the pool.
Typical use of is something like (without error checking):
devices = NewDeviceSetDM("/var/lib/docker")
devices.AddDevice(imageId, "") // "" is the base image id
devices.MountDevice(imageId, "/mnt/image")
... extract base image to /mnt/image
devices.AddDevice(containerId, imageId)
devices.MountDevice(containerId, "/mnt/container")
... start container at /mnt/container