Commit Graph

8 Commits

Author SHA1 Message Date
John Howard 63f9c7784b LCOW: Log stderr on failures
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-09-26 13:23:04 -07:00
John Howard dffc966165 LCOW: Capture stderr on external process. Log actual error rather than throwaway
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-08-16 19:20:14 -07:00
John Howard f586fe5637 LCOW: Mount to short container paths to avoid command-line length limit
Signed-off-by: John Howard <jhoward@microsoft.com>

Fixes #36764

@johnstep PTAL. @jterry75 FYI.

There are two commits in this PR. The first ensure that errors are actually returned to the caller - it was being thrown away.

The second commit changes the LCOW driver to map, on a per service VM basis, "long" container paths such as `/tmp/c8fa0ae1b348f505df2707060f6a49e63280d71b83b7936935c827e2e9bde16d` to much shorter paths, based on a per-service VM counter, so something more like /tmp/d3. This means that the root cause of the failure where the mount call to create the overlay was failing due to command line length becomes something much shorter such as below.

`mount -t overlay overlay -olowerdir=/tmp/d3:/tmp/d4:/tmp/d5:/tmp/d6:/tmp/d7:/tmp/d8:/tmp/d9:/tmp/d10:/tmp/d11:/tmp/d12:/tmp/d13:/tmp/d14:/tmp/d15:/tmp/d16:/tmp/d17:/tmp/d18:/tmp/d19:/tmp/d20:/tmp/d21:/tmp/d22:/tmp/d23:/tmp/d24:/tmp/d25:/tmp/d26:/tmp/d27:/tmp/d28:/tmp/d29:/tmp/d30:/tmp/d31:/tmp/d32:/tmp/d33:/tmp/d34:/tmp/d35:/tmp/d36:/tmp/d37:/tmp/d38:/tmp/d39:/tmp/d40:/tmp/d41:/tmp/d42:/tmp/d43:/tmp/d44:/tmp/d45:/tmp/d46:/tmp/d47:/tmp/d48:/tmp/d49:/tmp/d50:/tmp/d51:/tmp/d52:/tmp/d53:/tmp/d54:/tmp/d55:/tmp/d56:/tmp/d57:/tmp/d58:/tmp/d59:/tmp/d60:/tmp/d61:/tmp/d62,upperdir=/tmp/d2/upper,workdir=/tmp/d2/work /tmp/c8fa0ae1b348f505df2707060f6a49e63280d71b83b7936935c827e2e9bde16d-mount`

For those worrying about overflow (which I'm sure @thaJeztah will mention...): It's safe to use a counter here as SVMs are disposable in the default configuration. The exception is when running the daemon in unsafe LCOW "global" mode (ie `--storage-opt lcow.globalmode=1`) where the SVMs aren't disposed of, but a single one is reused. However, to overflow the command line length, it would require several hundred-thousand trillion (conservative, I should sit down and work it out accurately if I get -really- bored) of SCSI hot-add operations, and even to hit that would be hard as just running containers normally uses the VPMEM path for the containers UVM, not to the global SVM on SCSI. It gets incremented by one per build step (commit more accurately) as a general rule. Hence it would be necessary to have to be doing automated builds without restarting the daemon for literally years on end in unsafe mode. 😇

Note that in reality, the previous limit of ~47 layers before hitting the command line length limit is close to what is possible in the platform, at least as of RS5/Windows Server 2019 where, in the HCS v1 schema, a single SCSI controller is used, and that can only support 64 disks per controller per the Hyper-V VDEV. And remember we have one slot taken up for the SVMs scratch, and another for the containers scratch when committing a layer. So the best you can architecturally get on the platform is around the following (it's also different by 1 depending on whether in unsafe or default mode)

```
PS E:\docker\build\36764\short> docker build --no-cache .
Sending build context to Docker daemon  2.048kB
Step 1/4 : FROM alpine as first
 ---> 11cd0b38bc3c
Step 2/4 : RUN echo test > /test
 ---> Running in 8ddfe20e5bfb
Removing intermediate container 8ddfe20e5bfb
 ---> b0103a00b1c9
Step 3/4 : FROM alpine
 ---> 11cd0b38bc3c
Step 4/4 : COPY --from=first /test /test
 ---> 54bfae391eba
Successfully built 54bfae391eba
PS E:\docker\build\36764\short> cd ..
PS E:\docker\build\36764> docker build --no-cache .
Sending build context to Docker daemon  4.689MB
Step 1/61 : FROM alpine as first
 ---> 11cd0b38bc3c
Step 2/61 : RUN echo test > /test
 ---> Running in 02597ff870db
Removing intermediate container 02597ff870db
 ---> 3096de6fc454
Step 3/61 : RUN echo test > /test
 ---> Running in 9a8110f4ff19
Removing intermediate container 9a8110f4ff19
 ---> 7691808cf28e
Step 4/61 : RUN echo test > /test
 ---> Running in 9afb8f51510b
Removing intermediate container 9afb8f51510b
 ---> e42a0df2bb1c
Step 5/61 : RUN echo test > /test
 ---> Running in fe977ed6804e
Removing intermediate container fe977ed6804e
 ---> 55850c9b0479
Step 6/61 : RUN echo test > /test
 ---> Running in be65cbfad172
Removing intermediate container be65cbfad172
 ---> 0cf8acba70f0
Step 7/61 : RUN echo test > /test
 ---> Running in fd5b0907b6a9
Removing intermediate container fd5b0907b6a9
 ---> 257a4493d85d
Step 8/61 : RUN echo test > /test
 ---> Running in f7ca0ffd9076
Removing intermediate container f7ca0ffd9076
 ---> 3baa6f4fa2d5
Step 9/61 : RUN echo test > /test
 ---> Running in 5146814d4727
Removing intermediate container 5146814d4727
 ---> 485b9d5cf228
Step 10/61 : RUN echo test > /test
 ---> Running in a090eec1b743
Removing intermediate container a090eec1b743
 ---> a7eb10155b51
Step 11/61 : RUN echo test > /test
 ---> Running in 942660b288df
Removing intermediate container 942660b288df
 ---> 9d286a1e2133
Step 12/61 : RUN echo test > /test
 ---> Running in c3d369aa91df
Removing intermediate container c3d369aa91df
 ---> f78be4788992
Step 13/61 : RUN echo test > /test
 ---> Running in a03c3ac6888f
Removing intermediate container a03c3ac6888f
 ---> 6504363f61ab
Step 14/61 : RUN echo test > /test
 ---> Running in 0c3c2fca3f90
Removing intermediate container 0c3c2fca3f90
 ---> fe3448b8bb29
Step 15/61 : RUN echo test > /test
 ---> Running in 828d51c76d3b
Removing intermediate container 828d51c76d3b
 ---> 870684e3aea0
Step 16/61 : RUN echo test > /test
 ---> Running in 59a2f7c5f3ad
Removing intermediate container 59a2f7c5f3ad
 ---> cf84556ca5c0
Step 17/61 : RUN echo test > /test
 ---> Running in bfb4e088eeb3
Removing intermediate container bfb4e088eeb3
 ---> 9c8f9f652cef
Step 18/61 : RUN echo test > /test
 ---> Running in f1b88bb5a2d7
Removing intermediate container f1b88bb5a2d7
 ---> a6233ad21648
Step 19/61 : RUN echo test > /test
 ---> Running in 45f70577d709
Removing intermediate container 45f70577d709
 ---> 1b5cc52d370d
Step 20/61 : RUN echo test > /test
 ---> Running in 2ce231d5043d
Removing intermediate container 2ce231d5043d
 ---> 4a0e17cbebaa
Step 21/61 : RUN echo test > /test
 ---> Running in 52e4b0928f1f
Removing intermediate container 52e4b0928f1f
 ---> 99b50e989bcb
Step 22/61 : RUN echo test > /test
 ---> Running in f7ba3da7460d
Removing intermediate container f7ba3da7460d
 ---> bfa3cad88285
Step 23/61 : RUN echo test > /test
 ---> Running in 60180bf60f88
Removing intermediate container 60180bf60f88
 ---> fe7271988bcb
Step 24/61 : RUN echo test > /test
 ---> Running in 20324d396531
Removing intermediate container 20324d396531
 ---> e930bc039128
Step 25/61 : RUN echo test > /test
 ---> Running in b3ac70fd4404
Removing intermediate container b3ac70fd4404
 ---> 39d0a11ea6d8
Step 26/61 : RUN echo test > /test
 ---> Running in 0193267d3787
Removing intermediate container 0193267d3787
 ---> 8062d7aab0a5
Step 27/61 : RUN echo test > /test
 ---> Running in f41f45fb7985
Removing intermediate container f41f45fb7985
 ---> 1f5f18f2315b
Step 28/61 : RUN echo test > /test
 ---> Running in 90dd09c63d6e
Removing intermediate container 90dd09c63d6e
 ---> 02f0a1141f11
Step 29/61 : RUN echo test > /test
 ---> Running in c557e5386e0a
Removing intermediate container c557e5386e0a
 ---> dbcd6fb1f6f4
Step 30/61 : RUN echo test > /test
 ---> Running in 65369385d855
Removing intermediate container 65369385d855
 ---> e6e9058a0650
Step 31/61 : RUN echo test > /test
 ---> Running in d861fcc388fd
Removing intermediate container d861fcc388fd
 ---> 6e4c2c0f741f
Step 32/61 : RUN echo test > /test
 ---> Running in 1483962b7e1c
Removing intermediate container 1483962b7e1c
 ---> cf8f142aa055
Step 33/61 : RUN echo test > /test
 ---> Running in 5868934816c1
Removing intermediate container 5868934816c1
 ---> d5ff87cdc204
Step 34/61 : RUN echo test > /test
 ---> Running in e057f3201f3a
Removing intermediate container e057f3201f3a
 ---> b4031b7ab4ac
Step 35/61 : RUN echo test > /test
 ---> Running in 22b769b9079c
Removing intermediate container 22b769b9079c
 ---> 019d898510b6
Step 36/61 : RUN echo test > /test
 ---> Running in f1d364ef4ff8
Removing intermediate container f1d364ef4ff8
 ---> 9525cafdf04d
Step 37/61 : RUN echo test > /test
 ---> Running in 5bf505b8bdcc
Removing intermediate container 5bf505b8bdcc
 ---> cd5002b33bfd
Step 38/61 : RUN echo test > /test
 ---> Running in be24a921945c
Removing intermediate container be24a921945c
 ---> 8675db44d1b7
Step 39/61 : RUN echo test > /test
 ---> Running in 352dc6beef3d
Removing intermediate container 352dc6beef3d
 ---> 0ab0ece43c71
Step 40/61 : RUN echo test > /test
 ---> Running in eebde33e5d9b
Removing intermediate container eebde33e5d9b
 ---> 46ca4b0dfc03
Step 41/61 : RUN echo test > /test
 ---> Running in f920313a1e85
Removing intermediate container f920313a1e85
 ---> 7f3888414d58
Step 42/61 : RUN echo test > /test
 ---> Running in 10e2f4dc1ac7
Removing intermediate container 10e2f4dc1ac7
 ---> 14db9e15f2dc
Step 43/61 : RUN echo test > /test
 ---> Running in c849d6e89aa5
Removing intermediate container c849d6e89aa5
 ---> fdb770494dd6
Step 44/61 : RUN echo test > /test
 ---> Running in 419d1a8353db
Removing intermediate container 419d1a8353db
 ---> d12e9cf078be
Step 45/61 : RUN echo test > /test
 ---> Running in 0f1805263e4c
Removing intermediate container 0f1805263e4c
 ---> cd005e7b08a4
Step 46/61 : RUN echo test > /test
 ---> Running in 5bde05b46441
Removing intermediate container 5bde05b46441
 ---> 05aa426a3d4a
Step 47/61 : RUN echo test > /test
 ---> Running in 01ebc84bd1bc
Removing intermediate container 01ebc84bd1bc
 ---> 35d371fa4342
Step 48/61 : RUN echo test > /test
 ---> Running in 49f6c2f51dd4
Removing intermediate container 49f6c2f51dd4
 ---> 1090b5dfa130
Step 49/61 : RUN echo test > /test
 ---> Running in f8a9089cd725
Removing intermediate container f8a9089cd725
 ---> b2d0eec0716d
Step 50/61 : RUN echo test > /test
 ---> Running in a1697a0b2db0
Removing intermediate container a1697a0b2db0
 ---> 10d96ac8f497
Step 51/61 : RUN echo test > /test
 ---> Running in 33a2332c06eb
Removing intermediate container 33a2332c06eb
 ---> ba5bf5609c1c
Step 52/61 : RUN echo test > /test
 ---> Running in e8920392be0d
Removing intermediate container e8920392be0d
 ---> 5b3a95685c7e
Step 53/61 : RUN echo test > /test
 ---> Running in 4b9298587c65
Removing intermediate container 4b9298587c65
 ---> d4961a349141
Step 54/61 : RUN echo test > /test
 ---> Running in 8a0c960c2ba1
Removing intermediate container 8a0c960c2ba1
 ---> b413197fcfa2
Step 55/61 : RUN echo test > /test
 ---> Running in 536ee3b9596b
Removing intermediate container 536ee3b9596b
 ---> fc16b69b224a
Step 56/61 : RUN echo test > /test
 ---> Running in 8b817b8d7b59
Removing intermediate container 8b817b8d7b59
 ---> 2f0896400ff9
Step 57/61 : RUN echo test > /test
 ---> Running in ab0ed79ec3d4
Removing intermediate container ab0ed79ec3d4
 ---> b4fb420e736c
Step 58/61 : RUN echo test > /test
 ---> Running in 8548d7eead1f
Removing intermediate container 8548d7eead1f
 ---> 745103fd5a38
Step 59/61 : RUN echo test > /test
 ---> Running in 1980559ad5d6
Removing intermediate container 1980559ad5d6
 ---> 08c1c74a5618
Step 60/61 : FROM alpine
 ---> 11cd0b38bc3c
Step 61/61 : COPY --from=first /test /test
 ---> 67f053c66c27
Successfully built 67f053c66c27
PS E:\docker\build\36764>
```

Note also that subsequent error messages once you go beyond current platform limitations kind of suck (such as insufficient resources with a bunch of spew which is incomprehensible to most) and we could do better to detect this earlier in the daemon. That'll be for a (reasonably low-priority) follow-up though as and when I have time. Theoretically we *may*, if the platform doesn't require additional changes for RS5, be able to have bigger platform limits using the v2 schema with up to 127 VPMem devices, and the possibility to have multiple SCSI controllers per SVM/UVM. However, currently LCOW is using HCS v1 schema calls, and there's no plans to rewrite the graphdriver/libcontainerd components outside of the moving LCOW fully over to the containerd runtime/snapshotter using HCS v2 schema, which is still some time off fruition.

PS OK, while waiting for a full run to complete, I did get bored. Turns out it won't overflow line length as max(uint64) is 18446744073709551616 which would still be short enough at 127 layers, double the current platform limit. And I could always change it to hex or base36 to make it even shorter, or remove the 'd' from /tmp/dN. IOW, pretty sure no-one is going to hit the limit even if we could get the platform to 256 which is the current Hyper-V SCSI limit per VM (4x64), although PMEM at 127 would be the next immediate limit.
2018-08-16 19:18:55 -07:00
Daniel Nephin 4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
Yong Tang 03a1df9536
Merge pull request #36114 from Microsoft/jjh/fixdeadlock-lcowdriver-hotremove
LCOW: Graphdriver fix deadlock in hotRemoveVHDs
2018-01-29 09:57:43 -08:00
John Howard a44fcd3d27 LCOW: Graphdriver fix deadlock
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-26 08:57:52 -08:00
John Howard 420dc4eeb4 LCOW: Regular mount if only one layer
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-01-18 12:01:58 -08:00
Akash Gupta 7a7357dae1 LCOW: Implemented support for docker cp + build
This enables docker cp and ADD/COPY docker build support for LCOW.
Originally, the graphdriver.Get() interface returned a local path
to the container root filesystem. This does not work for LCOW, so
the Get() method now returns an interface that LCOW implements to
support copying to and from the container.

Signed-off-by: Akash Gupta <akagup@microsoft.com>
2017-09-14 12:07:52 -07:00