Commit Graph

18 Commits

Author SHA1 Message Date
Sebastiaan van Stijn 07ff4f1de8
goimports: fix imports
Format the source according to latest goimports.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-09-18 12:56:54 +02:00
Olli Janatuinen 80d7bfd54d Capabilities refactor
- Add support for exact list of capabilities, support only OCI model
- Support OCI model on CapAdd and CapDrop but remain backward compatibility
- Create variable locally instead of declaring it at the top
- Use const for magic "ALL" value
- Rename `cap` variable as it overlaps with `cap()` built-in
- Normalize and validate capabilities before use
- Move validation for conflicting options to validateHostConfig()
- TweakCapabilities: simplify logic to calculate capabilities

Signed-off-by: Olli Janatuinen <olli.janatuinen@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2019-01-22 21:50:41 +02:00
Jonathan A. Schweder 64e52ff3db Masked /proc/asound
@sw-pschmied originally post this in #38285

While looking through the Moby source code was found /proc/asound to be
shared with containers as read-only (as defined in
https://github.com/moby/moby/blob/master/oci/defaults.go#L128).

This can lead to two information leaks.

---

**Leak of media playback status of the host**

Steps to reproduce the issue:

 - Listen to music/Play a YouTube video/Do anything else that involves
sound output
 - Execute docker run --rm ubuntu:latest bash -c "sleep 7; cat
/proc/asound/card*/pcm*p/sub*/status | grep state | cut -d ' ' -f2 |
grep RUNNING || echo 'not running'"
 - See that the containerized process is able to check whether someone
on the host is playing music as it prints RUNNING
 - Stop the music output
 - Execute the command again (The sleep is delaying the output because
information regarding playback status isn't propagated instantly)
 - See that it outputs not running

**Describe the results you received:**

A containerized process is able to gather information on the playback
status of an audio device governed by the host. Therefore a process of a
container is able to check whether and what kind of user activity is
present on the host system. Also, this may indicate whether a container
runs on a desktop system or a server as media playback rarely happens on
server systems.

The description above is in regard to media playback - when examining
`/proc/asound/card*/pcm*c/sub*/status` (`pcm*c` instead of `pcm*p`) this
can also leak information regarding capturing sound, as in recording
audio or making calls on the host system.

Signed-off-by: Jonathan A. Schweder <jonathanschweder@gmail.com>
2018-11-30 10:03:10 -02:00
Antonio Murdaca 569b9702a5
Add /proc/acpi to masked paths
The deafult OCI linux spec in oci/defaults{_linux}.go in Docker/Moby
from 1.11 to current upstream master does not block /proc/acpi pathnames
allowing attackers to modify host's hardware like enabling/disabling
bluetooth or turning up/down keyboard brightness. SELinux prevents all
of this if enabled.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2018-07-05 17:39:52 +02:00
Justin Cormack de23cb9398 Add /proc/keys to masked paths
This leaks information about keyrings on the host. Keyrings are
not namespaced.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2018-02-21 16:23:34 +00:00
Daniel Nephin 4f0d95fa6e Add canonical import comment
Signed-off-by: Daniel Nephin <dnephin@docker.com>
2018-02-05 16:51:57 -05:00
John Howard b023a46a07 Don't special case /sys/firmware in masked paths
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-11-08 12:10:42 -08:00
Justin Cormack a21ecdf3c8 Add /proc/scsi to masked paths
This is writeable, and can be used to remove devices. Containers do
not need to know about scsi devices.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-11-03 15:12:22 +00:00
John Howard 71651e0b80 Fixes LCOW after containerd 1.0 introduced regressions
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-10-27 09:55:43 -07:00
Michael Crosby 5a9b5f10cf Remove solaris files
For obvious reasons that it is not really supported now.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
2017-10-24 15:39:34 -04:00
Kenfe-Mickael Laventure ddae20c032
Update libcontainerd to use containerd 1.0
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-10-20 07:11:37 -07:00
John Howard 285bc99731 Merge pull request #34356 from mlaventure/update-containerd
Update containerd to 06b9cb35161009dcb7123345749fef02f7cea8e0
2017-08-24 14:25:44 -07:00
Darren Stahl 7c29103ad9
Update Windows and LCOW to use v1.0.0 runtime-spec
Signed-off-by: Darren Stahl <darst@microsoft.com>
Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-21 15:19:31 -07:00
Kenfe-Mickael Laventure 45d85c9913
Update containerd to 06b9cb35161009dcb7123345749fef02f7cea8e0
This also update:
 - runc to 3f2f8b84a77f73d38244dd690525642a72156c64
 - runtime-specs to v1.0.0

Signed-off-by: Kenfe-Mickael Laventure <mickael.laventure@gmail.com>
2017-08-21 12:04:07 -07:00
Christophe Vidal dffa5d6df2 Dropped hyphen in bind mount where appropriate
Signed-off-by: Christophe Vidal <kriss@krizalys.com>
2017-08-19 21:25:07 +07:00
Kir Kolyshkin 7120976d74 Implement none, private, and shareable ipc modes
Since the commit d88fe447df ("Add support for sharing /dev/shm/ and
/dev/mqueue between containers") container's /dev/shm is mounted on the
host first, then bind-mounted inside the container. This is done that
way in order to be able to share this container's IPC namespace
(and the /dev/shm mount point) with another container.

Unfortunately, this functionality breaks container checkpoint/restore
(even if IPC is not shared). Since /dev/shm is an external mount, its
contents is not saved by `criu checkpoint`, and so upon restore any
application that tries to access data under /dev/shm is severily
disappointed (which usually results in a fatal crash).

This commit solves the issue by introducing new IPC modes for containers
(in addition to 'host' and 'container:ID'). The new modes are:

 - 'shareable':	enables sharing this container's IPC with others
		(this used to be the implicit default);

 - 'private':	disables sharing this container's IPC.

In 'private' mode, container's /dev/shm is truly mounted inside the
container, without any bind-mounting from the host, which solves the
issue.

While at it, let's also implement 'none' mode. The motivation, as
eloquently put by Justin Cormack, is:

> I wondered a while back about having a none shm mode, as currently it is
> not possible to have a totally unwriteable container as there is always
> a /dev/shm writeable mount. It is a bit of a niche case (and clearly
> should never be allowed to be daemon default) but it would be trivial to
> add now so maybe we should...

...so here's yet yet another mode:

 - 'none':	no /dev/shm mount inside the container (though it still
		has its own private IPC namespace).

Now, to ultimately solve the abovementioned checkpoint/restore issue, we'd
need to make 'private' the default mode, but unfortunately it breaks the
backward compatibility. So, let's make the default container IPC mode
per-daemon configurable (with the built-in default set to 'shareable'
for now). The default can be changed either via a daemon CLI option
(--default-shm-mode) or a daemon.json configuration file parameter
of the same name.

Note one can only set either 'shareable' or 'private' IPC modes as a
daemon default (i.e. in this context 'host', 'container', or 'none'
do not make much sense).

Some other changes this patch introduces are:

1. A mount for /dev/shm is added to default OCI Linux spec.

2. IpcMode.Valid() is simplified to remove duplicated code that parsed
   'container:ID' form. Note the old version used to check that ID does
   not contain a semicolon -- this is no longer the case (tests are
   modified accordingly). The motivation is we should either do a
   proper check for container ID validity, or don't check it at all
   (since it is checked in other places anyway). I chose the latter.

3. IpcMode.Container() is modified to not return container ID if the
   mode value does not start with "container:", unifying the check to
   be the same as in IpcMode.IsContainer().

3. IPC mode unit tests (runconfig/hostconfig_test.go) are modified
   to add checks for newly added values.

[v2: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-51345997]
[v3: addressed review at https://github.com/moby/moby/pull/34087#pullrequestreview-53902833]
[v4: addressed the case of upgrading from older daemon, in this case
     container.HostConfig.IpcMode is unset and this is valid]
[v5: document old and new IpcMode values in api/swagger.yaml]
[v6: add the 'none' mode, changelog entry to docs/api/version-history.md]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2017-08-14 10:50:39 +03:00
Daniel J Walsh bfdb0f3cb8 /dev should be constrained in size
There really is no reason why anyone should create content in /dev
other then device nodes.  Limiting it size to the 64 k size limit.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-07-20 08:59:56 -04:00
John Howard f154588226 LCOW: OCI Spec and Environment for container start
Signed-off-by: John Howard <jhoward@microsoft.com>
2017-06-20 19:50:11 -07:00