Commit Graph

23 Commits

Author SHA1 Message Date
NobodyOnSE b2a907c8ca Whitelist statx syscall for libseccomp-2.3.3 onward
Older seccomp versions will ignore this.

Signed-off-by: NobodyOnSE <ich@sektor.selfip.com>
2018-03-06 08:42:12 +01:00
Simon Vikstrom d7bf5e3b4d Remove double defined alarm
Signed-off-by: Simon Vikstrom <pullreq@devsn.se>
2017-08-19 09:55:03 +02:00
Panagiotis Moustafellos cf6e1c5dfd
seccomp: whitelist quotactl with CAP_SYS_ADMIN
The quotactl syscall is being whitelisted in default seccomp profile,
gated by CAP_SYS_ADMIN.

Signed-off-by: Panagiotis Moustafellos <pmoust@elastic.co>
2017-08-09 18:52:15 +03:00
Miklos Szegedi 2db05316d0 Whitelist adjtimex get operation. Adjustment operations are gated by CAP_SYS_TIME
Signed-off-by: Miklos Szegedi <miklos.szegedi@cloudera.com>
2017-06-02 18:48:16 +00:00
Justin Cormack dcf2632945 Revert "Block obsolete socket families in the default seccomp profile"
This reverts commit 7e3a596a63.

Unfortunately, it was pointed out in https://github.com/moby/moby/pull/29076#commitcomment-21831387
that the `socketcall` syscall takes a pointer to a struct so it is not possible to
use seccomp profiles to filter it. This means these cannot be blocked as you can
use `socketcall` to call them regardless, as we currently allow 32 bit syscalls.

Users who wish to block these should use a seccomp profile that blocks all
32 bit syscalls and then just block the non socketcall versions.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-05-09 14:26:00 +01:00
Ian Campbell cd456433ea seccomp: Allow personality with UNAME26 bit set.
From personality(2):

    Have uname(2) report a 2.6.40+ version number rather than a 3.x version
    number.  Added as a stopgap measure to support broken applications that
    could not handle the  kernel  version-numbering  switch  from 2.6.x to 3.x.

This allows both "UNAME26|PER_LINUX" and "UNAME26|PER_LINUX32".

Fixes: #32839

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
2017-05-02 15:05:01 +01:00
Antonio Murdaca 3ab4961032
profiles: seccomp: allow clock_settime when CAP_SYS_TIME is added
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-03-20 11:05:23 +01:00
Justin Cormack 9067ef0e32 Seccomp Update
- Update libseccomp-golang to 0.9.0 release
- Update libseccomp to 2.3.2 release
- add preadv2 and pwritev2 syscalls to whitelist

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-03-07 22:19:46 +00:00
Gabriel Linder 52d8f582c3 Allow sync_file_range2 on supported architectures.
Signed-off-by: Gabriel Linder <linder.gabriel@gmail.com>
2017-02-14 21:29:33 +01:00
Justin Cormack d6adcd6a82 Add two arm specific syscalls to seccomp profile
These are arm variants with different argument ordering because of
register alignment requirements.

fix #30516

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-01-29 14:59:45 +00:00
Justin Cormack 7e3a596a63 Block obsolete socket families in the default seccomp profile
Linux supports many obsolete address families, which are usually available in
common distro kernels, but they are less likely to be properly audited and
may have security issues

This blocks all socket families in the socket (and socketcall where applicable) syscall
except
- AF_UNIX - Unix domain sockets
- AF_INET - IPv4
- AF_INET6 - IPv6
- AF_NETLINK - Netlink sockets for communicating with the ekrnel
- AF_PACKET - raw sockets, which are only allowed with CAP_NET_RAW

All other socket families are blocked, including Appletalk (native, not
over IP), IPX (remember that!), VSOCK and HVSOCK, which should not generally
be used in containers, etc.

Note that users can of course provide a profile per container or in the daemon
config if they have unusual use cases that require these.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2017-01-17 17:50:44 +00:00
Antonio Murdaca 5ff21add06
New seccomp format
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2016-09-01 11:53:07 +02:00
Justin Cormack bdf01cf5de Move mlock back into the default ungated seccomp profile
Do not gate with CAP_IPC_LOCK as unprivileged use is now
allowed in Linux. This returns it to how it was in 1.11.

Fixes #23587

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-06-15 16:25:27 -04:00
Justin Cormack 9ed6e39cdd Do not restrict chown via seccomp, just let capabilities control access
In #22554 I aligned seccomp and capabilities, however the case of
the chown calls and CAP_CHOWN was less clearcut, as these are
simple calls that the capabilities will block if they are not
allowed. They are needed when no new privileges is not set in
order to allow docker to call chown before the container is
started, so there was a workaround but this did not include
all the chown syscalls, and Arm was failing on some seccomp
tests because it was using a different syscall from just the
fchown that was allowed in this case. It is simpler to just
allow all the chown calls in the default seccomp profile and
let the capabilities subsystem block them.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-05-25 12:49:30 -07:00
Justin Cormack a83cedddc6 Enable seccomp on ppc64le
In order to do this, allow the socketcall syscall in the default
seccomp profile. This is a multiplexing syscall for the socket
operations, which is becoming obsolete gradually, but it is used
in some architectures. libseccomp has special handling for it for
x86 where it is common, so we did not need it in the profile,
but does not have any handling for ppc64le. It turns out that the
Debian images we use for tests do use the socketcall, while the
newer images such as Ubuntu 16.04 do not. Enabling this does no
harm as we allow all the socket operations anyway, and we allow
the similar ipc call for similar reasons already.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-05-23 22:35:55 -07:00
Justin Cormack a01c4dc8f8 Align default seccomp profile with selected capabilities
Currently the default seccomp profile is fixed. This changes it
so that it varies depending on the Linux capabilities selected with
the --cap-add and --cap-drop options. Without this, if a user adds
privileges, eg to allow ptrace with --cap-add sys_ptrace then still
cannot actually use ptrace as it is still blocked by seccomp, so
they will probably disable seccomp or use --privileged. With this
change the syscalls that are needed for the capability are also
allowed by the seccomp profile based on the selected capabilities.

While this patch makes it easier to do things with for example
cap_sys_admin enabled, as it will now allow creating new namespaces
and use of mount, it still allows less than --cap-add cap_sys_admin
--security-opt seccomp:unconfined would have previously. It is not
recommended that users run containers with cap_sys_admin as this does
give full access to the host machine.

It also cleans up some architecture specific system calls to be
only selected when needed.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-05-11 09:30:23 +01:00
Justin Cormack e7a99ae5e1 Remove mlock and vhangup from the default seccomp profile
These syscalls are already blocked by the default capabilities:
mlock mlock2 mlockall require CAP_IPC_LOCK
vhangup requires CAP_SYS_TTY_CONFIG

There is therefore no reason to allow them in the default profile
as they cannot be used anyway.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-04-21 18:23:59 +01:00
Justin Cormack 96896f2d0b Add new syscalls in libseccomp 2.3.0 to seccomp default profile
This adds the following new syscalls that are supported in libseccomp 2.3.0,
including calls added up to kernel 4.5-rc4:
mlock2 - same as mlock but with a flag
copy_file_range - copy file contents, like splice but with reflink support.

The following are not added, and mentioned in docs:
userfaultfd - userspace page fault handling, mainly designed for process migration

The following are not added, only apply to less common architectures:
switch_endian
membarrier
breakpoint
set_tls
I plan to review the other architectures, some of which can now have seccomp
enabled in the build as they are now supported.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-03-16 21:17:32 +00:00
Justin Cormack 5abd881883 Allow restart_syscall in default seccomp profile
Fixes #20818

This syscall was blocked as there was some concern that it could be
used to bypass filtering of other syscall arguments. However none of the
potential syscalls where this could be an issue (poll, nanosleep,
clock_nanosleep, futex) are blocked in the default profile anyway.

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-03-11 16:44:11 +00:00
Justin Cormack 31410a6d79 Add ipc syscall to default seccomp profile
On 32 bit x86 this is a multiplexing syscall for the system V
ipc syscalls such as shmget, and so needs to be allowed for
shared memory access for 32 bit binaries.

Fixes #20733

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-03-05 22:12:23 +00:00
Justin Cormack 39b799ac53 Add some uses of personality syscall to default seccomp filter
We generally want to filter the personality(2) syscall, as it
allows disabling ASLR, and turning on some poorly supported
emulations that have been the target of CVEs. However the use
cases for reading the current value, setting the default
PER_LINUX personality, and setting PER_LINUX32 for 32 bit
emulation are fine.

See issue #20634

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
2016-02-26 18:43:08 +01:00
Jessica Frazelle ad600239bc
generate seccomp profile convert type
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-02-19 13:32:54 -08:00
Jessica Frazelle d57816de02
add default seccomp profile as json
profile is created by go generate

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
2016-02-08 08:19:21 -08:00