Skip to content
runc v1.2.0-rc.1 -- "There's a frood who really knows where his towel is."

This is the first release candidate for the 1.2.0 branch of runc. It includes
all patches and bugfixes included in runc 1.1 patch releases (up to and
including 1.1.12). A fair few new features have been added, and some changes
have been made which may affect users. Please help us thoroughly test this
release before we release 1.2.0.

runc now requires a minimum of Go 1.20 to compile.

> NOTE: runc currently will not work properly when compiled with Go 1.22 or
> newer. This is due to some unfortunate glibc behaviour that Go 1.22
> exacerbates in a way that results in containers not being able to start on
> some systems. [See this issue for more information.][runc-4233]

Breaking:

 * Several aspects of how mount options work has been adjusted in a way that
   could theoretically break users that have very strange mount option strings.
   This was necessary to fix glaring issues in how mount options were being
   treated. The key changes are:

   - Mount options on bind-mounts that clear a mount flag are now always
     applied. Previously, if a user requested a bind-mount with only clearing
     options (such as `rw,exec,dev`) the options would be ignored and the
     original bind-mount options would be set. Unfortunately this also means
     that container configurations which specified only clearing mount options
     will now actually get what they asked for, which could break existing
     containers (though it seems unlikely that a user who requested a specific
     mount option would consider it "broken" to get the mount options they
     asked foruser who requested a specific mount option would consider it
     "broken" to get the mount options they asked for). This also allows us to
     silently add locked mount flags the user *did not explicitly request to be
     cleared* in rootless mode, allowing for easier use of bind-mounts for
     rootless containers. (#3967)

   - Container configurations using bind-mounts with superblock mount flags
     (i.e. filesystem-specific mount flags, referred to as "data" in
     `mount(2)`, as opposed to VFS generic mount flags like `MS_NODEV`) will
     now return an error. This is because superblock mount flags will also
     affect the host mount (as the superblock is shared when bind-mounting),
     which is obviously not acceptable. Previously, these flags were silently
     ignored so this change simply tells users that runc cannot fulfil their
     request rather than just ignoring it. (#3990)

   If any of these changes cause problems in real-world workloads, please [open
   an issue](https://github.com/opencontainers/runc/issues/new/choose) so we
   can adjust the behaviour to avoid compatibility issues.

Added:

 * runc has been updated to OCI runtime-spec 1.2.0, and supports all Linux
   features with a few minor exceptions. See
   [`docs/spec-conformance.md`](https://github.com/opencontainers/runc/blob/v1.2.0-rc.1/docs/spec-conformance.md)
   for more details.
 * runc now supports id-mapped mounts for bind-mounts (with no restrictions on
   the mapping used for each mount). Other mount types are not currently
   supported. This feature requires `MOUNT_ATTR_IDMAP` kernel support (Linux
   5.12 or newer) as well as kernel support for the underlying filesystem used
   for the bind-mount. See [`mount_setattr(2)`][mount_setattr.2] for a list of
   supported filesystems and other restrictions. (#3717, #3985, #3993)
 * Two new mechanisms for reducing the memory usage of our protections against
   [CVE-2019-5736][cve-2019-5736] have been introduced:
   - `runc-dmz` is a minimal binary (~8K) which acts as an additional execve
     stage, allowing us to only need to protect the smaller binary. It should
     be noted that there have been several compatibility issues reported with
     the usage of `runc-dmz` (namely related to capabilities and SELinux). As
     such, this mechanism is **opt-in** and can be enabled by running `runc`
     with the environment variable `RUNC_DMZ=true` (setting this environment
     variable in `config.json` will have no effect). This feature can be
     disabled at build time using the `runc_nodmz` build tag. (#3983, #3987)
   - `contrib/memfd-bind` is a helper daemon which will bind-mount a memfd copy
     of `/usr/bin/runc` on top of `/usr/bin/runc`. This entirely eliminates
     per-container copies of the binary, but requires care to ensure that
     upgrades to runc are handled properly, and requires a long-running daemon
     (unfortunately memfds cannot be bind-mounted directly and thus require a
     daemon to keep them alive). (#3987)
 * runc will now use `cgroup.kill` if available to kill all processes in a
   container (such as when doing `runc kill`). (#3135, #3825)
 * Add support for setting the umask for `runc exec`. (#3661)
 * libct/cg: support `SCHED_IDLE` for runc cgroupfs. (#3377)
 * checkpoint/restore: implement `--manage-cgroups-mode=ignore`. (#3546)
 * seccomp: refactor flags support; add flags to features, set `SPEC_ALLOW` by
   default. (#3588)
 * libct/cg/sd: use systemd v240+ new `MAJOR:*` syntax. (#3843)
 * Support CFS bandwidth burst for CPU. (#3749, #3145)
 * Support time namespaces. (#3876)
 * Reduce the `runc` binary size by ~11% by updating
   `github.com/checkpoint-restore/go-criu`. (#3652)
 * Add `--pidfd-socket` to `runc run` and `runc exec` to allow for management
   processes to receive a pidfd for the new process, allowing them to avoid pid
   reuse attacks. (#4045)

Deprecated:

 * `runc` option `--criu` is now ignored (with a warning), and the option will
   be removed entirely in a future release. Users who need a non-standard
   `criu` binary should rely on the standard way of looking up binaries in
   `$PATH`. (#3316)
 * `runc kill` option `-a` is now deprecated. Previously, it had to be specified
   to kill a container (with SIGKILL) which does not have its own private PID
   namespace (so that runc would send SIGKILL to all processes). Now, this is
   done automatically. (#3864, #3825)
 * `github.com/opencontainers/runc/libcontainer/user` is now deprecated, please
   use `github.com/moby/sys/user` instead. It will be removed in a future
   release. (#4017)

Changed:

 * When Intel RDT feature is not available, its initialization is skipped,
   resulting in slightly faster `runc exec` and `runc run`. (#3306)
 * `runc features` is no longer experimental. (#3861)
 * libcontainer users that create and kill containers from a daemon process
   (so that the container init is a child of that process) must now implement
   a proper child reaper in case a container does not have its own private PID
   namespace, as documented in `container.Signal`. (#3825)
 * Sum `anon` and `file` from `memory.stat` for cgroupv2 root usage,
   as the root does not have `memory.current` for cgroupv2.
   This aligns cgroupv2 root usage more closely with cgroupv1 reporting.
   Additionally, report root swap usage as sum of swap and memory usage,
   aligned with v1 and existing non-root v2 reporting. (#3933)
 * Add `swapOnlyUsage` in `MemoryStats`. This field reports swap-only usage.
   For cgroupv1, `Usage` and `Failcnt` are set by subtracting memory usage
   from memory+swap usage. For cgroupv2, `Usage`, `Limit`, and `MaxUsage`
   are set. (#4010)
 * libcontainer users that create and kill containers from a daemon process
   (so that the container init is a child of that process) must now implement
   a proper child reaper in case a container does not have its own private PID
   namespace, as documented in `container.Signal`. (#3825)
 * libcontainer: `container.Signal` no longer takes an `all` argument. Whether
   or not it is necessary to kill all processes in the container individually
   is now determined automatically. (#3825, #3885)
 * seccomp: enable seccomp binary tree optimization. (#3405)
 * `runc run`/`runc exec`: ignore SIGURG. (#3368)
 * Remove tun/tap from the default device allowlist. (#3468)
 * `runc --root non-existent-dir list` now reports an error for non-existent
   root directory. (#3374)

Fixed:

 * In case the runc binary resides on tmpfs, `runc init` no longer re-execs
   itself twice. (#3342)
 * Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on
   s390 and s390x. This solves the issue where syscalls the host kernel did not
   support would return `-EPERM` despite the existence of the `-ENOSYS` stub
   code (this was due to how s390x does syscall multiplexing). (#3474)
 * Remove tun/tap from the default device rules. (#3468)
 * specconv: avoid mapping "acl" to `MS_POSIXACL`. (#3739)
 * libcontainer: fix private PID namespace detection when killing the
   container. (#3866, #3825)
 * systemd socket notification: fix race where runc exited before systemd
   properly handled the `READY` notification. (#3291, #3293)
 * The `-ENOSYS` seccomp stub is now always generated for the native
   architecture that `runc` is running on. This is needed to work around some
   arguably specification-incompliant behaviour from Docker on architectures
   such as ppc64le, where the allowed architecture list is set to `null`. This
   ensures that we always generate at least one `-ENOSYS` stub for the native
   architecture even with these weird configs. (#4219)

Removed:

 * In order to fix performance issues in the "lightweight" bindfd protection
   against [CVE-2019-5736][cve-2019-5736], the temporary `ro` bind-mount of
   `/proc/self/exe` has been removed. runc now creates a binary copy in all
   cases. See the above notes about `memfd-bind` and `runc-dmz` as well as
   `contrib/cmd/memfd-bind/README.md` for more information about how this
   (minor) change in memory usage can be further reduced. (#3987, #3599, #2532,
   #3931)
 * libct/cg: Remove `EnterPid` (a function with no users). (#3797)
 * libcontainer: Remove `{Pre,Post}MountCmds` which were never used and are
   obsoleted by more generic container hooks. (#3350)

[runc-4233]: https://github.com/opencontainers/runc/issues/4233
[mount_setattr.2]: https://man7.org/linux/man-pages/man2/mount_setattr.2.html
[cve-2019-5736]: https://github.com/advisories/GHSA-gxmr-w5mj-v8hh

Thanks to the following contributors who made this release possible:

 * Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
 * Alban Crequy <albancrequy@microsoft.com>
 * Aleksa Sarai <cyphar@cyphar.com>
 * Alex Jia <ajia@redhat.com>
 * Alexander Eldeib <alexeldeib@gmail.com>
 * Andrey Tsygunka <dreamsider@mail.ru>
 * Austin Vazquez <macedonv@amazon.com>
 * Bjorn Neergaard <bjorn.neergaard@docker.com>
 * Brian Goff <cpuguy83@gmail.com>
 * Chengen, Du <chengen.du@canonical.com>
 * Chethan Suresh <chethan.suresh@sony.com>
 * Christian Happ <Christian.Happ@jumo.net>
 * Cory Snider <csnider@mirantis.com>
 * CrazyMax <crazy-max@users.noreply.github.com>
 * Daniel, Dao Quang Minh <dqminh89@gmail.com>
 * Danish Prakash <grafitykoncept@gmail.com>
 * Davanum Srinivas <davanum@gmail.com>
 * Eng Zer Jun <engzerjun@gmail.com>
 * Eric Ernst <eric_ernst@apple.com>
 * Erik Sjölund <erik.sjolund@gmail.com>
 * Evan Phoenix <evan@phx.io>
 * Francis Laniel <flaniel@linux.microsoft.com>
 * Heran Yang <heran55@126.com>
 * Irwin D'Souza <dsouzai.gh@gmail.com>
 * Jaroslav Jindrak <dzejrou@gmail.com>
 * Jonas Eschenburg <jonas.eschenburg@kuka.com>
 * Jordan Rife <jrife0@gmail.com>
 * Kailun Qin <kailun.qin@intel.com>
 * Kang Chen <kongchen28@gmail.com>
 * Kazuki Hasegawa <nanasi880@gmail.com>
 * Kir Kolyshkin <kolyshkin@gmail.com>
 * Markus Lehtonen <markus.lehtonen@intel.com>
 * Masahiro Yamada <masahiroy@kernel.org>
 * Mikko Ylinen <mikko.ylinen@intel.com>
 * Mrunal Patel <mrunalp@gmail.com>
 * Peter Hunt <pehunt@redhat.com>
 * Prajwal S N <prajwalnadig21@gmail.com>
 * Qiang Huang <h.huangqiang@huawei.com>
 * Radostin Stoyanov <rstoyanov@fedoraproject.org>
 * Rodrigo Campos <rodrigoca@microsoft.com>
 * Ruediger Pluem <ruediger.pluem@vodafone.com>
 * Sebastiaan van Stijn <github@gone.nl>
 * Shengjing Zhu <zhsj@debian.org>
 * Sjoerd van Leent <sjoerd.van.leent@alliander.com>
 * SuperQ <superq@gmail.com>
 * TTFISH <jiongchiyu@gmail.com>
 * Tianon Gravi <admwiggin@gmail.com>
 * Vipul Newaskar <vipulnewaskar7@gmail.com>
 * Walt Chen <godsarmycy@gmail.com>
 * Wang-squirrel <117961776+Wang-squirrel@users.noreply.github.com>
 * Wei Fu <fuweid89@gmail.com>
 * Zheao Li <me@manjusaka.me>
 * Zoe <hi@zoe.im>
 * cdoern <cdoern@redhat.com>
 * dharmicksai <dharmicksaik@gmail.com>
 * guodong <guodong9211@gmail.com>
 * hang.jiang <hang.jiang@daocloud.io>
 * lengrongfu <lengrongfu@lengrongfudeMacBook-Pro.local>
 * lifubang <lifubang@acmcoder.com>
 * utam0k <k0ma@utam0k.jp>
 * wineway <wangyuweihx@gmail.com>
 * yanggang <gang.yang@daocloud.io>
 * yaozhenxiu <946666800@qq.com>

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>