This commit adds extra setup when cgroupsv2 is enabled. In particular,
we make sure that the root namespace has setup cgroup.subtree_control
with the controllers we need.
If the necessary controller are not listed, we have to move all
processes out of the root namespace before we can change this
(the 'no internal processes' rule:
https://unix.stackexchange.com/a/713343). Currently we only
handle the case where the nsjail process is the only process in
the cgroup. It seems like this would be relatively rare, but since
nsjail is frequently the root process in a Docker container (e.g.
for hosting CTF challenges), I think this case is common enough to
make it worth implementing.
This also adds `--detect_cgroupv2`, which will attempt to detect
whether `--cgroupv2_mount` is a valid cgroupv2 mount, and if so
it will set `use_cgroupv2`. This is useful in containerized
environments where you may not know the kernel version ahead of time.
References:
https://github.com/redpwn/jail/blob/master/internal/cgroup/cgroup2.go
Currently, we always kill children by sending them a SIGKILL signal in
case we've got a fatal signal. This is rather inflexible and forbids
some usecases where e.g. child process listen for specific signals to
shut down gracefully.
Add a new command configuration `--forward_signals` that allows the user
to opt-in to forwarding fatal signals to the child process.
`subproc::killAndReapAll()` is always killing the child process with the
SIGKILL signal. We're about to make this configurable though so that we
may optionally forward signals received by nsjail to the child process.
Add a new parameter to `killAndReapAll()` to prepare for this change.