nsjail/README.md

196 lines
6.9 KiB
Markdown
Raw Normal View History

2015-05-15 05:44:48 +08:00
### WHAT IS IT?
2015-05-16 11:18:23 +08:00
NsJail is a process isolation tool for Linux. It makes use of the the namespacing, resource control, and seccomp-bpf syscall filter subsystems of the Linux kernel.
2015-08-15 22:20:48 +08:00
It can help, among others, with:
* Securing networking services (e.g. web, time, DNS), by isolating them from the rest of the OS
2015-05-16 11:18:23 +08:00
* Hosting computer security challenges (so-called CTFs)
* Containing invasive syscall-level OS fuzzers
2015-05-15 05:44:48 +08:00
2015-05-15 08:08:13 +08:00
This is NOT an official Google product.
2016-03-02 10:10:08 +08:00
### WHAT KIND OF ISOLATION DOES THIS TOOL PROVIDE?
2015-05-16 11:18:23 +08:00
1. Linux namespaces: UTS (hostname), MOUNT (chroot), PID (separate PID tree), IPC, NET (separate networking context), USER
2015-05-16 07:55:25 +08:00
2. FS constraints: chroot(), pivot_root(), RO-remounting
2015-05-16 11:18:23 +08:00
3. Resource limits (wall-time/CPU time limits, VM/mem address space limits, etc.)
4. Programmable seccomp-bpf syscall filters
2016-03-02 10:10:08 +08:00
5. Cloned and separated Ethernet interfaces
2015-05-15 05:44:48 +08:00
2016-03-02 10:10:08 +08:00
### WHAT KIND OF USE-CASES ARE SUPPORTED?
2015-05-16 07:55:25 +08:00
#### Isolation of network servers (inetd-style)
2015-05-15 08:25:55 +08:00
+ Server:
2015-05-15 08:27:51 +08:00
```
2015-05-15 05:44:48 +08:00
$ ./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
2015-05-15 08:27:51 +08:00
```
2015-05-15 05:44:48 +08:00
2015-05-15 08:25:55 +08:00
+ Client:
```
2015-05-16 11:10:13 +08:00
$ nc 127.0.0.1 9000
/ $ ifconfig
/ $ ifconfig -a
lo Link encap:Local Loopback
LOOPBACK MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
/ $ ps wuax
PID USER COMMAND
1 99999 /bin/sh -i
2015-05-16 11:18:23 +08:00
3 99999 {busybox} ps wuax
2015-07-08 01:54:36 +08:00
/ $
2015-05-16 11:10:13 +08:00
2015-05-15 08:25:55 +08:00
```
2015-05-15 05:44:48 +08:00
2015-05-16 07:55:25 +08:00
#### Isolation of local processes
2015-05-15 08:25:55 +08:00
```
2015-05-15 05:44:48 +08:00
$ ./nsjail -Mo --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
2015-05-16 11:10:13 +08:00
/ $ ifconfig -a
lo Link encap:Local Loopback
LOOPBACK MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
2015-05-15 05:44:48 +08:00
/ $ id
uid=99999 gid=99999
2015-05-16 11:10:13 +08:00
/ $ ps wuax
PID USER COMMAND
1 99999 /bin/sh -i
2015-05-16 11:18:23 +08:00
4 99999 {busybox} ps wuax
2015-05-15 05:44:48 +08:00
/ $exit
$
2015-05-15 08:25:55 +08:00
```
2015-05-15 05:44:48 +08:00
2015-05-16 07:55:25 +08:00
#### Isolation of local processes (and re-running them)
2015-05-15 08:25:55 +08:00
```
2015-05-15 05:44:48 +08:00
$ ./nsjail -Mr --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
2015-05-16 11:10:13 +08:00
/ $ ps wuax
PID USER COMMAND
1 99999 /bin/sh -i
2 99999 {busybox} ps wuax
2015-05-15 05:44:48 +08:00
/ $ exit
BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
Enter 'help' for a list of built-in commands.
2015-05-16 11:10:13 +08:00
/ $ ps wuax
PID USER COMMAND
1 99999 /bin/sh -i
2 99999 {busybox} ps wuax
2015-05-15 05:44:48 +08:00
/ $
2015-05-15 08:25:55 +08:00
```
2015-05-15 05:44:48 +08:00
### MORE INFO?
2015-05-15 08:30:53 +08:00
Type:
```
./nsjail --help'
```
2015-05-16 07:55:25 +08:00
The commandline options are reasonably well-documented
```
Usage: ./nsjail [options] -- path_to_command [args]
Options:
2016-03-02 10:10:08 +08:00
--help|-h
2015-05-16 07:55:25 +08:00
Help plz..
--mode|-M [val]
Execution mode (default: l [MODE_LISTEN_TCP]):
2016-03-02 10:10:08 +08:00
l: Wait for connections on a TCP port (specified with --port) [MODE_LISTEN_TCP]
o: Immediately launch a single process on a console using clone/execve [MODE_STANDALONE_ONCE]
e: Immediately launch a single process on a console using execve [MODE_STANDALONE_EXECVE]
2015-05-16 07:55:25 +08:00
r: Immediately launch a single process on a console, keep doing it forever [MODE_STANDALONE_RERUN]
2016-03-02 10:10:08 +08:00
--cmd
Equivalent of -Mo (MODE_STANDALONE_ONCE), run command on a local console, once
2015-05-16 07:55:25 +08:00
--chroot|-c [val]
2016-03-02 10:10:08 +08:00
Directory containing / of the jail (default: "/"). Skip mounting it if ""
--rw
Mount / as RW (default: RO)
2015-05-16 07:55:25 +08:00
--user|-u [val]
2016-03-02 10:10:08 +08:00
Username/uid of processess inside the jail (default: your current uid). You can also use inside_ns_uid:outside_ns_uid convention here
2015-05-16 07:55:25 +08:00
--group|-g [val]
2016-03-02 10:10:08 +08:00
Groupname/gid of processess inside the jail (default: your current gid). You can also use inside_ns_gid:global_ns_gid convention here
2015-05-16 07:55:25 +08:00
--hostname|-H [val]
UTS name (hostname) of the jail (default: 'NSJAIL')
2015-11-07 20:33:50 +08:00
--cwd|-D [val]
2016-03-02 10:10:08 +08:00
Directory in the namespace the process will run (default: '/')
2015-05-16 07:55:25 +08:00
--port|-p [val]
TCP port to bind to (only in [MODE_LISTEN_TCP]) (default: 31337)
2016-03-02 10:10:08 +08:00
--bindhost [val]
IP address port to bind to (only in [MODE_LISTEN_TCP]) (default: '::')
2015-05-16 07:55:25 +08:00
--max_conns_per_ip|-i [val]
Maximum number of connections per one IP (default: 0 (unlimited))
--log|-l [val]
2016-03-02 10:10:08 +08:00
Log file (default: /proc/self/fd/2)
2015-05-16 07:55:25 +08:00
--time_limit|-t [val]
2015-07-08 04:17:44 +08:00
Maximum time that a jail can exist, in seconds (default: 600)
2016-03-02 10:10:08 +08:00
--daemon|-d
Daemonize after start
--verbose|-v
Verbose output
--keep_env|-e
Should all environment variables be passed to the child?
--env|-E [val]
Environment variable (can be used multiple times)
--keep_caps
Don't drop capabilities (DANGEROUS)
--silent
Redirect child's fd:0/1/2 to /dev/null
--disable_sandbox
Don't enable the seccomp-bpf sandboxing
--skip_setsid
Don't call setsid(), allows for terminal signal handling in the sandboxed process
2015-05-16 07:55:25 +08:00
--rlimit_as [val]
RLIMIT_AS in MB, 'max' for RLIM_INFINITY, 'def' for the current value (default: 512)
--rlimit_core [val]
RLIMIT_CORE in MB, 'max' for RLIM_INFINITY, 'def' for the current value (default: 0)
--rlimit_cpu [val]
RLIMIT_CPU, 'max' for RLIM_INFINITY, 'def' for the current value (default: 600)
--rlimit_fsize [val]
RLIMIT_FSIZE in MB, 'max' for RLIM_INFINITY, 'def' for the current value (default: 1)
--rlimit_nofile [val]
RLIMIT_NOFILE, 'max' for RLIM_INFINITY, 'def' for the current value (default: 32)
--rlimit_nproc [val]
RLIMIT_NPROC, 'max' for RLIM_INFINITY, 'def' for the current value (default: 'def')
--rlimit_stack [val]
RLIMIT_STACK in MB, 'max' for RLIM_INFINITY, 'def' for the current value (default: 'def')
2016-03-02 10:10:08 +08:00
--persona_addr_compat_layout
personality(ADDR_COMPAT_LAYOUT)
--persona_mmap_page_zero
personality(MMAP_PAGE_ZERO)
--persona_read_implies_exec
personality(READ_IMPLIES_EXEC)
--persona_addr_limit_3gb
personality(ADDR_LIMIT_3GB)
--persona_addr_no_randomize
personality(ADDR_NO_RANDOMIZE)
--disable_clone_newnet|-N
Don't use CLONE_NEWNET. Enable networking inside the jail
--disable_clone_newuser
Don't use CLONE_NEWUSER. Requires euid==0
--disable_clone_newns
Don't use CLONE_NEWNS
--disable_clone_newpid
Don't use CLONE_NEWPID
--disable_clone_newipc
Don't use CLONE_NEWIPC
--disable_clone_newuts
Don't use CLONE_NEWUTS
--bindmount_ro|-R [val]
List of mountpoints to be mounted --bind (ro) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
2015-05-16 07:55:25 +08:00
--bindmount|-B [val]
2016-03-02 10:10:08 +08:00
List of mountpoints to be mounted --bind (rw) inside the container. Can be specified multiple times. Supports 'source' syntax, or 'source:dest'
2015-05-16 07:55:25 +08:00
--tmpfsmount|-T [val]
2016-03-02 10:10:08 +08:00
List of mountpoints to be mounted as RW/tmpfs inside the container. Can be specified multiple times. Supports 'dest' syntax
2015-07-08 04:17:44 +08:00
--tmpfs_size [val]
2016-03-02 10:10:08 +08:00
Number of bytes to allocate for tmpfsmounts (default: 4194304)
--disable_proc
Disable mounting /proc in the jail
--iface_no_lo
Don't bring up the 'lo' interface
--iface|-I [val]
Interface which will be cloned (MACVTAP) and put inside the subprocess' namespace as 'vs'
--iface_vs_ip [val]
IP of the 'vs' interface
--iface_vs_nm [val]
Netmask of the 'vs' interface
--iface_vs_gw [val]
Default GW for the 'vs' interface
2015-05-16 07:55:25 +08:00
```