sandbox

With Ubuntu 14.04 (Trusty) now a year away from end-of-life, we've been planning and performing upgrades for the soon-to-be legacy OS. The biggest change is the move from Upstart to Systemd for managing services. It's trivial to convert a service configuration from one to the other, but we're taking the opportunity to explore some of the extra bells-and-whistles included with Systemd.

I've been exploring some of the sandboxing capabilities recently and been pleasently surprised both with how easy they are to setup and how powerful they are. With Upstart, we would run each service as its own unprivileged user and count on the Principle of Least Privilege to protect the system in the event of a security hole or bug in the service.

Systemd allows you to restrict services much more. Their documentation is pretty good and even provides recommendations on which options to set. Here are a few I've found useful:

ProtectSystem

This allows you to mark large portions of (or even the entire) filesystem as read-only. In most cases, our services don't need to write files to disk, so this is a big win right off the bat. On the upcoming Ubuntu release (18.04 Bionic), you can use this combination:

ProtectSystem=strict
PrivateTmp=true

This mounts a private /tmp directory the service can write to, but otherwise makes the entire file system read-only.

ProtectSystem docs

DynamicUser

In the past it was common to run services which required no privileges as nobody or a special-purpose user per service. This option combines the best of both worlds, creating an ephemeral user account on-the-fly for the service. It can be combined with the different *Directory options to ensure the user has access to any files it might need. Here's a snippet for a service we use to tail log files/journal entries:

DynamicUser=true
SupplementaryGroups=adm
ConfigurationDirectory=margie

This ensures any files in /etc/margie will be owned by the dynamic user on startup and the process is executed as the adm group which has permission to read the files/journal.

DynamicUser docs

BindReadOnlyPaths

Even if the whole filesystem is read-only, you may not want your service (or a bad actor that has taken it over) to read the files. This option can be used to replace entire directories in the filesystem:

BindReadOnlyPaths=/home/user/etc:/etc

In the recently released Systemd 238, another option is provided that will allow you to provide access to just a few files or directories within a directory:

TemporaryFileSystem=/etc/:ro
BindReadOnlyPaths=/etc/resolv.conf
BindReadOnlyPaths=/etc/mime.types

TemporaryFileSystem docs

BindPaths docs

Additional Settings

I'll avoid rewriting all the docs here, but the Systemd docs recommend setting the following options for security as well:

ProtectHome=true
ProtectKernelTunables=true
ProtectControlGroups=true
ProtectKernelModules=true
PrivateDevices=true
SystemCallArchitectures=native
CapabilityBoundingSet=~CAP_SYS_ADMIN

The docs do a good job of explaining what each of these individual options does.

This just scratches the surface on what limits you can impose on a service with Systemd. There is also a whole set of Limit* options which allow you to limit CPU, memory usage, etc. If you want to take it one step further, RootDirectory runs a service in a chroot jail and systemd-nspawn to run a namespace container (like Docker).

I'm by no means a Systemd expert. If you have any other tips or tricks, please pass them along.