Tiago Scolari

bits for fun

Containers outside Containers

2017-03-18

What is a container? Most will probably answer something related to Docker. That’s a good guess, but docker only one, of many, wrappers that enables the illusion of what we call container.

Container is not a “one” thing, there is a set of different features from the Linux kernel that enables the “container behavior” on a process. They consist of isolation (namespaces) and control (qgroups), to add security that the running process can’t harm the host system. Recently I presented a lightning talk at Pivotal (slides at the end) about how to use some of these (actually only one) features outside the containers’ context.

cgroups

A control group, or cgroup, is how you limit resources of a process. They can also be used to report back usage of different resources (CPU, memory, disk, network…). All this is built into the Linux kernel, and the point of my presentation was that: possibly, knowing this could save some development time.

Cgroups is a filesystem, usually mounted at /sys/fs/cgroup. It consists of several subsystems: cpuset, cpu, cpuacct, blkio, memory, devices, freezer, net_cls, perf_event, net_prio, hugetlb and pids. And that’s what you’ll see if you list the contents of /sys/fs/cgroup:

dr-xr-xr-x 5 root root  0 Mar 18 21:06 blkio
lrwxrwxrwx 1 root root 11 Mar 18 21:06 cpu -> cpu,cpuacct
lrwxrwxrwx 1 root root 11 Mar 18 21:06 cpuacct -> cpu,cpuacct
dr-xr-xr-x 5 root root  0 Mar 18 21:06 cpu,cpuacct
dr-xr-xr-x 2 root root  0 Mar 18 21:06 cpuset
dr-xr-xr-x 5 root root  0 Mar 18 21:06 devices
dr-xr-xr-x 2 root root  0 Mar 18 21:06 freezer
dr-xr-xr-x 2 root root  0 Mar 18 21:06 hugetlb
dr-xr-xr-x 5 root root  0 Mar 18 21:06 memory
lrwxrwxrwx 1 root root 16 Mar 18 21:06 net_cls -> net_cls,net_prio
dr-xr-xr-x 2 root root  0 Mar 18 21:06 net_cls,net_prio
lrwxrwxrwx 1 root root 16 Mar 18 21:06 net_prio -> net_cls,net_prio
dr-xr-xr-x 2 root root  0 Mar 18 21:06 perf_event
dr-xr-xr-x 5 root root  0 Mar 18 21:06 pids
dr-xr-xr-x 5 root root  0 Mar 18 21:06 systemd

adding a process to a group

Each one of this folder will have different files inside, which can restrain the usage of the resource in some way, or display specific usage statistics of each. But all of them will have in commum a file named cgroup.procs, which contains all the PIDs of the processes that belong to the group. To add a process to a group, simply append it to this file. A process may belong to more than one cgroup.

echo $$ > /sys/fs/cgroup/cpu/cgroup.procs
echo $$ > /sys/fs/cgroup/memory/cgroup.procs

creating subgroups

You can also create a “group” inside a subsystem to limit with more granularity your processes. That’s as simple as creating a folder inside it.

$ cd /sys/fs/cgroup/cpu
$ mkdir myapp

If you list the contents of myapp, it has the same files of the parent folder. That is because when you create group it’ll automatically inherits all limits from its parent. You can, theoretically, create unlimited groups inside groups inside groups…

limiting a resource

Inside the cpu folder, you’ll find for example a file called cpu.shares which contains a number, usually 1024. If all your cpu groups have this set to 1024 it means all they have an equal share of the CPU. If change it’s value to 512 in one group, that means that group will have only half of CPU than the others. If in another group you write 2048 to that file, that group will have double shares of all the 1024 group processes, and 4 times more shares than the 512 one.

$ echo 512 > /sys/fs/cgroup/cpu/group1/cpu.shares
$ echo 2048 > /sys/fs/cgroup/cpu/group2/cpu.shares

demo

That was an easy way to limit your app CPU usage. There are similar ways to limit memory, network, disk, etc, only using cgroups. Also note that cgroups can be used to report resource usage of your app, just by reading the correct files. If you’re think about implementing this features as part of your app code, using cgroup can save you some precious time, allowing you to focus in something else.

The man pages for cgroups is an extensive and good source of information about them. Also Redhat has a good documentation on them.