Lpi 305 Container Virtualization Concepts
Container Virtualization is here to stay and you should get familiar with it. Nowaday you have different solutions for dealing with containerized environments, for instance, Docker, LXC, buildah, podman and many others. And we should not forget about Container Orchestration using different tools like Mesos, Kubernetes, Docker Swarm, Rancher and OpenShift, and why not through cloud solutions like EKS, GKS and AKS. For this post we’re going to focus on understand the base concepts behind containers, using as base the LPIC 305 - Container Virtualization Concepts topics.
So, let’s roll… :)
For the LPIC - 305 Container Virtualization Concepts topics we have:
Weight: 7 Description: Candidates should understand the concept of container virtualization. This includes understanding the Linux components used to implement container virtualization as well as using standard Linux tools to troubleshoot these components.
Key Knowledge Areas:
- Understand the concepts of system and application container
- Understand and analyze kernel namespaces
- Understand and analyze control groups
- Understand and analyze capabilities
- Understand the role of seccomp, SELinux and AppArmor for container virtualization
- Understand how LXC and Docker leverage namespaces, cgroups, capabilities, seccomp and MAC
- Understand the principle of runc
- Understand the principle of CRI-O and containerd
- Awareness of the OCI runtime and image specifications
- Awareness of the Kubernetes Container Runtime Interface (CRI)
- Awareness of podman, buildah and skopeo
- Awareness of other container virtualization approaches in Linux and other free operating systems, such as rkt, OpenVZ, systemd-nspawn or BSD Jails
The following is a partial list of the used files, terms and utilities:
- nsenter
- unshare
- ip (including relevant subcommands)
- capsh
- /sys/fs/cgroups
- /proc/[0-9]+/ns
- /proc/[0-9]+/status
Understanding Containers
Understanding the basics of conteiners are not so complicated, of course you could go ahead and read any basic tutorial or Getting Started with Docker and in 1 or 2 hours you will be fine with containers - I hightly recommend the Docker Getting Started from the Docker documentation for this. But, if you stop there, you’ll only understand the use of a frontend tool and not the base techs that allows you to run a container, and by that I mean understand what a cgroup is, or how namespaces allows you to isolate your network from the containers networks and also what’s a runtime and why I need one.
So, let’s starts the following:
- Control Groups
- Kernel Namespaces
- Containers Capabilities
Linux Control Groups, cgroups
From the GNU/Linux man page we have that cgroups are:
Linux kernel feature which allow processes to be organized into hierarchical groups whose usage of various types of resources can then be limited and monitored
Which means that with cgroups we can have processes’s resources such as CPU time, memory and bandwidth controlled, monitored and limited. For example, if you’re running a task that is going to take some time to finish but you also want to guarantee that it will not consume all your memory or cpu, you could easly run it within a cgroup with limits for memory and cpu time usage.
In your GNU/Linux environment you can take a look at what cgroups controllers you already have by listing the content of the /sys/fs/cgroup/
folder:
|
|
If we take for example memory:
|
|
We will find a list of limiters that can be applyed to a process. For example, the memory.limit_in_bytes that will limite the amount of memory that a process can use, or the memory.swappiness for the swap relation that the system is using. And the folders inside this directory are just systemd unit types for resource control. In this case we have:
- Services: For systemd services
- Scope: A group of externally created processes. For example, user sessions, containers, and virtual machines.
- Slice: Organize a hierarchy in which scopes and services are placed.
This way, if we take a look at the contant of the user.slice folder we will see a group of files and also the user-1000.slice which have a session-1.scope:
|
|
If you run a cat
command on the tasks file inside this folder you will be presented with all the processes that you session is holding. In my case I can se the PID 1764 that run the ssh-agent:
|
|
Another way to check this hierarchy is by running the systemd-cgls
command:
|
|
And for monitoring the cgroups resource consuption you could use the systemd-cgtop
command, which will return an output like the one bellow:
Control Group Tasks %CPU Memory Input/s Output/s
user.slice 759 8.7 4.1G - -
/ 1184 4.0 6.5G - -
system.slice 163 2.1 1.6G - -
system.slice/acpid.service 1 1.3 684.0K - -
system.slice/systemd-logind.service 1 0.7 7.3M - -
system.slice/systemd-journald.service 1 0.1 82.6M - -
system.slice/containerd.service 42 0.0 103.9M - -
system.slice/system76-power.service 17 0.0 6.0M - -
Namespaces
Boundaries of a process
lsns /proc/*/ns unshared -> runs a program in a namespace unshared from its parent process. nsenter -> enter the namespaces of one or more processes and than executes the specified program.