Weaveworks 2022.03 release featuring Magalix PaC | Learn more
Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
When you decide to use containers for running your software, you save money, effort, and can make use of orchestration systems like Kubernetes, and that even increases your business efficiency. However, switching to containers also means following a different mindset when dealing with issues like security. Many people are fooled into thinking that containers are just lighter forms of virtual machines. While containers can give you the illusion that they are working on their own, isolated from the host’s operating system and from other containers, the truth is, they aren’t.
A container at the end is just a process running on your operating system. It is just using powerful features of the Linux kernel to provide a level of isolation.
The above introduction is necessary so that you can understand the recommendations that we’ll layout in this article. When you think of a container as a process rather than a full-blown machine, the coming security recommendations will make a lot of sense.
If you examine the well-known daemons that run on Linux, you’ll find out that most of them don’t use the root account as the process owner. Take Apache for example; although you do need to use the root account to start http (or apache2 if you’re running on Ubuntu), the daemon itself spawns child processes that are owned by a non-privileged user that is typically created for this purpose. So, on Ubuntu, if you are running Apache, you’ll multiply child processes owned by a user called www-data. It is those processes that receive and respond to requests, and they are the ones that listen on port 80. The following is a real-world example from a production machine:
ubuntu:~/ $ ps -ef | grep apache
root 1526 1 0 Sep21 ? 00:02:50 /usr/sbin/apache2 -k start
www-data 89643 1526 0 17:11 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 89656 1526 0 17:11 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 90837 1526 0 17:33 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 93493 1526 0 18:18 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 95117 1526 0 18:47 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 95960 1526 0 19:02 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 95961 1526 0 19:02 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 96675 1526 0 19:14 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 96902 1526 0 19:18 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 97487 1526 0 19:28 ? 00:00:00 /usr/sbin/apache2 -k start
So, we agreed that Linux containers are nothing more than processes running on their own isolated namespaces. Hence, they also need to run as a non-privileged user unless strictly required.
Now, let’s have a practical example; a container that runs a simple shell command:
FROM alpine:latest
RUN apk update && addgroup -S mygroup && adduser -S myuser -G mygroup
USER myuser
ENTRYPOINT ["sh","-c","sleep 100000"]
Note that in this Dockerfile, we are creating a normal, non-privileged user myuser and using it to start the container application. Let’s build and run this image:
$ docker build --no-cache -t secured .
Sending build context to Docker daemon 2.048kB
Step 1/4 : FROM alpine:latest
---> 965ea09ff2eb
Step 2/4 : RUN apk update && addgroup -S mygroup && adduser -S myuser -G mygroup
---> Running in 85305e2feaec
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
v3.10.3-36-g7cacb7930a [http://dl-cdn.alpinelinux.org/alpine/v3.10/main]
v3.10.3-37-gc81e885c62 [http://dl-cdn.alpinelinux.org/alpine/v3.10/community]
OK: 10339 distinct packages available
Removing intermediate container 85305e2feaec
---> 5c18a66c1445
Step 3/4 : USER myuser
---> Running in cf1525e4e653
Removing intermediate container cf1525e4e653
---> 7d5095f0fd6b
Step 4/4 : ENTRYPOINT ["sh","-c","sleep 100000"]
---> Running in 0331864a7ead
Removing intermediate container 0331864a7ead
---> 794b14aa26fb
Successfully built 794b14aa26fb
Successfully tagged secured:latest
$ docker run -d secured337e45006106fc823e69ba9bc1459ca81e4ac51cbbe9616cdcec7421b5198fbb
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
337e45006106 secured "sh -c 'sleep 100000'" 9 seconds ago Up 7 seconds lucid_williams
$ docker exec -it 337e45006106 sh
/ $ id
uid=100(myuser) gid=101(mygroup) groups=101(mygroup)
/ $
Note that when we logged into the container by running the docker exec command, we are dropped into a shell that does not use the root user. Any command that we execute (or the attacker executes) is constrained.
When running a container through a Kubernetes Pod, the image - by default - gets pulled only the first time it is used. Subsequent requests to the same image are fulfilled by using the cached image on the node. While this may seem like a convenient way to save time (and bandwidth) whenever you recreate the Pod, it carries a serious security risk if this node is used to host other Pods.
Assume that you are using an image called middleware_client1.5. The image contains the authentication logic necessary for accessing the middleware part of your application. To protect your image, you place it in a private registry that requires credentials to pull images from it. The first time your middleware pods run, they pull the image from the registry using the credentials that you specified. However, this means that any other pods that get scheduled on the same node are able to use the middleware_client1.5 image without having to authenticate themselves to the image registry. This can be illustrated in the following diagram:
To address this risk, you should use the imagePullPolicy: Always parameter in the Pod definition or in the pod template of higher controllers like Deployments, StatefulSets, etc.
An example of a pod that addresses this risk may look as follows:
apiVersion: v1
kind: Pod
metadata:
name: middlewareclient
spec:
containers:
- name: uses-middleware-image
image: middlwareclient:1.5
imagePullPolicy: Always
command: [ "echo", "SUCCESS" ]
Once you are running your containers inside Kubernetes Pod, you should start thinking about how your pods will communicate with other entities on the network. Kubernetes offers Network Policies to control and restrict how pods handle incoming and outgoing communication. However, before using this feature you must ensure that your network addon supports it. As you probably already know, Kubernetes do ship with a network controller. Instead, you can choose among a number of providers that offer network addons. Most of the well-known providers support network policies, namely Calico, Weave Net, and Cilium. Therefore, what is a network policy good for? How is it implemented? Let’s have a quick example.
Consider the definition:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: jailed
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
As the name suggests (jailed) this is a NetworkPolicy resource that locks down pod connections. Any pod matched by this policy is denied access to both incoming and outgoing traffic. It’s worth noting that this specified NetworkPolicy does not target any particular pod as denoted by the empty braces {}. When the NetworkPolicy does not select any pod, all pods are matched. In other words, any pods created in the default namespace will not have incoming or outgoing network access. Note that different network policies can work together at the same time. The more specific policy is matched first. Hence, we can use a combination of the lockdown network policy that we defined earlier with policies that are more specific about which pods they control to fine-grain our network access control mechanisms. Consider the following definition:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-app-mysql
spec:
podSelector:
matchLabels:
role: db
ingress:
- from:
- podSelector:
matchLabels:
role: app
ports:
- protocol: TCP
port: 3306
Let’s have a closer look at this definition:
Lines 6-8: we specify that this policy will be applied to pods that have the label role=db.
The rest of the definition instructs the policy about which type of traffic we are interested in controlling (ingress). We need to allow connections coming from other pods that have the label role=app. Additionally, those application pods can access the database pods only on the DB port 3306 (the default port for MySQL).
So, this explicitly allows traffic from the application pods to the database pods on port 3306. But what about the rest of the communication? Since no other network policies match the app=db labeled pods, the catch-all policy is applied, which is the lockdown we created earlier. In other words, all other incoming connections shall be blocked.
However, the above means that our clients will not be able to access our application since no policies match the role=app pods. Let’s create a third policy that matches our application pods:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: external-app
spec:
podSelector:
matchLabels:
role: app
ingress:
- from: []
Now, we have our hypothetical application secured. If an attacker could gain access to our application pod, he/she could only gain access to the DB on the 3306 port. They cannot reach other pods, which may decrease and control the severity of the damage they can do.
Network policies can also be used to restrict access to the metadata API of the cloud provider. If you don’t know what the Metadata API is, it’s simply a way through which cloud providers offer information about the resources that you are using. For example, if you’re using an EC2 instance on AWS, issuing a GET request to http://169.254.169.254 to receive a lot of information about the instance that you are running in, like the DNS name, the external IP address, the IAM role that the instance is using (if any) and so on. Obviously, you don’t want data of such criticality to be available to the pod unless it’s strictly required. Hence, you can make use of NetworkPolicies to restrict access to this IP address unless the pod needs it.
Empower developers to delivery secure and compliant software with trusted application delivery and policy as code. Learn more.
Automate your deployments with continuous application delivery and GitOps. Read this blog to learn more.
This article explains the differences between hybrid and multi-cloud model and how GitOps is an effective way of managing these approaches. Learn more.
Implement the proper governance and operational excellence in your Kubernetes clusters.
Comments and Responses