Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
Sometimes, you may need to run a process on all the nodes of the cluster. Think of log-collecting services like Prometheus Log Exporter or storage daemons like glusterd. Such services need to be started on each node as soon as the node joins the cluster. You may think: we can use a cron job that runs on machine boot/reboot. Perhaps use the /etc/init.local file to ensure that a specific process or command gets executed as soon as the server gets started. While certainly a valid solution, using the node itself to control the daemons that run on it (especially within a Kubernetes cluster) suffers some drawbacks:
Because of those drawbacks, Kubernetes offers Daemonsets. A Daemonset is another controller that manages pods like Deployments, ReplicaSets, and StatefulSets. It was created for one particular purpose: ensuring that the pods it manages to run on all the cluster nodes. As soon as a node joins the cluster, the DaemonSet ensures that it has the necessary pods running on it. When the node leaves the cluster, those pods are garbage collected.
Let’s deploy fluentd to collect node data using DaemonSet. fluentd is an open-source application used for collecting and normalizing data. Such data is often logs like web servers or the system log. fluentd is deployed as a background service. If we need it to run on every node of a Kubernetes cluster, we can create a YAML file as follows:
apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: kube-system labels: tier: management app: fluentd spec: selector: matchLabels: name: fluentd template: metadata: labels: name: fluentd spec: containers: - resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi securityContext: privileged: true image: fluent/fluentd name: fluentd-elasticsearch volumeMounts: - name: varlog mountPath: /var - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true volumes: - name: varlog hostPath: path: /var - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers
Let’s see what this definition do for us:
Apply the above configuration using
kubectl apply -f daemonset.yaml
Let’s see what the DaemonSet created:
kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE coredns-696c56cf6f-gjcs9 1/1 Running 0 6h46m coredns-696c56cf6f-twj4l 1/1 Running 0 6h50m coredns-autoscaler-bc55cb685-xpzk6 1/1 Running 0 6h50m fluentd-f657w 1/1 Running 0 5h26m fluentd-wsvdf 1/1 Running 0 5h26m heapster-5678f88989-fvdj7 2/2 Running 0 6h46m kube-proxy-xdggl 1/1 Running 0 6h47m kube-proxy-z899d 1/1 Running 0 6h47m kubernetes-dashboard-7b749f655b-lbgnp 1/1 Running 1 6h50m metrics-server-5b7d5c6f8d-kss58 1/1 Running 1 6h50m omsagent-fk4qg 1/1 Running 1 6h47m omsagent-lgqkz 1/1 Running 0 6h47m omsagent-rs-7b459857cd-g9gsx 1/1 Running 1 6h50m tunnelfront-5d4d658788-28tdc 1/1 Running 0 6h50
Since we opted to deploy this DaemonSet on the kube-system namespace (rather than the default), we must pass in the namespace name when issuing any kubectl command to it by using -n option.
Additionally, we have some other pods running in the kube-system for a variety of tasks, including coredns for name resolution, kube-proxy, and others. The pods that we are interested in are fluentd-f657w and fluentd-wsvdf.
By default, a DaemonSet schedules its pods on all the cluster nodes. But sometimes you may need to run specific processes on specific nodes. For example, nodes that host database pods need different monitoring or logging rules. DaemonSets allow you to select which nodes you want to run the pods on. You can do this by using nodeSelector. With nodeSelector, you can select nodes by their labels the same way you do with pods. However, Kubernetes also allows you to select nodes based on some already-defined node properties. For example, kubernetes.io/hostname matches the node name. So, our example cluster has two nodes. We can modify the DaemonSet definition to run only on the first node. Lets’ first get the node names:
kubectl get nodes NAME STATUS ROLES AGE VERSION aks-agentpool-30423418-0 Ready agent 7h2m v1.12.8 aks-agentpool-30423418-1 Ready agent 7h2m v1.12.8
Our DaemonSet definition now should look like this:
apiVersion: apps/v1 kind: DaemonSet metadata: name: fluentd namespace: kube-system labels: tier: management app: fluentd spec: selector: matchLabels: name: fluentd template: metadata: labels: name: fluentd spec: containers: - resources: limits: memory: 200Mi requests: cpu: 100m memory: 200Mi securityContext: privileged: true image: fluent/fluentd name: fluentd-elasticsearch volumeMounts: - name: varlog mountPath: /var - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true nodeSelector: kubernetes.io/hostname: aks-agentpool-30423418-0 volumes: - name: varlog hostPath: path: /var - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers
The added part here is:
nodeSelector: kubernetes.io/hostname: aks-agentpool-30423418-0
Run the new definition using kubectl:
kubectl apply -f daemonset.yaml
If we check the pods now:
kubectl -n kube-system get pods NAME READY STATUS RESTARTS AGE coredns-696c56cf6f-gjcs9 1/1 Running 0 6h55m coredns-696c56cf6f-twj4l 1/1 Running 0 6h59m coredns-autoscaler-bc55cb685-xpzk6 1/1 Running 0 6h59m fluentd-wg7rf 1/1 Running 0 3s heapster-5678f88989-fvdj7 2/2 Running 0 6h55m kube-proxy-xdggl 1/1 Running 0 6h56m kube-proxy-z899d 1/1 Running 0 6h56m kubernetes-dashboard-7b749f655b-lbgnp 1/1 Running 1 6h59m metrics-server-5b7d5c6f8d-kss58 1/1 Running 1 6h59m omsagent-fk4qg 1/1 Running 1 6h56m omsagent-lgqkz 1/1 Running 0 6h56m omsagent-rs-7b459857cd-g9gsx 1/1 Running 1 6h59m tunnelfront-5d4d658788-28tdc 1/1 Running 0 6h59m
Notice that now we have only one pod running because only one node is matched.
There are several design patterns DaemonSet-pods communication in the cluster:
Protect your cloud infrastructure by understanding the key vulnerability areas according to the shared responsibility model.
Know more about the 4 main types of “leaks” that commonly occur with cloud asset management, and some useful strategies to address them.
With the NIST cybersecurity framework implemented using policy-as-code, companies can strengthen their security processes. Learn more.