Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
The Singleton pattern is one of the creational patterns used in software development. In Object-Oriented programming, a Singleton class refers to one that does not accept other instances to get created. The same concept holds for other domains than application development. For example, let’s assume that you are working on a Red Hat Linux box. Red Hat and its variants (Centos, Fedora, etc.) use yum for package and system updates. On a periodic basis, the operating system contacts its package repositories to download and update the packages that are currently installed. While this operation is in progress, no other yum processes should start. So, if a user tries to use yum to install a package, the system responds that “another installation is already in progress”. The yum process here may be referred to as the “singleton” process: only one instance of yum should be running at any given time. The Singleton pattern is used to maintain consistency and integrity. In the yum example, running multiple instances of the package manager concurrently may break the operating system or make it unstable.
The end goal is to have only one instance of a process running. In our yum example, the singleton pattern is applied by creating a zero-byte file /var/run/yum.pid. Each time the yum command runs, it checks for the presence of this file and tries to “lock” it. If the file cannot be locked, this means that another yum process is using it and, therefore, the new instance does not start. Depending on the way an application logic is implemented, it responds. A yum command sleeps until the already-running process exits. Another application may just exit with a non-zero exit code denoting that concurrent execution is not allowed.
In the above scenario, the yum process is aware that it is (and should be) the only instance running. Another way of implementing the Singleton pattern is through a wrapper script. We could write a bash script that starts the yum process, but before it does, it checks whether or not another yum process is running (perhaps through a command like (ps -fe | grep yum). The difference here is that the application itself (yum) is not aware of the constraints imposed on it. It doesn’t know whether or not another version is running. Take note of this important implementation difference because we’ll apply the same logic to Kubernetes and discuss the use cases of using each method.
In a Kubernetes cluster, the default behavior is to run and maintain several versions (replicas) of the application for high availability. A web application that runs on only one Nginx instance is vulnerable to downtime if this Nginx went down or got restarted. However, sometimes this may not best-serve your environment needs. In the microservices architecture, an application may be running on more than one component. If the application is hosted on Kubernetes, some of those components may need to follow the Singleton pattern when they run.
For example, a web application that needs to consume a message from a message queue in a sequential manner should not have more than one instance running at a time. Let’s see how we can implement the Singleton pattern in Kubernetes using the two methods that we described earlier: from within the application and from outside the application.
The simplest method that comes to mind when we need to run a Singleton Pod in Kubernetes is a ReplicaSet. The ReplicaSet controller ensures that a specific number of Pods are running. Setting the replicas count to 1 seems to do the trick. Let’s have an example:
apiVersion: apps/v1 kind: ReplicaSet metadata: name: frontend spec: replicas: 1 selector: matchLabels: tier: frontend template: metadata: labels: tier: frontend spec: containers: - name: php-redis image: gcr.io/google_samples/gb-frontend:v3
The above definition uses a ReplicaSet to spawn one Pod. The container running inside the Pod does not know that it is the only version running. It is not aware of the mechanism that prevents other instances from running. However, a ReplicaSet may not be your best option for implementing the Singleton pattern.
Because of the way ReplicaSets are designed, they ensure that at least a specific number of Pods are running. It does not strictly enforce a maximum number of running Pods at a given time. To better understand this limitation, let’s consider the situation when the Pod fails the readiness nor liveness probes. In this case, the ReplicaSet immediately spawns a new Pod on the same or on a different node. After some time, the failing Pod may come back to service. Now the matched Pod count is more than the replica count so the ReplicaSet kills one of them. However, even for a brief period of time, we had more than one Pod running. The Singleton pattern was violated.
ReplicaSets are easier to implement, yet they are more inclined towards keeping a highly-available state of the application rather than a consistent one. So, if your application requirements are very strict when following the Singleton pattern, you should use a StatefulSet instead. A StatefulSet is more strict about the order of the start and termination of its Pods. A given Pod cannot start unless its predecessor has already started. The termination process works in a similar way. Thus, a StatefulSet controller guarantees to abide by the Singleton pattern more than a ReplicaSet. However, StatefulSets have their own limitations:
Again, the hosted application is not aware that only one instance of it is running and that additional instances are not allowed.
You can also design the application itself so that only one instance (commonly known as the leader) us running at any given time. Other instances of the application are also running but they’re in passive mode. When in passive mode, the instance does not respond to client requests. Passive instances are constantly waiting for the leader to come down so that one of them gets elected to be the leader. There are many cluster applications that already implements this pattern like ZooKeeper, Redis, Consul, and Etcd.
Etcd is a key-value data store that is used by Kubernetes to maintain its state. The datastore must be replicated for high availability, but only one instance serves as the single source of truth for the cluster. Applications can use third-party data stores for this purpose like Consul but they can also use Etcd. A prominent example of this type of usage is Apache Camel which offers a connector to Etcd enabling the application to make use of Etcd’s locking mechanisms. No matter which leader-selection application you’re using, the concept remains the same: one instance of the application is active while the rest are inactive. When the active instance is down, the rest of the instances elect a new one to take over.
The Singleton pattern describes a scenario where you need only one instance of your application to be running at any given time. But sometimes, you may also want to ensure that a specific number (or percentage) of your Pods are not unavailable. Despite the existence of controllers like ReplicaSets, Deployments, etc. which ensure that all your Pods are healthy and running, in some scenarios, you are intentionally bringing down a portion of your Pods. The most well-known example of such a scenario is when you are draining a node. Draining refers to gradually removing Pods from a running node till all of them are down. Then, the node can be brought down for maintenance.
Beware, though, that Pod Disruption Budget policy only protects against voluntary Pod eviction. If Pods are evicted due to a node becoming unhealthy, PDB is not honored.
Assume that you have a Deployment that spawns five Pod replicas. Now, you need to upgrade one of the cluster nodes. You know that your application can tolerate running with four Pods but no less than that. If one of the four Pods fails, the whole application breaks. This behavior is not uncommon in clustered applications that need to maintain a specific quorum of nodes. So, because of a kernel bug in one of the nodes, you installed a new kernel and you need to restart the node for the new kernel to be used. Restarting the node requires transferring any Pod running on the selected node to other nodes. Without using a Pod Disruption Budget (PDB), Kubernetes just kills all the Pods on the node. If those Pods are managed by a controller like a Deployment or a ReplicaSet, another Pod is automatically spawned and deployed to another healthy node. However, this breaks your application because the quorum is violated. With PDB, the eviction does not happen unless an additional Pod is running so that the minimum number of running Pods is maintained.
The following is an example of a Pod Disruption Policy that ensures that at least four Pods of the ones labeled app=mycluster are available at any given time:
apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: mycluster-pdb spec: minAvailable: 4 selector: matchLabels: app: mycluster
As you can see, the definition starts with the usual data, apiVersion, kind, and metadata.name. The spec part has two important parameters:
Prevent Kubernetes NetworkPolicy misconfigurations by enforcing policy as code