14-days FREE Trial

 

Right-size Kubernetes cluster, boost app performance and lower cloud infrastructure cost in 5 minutes or less

 

GET STARTED

  Blog

Kubernetes Patterns : The Init Container Pattern

 

The Initialization Pattern

The initialization logic is common among programming languages in general. In Object-Oriented Programming languages, we have the concept of the constructor. The constructor is a function (or method) that is called whenever an object gets instantiated. The purpose of the constructor is to “prepare” the object for the work it’s due to do. For example, it sets the default values for variables, creates the database connection objects, ensures the existence of the necessary prerequisites for the object to function correctly. For example, if a user object gets created, it needs at least the username, the first name, and last name of the user so that it can function correctly. Constructor implementation is different among different languages. Yet, all of them are invoked only once and only at object instantiation.

The purpose of the initialization pattern is to decouple an object from its initialization logic. So, if an object needs some seed data to be fed into a database, this falls under the constructor logic rather than the application logic. This allows you to make changes to how the object “starts” without affecting how it “works”.

Kubernetes uses the same pattern. While the object is the atomic unit of Object-Oriented languages, Kubernetes has Pods. So, if you have an application running on a container that needs some initialization logic, it’s a good practice to hand this work to another container. Kubernetes has a type of container for that specific job: init containers.

What are Init Containers?

In Kubernetes, an init container is the one that starts and executes before other containers in the same Pod. It’s meant to perform initialization logic for the main application hosted on the Pod. For example, create the necessary user accounts, perform database migrations, create database schemas and so on.

Init Containers Design Considerations

There are some considerations that you should take into account when you create init containers:

  • They always get executed before other containers in the Pod. So, they shouldn’t contain complex logic that takes a long time to complete. Startup scripts are typically small and concise. If you find that you’re adding too much logic to init containers, you should consider moving part of it to the application container itself.
  • Init containers are started and executed in sequence. An init container is not invoked unless its predecessor is completed successfully. Hence, if the startup task is very long, you may consider breaking it into a number of steps, each handled by an init container so that you know which steps fail.
  • If any of the init containers fail, the whole Pod is restarted (unless you set restartPolicy to Never). Restarting the Pod means re-executing all the containers again including any init containers. So, you may need to ensure that the startup logic tolerates being executed multiple times without causing duplication. For example, if a DB migration is already done, executing the migration command again should just be ignored.
  • An init container is a good candidate for delaying the application initialization until one or more dependencies are available. For example, if your application depends on an API that imposes an API request-rate limit, you may need to wait for a certain time period to be able to receive responses from that API. Implementing this logic in the application container may be complex; as it needs to be combined with health and readiness probes. A much simpler way would be creating an init container that waits until the API is ready before it exits successfully. The application container would start only after the init container has done its job successfully.
  • Init containers cannot use health and readiness probes as application containers do. The reason is that they are meant to start and exit successfully, much like how Jobs and CronJobs behave.
  • All containers on the same Pod share the same Volumes and network. You can make use of this feature to share data between the application and its init containers.

Init Containers “Requests” And “Limits” Behavior

As we’ve just discussed, init containers always start before other application containers on the same Pod. As a result, the scheduler gives higher precedence to the resources and limits of the init containers. Such behavior must be thoroughly considered as it may result in undesired results. For example, if you have one init container and one application container and you set the resources and limits of the init container to be higher than those of the application container, then the entire Pod is scheduled only if there’s an available node that satisfies the init container requirements. In other words, even if there’s an unused node where the application container can run, the Pod will not get deployed to this node if the init container has higher resource prerequisites that this node can handle. Hence, you should be as strict as possible when defining the requests and limits of an init container. As a best practice, do not set those parameters to higher values than the application containers’ unless absolutely required.

Scenario01: Seeding a Database

In this scenario, we are serving a MySQL database. This database is used for testing an application. It doesn’t have to contain real data, but it must be seeded with enough data so that we can test the application's query speed. We use an init container to handle downloading the SQL dump file and restore it to the database, which is hosted in another container. This scenario can be illustrated as below:

 

init 1

Learn how to continuously optimize your k8s cluster

The definition file may look like this:

apiVersion: v1
kind: Pod
metadata:
  name: mydb
  labels:
    app: db
spec:
  initContainers:
    - name: fetch
      image: mwendler/wget
      command: ["wget","--no-check-certificate","https://sample-videos.com/sql/Sample-SQL-File-1000rows.sql","-O","/docker-entrypoint-initdb.d/dump.sql"]
      volumeMounts:
        - mountPath: /docker-entrypoint-initdb.d
          name: dump
  containers:
    - name: mysql
      image: mysql
      env:
        - name: MYSQL_ROOT_PASSWORD
          value: "example"
      volumeMounts:
        - mountPath: /docker-entrypoint-initdb.d
          name: dump
  volumes:
    - emptyDir: {}
      name: dump

The above definition creates a Pod that hosts two containers: the init container and the application one. Let’s have a look at the interesting aspects of this definition:

  • The init container is responsible for downloading the SQL file that contains the database dump. We use the mwendler/wget image because we only need the wget command.
  • The destination directory for the downloaded SQL is the directory used by the MySQL image to execute SQL files (/docker-entrypoint-initdb.d). This behavior is built into the MySQL image that we use in the application container.
  • The init container mounts /docker-entrypoint-initdb.d to an emptyDir volume. Because both containers are hosted on the same Pod, they share the same volume. So, the database container has access to the SQL file placed on the emptyDir volume.

What Would Have Happened If We Hadn’t Used Init Containers?

In this example, we use the initialization pattern to establish the separation of concerns best practice. If we’d implement the same logic without using an init pattern, we’d had to create a new image based on the mysql base image, install wget, and use it to download the SQL file. The drawbacks of this approach are:

  • If we need to make any changes to the download logic, we need to create a new image, push it and change its reference in the definition file. This adds the burden of having to maintain your custom image.
  • It creates a tightly-coupled relation between the DB container and its startup logic, which makes the application harder to manage and increases the possibility of introducing errors and bugs.

Magalix trial

Scenario 02: Delaying The Application Launch Until The Dependencies Are Ready

Another common use case for init containers is when you need your application to wait until another service is full running (responding to requests). The following definition demonstrates this scenario:

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app: myapp
spec:
  initContainers:
  - name: init-myservice
    image: busybox:1.28
    command: ['sh', '-c', 'until nslookup myservice; do echo waiting for myservice; sleep 2; done;']
  containers:
  - name: myapp-container
    image: busybox:1.28
    command: ['sh', '-c', 'echo The app is running! && sleep 3600']

So, assuming that our application, running on myapp-container does not function correctly except when myservice application is running. We need to delay myapp lanch until myservice is ready. We do this by using a simple nslookup command (line 11) that constantly checks for the successful name resolution of “myservice”. If nslookup was able to resolve “myservice”, then the service is started. With a success exit code, the init container terminates giving way for the application container to start. Otherwise, the container sleeps for two seconds before trying again, delaying the application container start.

For completeness, this is the definition file for myservice:

apiVersion: v1
kind: Service
metadata:
  name: myservice
spec:
  ports:
  - protocol: TCP
    port: 80
    targetPort: 9376

TL;DR

  • The initialization pattern is an important practice to follow when designing applications that need a startup logic.
  • Kubernetes offers init containers as a means of decoupling application logic from its startup procedure.
  • Placing the application initialization logic in an init container offers a number of advantages:
    • You’ll impose the separation of concerns principle. An application can have its team of engineers, while its initialization logic is authored by another team.
    • Having a separate team working on the initialization steps of an application gives the company more flexibility when it comes to authorization and access control. For example, if launching an application requires working with resources that need a security clearance (for example, modifying firewall rules), this can be done by those with suitable credentials. The application team is not involved in the operation.
    • If there are too many initialization steps involved, you can break them into a number of init containers that execute in turn. If one step fails, the init container reports an error, which gives you a better insight into which part of the logic was not successful.
  • There are some considerations that should be taken when working with init containers:
    • Init containers are restarted when they fail. Hence, their code must be idempotent.
    • Init containers are requests and limits are examined first by the scheduler. An incorrect value can negatively affect the scheduler’s decision on where to place the whole Pod (including the application containers).

Magalix Trial*The outline of this article outline is inspired by the book of Roland Huss and Bilgin Ibryam : Kubernetes Patterns.

Mohamed Ahmed

Oct 28, 2019