Weaveworks 2022.03 release featuring Magalix PaC | Learn more
Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
A Pod may host one or more containers in which all of them are treated as one unit. There are several ways available to create Pods. For example:
As you can see, the typical pattern here is trying to have the Pod running at all times. This pattern is a common one: you always want your service to continue responding to requests. However, in some cases, you need the container to run once and then get terminated.
As mentioned, Kubernetes uses Pods as its building block. Any task that you need Kubernetes to run is done through a Pod. The difference between one Pod type and other lies in the controller that manages them. Kubernetes provides the Job controller, which is used to run one or more Pods and ensure that all of them terminate successfully. Let’s demonstrate an example.
Assume that you have a web application that needs a random number that it reads from a file. To create this number, we create a Job controller that spawns a Pod. The Pod writes the number to the file and exits. A Job definition may look like this:
apiVersion: batch/v1
kind: Job
metadata:
name: seed-creator
spec:
completions: 1
parallelism: 1
template:
metadata:
name: seed-creator
spec:
restartPolicy: OnFailure
containers:
- image: bash
name: seed-creator
command: ["bash","-c","echo","$RANDOM",">","/random.txt"]
A Job definition contains the following unique fields:
Using a Job vs. using a bare Pod
You may be asking why bother using a specific definition for a Job, or you could’ve equally used a naked (bare) Pod. For example, the following Pod definition will achieve the same goal as the Job in the previous example:
apiVersion: v1
kind: Pod
metadata:
name: seed-creator
spec:
restartPolicy: OnFailure
containers:
- image: bash
name: seed-creator
command: ["bash","-c","echo","$RANDOM",">","/random.txt"]
Applying this definition creates a Pod that executes the same command with the same options. So, why (and when) to use a Job?
You control the number of times the Pod should run by the completion’s parameter. With a bare Pod, you will have to do this manually.
Using the parallelism parameter, you can scale up the number of running Pods.
If the node fails while the Job Pod is running, the Job controller reschedules this Pod to a healthy node. A bare Pod remains failing until you manually delete it and start it on another node.
So, as you can see, a Job lifts a lot of the administrative burden by automatically managing the Pods.
The completions and parallelism parameters allow you to utilize different Job patterns depending on your environment and requirements. Let’s have a look:
Single Job Pattern: This pattern is used when you want to execute only one task. You can use this pattern when you set both the completions and parallelism values to 1. Alternatively, you can omit them from the definition file, and they automatically use the default values (1). The job is considered done when the exit status of the Pod is 0. Refer to the first example in this article which uses the Single Job Pattern.
Fixed-count Job Pattern: When you need the task to be executed in a specific number of times. For example, you need to read precisely five files and insert their contents in a database. After each file is read, it gets deleted so that the next iteration reads the following file. For this pattern, you need to set the completions pattern to be higher than 1 (5 in our example). The parallelism parameter is optional here.
Work-queue Job Pattern: You should use this pattern when you have an undefined number of tasks that need to be done. The typical use case here is message queues. If you need to consume messages from a message queue until it is empty, you create a Job, set the completions count to 1 (or omit it) and set the parallelism value to be greater than 1 to have high throughput.
A Job is considered successful when at least one Pod terminates successfully, and all other Pods terminate as well. Since more than one Pod is running in parallel, it is the responsibility of each Pod to coordinate with other Pods regarding which items every Pod is working on. The first Pod that detects an empty queue will terminate with an exit status of 0. Other Pods also end as soon as they finish processing the messages they’re consuming.
If you have an indefinite number of work items that need to be processed (think of Twitter messages, for example), then you should consider other controllers like ReplicaSets. The reason is that such queue types need Pods that are always running and restarted when they fail.
Self-service developer platform is all about creating a frictionless development process, boosting developer velocity, and increasing developer autonomy. Learn more about self-service platforms and why it’s important.
Explore how you can get started with GitOps using Weave GitOps products: Weave GitOps Core and Weave GitOps Enterprise. Read more.
More and more businesses are adopting GitOps. Learn about the 5 reasons why GitOps is important for businesses.
Implement the proper governance and operational excellence in your Kubernetes clusters.
Comments and Responses