We’re about to see the latest version of Kubernetes coming out. It’s a major release with a lot of new and exciting features. In this article, we get to discover some of those features and whether or not you should consider upgrading. So, shall we get started?
The Ability To Use Service Account Tokens As A General Authentication Method
Kubernetes uses service accounts to authenticate services within the cluster. For example, if you want a Pod to manage other Kubernetes resources like a Deployment or a Service, you can associate with a Service Account and create the necessary roles and role bindings. Kubernetes Service Accounts (KSA) sends JSON Web Tokens (JWT) to the API server to authenticate themselves. This makes the API server the single source of authentication for service accounts. So, what if the entity needs to authenticate against some other service outside the cluster? In order to use its KSA, the external authenticator must reach out to the API server to validate the request. However, the API server shouldn’t be publicly accessible. That makes you resort to a different authentication system for validation, which adds more complexity. Even if the third-party service lies within the cluster, where the API server is accessible, this adds more load on it, which is generally undesirable. Kubernetes 1.18 offers feature #1393 that makes the API server provide an OpenID Connect discovery document that contains the token’s public keys in addition to other metadata. OIDC authenticators can use this data to authenticate the token without having to refer to the API server first.
The Ability To Configure HPA Velocity For Specific Pods
The Horizontal Pod Autoscaler (HPA) is used to enable your Kubernetes cluster to react automatically to high/low traffic. Through HPA, you can instruct the controller to create more pods in response to CPU spikes, other metrics, or application-provided metrics. For cost optimization, HPA will terminate excess pods when they are no longer needed, like when there’s no more high load. HPA increases/decreases pods at a configurable speed to avoid fluctuating pod creation/destruction in unstable times. However, currently, this feature is configurable at the cluster level. In a typical microservices application, you often have services that are more important than others. Assume that you host a web application on Kubernetes that does the following tasks:
- Respond to end client requests (frontend).
- Process data supplied by the clients (that includes performing CPU-heavy operations like map-reduce).
- Process less important data (for example, archiving, cleanup, etc.)
From the above, it’s clear that task #1 requires pods to scale up faster so that the application can handle increased client traffic swiftly and efficiently. Additionally, they should scale down very slowly in anticipation of another near traffic spike.
Task #2 needs its pods to also scale up very fast in response to an increased data volume. Data processing should not be delayed in mission-critical applications. However, they should scale down also very fast since they consume a lot of resources that need to be availed to other services as soon as they are no longer needed.
Due to their importance, we can tolerate pods belonging to tasks #1 and #2 responding to false positives. After all, wasting some resources is better than losing clients.
Pods serving task #3 do not need special arrangments. They can be scaled up and down in the usual manner.
Kubernetes 1.18 offers feature #853, which allows scaling behavior to be configured through the HPA behavior field. Behaviors are specified separately for scaling up and down in scaleUp or scaleDown section under the behavior field.
Introducing Profiles To Run Multiple Scheduler Configurations
For a quick refresher, Kube Scheduler is the component that controls which pods get deployed (scheduled) to which nodes. The Scheduler’s decision is bound to several conditions including node affinity/anti-affinity, the requests and limits configured on the pods, the resource and node availability, etc.
Generally, there are two types of workloads in Kubernetes: long-running services (for example, web servers, APIs, etc.) and tasks that run to completion (better known as Jobs). Due to the obvious differences between workload types, some users resort to creating complete clusters for different needs. For example, a cluster for handling data mining and another for serving the application’s APIs. The reason is that they need the decision process to differ. For example, the default scheduler configuration favors high availability. Other users might disapprove of having their workloads dispersed among multiple clusters. Instead, they’d opt to install multiple schedulers on the same cluster, each having its own decision-making rules. However, having multiple schedulers in the same cluster may introduce race conditions, where each scheduler has a different view of the cluster and the appropriate scheduling decision to be made.
Feature #1451 allows you to use one scheduler for the cluster, but with different profiles. Each profile can be referred to through the schedulerName. Pods can use the schedulerName to identify which profile to use. But in the end, it’s the same scheduler doing all the work, which avoids race conditions.
The Ability To Define Even Pod Spreading Rule At The Cluster Level
First introduced in Kubernetes 1.16, Even Pod Spreading allowed you to ensure that pods will be scheduled on availability zones (provided that you are using a multi-zone cluster) in a way that ensures maximum availability and resource utilization. The feature worked by specifying the topologySpreadConstraints, which identifies zones by searching for nodes with the same topologyKey label. Nodes with the same topologyKey label belong to the same zone. The setting aimed at distributing pods evenly among different zones. However, the drawback of it is that that this setting must be applied at the Pod level. Any pods that do not have the configuration parameter will not be distributed evenly across failure domains.
Feature #895 allows you to define default spreading constraints for pods that don't provide any topologySpreadConstraints. Pods that already have this setting defined will override the one set on the global level.
Support For ContainerD On Windows
When we say “Kubernetes”, we almost always think of Linux. Even tutorials, most of the books and the literature in general regard Linux as a de facto OS on which Kubernetes runs. However, Microsoft Windows has taken serious steps to support running Kubernetes on their Windows Server line of products. Those steps included adding support for ContainerD runtime version 1.3. Windows Server 2019 includes an updated host container service (HCS v2) that features an increased control over container management, which may improve Kubernetes API compatibility. Yet, the current Docker release (EE 18.09) is not ready to work with the Windows HCSv2, only ContainerD is. Using ContainerD runtime allows for more compatibility between Windows OS and Kubernetes, which means more features will be available. Feature #1001 introduces support for ContainerD version 1.3 for Windows as a container runtime interface (CRI).
Support For RuntimeClass And Labels For Multiple Windows Versions in The Same Cluster
Since Microsoft Windows is actively supporting various Kubernetes features (see the previous paragraph), it’s not uncommon today to see mixed clusters that operate on Linux and Windows nodes. The RuntimeClass was introduced as early as Kubernetes 1.12, with major enhancements introduced with Kubernetes 1.14. It was used so that you can select the container runtime on which specific pods should run. Now, with Kubernetes 1.18, the RuntimeClass supports Windows nodes. So, you can select nodes that run specific Windows build to schedule pods that should run on Windows only.
The Ability To Skip Volume Ownership Change
By default, when a volume is mounted to a container in a Kubernetes cluster, all the files and directories inside that volume have their ownership changed to the value provided through the fsGroup. The reason for doing this is to enable the volume to be readable and writeable by the fsGroup. However, this behavior proved to be undesirable in some cases. For example:
- Some applications (like databases) are sensitive towards file permission and ownership modifications. Those applications may cease to start after the volume is mounted.
- When the volume is very large (>1TB) and/or the number of files and directories it contains is huge, the chown and chmod operations may be too lengthy. In some cases, they may cause a timeout when starting the pod.
Feature #695 provides the FSGroupChangePolicy parameter, which can be set to Always to maintain the default behavior, or OnRootMismatch, which will trigger the modification process only if the top-level directory permissions do not match the fsGroup value.
Allowing Secrets And ConfigMaps To Be Immutable
Since the early days of Kubernetes, we’ve been using ConfigMaps to inject configuration data into our containers. When the data is sensitive, a Secret is used. The most common way of presenting the data to containers is by mounting a file that contains the data. However, when a change is made to a ConfigMap or Secret, this change is propagated to all the pods that have the configuration file mounted immediately. This may not be the best way of applying changes to a running cluster. If the new configuration was faulty, we’re risking the application to cease functioning. When modifying a Deployment, the change is applied through a rolling-update strategy where new pods get created while the old ones still function before they get deleted. This strategy ensures that if the new pods fail to start, the application will still be working on the old pods. A similar approach was applied to ConfigMaps and Secrets by enabling their immutability through the immutable field. When an object is immutable, the API will reject any changes made to it. In order to modify the object, you will have to delete it and recreate it while also recreating all the pods that use it. Using the Deployment rolling-updates, you can avoid application outages due to bad configuration change by ensuring that new pods are working properly with the new configuration before destroying the old ones.
Additionally, making the ConfigMaps and Secrets immutable saves the API server from having to poll them for changes every time interval. Feature #1412 can be enabled in Kubernetes 1.18 by enabling ImmutableEmphemeralVolumes feature gate, then setting the immutable value to true in the ConfigMap or Secret resource file.
Giving The User More Troubleshooting Power Using Kubectl Debug
As a Kubernetes user, when you need visibility into the running pods, you are limited kubectl exec and kubectl port-forward. With Kubernetes 1.18 release, you also have kubectl debug command at your disposal. The command allows you to:
- Deploy an ephemeral container to a running pod. Ephemeral containers are short-lived. They usually contain the necessary debugging tools. Since they are being launched within the same pod, they have access to the same network and filesystem that the other containers have. This can greatly help you troubleshoot a problem or trace an issue.
- Restart a pod in-place with a modified PodSpec. This allows you to make things like changing the container’s source image or permissions.
- You can even start a privileged container in the host namespace. This allows you to troubleshoot node problems.
Kubernetes is an ever-changing technology with more and more features added in each release. In this article, we briefly discussed some of the most interesting new features that can be found in the latest Kubernetes release 1.18. However, it goes without saying that upgrading your Kubernetes cluster is not a decision that can be taken lightly. Have a look at the above features and also consult the official documentation making up your mind.