Background
The benefits of cloud-native application development are endless, but there can be significant risks. Modern development practices and technologies, like CI/CD, containers, microservices, and self-provisioning, require greater visibility and control to account for the faster and more dynamic software delivery cycle.
The 24/7 nature of modern software systems which are expected to be highly-available, seamlessly responsive, and infinitely scalable, presents unique challenges to traditional development and operations teams.
Operation and security teams can’t possibly administer all cloud instances and workloads manually to ensure adherence to operational best practices, security rules, and organizational standards. Manual administration will slow your innovation cadence, and it won’t scale.
This can be a lot to consider, but if you rethink your organization culture and adopt the DevSecOps culture, you will also want to adopt a Policy-as-Code strategy to maintain your pace of innovation.
Codifying and automating your operational best practices, security rules, and organizational standards is a must to ensure a healthy balance between an organization’s rate of innovation and its risk posture.
What is Policy-as-Code?
Similar to the concept of Infrastructure-as-Code (IaC) and the benefits you get from codifying your infrastructure setup, Policy-as-Code (PaC) is the codification of your policies. You can think of your policies as the linting rules for your IaC.
Overall, this refers to the ability of your operation team to verify and enforce certain rules and standards across the entire organization or within specific clusters. By reducing variations in your infrastructure, you reduce your maintenance cost and attack surface. Also, having policies codified enables the automation of common tasks and thus improves efficiency at an organizational level since you get to prevent bad setup and configuration from leaking into your production environment.
The policies you want to enforce come from your organization’s established guidelines or agreed-upon conventions, and best practices within the industry. It could also be derived from tribal knowledge that has accumulated over the years within your operations and development teams.
Policy Examples for Kubernetes
Policies could be established for multiple areas of your operational environments. You want your Kubernetes clusters to be reliable and secure and you want to control who has access to what. You also want to limit the usage of available infrastructure resources and enforce some quotas.
Additionally, you want to enforce rules for your network ingress and egress traffic. In this section, we will cover some examples of policies you might want to enforce on top of Kubernetes. This should be regarded as a starting point and is not intended to be a complete or comprehensive list.
Reliability
You want to ensure and improve the continuity of your business applications. This is done by making your system highly available and fault-tolerant. Here are some examples of policies you can enforce in your Kubernetes clusters:
- Verify that the spec’s replicas count is 2 or greater, to ensure redundancy in your ReplicaSets for fault tolerance.
- Ensure that
afinity.podAntiAfinity
is set in your deployment spec to avoid having multiple pods - from the same deployment - running on the same node. - Check that
readinessProbe
andlivenessProbe
are defined in your container’s spec to guarantee that only healthy pods get traffic.
Security
Define rules and conditions related to access and privilege that pods must meet to be allowed to run in your cluster. For example:
- Control the user IDs and group IDs allowed for your containers to run by checking
runAsUser
andrunAsGroup
. - Enforce the settings of
allowPrivilegeEscalation=false
andmustRunAsNonRoot
so the container and its child process cannot escalate their privilege or change their capabilities. - Require that containers have
read-only
access to the filesystem by ensuring thatReadOnlyRootFilesystem
is set.
Network
Check that best practices are applied and your network rules are followed for ingress and service objects defined in your cluster:
- Avoid using
hostPort
andhostNetwork
for any pod since this could limit the number of places the pod could run, sincehostIP.hostPort.protocol
combination must be unique. Additionally, avoid usinghostNetwork
(for the same reason). - Ensure your publicly exposed load balancer’s
selector.k8s-app
is limited to certain (necessary) controllers.
Access Control
Implement role-based access control and enforce your policies:
- Disable default namespace, to force every object in your cluster to be assigned a proper namespace that you have set.
- Check that no
RoleBinding
objects give patch access to users that you haven’t approved. - Check for
rules.apiGroups
,rules.resources
, andrules.verbs
combinations that might violate any of your access control policies.
Operational Excellence
General best practices that you’ll want to maintain across your cluster, or for certain types of workloads:
- Enforce that certain keys exist under
metadata.labels
for all yourStatefulSet
(like an owner name or email). - Check that
container.image
in all your specs are using a trusted container registry. - Do not allow any
container
.image with the:latest
tag. Force the use of specific versions.
Again, these were simply a few examples of areas where you can implement some policies for your Kubernetes clusters. At Magalix, we’ve already implemented most of these policies so you can have a basic PaC framework out of the box for your organization, which you can also expand and customize as you see fit.
Framework for Implementing Policy-as-Code
There are three key dimensions that need to be defined in order to establish a Policy-as-Code framework within your organization:
- Targets: the clusters, workloads, or entities where you want to apply the policies
- Policies: the rules or standards you want to validate against your specified targets
- Triggers: the catalyst - when the policy should be checked (e.g., after git push, before Kubernetes deployment, every 24 hours, every time after an object spec changes in the cluster, etc.)
Once you have the targets, policies, and triggers defined, you need a way to enforce them to ensure compliance. Doing so manually is a sure way to put your organization in firefighting mode incessantly. The best way to implement this framework is to use tools that can automate the compliance check process based on the triggers you defined, as well as provide a user-friendly way to manage the policies and their targets.
Implementing Policy as Code with The Open Policy Agent
As part of the CNCF project, the Open Policy Agent (OPA) is a great tool that allows organizations to easily define custom policies for their Kubernetes environments. Open Policy Agent policies are written in a declarative policy language called Rego.
With Rego, you can filter the input (workloads, users, other entities) to match the “targets” you want and you can add assertions that would define your “policy.” It’s important to have a common framework and tools to enforce your policies so that you can easily evolve your policies and enforce them consistently within your different operations environments.
While it is possible to install and configure your own setup of OPA, that would require a much higher degree of know-how, effort, and maintenance.
How Magalix Can Help?
Magalix allows you to define and manage these policies, and their lifecycle, in an easy and user-friendly way. Internally, Magalix uses OPA as the policy execution engine. Magalix, by default, will run global policies based on Kubernetes’ best practices for the relevant Kubernetes objects in your cluster.
Also, it will allow you to define custom policies (KubeAdvisor), run these policies periodically against your cluster, and track violations and compliance overtime to report regular updates (via a dashboard or through integrations with Slack, etc.).
Additionally, Magalix offers webhooks that you can use to pass your object specs, and we’ll run all relevant policies and respond to you with any violations. You’ll find this is a great way to integrate your CI/CD pipeline, and easily prevent policy violations. To read about Magalix Policy-as-Code platform for Kubernetes, go to Kubernetes Governance with Magalix.
Book a commitment free consultation with a Magalix expert now!
Comments and Responses