Is our Spending Getting Worse?
I woke up one day to see this email from our CEO in my mailbox. I knew this would happen at some point as we had not been paying attention to our infrastructure spending. The time had finally come to deal with our problem.
We need to ensure that our system is performant and meets our internally mandated SLAs. However, we also need to ensure that we are using our infrastructure efficiently and eliminate any unwarranted waste. The bottom line is just as important as the top line in any business. In a SaaS business like ours, the bottom line is mainly driven by our infrastructure spending
Running Kubernetes on the cloud isn’t expensive- we can roll out a Kubernetes cluster with an average cost of $70 per month. What drives the cost up is running worker nodes to host and run our workloads. Many considerations that contribute to the cost, but most importantly is how much your cluster is utilizing the resources you’re actually running. We discussed various techniques to improve your cluster utilization in “Kubernetes Cost Optimization 101”
In this article, we will discuss how Magalix can help you better utilize your resources and pay less to run the same cluster.
Applying Workload Right-Sizing using KubeOptimizer
Kubernetes manages and schedules pods based on container resource specs:
- Resource Requests: Kubernetes scheduler places containers on the node that has enough capacity
- Resource Limits: Containers are NOT allowed to use more than their resource limit
Resource requests and limits are container-scoped specs, while multi-container pods define separate resource specs for each container.
Kubernetes schedules pods based on the resource request and other restrictions without impairing availability. It uses CPU and memory resource requests to schedule workloads in the right nodes, while controlling which pod can work on which node and if multiple pods can schedule together on a single node
Every node type has its own allocatable CPU and memory capacities. Assigning unneeded CPU or memory resource requests can leave you with underutilized pods on each node, which leads to underutilized nodes
In this section, we will use the Container Resource Advisor (a Magalix built-in advisor) to right-size the containers in our cluster.
This is the Cluster Dashboard in Magalix console, In the “CPU Usage Distribution” and “Memory Usage Distribution”, I can view what workloads are utilizing my cluster the most. Right-sizing these workloads can save money by eliminating wasted capacity.
Let’s click on Optimization in the left navThis will show us all the recommended optimizations. I want to see the workloads that are wasting resources, so I will choose “Container wasting CPU resource” and “Container wasting memory resource”. Now I can see all workloads with this particular issue.
I’m interested in the mongo-replica set statefulset that we recently installed, so let’s see if our estimate is close to actual usage.
Clicking on the workload, I get to see the detailed recommendation which includes the metrics used to make the recommendation. I can save 450m cores which are good savings. Scrolling down, I see the evidence of the recommendation and can see the usage is pretty low. I see the current request, limit and usage, and the recommended request and limit for this container based on usage trend.
Once everything looks great, I will click “Apply”.
It will redirect me to the “Execution Log” page with “In Progress”. During this time, a Magalix agent will send a patch command to the Kubernetes cluster to patch the workload spec in the background, then turn it to “Succeed”
I did the same process for other CPU waste recommendations. While the memory recommendation will look the same, this one particular container seems to waste a lot of memory. You can also select multiple recommendations at once if you go to the “Automation” tab.There are other advisors focused on security, such as pod security settings and performance improvements like CPU throttling and memory starvation. For now, we are only focused on the cost-saving recommendations
After this step, we haven’t actually saved any money yet. We’ve just configured our workloads to request the right resources and have smaller worker nodes to run the same workloads, but how can we manage the sizing and scaling of the nodes? Let’s keep on reading.
Already working in production with Kubernetes? Want to know more about kubernetes application patterns?
Choosing the Right Worker Nodes with Node Advisor
Every Kubernetes cluster has its own special workload utilization- some clusters use memory more than CPU (e.g. database and caching workloads), while others use CPU more than memory (e.g. user-interactive and batch-processing workloads). Cloud providers such as GCP and AWS offer various node types that you can choose from.
Choosing the wrong node size for your cluster can end up costing a lot. For instance, choosing high CPU-to-memory ratio nodes for workloads that use memory extensively can starve for memory easily. This will trigger auto node scale-up, wasting more CPUs that we don’t need
GCP offers general purpose, compute-optimized, memory-optimized nodes with various CPU and memory count and ratiosCalculating the right ratio of CPU-to-memory isn’t easy. This is where Magalix can help. Let’s click on the Optimization again in the left nav.
I see one recommendation under “Cost Saving”, so let’s try it out.Here I’m using 3 x DS2 v2 Azure nodes to run this cluster. The cluster utilization isn’t high and the Node Advisor recommends using 1 x B1S, 2 x B1S based on an analysis of your workload’s CPU and memory usage, taking into account the difference between DaemonSets and other workloads. The advisor will find a set of nodes to run this workload with the lowest waste.Here is the utilization before and after. Notice how the utilization increases in the recommended node.The advisor will show multiple billing options to run the nodes. It will also detect your cloud provider, and check out the prices for nodes and saving plans to give you the best options to choose from. Next, let’s discuss the cloud saving options.
Purchasing Commitment/Saving Plans
Running Kubernetes service is relatively cheap. What costs customers the most is the compute resources on the worker nodes.
GCP offers “Commitment Plans” on a certain amount of vCPUs, memory, GPUs, and local SSDs for 1 or 3 years which can save up to 57% of the compute cost.
AWS offers “Compute Saving Plans” to commit on using a certain amount of money on compute every month and “Reserved Instances” to commit on using a certain type of machine, both for 1 or 3 years with possible cost-savings of up to 60%.
Azure offers “Azure Reserved VM Instances” like AWS with possible cost-savings of up to 60%.
Understand how resource utilization, application UX and performance, and cost work in cloud-native.
As we saw in this article, there are multiple factors and considerations when trying to reduce your cluster cost. Going through the whole process can bring you huge savings. We managed to reduce our cluster daily cost by 56% by applying the “Container Resource Advisor” and “Node Advisor” recommendations.