Resource Management: The Struggle is Real
In the days when companies built their own data centers, software development had to abide by a few hard constraints. One of those constraints was the finite number of available compute resources. When VMs arrived, systems administrators factored in overhead for additional components of the hypervisor and operating system but developers were still bound to the same physical limitations of available resources.
One of the selling points of the cloud is that you can scale your infrastructure according to your needs and only pay for what you use. The elasticity of the cloud allows for an essentially limitless server farm but not managing your cloud footprint could result in a shockingly high invoice at the end of each billing cycle. When using Kubernetes, resource management becomes a little more complex since there is another layer or resources that need to be configured. Thankfully, the resource management of workloads is handled by the Kubernetes Scheduler.
AI-Powered Insights with Magalix
The Kubernetes scheduler does a decent job of managing resource allocation and availability but it’s not perfect. Over time, you may find that some nodes are hot, or rather, are consuming more workloads than others instead of an even distribution across nodes. The more workloads you add, the more nodes you’ll need in your cluster, making manual management of resources a difficult task to take on.
Magalix takes the manual analysis of resource allocation and utilization out of the equation. Behind the scenes, our AI-powered backend analyzes your set resource configuration, workload usage, node type, and cost to perform the necessary calculations to recommend the right types of nodes for your cluster and the right-sizing for your workloads.
Juggling the placement of workloads to maximize your provisioned resources takes a bit of effort. Manually checking each container’s CPU and memory configuration, utilization percentages during peak and off-hours, application performance, node instance types, affinity/anti-affinity rules, and calculating cost leaves a large room for error. One of the problems Kubernetes solves is not having to statically map workloads to resources but when the number of workloads and resource requirements start to grow, the Kubernetes Scheduler will only schedule your workloads as best as it can with the nodes you have. It won’t tell you anything about your node sizing so you will have to scale your nodes using some of the manual techniques mentioned. In turn, you might be left with workloads that are unable to be scheduled until resources become available to the cluster.
Magalix AI recommending a change in your node pool.
Manually calculating resources in a dynamic infrastructure doesn't make a whole lot of sense. In today’s data-driven world, why not leverage historical data to make informed decisions. Magalix provides insights out of the box to help you make those decisions confidently. Using the image above as an example, changing the type and quantity of a node can help us achieve a lower cloud spend while maximizing your usage.
In many organizations, container resources are typically set too high, or not at all. When this happens unexpected behaviors and problems arise. If you aren’t sure how many resources a container will need, it’s best to start high and work your way down. Unfortunately, fine-tuning a container in this way might require a few iterations since it’s essentially a trial and error process.
To help overcome these pitfalls, Magalix provides insights about deployed containers. Similar to the right-sizing of nodes, Magalix also recommends container right-sizing based on similar factors such as configured resources and resource utilization. Get cluster-wide views of how resources are allocated relative to other workloads, and to the overall sizing of your cluster. With our built-in optimization policies, know when resources are underprovisioned, oversubscribed, or not configured at all.
A view of the top containers wasting memory resources
The image below is an example of one of our built-in optimization policies. This policy detected a container wasting memory based on a comparison of configured resources to the amount actually used. The usage stats are highlighted as evidence, along with a recommendation of how the container should be sized.
Our example shows that we can save over 900Mi in our Memory Limits and 350Mi for our requests. With this type of resource reduction, we can surely add more workloads to our node, to again, achieve maximum utilization for the amount of money spent.
When it comes to right-sizing your cloud footprint and lowering costs, take the guesswork and manual intervention out of your dynamic infrastructure and leverage data and evidence to make the best decision possible. With Kubernetes, managing an additional layer of resources can be a bit tricky so getting the right insights you need to make the best decision possible is crucial. Magalix analyzes data out of the box by capturing usage metrics, and resource configurations to take the guesswork out of right-sizing your containers and nodes.