Does the following conversation sound familiar?
CEO: Our AWS bill has gone to the roof. Why?
VP of engineering: We’re adding new customers! We need enough capacity to keep up with the demand.
CTO: But our average CPU and memory utilization are quite low. Why do we need more capacity if we’re not using all the infrastructure we already have?
VP of engineering: We get traffic spikes throughout the day. We need to be ready for them.
CTO: Why aren’t we using our cloud provider’s auto-scaling features?
VP of engineering: We are! The problem is figuring the best auto-scaling rules. Our services each need to scale in different capacities, based on various metrics. We don’t have performance engineering skill sets. We need enough buffer to keep up with spikes, and they’re hard to predict.
CTO: But even during our spikes, we’re still barely utilizing most of our machines. Our frontend instances hardly run above 50-percent CPU and memory load.
VP of engineering: Well, as you know, our software is built for enterprises, and it’s single-tenant. We’ll need to address that in the next release. When we support multi-tenancy, we’ll be able to utilize out VMs a lot more efficiently.
CTO: Okay, and how far off is that next release?
VP of Engineering: Six months.
CEO: We have to keep spending at this rate for six more months?! There’s got to be a faster way.
If this exchange feels painfully familiar, then you’ve probably sat in meetings with this same recurring theme — as we have.
Within the first few months of launching a cloud application, many business owners get blindsided by huge bills. It seemed so quick and affordable to provision resources, but the costs add up far more quickly than anyone expects. As the requirements of moving fast and saving money come into sharper conflict, stress and contention build up in the organization — sometimes with disastrous results.
Technical leaders usually start trying to cut their cloud bills by turning off or downsizing some of their VMs, or by adopting different billing models, such as reserve instances on AWS. But these kinds of optimization attempts don’t really solve anything. The remaining VMs are still under-utilized, and there are still too many of them.
In response, many managers shift focus to one or both of the next two options:
- Redesign applications for greater cloud optimization.
This takes time and deep expertise on distributed systems, for which many companies struggle to find the right talent for. It also takes time to redesign, test, and stabilize the redesigned software.
- Implement less disruptive changes to achieve multi-tenancy or collocation of services on the same machine.
The obvious approach here is to containerize applications and services, since containers provide a clean and relatively easy way to isolate applications running on the same VM. The problem, of course, is that many organizations face serious limitations on this front. It’s not easy to assemble a team that can hit the ground running, which means it can take months to provision necessary containers orchestration tools. Plus, you still need to manage the application’s overall capacity, in order to ensure that the right services are collocated together at the right time. Otherwise, you’ll need a significant number of idle machines to establish a buffer. Containerization attempts often start off relatively smoothly, only to slow down and suffer delays as complexity builds up (for example, one group of Hackers used Tesla Kubernetes clusters for cryptocurrency mining). In the end, containers are just another layer of technology to manage, in addition to your existing infrastructure.
If you decide to explore the second option, we recommend using services that are smart enough to learn the requirements of your changing application workloads and automatically adjust your containers and infrastructure in ways that reduce your cost, and keep your application highly available for all your users.
If you’re currently facing these problems — or you’re reading to face them in the near future — we’ve got a solution! We challenge the status quo, by teaching you how not to overprovision and overspend on cloud infrastructure, so you can spend less while working more efficiently.
Sign up now to get early access to our unique services. https://www.magalix.com