Weaveworks 2022.03 release featuring Magalix PaC | Learn more
Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
Any system that runs without exposing the necessary information about its health, the status of the applications running inside it, and whether or not something has gone wrong is simply useless. But this has been said many times before, right? Since the early ages of mainframes and punch-card programming, there have always been gauges for monitoring the system’s temperature and power among other things.
The important aspect of those old systems is that they rarely change. If there is a modification to occur, monitoring was included in the plan and, thus, it is not a problem. Additionally, the system operated as a whole, thus it was easier to observe as there weren’t several moving parts. But in the cloud-native world that we live in today, change is nothing but rare; it’s the norm. The microservices pattern that cloud-native apps follow makes the system a highly distributed one. As a result, metric collection, monitoring, and logging need to be approached differently. Let’s have a brief look at some of the monitoring challenges in a cloud-native environment:
As a rule of thumb, logs should be collected and stored outside the node that’s hosting the application. The reason behind that is that you need to have a single place where you can view, analyze, and correlate different logs from different sources. In highly-distributed environments, the need becomes even more important as the number of services (and logs) increases. There are many tools that do that task for you; for example, Filebeat, Logstash, Log4j among others. The application should log important events as they happen, but it should not decide where the logs would go; that’s the deployment’s decision. For example, a Python application may have the following line to alert that a new user was added to the database:
log.info("{} was added successfully".format(user))
Of course, the application needs to know where this line of text should go. Since we’re leaving this to the environment, we instruct the application to send it to STDOUT and STDERR (if the log was an error event). There are many reasons why you should follow this practice:
Assuming that your application is hosted on three pods on a Kubernetes cluster. The application is configured to print a line of log whenever the application receives and responds to a web request. However, the cluster is designed so that a Service routes the incoming request to a random backend Pod. Now, the question is; when we are to view the application logs, which Pod should we query? The following diagram depicts this situation:
Because this scenario is common, Kubernetes provides a simple command that allows you to view collective logs for from all the Pods that match a specific label:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
frontend-5989794999-6wxhm 1/1 Running 0 2d4h
frontend-5989794999-fkq56 1/1 Running 0 124m
frontend-5989794999-nzn2f 1/1 Running 0 124m
$ kubectl logs -l app=frontend
--- output truncated for brevity ---
10.128.0.3 - - [06/Oct/2019:15:16:08 +0000] "GET / HTTP/1.1" 200 22 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36" "-"
10.128.0.3 - - [06/Oct/2019:15:16:09 +0000] "GET /favicon.ico HTTP/1.1" 404 232 "http://34.70.214.219:31121/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36" "-"
[pid: 14|app: 0|req: 2/2] 10.128.0.3 () {40 vars in 646 bytes} [Sun Oct 6 15:16:09 2019] GET /favicon.ico => generated 232 bytes in 12 msecs (HTTP/1.1 404) 2 headers in 72 bytes (1 switches on core 0)
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x555ed3308170 pid: 11 (default app)
--- output truncated for brevity ---
As expected, the command returned a verbose output because it’s the combined logs of all the Pods labeled app=frontend. The log entry that we are interested in is the one highlighted in yellow. Notice that the log entry contains the IP address of the container followed by the date and time of the request, the HTTP verb, and other data. The other two Pods produce similar output but with a different IP address. The application we use in this example is Python Flask. The format of the log message is not dictated by Kubernetes, it is rather how Flask logs its events. In a real-world application, you should configure the application to use the hostname of the container when logging events. Additionally, the container’s hostname can be set whenever it starts.
There is an important difference between logs and metrics. Logs describe events as they happen. Some events are of absolute importance (critical) while others are of less and less importance till we reach the debug level, which is the most verbose. Metrics, on the other hand, describe the current state of the application and its infrastructure. You can set alarms when certain metrics get to a predefined threshold. For example, when the CPU crosses 80% or when the number of concurrent web requests reaches 1k (one thousand).
You should implement the necessary logic for exposing application metrics in your code. Most frameworks already have that implemented anyway, but you can (and should) extend the core functionality with your own custom metrics whenever they’re necessary.
When obtaining metrics, we basically have two ways of doing it: the push-based, and pull-based method. Let’s have a look at each:
In this approach, the application exposes one or more health endpoints. Hitting those URLs through an HTTP request returns health and metrics data for that application. For example, our flask API may have a /health endpoint that exposes important information about the current status of the service. However, implementing this approach comes with a challenge:
In a microservices architecture, more than one component is responsible for hosting an application part, each service has more than one pod for high availability. For example, a blogging app may have an authentication service, posts, and comments services. Each service is typically behind a load balancer to distribute load among its replicas. Your log-collection controller hits the health endpoints of the application and stores the results in a time-series database (for example, InfluxDB or Prometheus). So, when the controller needs to pull a service for its metrics, it will probably hit the URL of the load balancer rather than the service itself. The load balancer routes the request to the first available backend pod, which doesn’t cover all the pods’ health. The following illustration demonstrates this situation:
In addition, since a different pod may reply to the health check request, we need a mechanism by which each pod can identify itself in the response (a hostname, an IP address, etc.)
One possible solution to this problem is to make the monitoring system responsible for detecting the URL(s) of the different services in the application instead of relying on the load balancer that is moving service discovery to the client-side. The controller here is doing two tasks periodically:
Discovering which services are available. One way of doing this in Kubernetes is by placing the pods behind a headless service that returns the URLs of the individual pods.
Pull the health metrics of the pods by hitting the respective URL for each of the discovered pods.
The following illustration depicts this approach:
With this approach, the monitoring system must have access to the internal IP addresses of the pods. Thus, it must be deployed as part of the cluster. Popular solutions like Prometheus offer operators that provide deep integration not only with the pods running inside the cluster but also with the system-level metrics from the cluster itself.
In this approach, the application is responsible for finding the address of the metrics server and pushing the data to it. At first, this method may seem to add an additional level of complexity to the application. Developers have to build the necessary modules for metric-collection and pushing. However, this can be totally avoided by using the sidecar pattern.
The sidecar pattern makes use of the fact that a pod can host more than one container, all of which share the same network address and volumes. They can communicate with each other through the loopback address localhost. Accordingly, we can place a container that is solely responsible for log and/or metrics collection and pushing them to the metric server or a log-aggregator.
Using this pattern, you don’t need to make changes to the application since the sidecar container already collects the necessary metrics through the health endpoint and sends it over to the server. If the server’s IP address, or type changes, we only need to change the implementation of the sidecar container. The application remains intact. The following figure demonstrates using the sidecar container (also called Ambassador or Envoy) to collect and push metrics:
As a bonus, this pattern also gives chance for exposing more metrics than the application does. This is possible because the sidecar container itself can expose additional data about the application performance; for example, the response latency, the number of failed requests and more.
Self-service developer platform is all about creating a frictionless development process, boosting developer velocity, and increasing developer autonomy. Learn more about self-service platforms and why it’s important.
Explore how you can get started with GitOps using Weave GitOps products: Weave GitOps Core and Weave GitOps Enterprise. Read more.
More and more businesses are adopting GitOps. Learn about the 5 reasons why GitOps is important for businesses.
Implement the proper governance and operational excellence in your Kubernetes clusters.
Comments and Responses