Understanding the inside of your Kubernetes cluster is vital to system and application performance. Logs provide that insight, by understanding and reviewing your logs, you can fine-tune your applications and ensure the stability of your system. When debugging problems and monitoring cluster performance logs are particularly handy. Compared to traditional applications and services Kubernetes maintains its logs much differently.
Kubernetes abstracts away most of the traditional maintenance that accompanies application design by running applications per server. The goal of this post is to provide you with a good high-level overview of the essential concepts for logging with Kubernetes. By the end, you should feel comfortable monitoring and reviewing the logs for your Kubernetes environment.
Basics of Kubernetes Logging
In traditional server environments, application logs are written to a file such as /var/log/app.log. These files are then viewed on each individual server or sent to a central repository for analysis and or retention.
This method of log collection is discouraged in Kubernetes due to the simple fact that pods can be numerous and short-lived. Kubernetes recommends letting the application output logs to the stdout and stderr. Each node has its own Kubelet running which will collect the segmented output logs and augment them into a singular log file. Automatically the log files for each container will be managed and restricted to 10MB. Basic topology of how Kubernetes natively handles logs can be seen in below illustration.
Types of Logs
It should be noted that Kubernetes has two main types of logs. Node Logs and Component Logs. Node logs are those logs which are generated by the nodes and services running on those nodes. A good example would be the Kubelet agent previously mentioned. Component logs, on the other hand, are generated by the Pods, Containers, Kubernetes components, DaemonSets, and other Kubernetes Services.
Nodes in Kubernetes Clusters each runs services which allow the node to host Pods, as well as receive commands and communicate with other nodes. How these logs are formatted and where they reside depend on the host operating system. On a Linux server for example you can retrieve the logs by running
journalctl -u kubelet, however on other systems Kubernetes places the logs in the default /var/log directory.
Component logs on the other hand are captured by Kubernets itself and typically accessed using the Kubernetes API. Pods would be the best example. Applications will write all their messages to STDOUT or STDERR, more on reviewing these logs below.
To access your Pod logs, utilize kubectl log. Utilizing kubectl avoids the need to access individual nodes and provides the ability to view logs for all of your nodes collectively in real time.
Below snippet provides a few examples using kubectl to view logs for a pod:
kubectl logs my-pod # dump pod logs (stdout) kubectl logs -l name=myLabel # dump pod logs, with label name=myLabel (stdout) kubectl logs my-pod --previous # dump pod logs (stdout) for a previous instantiation of a container kubectl logs my-pod -c my-container # dump pod container logs (stdout, multi-container case) kubectl logs -l name=myLabel -c my-container # dump pod logs, with label name=myLabel (stdout) kubectl logs –since=1h my-pod # view logs for last hour kubectl logs -f my-pod # stream pod logs (stdout) kubectl logs -f my-pod -c my-container # stream pod container logs (stdout, multi-container case) kubectl logs -f -l name=myLabel --all-containers # stream all pods logs with label name=myLabel (stdout)
One limitation of kubectl logs is the inability to view multiple pod logs simultaneously. Reviewing single pod logs at a time is fine if you are quickly debugging your application or are on a smaller system, however, eventually you will need a way to quickly tail the logs for multiple pods, and this is where kubetail comes in handy.
A small binary, Kubetail runs kubectl logs-f on multiple pods and combines the results into a single data stream. Many of the same commands found in kubectl logs-g are found in Kubetail.
Kubetail my-testapp # view all pod logs with my-testapp name Kubetail my-testapp –s 30 m # view last 30 mins of logs with my-testapp name kubetail <my_testapp_1>,<my_testapp_2> # To tail multiple applications at the same time separated by a comma: kubetail <my_app> -c <my_container_1> -c <my_container_2> # To tail multiple containers from multiple pods:
Log review using kubectl logs or kubetail is convenient for live log streams, however, it does have its limitations. For example, historical logs, logs from terminated pods, and logs from crashed instances are not available. Best practice recommends implementation of a centralized log management system and Kubernetes is no exception to this rule.
Numerous tools and solutions are available for centrally connecting pod logs. One of the most notable being fluentd. Fluentd collects and parses logs from numerous sources, then ships them to one or multiple repositories. What adds to Fluentds flexibility is its huge database of customizable plugins.
Because of the resources consumed by performing an Elasticsearch best practice dictates not storing your logs on the same Kubernetes cluster. In additional managing the necessary storage necessary to maintain your logs, as well as configuring the archival and retention process can be time-consuming as well as tedious. Distracting you from product deployment and app deployment. There are numerous trusted providers that you can offshore these responsibilities, some examples below:
Whether you use a third-party provider to store your logs or run your own Elasticsearch setup, you will likely need to configure your fluentd pods to collect and parse logs that are specific to your application. The fluentd Quickstart Guide is a great resource to understand how fluentd works and to find ways to configure sources, filters, and outputs.
Logging Best Practices
Kubernetes has many moving parts, which helps performance and efficiency, however, it also increases the complexity of log retention and monitoring. As previously discussed centralized logging and monitoring is essential to any production or high-performance environment. In the event of an incident, logs provide the insight and evidence needed to investigate and ultimately resolve the issue. As described getting the logs out of Containers and Pods in Kubernetes is fairly straightforward but where to send them and maintaining retention can become a challenge. Which logging system you elect to use is up to you, however ensuring that the logs are in a central location and in a separate isolated environment is paramount.
Secondly, Kubernetes logs can quickly take up a lot of space. Depending on if you elect to use a service or perform internal logging it is essential you establish a retention policy and probably estimate your required disk space. How much disk space required is truly determined by your Kubernetes environment. And lastly, you should review your logs daily. Even if no incident has occurred the logs can reveal important information which can prevent issues from occurring.
For additional resources on and best practices please review the below links.
We’ve gone over the basics of log management in Kubernetes vs. traditional servers, how to view pod logs in real-time using kubectl and kubetail, and how fluentd is used to ship logs from your Kubernetes cluster to a centralized log management service. You should now have a basic understanding of how logging works in Kubernetes, with tons of resources to check out for configuring log management in your production Kubernetes cluster.