Balance innovation and agility with security and compliance
risks using a 3-step process across all cloud infrastructure.
Step up business agility without compromising
security or compliance
Everything you need to become a Kubernetes expert.
Always for free!
Everything you need to know about Magalix
culture and much more
Monitoring is vital whether it is web application databases or kubernetes clusters. It’s about knowing how’s and what’s of application or cluster performance.
As today’s talk is specifically about monitoring kubernetes clusters, when there are thousands or millions of services running inside the cluster it’s not viable or not possible to monitor clusters by the subsequent command or manually. In this article, we will deploy Prometheus and Grafana to kubernetes cluster and monitor our cluster,
Prometheus is free and an open-source event monitoring tool for containers or microservices. Prometheus collects numerical data based on time series. The Prometheus server works on the principle of scraping. This invokes the metric endpoint of the various nodes that have been configured to monitor. These metrics are collected in regular timestamps and stored locally. The endpoint that was used to discard is exposed on the node.
Prometheus data retention time is 15 days by default. The lowest retention period is 2hour. If you retain the data for the highest period more disk space will be used as there will be more data. The lowest retention period can be used when configuring remote storage for Prometheus.
Grafana is a multi-platform visualization software that provides us a graph, the chart for a web connected to the data source. Prometheus has it’s own built-in browser expression but Grafana is the industry's most powerful visualization software. Grafana has out of the box integration with Prometheus.
Grafana is a multi-platform visualization software available since 2014. Grafana provides us a graph, the chart for a web-connected to the data source. It can query or visualize your data source, it doesn’t matter where they are stored.
Swift and extensible client-side graphs with a number of options. There are many plugins for many different ways to visualize metrics and logs. You will use custom kubernetes metrics to plot them in the graph we will see that in the latter section
In this article, the Kube state metric list to visually see in the Grafana graph. Split view and compare different time ranges, queries, and data sources
Experience the magic of switching from metrics to logs with preserved label filters. Quickly search through all your logs or stream them live.
You can setup Minikube (Local Kubernetes cluster) or use cloud managed kubernetes service like Google kubernetes Engine or Elastic Kubernetes service which you use to deploy Prometheus and Grafana to monitor the cluster. Connect to the cluster and start following tutorials.
Firstly we will create a namespace and follow good practice.
$ kubectl create namespace prometheus
This command creates namespace in the cluster where on the next step we will deploy Prometheus.
We use the Helm chart of Prometheus operator to deploy Prometheus Grafana and many services that have been used to monitor kubernetes clusters. For details visit here
$ helm install prometheus stable/prometheus-operator --namespace prometheus
To validate Prometheus installation
$ kubectl get pods -n prometheus
Result will look like:
NAME READY STATUS RESTARTS AGE Alertmanager-prometheus-prometheus-oper-alertmanager 2/2 Running 0 16m prometheus-grafana-5c5885d488-b9mlj 2/2 Running 0 19m prometheus-kube-state-metrics-6967c9fd67-zsw6c 1/1 Running 0 19m prometheus-prometheus-node-exporter-jj6wq 1/1 Running 0 19m prometheus-prometheus-oper-operator-77cbbc55f5-6btf2 2/2 Running 0 19m prometheus-prometheus-prometheus-oper-prometheus-0 3/3 Running 1 15m
Now we have installed Prometheus on the cluster we can visit Prometheus dashboard by the following command
$ kubectl port-forward -n prometheus prometheus-prometheus-prometheus-oper-prometheus-0 9090
Visit localhost:9090, you will see Prometheus dashboard
As we learned before, the Prometheus operator installs Grafana and Prometheus with some other deployments. With the installation of the Prometheus operator, Grafana is also installed with it. We can also access the Grafana dashboard by port forwarding the Grafana pod.
$ kubectl port-forward -n prometheus prometheus-grafana-5c5885d488-b9mlj 3000
Now you can access Grafana dashboard at localhost:3000
You will get the Grafana dashboard username and password from getting a secret from prometheus-grafana.
$ kubectl get secret --namespace prometheus prometheus-grafana -o yaml
Result will look like:
apiVersion: v1 data: admin-password: cHJvbS1vcGVyYXRvcg== admin-user: YWRtaW4= ldap-toml: "" kind: Secret type: Opaque // with some other metadata
Secret data encoded with base64 you can decode it by the following command:
$ openssl base64 -d YWRtaW4= admin
$ openssl base64 -d cHJvbS1vcGVyYXRvcg== prom-operator
Use these credential to access Grafana dashboard
Result will look like:
By default, Prometheus operators deployed with some pre-configured Grafana dashboards are available by default. To see default dashboard click dashboard > manage
Here are the kubernetes compute resources namespace pod’s dashboard to monitor, and there are some of the needful dashboard available example given below
When we deployed the Prometheus operator chart using Helm, and it includes not just Prometheus but these also deployed:
Tools included like Kube-state-metrics scrape internal system metrics and we can get a list of that by port forwarding the Kube state metrics pod.
$ kubectl port-forward -n prometheus prometheus-kube-state-metrics-6967c9fd67-zsw6c 8080
Now visit localhost:8080/metrics and you will get all metrics that can be used to monitor the kubernetes cluster
Now we do have all integration in place and we are ready to monitor the cluster. As we have some default dashboard available. We start by going to dashboard > manage and you will get the dashboard list
Unless you are on Minikube or its alternative you can monitor each node resource consumption graph separately so you will get the idea of each node performance.
Just like we are getting only one node but on production, we will be having more nodes so you will get more instances. You will get Network, Disk, Memory, and load averages and get a fine-grained metric on each of the metrics.
We will also create a new dashboard for a specific metric that is not available as a default dashboard.
We use prometheus_sd_kubernetes_http_request_total for our new panel.
Click on apply and you will get your new panel for the custom metric. It is as easy as it is.
There are other important metrics that we watch in production. Some of them are related to Nodes, Pods, Kube API, CPU, memory, storage, and resource usage:
Protect your cloud infrastructure by understanding the key vulnerability areas according to the shared responsibility model.
Know more about the 4 main types of “leaks” that commonly occur with cloud asset management, and some useful strategies to address them.
With the NIST cybersecurity framework implemented using policy-as-code, companies can strengthen their security processes. Learn more.