What Is A Service Mesh?

DevOps, Kubernetes, services, K8s, Service Mesh
What Is A Service Mesh?
DevOps, Kubernetes, services, K8s, Service Mesh

In order to understand how, or why, a tool or technology has come into existence, you can start by taking a look at the origins of the problem it tries to solve, and what the world would potentially look like without this new technology. Microservices have been around for some time -- it’s an architecture that aims at breaking large monolithic applications into small units that communicate with each other through HTTP protocol. It’s also a model created to solve the issues of scalability and availability. A microservices application is more scalable than a monolithic one because you can easily select the services that are under heavy load and create more replicas of them instead of creating replicas of the whole giant system. Similarly, and by the same mechanism, the microservices application is more readily available as we can have multiple replicas of the same service. Containerization and container orchestration systems like Kubernetes made microservices even more robust. Containers are very lightweight, taking only milliseconds to launch. Using systems like Kubernetes, containers can also be moved easily from one node (machine) to another. That’s great on its own, however, as more and more environments embraced microservices, new challenges have emerged. Let’s take a look.

Microservice Challenges

The need for load balancing: since there’ll be more than one replica responsible for the same service, there must be a load balancer to receive the initial request from the client and route it to a healthy backend service.

illustration 8

The need for intelligent load balancing: in the modern world of microservices and clients who expect services to respond in a fraction of a second, round-robin load balancing is no longer enough. The load balancer should be smart enough to send less traffic to a service with high latency. In extreme cases, it should cease routing to an overloaded service so that it doesn’t collapse under increased load. Additionally, intelligent load balancing can be used in advanced routing scenarios like canary testing, where traffic should be routed to particular backends depending on specific conditions like custom HTTP headers.

illustration 9

Circuit-breaking pattern: assume that we have an application that requires users to login. The frontend page is displayed by the webserver service. As soon as users enter their credentials, the webserver contacts the authentication API (which is hosted on another service) to validate those credentials and route users to the dashboard, or display an “access denied” message. Now, what if the authentication service is overburdened and having some difficulty handling requests at that time? The request will travel over the network and wait for the authentication API to respond. Since the latter is not functioning properly, the request will wait till the I/O timeout is reached before throwing an error that the service is unavailable. This happens with each and every authentication request until the service is back. However, sitting around and waiting for the request to timeout provides a bad user experience, to say the least. The circuit-breaking pattern aims at eliminating the failure-reporting delay by classifying connections to three states:

  • Closed: the connection was OK. In our example, the authentication API returned a valid response.
  • Open: the request failed a number of times. In our example, if that set number is 3, then after 3 failed requests, the service will no longer try to communicate with the authentication service. Instead, it will immediately report that the backend service is not responding, or is having issues.
  • Half-open: after a specified time duration, the service rechecks the target. If it’s still unreachable (or not working as expected), the connection state remains open. Otherwise, the failure count is reset and the connection returns to the closed state.
illustration 10

HTTPS communication requirement: despite operating inside an internal, isolated network, microservices should always use HTTPS in their internal communication. However, HTTPS brings its own operational burden. The application in the container needs to implement certificates, terminate TLS traffic and so on.

Security policies should be in place: just because the microservices are sealed from the outside Internet inside the internal network, does not mean that they can’t be compromised. If intruders were able to break into a weak part of the application, let’s say a service that has no business communicating with the database, they shouldn’t be able to connect to the database by compromising this service.

These are just a subset of the many challenges that modern microservice architectures face today, and a service mesh is a technology that aims at addressing these issues.

But I Thought The Kubernetes Service Already Does That?

One of the most common questions that pop in mind when studying service meshes for the first time is: I already know about Kubernetes, it has components for dealing with those requirements like Service, Liveness Probes, and Readiness Probes. That’s partially correct. Kubernetes offers a basic service mesh of its own through its Service component. A Service provides round-robin load balancing and service discovery. However, it will only get you that far. A Service cannot implement intelligent load balancing, back off logic (stops sending traffic to a backend pod when specific conditions are met) and other advanced features provided by service meshes. As a matter of fact, a service mesh is considered an extension of container orchestration systems. Think of it as Kubernetes on steroids.

Service Mesh Deployment Patterns

When deploying a service mesh framework, you typically have two options:

  • Per host: where the tool is deployed on the host. If we’re using Kubernetes, it can be deployed through a Daemonset. The advantage of this approach is that you get to save resources by deploying fewer proxies. The drawback, however, is that a failure in one of the proxies affects all the containers running on that node.
  • Per container: where the proxy is deployed as a sidecar to each container. The pod definition contains the application container as well as the proxy one (sidecar) - this way, the sidecar container is tightly coupled with the application. They can communicate with each other through the loopback interface.

Hands-On Example: Implementing A Sidecar Proxy Using Envoy

In this lab, we’re going to demonstrate one of the features the service mesh technology provides, which is traffic control. For this lab, we’ll use a very simple API written in Go. the API expects a POST request with the user’s birthday in the body. It responds with “Hello, username. Happy birthday” if the current day matches the user’s birthday, or just “Hello, username”. Let’s give it a test drive:

docker run -d -p 8080:8080 magalixcorp/birthdaygreeter
YAML

Now, let’s send a POST request:

$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost:8080/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
YAML

Since it seems to be working, let’s deploy it to a running Kubernetes cluster:

apiVersion: apps/v1
kind: Deployment
metadata:
 name: bdgreeter
 labels:
   app: bdgreeter
spec:
 replicas: 1
 selector:
   matchLabels:
     app: bdgreeter
 template:
   metadata:
     labels:
       app: bdgreeter
   spec:
     containers:
     - name: app
       image: magalixcorp/birthdaygreeter
---
apiVersion: v1
kind: Service
metadata:
 name: bdgreeter-svc
spec:
 selector:
   app: bdgreeter
 ports:
 - name: http
   port: 8080
   targetPort: 8080
   nodePort: 32000
   protocol: TCP
 type: NodePort
YAML

The definition contains a Deployment and a Service of type NodePort to route external requests to the Pod. Let’s rerun the request against the cluster:

$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' 35.188.81.116:32000/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
YAML

Here, 35.188.81.116 is the external IP address of one of our Kubernetes nodes.

Now, let’s modify the deployment to include a sidecar container running Envoy.

What’s Envoy Anyway?

According to the docs, Envoy Proxy can be defined as “an L7 proxy and communication bus designed for large modern service-oriented architectures.” The main goal of the project is to make a clear separation between the application and the network on which it operates. When issues occur, it should be pretty straightforward where the root cause originates. Written in C++, Envoy makes use of the speed and reliability of the native language. It works at HTTP layer 7 to perform tasks that are normally delegated to webservers like Nginx, namely routing, rate limiting, etc. It also operates at lower levels L3 and L4, where clever decisions can be made with IP addresses and ports. Envoy also offers statistical visibility and reporting for the application. It’s so popular that it’s now the go-to solution whenever an application sidecar needs to be implemented. Let’s have a look at how Envoy works. At its core, it has the following components:

  • TCP listener: this is where Envoy accepts traffic. A single Envoy instance may have multiple listeners. It only accepts TCP traffic.
  • Filters: this is the meat and potatoes of the system. A message received by Envoy may pass through one or more filters, undergoing multiple operations until it’s finally routed to the application. Every listener can have its own set of filters forming a filter chain. Envoy offers a lot of native filters that can be used in its configuration. For example:
    • Listener Filters: which are invoked at the early stages of establishing communication. The HTTP inspector filter, for example, is used to detect whether or not the protocol used in the connection is HTTP. Another example is the TLS inspector filter, which detects whether the connection is using TLS or just plaintext. If it’s using TLS, it extracts important information like the Server Name Indication.
    • Network Filters: those are applied to the TCP message once the connection is established. They can do several application-related tasks like rate limiting, authentication, and authorization. There are also application-specific filters like Mongo Proxy, MySQL proxy, and Kafka proxy, which provide features specific to those systems.
    • HTTP filters: they have a wide range of facilities for doing lots of things with HTTP messages. For example, CSRF protection, gRPC to JSON translation, health check, gzip compression among many others.
  • Clusters: a cluster represents one or more upstream services that Envoy connects to. Those services can be added directly in the configuration file. They can also be detected automatically by Envoy through service discovery mechanisms.
what is sercvic mesh updated

Now enough of the theory, let’s see Envoy in action.

Deploying Envoy Locally

Envoy comes in different flavors: it can be compiled and run on a local machine as a binary, or it can run through a Docker container directly. Since our end goal is to deploy it on Kubernetes as a sidecar, we’ll go with the second approach. To demonstrate how it works, we’ll deploy the Envoy container on a local machine. Create a new file called envoy.yaml (the name doesn’t matter), and paste the following content:

static_resources:
 listeners:
 - address:
     socket_address:
       address: 0.0.0.0
       port_value: 80
   filter_chains:
   - filters:
     - name: envoy.http_connection_manager
       typed_config:
         "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
         codec_type: auto
         stat_prefix: ingress_http
         route_config:
           name: local_route
           virtual_hosts:
           - name: service
             domains:
             - "*"
             routes:
             - match:
                 prefix: "/hello"
               route:
                 cluster: local_service
         http_filters:
         - name: envoy.router
           typed_config: {}
 clusters:
 - name: local_service
   connect_timeout: 0.25s
   type: strict_dns
   lb_policy: round_robin
   load_assignment:
     cluster_name: local_service
     endpoints:
     - lb_endpoints:
       - endpoint:
           address:
             socket_address:
               address: 172.17.0.1
               port_value: 3000
admin:
 access_log_path: "/dev/null"
 address:
   socket_address:
     address: 0.0.0.0
     port_value: 8081
YAML

Since we have Envoy configuration in place, we can launch a container using the following command:

$ docker run -d -v $(pwd)/envoy.yaml:/envoy-config/envoy.yaml -p 80:80 -p 8081:8081 envoyproxy/envoy-alpine -c /envoy-config/envoy.yaml
YAML

Notice that we are exposing port 80 for the traffic and port 8081 for the admin dashboard. Let’s try to establish the HTTP request to confirm that everything works as expected:

$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
YAML

Okay, this proves that the system is working properly. Now, let’s see what happens when the application goes down:

$ docker container ls
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                                                   NAMES
dc5f22befa84        magalixcorp/birthdaygreeter   "./app"                  25 minutes ago      Up 25 minutes       0.0.0.0:3000->3000/tcp                                  recursing_sinoussi
35559394d244        envoyproxy/envoy-alpine       "/docker-entrypoint...."   46 minutes ago      Up 46 minutes       0.0.0.0:80->80/tcp, 0.0.0.0:8081->8081/tcp, 10000/tcp   laughing_wescoff
$ docker rm -f dc5f22befa84
dc5f22befa84
$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost/hello/Magalixcorp
upstream connect error or disconnect/reset before headers. reset reason: connection failure
YAML

As you can see, once the application is down, the Envoy reports an important error message, that the upstream service is having trouble responding to the request. The client didn't have to wait until the connection timed out, the message is immediately displayed.

Before deploying the sidecar to our Kubernetes cluster, let’s spend a few moments with the configuration file.

  • The listeners part contains the address on which Envoy would listen for incoming connections (line 5).
  • The filters section operates on layer 7 and it uses the HTTP filter. Notice that - since it is operating on layer 7 - it can detect the URL of the request. Here we’re directing any request that targets "/hello" to a service called “local_service” (lines 17-24).
  • To define the “local_service”, we use the clusters section (starting at line 29). We can add more than one upstream service in the cluster and select the load balancing mechanism. In our case, it was round_robin (line 34).
  • The cluster needs to know where the upstream service is located on the network. Since we’re running this lab on our local machine through Docker, we can’t state 127.0.0.1 because this is translated to the loopback interface on the container. Instead, we need to write the bridge interface 172.17.0.1, which is the gateway the container uses to access the host’s loopback interface.
  • The admin section (starts at line 42) defines the admin interface settings. The admin interface provides many useful tools in addition to statistics. It can be accessed on port 8081. In our lab, that’d be http://127.0.0.1:8081.

The final part of our lab is to apply the above setup to a real Kubernetes cluster. We’re going to make some changes first:

  • Crate a configMap to hold Envoy configuration. This can be done in several ways, we’ve chosen the declarative way so that we can put the resource under version control. Create a YAML file and add the following contents:
apiVersion: v1
kind: ConfigMap
metadata:
 name: envoyconfig
 namespace: default
data:
 envoy.yaml: |-
   static_resources:
     listeners:
     - address:
         socket_address:
           address: 0.0.0.0
           port_value: 80
       filter_chains:
       - filters:
         - name: envoy.http_connection_manager
           typed_config:
             "@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
             codec_type: auto
             stat_prefix: ingress_http
             route_config:
               name: local_route
               virtual_hosts:
               - name: service
                 domains:
                 - "*"
                 routes:
                 - match:
                     prefix: "/hello"
                   route:
                     cluster: local_service
             http_filters:
             - name: envoy.router
               typed_config: {}
     clusters:
     - name: local_service
       connect_timeout: 0.25s
       type: strict_dns
       lb_policy: round_robin
       load_assignment:
         cluster_name: local_service
         endpoints:
         - lb_endpoints:
           - endpoint:
               address:
                 socket_address:
                   address: 127.0.0.1
                   port_value: 3000
   admin:
     access_log_path: "/dev/null"
     address:
       socket_address:
         address: 0.0.0.0
         port_value: 8081
YAML
  • The only change we made to the Envoy configuration is switching the socket_address of the upstream service to 127.0.0.1, instead of the Docker bridge address. The reason is because Envoy is now operating as a sidecar container; it can access the application container on localhost.
  • The third change we need to make is in the deployment file; we need to add the sidecar container to the pod. Additionally, we need to configure our service to connect to the sidecar container as its backend instead of the application container. Finally, we add a second route to the admin interface on port 8081. Our deployment file should look as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
 name: bdgreeter
 labels:
   app: bdgreeter
spec:
 replicas: 1
 selector:
   matchLabels:
     app: bdgreeter
 template:
   metadata:
     labels:
       app: bdgreeter
   spec:
     containers:
     - name: app
       image: magalixcorp/birthdaygreeter
     - name: envoy
       image: envoyproxy/envoy-alpine
       args:
         - "-c"
         - "/envoy-config/envoy.yaml"
       volumeMounts:
         - name: envoy-config
           mountPath: /envoy-config
     volumes:
     - name: envoy-config
       configMap:
         name: envoyconfig
---
apiVersion: v1
kind: Service
metadata:
 name: bdgreeter-svc
spec:
 selector:
   app: bdgreeter
 ports:
 - name: http
   port: 80
   targetPort: 80
   protocol: TCP
 - name: admin
   port: 8081
   targetPort: 8081
 type: LoadBalancer
YAML

Now, we need to apply both files to our cluster:

$ kubectl apply -f configmap.yml
$ kubectl apply -f bdaygreeter.yaml
YAML

We’re using the LoadBalancer as the Service type this time, let’s get the external IP:

$ kubectl get svc
NAME            TYPE           CLUSTER-IP   EXTERNAL-IP     PORT(S)                       AGE
bdgreeter-svc   LoadBalancer   10.0.26.74   20.185.12.166   80:30652/TCP,8081:30957/TCP   24m
kubernetes      ClusterIP      10.0.0.1               443/TCP                       86m
YAML

Now, let’s test our work:

$ curl -XPOST --data '{"dateOfBirth":"2020-01-26"}' 20.185.12.166/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
YAML

You can also have a look at the admin dashboard at http://20.185.12.166:8081/

TL;DR

Service meshes are some of the newest, trending tech topics today. In this article, we discussed an overview of what service meshes are, and how they’re typically used. We also had a practical hands-on lab, where we made use of the very popular sidecar software: Envoy. Through the lab, we deployed Envoy as a standalone container on our local environment, as well as a sidecar container to a Kubernetes Pod deployment. Envoy is written in C++, so it’s very fast and offers a myriad of features. At its core, Envoy uses Filters to apply various operations on the TCP messages as they arrive, and it’s capable of distributing traffic to multiple upstream services using several load balancing protocols.

Comments and Responses

Related Articles

DevOps, Kubernetes, cost saving, K8s
Kubernetes Cost Optimization 101

Over the past two years at Magalix, we have focused on building our system, introducing new features, and

Read more
The Importance of Using Labels in Your Kubernetes Specs: A Guide

Even a small Kubernetes cluster may have hundreds of Containers, Pods, Services and many other Kubernetes API

Read more
How to Deploy a React App to a Kubernetes Cluster

Kubernetes is a gold standard in the industry for deploying containerized applications in the cloud. Many

Read more

start your 14-day free trial today!

Automate your Kubernetes cluster optimization in minutes.

Get started View Pricing
No Card Required