In order to understand how, or why, a tool or technology has come into existence, you can start by taking a look at the origins of the problem it tries to solve, and what the world would potentially look like without this new technology. Microservices have been around for some time -- it’s an architecture that aims at breaking large monolithic applications into small units that communicate with each other through HTTP protocol. It’s also a model created to solve the issues of scalability and availability. A microservices application is more scalable than a monolithic one because you can easily select the services that are under heavy load and create more replicas of them instead of creating replicas of the whole giant system. Similarly, and by the same mechanism, the microservices application is more readily available as we can have multiple replicas of the same service. Containerization and container orchestration systems like Kubernetes made microservices even more robust. Containers are very lightweight, taking only milliseconds to launch. Using systems like Kubernetes, containers can also be moved easily from one node (machine) to another. That’s great on its own, however, as more and more environments embraced microservices, new challenges have emerged. Let’s take a look.
Microservice Challenges
The need for load balancing: since there’ll be more than one replica responsible for the same service, there must be a load balancer to receive the initial request from the client and route it to a healthy backend service.

The need for intelligent load balancing: in the modern world of microservices and clients who expect services to respond in a fraction of a second, round-robin load balancing is no longer enough. The load balancer should be smart enough to send less traffic to a service with high latency. In extreme cases, it should cease routing to an overloaded service so that it doesn’t collapse under increased load. Additionally, intelligent load balancing can be used in advanced routing scenarios like canary testing, where traffic should be routed to particular backends depending on specific conditions like custom HTTP headers.

Circuit-breaking pattern: assume that we have an application that requires users to login. The frontend page is displayed by the webserver service. As soon as users enter their credentials, the webserver contacts the authentication API (which is hosted on another service) to validate those credentials and route users to the dashboard, or display an “access denied” message. Now, what if the authentication service is overburdened and having some difficulty handling requests at that time? The request will travel over the network and wait for the authentication API to respond. Since the latter is not functioning properly, the request will wait till the I/O timeout is reached before throwing an error that the service is unavailable. This happens with each and every authentication request until the service is back. However, sitting around and waiting for the request to timeout provides a bad user experience, to say the least. The circuit-breaking pattern aims at eliminating the failure-reporting delay by classifying connections to three states:
- Closed: the connection was OK. In our example, the authentication API returned a valid response.
- Open: the request failed a number of times. In our example, if that set number is 3, then after 3 failed requests, the service will no longer try to communicate with the authentication service. Instead, it will immediately report that the backend service is not responding, or is having issues.
- Half-open: after a specified time duration, the service rechecks the target. If it’s still unreachable (or not working as expected), the connection state remains open. Otherwise, the failure count is reset and the connection returns to the closed state.

HTTPS communication requirement: despite operating inside an internal, isolated network, microservices should always use HTTPS in their internal communication. However, HTTPS brings its own operational burden. The application in the container needs to implement certificates, terminate TLS traffic and so on.
Security policies should be in place: just because the microservices are sealed from the outside Internet inside the internal network, does not mean that they can’t be compromised. If intruders were able to break into a weak part of the application, let’s say a service that has no business communicating with the database, they shouldn’t be able to connect to the database by compromising this service.
These are just a subset of the many challenges that modern microservice architectures face today, and a service mesh is a technology that aims at addressing these issues.
But I Thought The Kubernetes Service Already Does That?
One of the most common questions that pop in mind when studying service meshes for the first time is: I already know about Kubernetes, it has components for dealing with those requirements like Service, Liveness Probes, and Readiness Probes. That’s partially correct. Kubernetes offers a basic service mesh of its own through its Service component. A Service provides round-robin load balancing and service discovery. However, it will only get you that far. A Service cannot implement intelligent load balancing, back off logic (stops sending traffic to a backend pod when specific conditions are met) and other advanced features provided by service meshes. As a matter of fact, a service mesh is considered an extension of container orchestration systems. Think of it as Kubernetes on steroids.
Service Mesh Deployment Patterns
When deploying a service mesh framework, you typically have two options:
- Per host: where the tool is deployed on the host. If we’re using Kubernetes, it can be deployed through a Daemonset. The advantage of this approach is that you get to save resources by deploying fewer proxies. The drawback, however, is that a failure in one of the proxies affects all the containers running on that node.
- Per container: where the proxy is deployed as a sidecar to each container. The pod definition contains the application container as well as the proxy one (sidecar) - this way, the sidecar container is tightly coupled with the application. They can communicate with each other through the loopback interface.
Hands-On Example: Implementing A Sidecar Proxy Using Envoy
In this lab, we’re going to demonstrate one of the features the service mesh technology provides, which is traffic control. For this lab, we’ll use a very simple API written in Go. the API expects a POST request with the user’s birthday in the body. It responds with “Hello, username. Happy birthday” if the current day matches the user’s birthday, or just “Hello, username”. Let’s give it a test drive:
docker run -d -p 8080:8080 magalixcorp/birthdaygreeter
Now, let’s send a POST request:
$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost:8080/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
Since it seems to be working, let’s deploy it to a running Kubernetes cluster:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bdgreeter
labels:
app: bdgreeter
spec:
replicas: 1
selector:
matchLabels:
app: bdgreeter
template:
metadata:
labels:
app: bdgreeter
spec:
containers:
- name: app
image: magalixcorp/birthdaygreeter
---
apiVersion: v1
kind: Service
metadata:
name: bdgreeter-svc
spec:
selector:
app: bdgreeter
ports:
- name: http
port: 8080
targetPort: 8080
nodePort: 32000
protocol: TCP
type: NodePort
The definition contains a Deployment and a Service of type NodePort to route external requests to the Pod. Let’s rerun the request against the cluster:
$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' 35.188.81.116:32000/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
Here, 35.188.81.116 is the external IP address of one of our Kubernetes nodes.
Now, let’s modify the deployment to include a sidecar container running Envoy.
What’s Envoy Anyway?
According to the docs, Envoy Proxy can be defined as “an L7 proxy and communication bus designed for large modern service-oriented architectures.” The main goal of the project is to make a clear separation between the application and the network on which it operates. When issues occur, it should be pretty straightforward where the root cause originates. Written in C++, Envoy makes use of the speed and reliability of the native language. It works at HTTP layer 7 to perform tasks that are normally delegated to webservers like Nginx, namely routing, rate limiting, etc. It also operates at lower levels L3 and L4, where clever decisions can be made with IP addresses and ports. Envoy also offers statistical visibility and reporting for the application. It’s so popular that it’s now the go-to solution whenever an application sidecar needs to be implemented. Let’s have a look at how Envoy works. At its core, it has the following components:
- TCP listener: this is where Envoy accepts traffic. A single Envoy instance may have multiple listeners. It only accepts TCP traffic.
- Filters: this is the meat and potatoes of the system. A message received by Envoy may pass through one or more filters, undergoing multiple operations until it’s finally routed to the application. Every listener can have its own set of filters forming a filter chain. Envoy offers a lot of native filters that can be used in its configuration. For example:
- Listener Filters: which are invoked at the early stages of establishing communication. The HTTP inspector filter, for example, is used to detect whether or not the protocol used in the connection is HTTP. Another example is the TLS inspector filter, which detects whether the connection is using TLS or just plaintext. If it’s using TLS, it extracts important information like the Server Name Indication.
- Network Filters: those are applied to the TCP message once the connection is established. They can do several application-related tasks like rate limiting, authentication, and authorization. There are also application-specific filters like Mongo Proxy, MySQL proxy, and Kafka proxy, which provide features specific to those systems.
- HTTP filters: they have a wide range of facilities for doing lots of things with HTTP messages. For example, CSRF protection, gRPC to JSON translation, health check, gzip compression among many others.
- Clusters: a cluster represents one or more upstream services that Envoy connects to. Those services can be added directly in the configuration file. They can also be detected automatically by Envoy through service discovery mechanisms.

Now enough of the theory, let’s see Envoy in action.
Deploying Envoy Locally
Envoy comes in different flavors: it can be compiled and run on a local machine as a binary, or it can run through a Docker container directly. Since our end goal is to deploy it on Kubernetes as a sidecar, we’ll go with the second approach. To demonstrate how it works, we’ll deploy the Envoy container on a local machine. Create a new file called envoy.yaml (the name doesn’t matter), and paste the following content:
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
codec_type: auto
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: service
domains:
- "*"
routes:
- match:
prefix: "/hello"
route:
cluster: local_service
http_filters:
- name: envoy.router
typed_config: {}
clusters:
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: round_robin
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 172.17.0.1
port_value: 3000
admin:
access_log_path: "/dev/null"
address:
socket_address:
address: 0.0.0.0
port_value: 8081
Since we have Envoy configuration in place, we can launch a container using the following command:
$ docker run -d -v $(pwd)/envoy.yaml:/envoy-config/envoy.yaml -p 80:80 -p 8081:8081 envoyproxy/envoy-alpine -c /envoy-config/envoy.yaml
Notice that we are exposing port 80 for the traffic and port 8081 for the admin dashboard. Let’s try to establish the HTTP request to confirm that everything works as expected:
$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
Okay, this proves that the system is working properly. Now, let’s see what happens when the application goes down:
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
dc5f22befa84 magalixcorp/birthdaygreeter "./app" 25 minutes ago Up 25 minutes 0.0.0.0:3000->3000/tcp recursing_sinoussi
35559394d244 envoyproxy/envoy-alpine "/docker-entrypoint...." 46 minutes ago Up 46 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:8081->8081/tcp, 10000/tcp laughing_wescoff
$ docker rm -f dc5f22befa84
dc5f22befa84
$ curl -XPOST --data '{"dateOfBirth":"2020-01-25"}' localhost/hello/Magalixcorp
upstream connect error or disconnect/reset before headers. reset reason: connection failure
As you can see, once the application is down, the Envoy reports an important error message, that the upstream service is having trouble responding to the request. The client didn't have to wait until the connection timed out, the message is immediately displayed.
Before deploying the sidecar to our Kubernetes cluster, let’s spend a few moments with the configuration file.
- The listeners part contains the address on which Envoy would listen for incoming connections (line 5).
- The filters section operates on layer 7 and it uses the HTTP filter. Notice that - since it is operating on layer 7 - it can detect the URL of the request. Here we’re directing any request that targets "/hello" to a service called “local_service” (lines 17-24).
- To define the “local_service”, we use the clusters section (starting at line 29). We can add more than one upstream service in the cluster and select the load balancing mechanism. In our case, it was round_robin (line 34).
- The cluster needs to know where the upstream service is located on the network. Since we’re running this lab on our local machine through Docker, we can’t state 127.0.0.1 because this is translated to the loopback interface on the container. Instead, we need to write the bridge interface 172.17.0.1, which is the gateway the container uses to access the host’s loopback interface.
- The admin section (starts at line 42) defines the admin interface settings. The admin interface provides many useful tools in addition to statistics. It can be accessed on port 8081. In our lab, that’d be http://127.0.0.1:8081.
The final part of our lab is to apply the above setup to a real Kubernetes cluster. We’re going to make some changes first:
- Crate a configMap to hold Envoy configuration. This can be done in several ways, we’ve chosen the declarative way so that we can put the resource under version control. Create a YAML file and add the following contents:
apiVersion: v1
kind: ConfigMap
metadata:
name: envoyconfig
namespace: default
data:
envoy.yaml: |-
static_resources:
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
codec_type: auto
stat_prefix: ingress_http
route_config:
name: local_route
virtual_hosts:
- name: service
domains:
- "*"
routes:
- match:
prefix: "/hello"
route:
cluster: local_service
http_filters:
- name: envoy.router
typed_config: {}
clusters:
- name: local_service
connect_timeout: 0.25s
type: strict_dns
lb_policy: round_robin
load_assignment:
cluster_name: local_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 3000
admin:
access_log_path: "/dev/null"
address:
socket_address:
address: 0.0.0.0
port_value: 8081
- The only change we made to the Envoy configuration is switching the socket_address of the upstream service to 127.0.0.1, instead of the Docker bridge address. The reason is because Envoy is now operating as a sidecar container; it can access the application container on localhost.
- The third change we need to make is in the deployment file; we need to add the sidecar container to the pod. Additionally, we need to configure our service to connect to the sidecar container as its backend instead of the application container. Finally, we add a second route to the admin interface on port 8081. Our deployment file should look as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bdgreeter
labels:
app: bdgreeter
spec:
replicas: 1
selector:
matchLabels:
app: bdgreeter
template:
metadata:
labels:
app: bdgreeter
spec:
containers:
- name: app
image: magalixcorp/birthdaygreeter
- name: envoy
image: envoyproxy/envoy-alpine
args:
- "-c"
- "/envoy-config/envoy.yaml"
volumeMounts:
- name: envoy-config
mountPath: /envoy-config
volumes:
- name: envoy-config
configMap:
name: envoyconfig
---
apiVersion: v1
kind: Service
metadata:
name: bdgreeter-svc
spec:
selector:
app: bdgreeter
ports:
- name: http
port: 80
targetPort: 80
protocol: TCP
- name: admin
port: 8081
targetPort: 8081
type: LoadBalancer
Now, we need to apply both files to our cluster:
$ kubectl apply -f configmap.yml
$ kubectl apply -f bdaygreeter.yaml
We’re using the LoadBalancer as the Service type this time, let’s get the external IP:
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
bdgreeter-svc LoadBalancer 10.0.26.74 20.185.12.166 80:30652/TCP,8081:30957/TCP 24m
kubernetes ClusterIP 10.0.0.1 443/TCP 86m
Now, let’s test our work:
$ curl -XPOST --data '{"dateOfBirth":"2020-01-26"}' 20.185.12.166/hello/Magalixcorp
{"message":"Hello, Magalixcorp! Happy birthday"}
You can also have a look at the admin dashboard at http://20.185.12.166:8081/
TL;DR
Service meshes are some of the newest, trending tech topics today. In this article, we discussed an overview of what service meshes are, and how they’re typically used. We also had a practical hands-on lab, where we made use of the very popular sidecar software: Envoy. Through the lab, we deployed Envoy as a standalone container on our local environment, as well as a sidecar container to a Kubernetes Pod deployment. Envoy is written in C++, so it’s very fast and offers a myriad of features. At its core, Envoy uses Filters to apply various operations on the TCP messages as they arrive, and it’s capable of distributing traffic to multiple upstream services using several load balancing protocols.
Comments and Responses