Envoy is the new kid on the block when it comes to inter-service communication. It has everything you'd want in a smart network, but it'll cost you in terms of complexity - and it's likely solving a problem you don't really have.

As you do your investigation and prototyping to help decide if the cost is worth it, I suggest starting with the following question:

Am I happy with my service discovery system?

In many ways, service discovery is at the heart of Envoy. If a product is going to create a mesh between containers and act as a smart communication bus between them, it needs to be able to find all those containers and address them directly.

Directly, here, is critical. Envoy needs something - an API or a DNS server - which can return the IPs of all containers[1] of a particular service. In the simplest case, this could even be a configuration file in each Envoy container, so long as you have a way to keep it up-to-date. The IP of a load-balancer isn't sufficient here, as the whole point of Envoy is that it does its own load-balancing across long-running HTTP/2 connections.

If you don't have this, there's a good chance that Envoy is more than you need. For many use-cases, putting containers behind load-balancers and then distributing those load-balancer addresses is a great solution.


As an example of why service discovery is so core, let's consider how to deploy a new container. In a traditional architecture, we[2] bring a new container online by making an API call to the load-balancer, waiting for health-checks to pass, and making a second API call to drain connections and then remove the old container.

In the world of Envoy - where routing is powered by service discovery - sending traffic to a new container is a side-effect of launching it. Once the new container is running it'll show up in service discovery, which will cause Envoy to start health-checking it, which will result in it getting traffic. Similarly, terminating the old container will cause it to fall out of service discovery and fail health checks, either of which will cause it to no longer get traffic.

I'm convinced

Even if you have a great service discovery system in place, Envoy may not be for you. It's a very new piece of technology that was written with some specific problems in mind, and there's a good chance that you don't have those problems. If you think you do, however, it's got some great backers and shows huge promise.

If you don't have a service discovery system in place that you're happy with, you're probably best served by continuing with what you have, or starting with something simpler. When your system gets complex enough that you've had some other reason to get good at service discovery, then it might be time to take another look at Envoy.

  1. Envoy is often used with Docker - since it's designed to support hosts appearing and disappearing at will - but this equally applies to VMs or instances. All Envoy requires is that each host be independently addressable. ↩︎

  2. Through scripts, via automated services like CloudFormation's rolling update, by platforms like ECS, etc. ↩︎