Service Discovery

Modern dynamic IT environments create new challenges for monitoring systems:

  • On-demand VMs on cloud providers are scaled up and down as required.
  • Service instances are dynamically scheduled onto hosts by container orchestrators such as Kubernetes, Docker Swarm, or Mesos.
  • The trend towards microservices leads to an ever-growing number of individual services to operate and monitor.

The question arises: How can a monitoring system still make sense of this dynamic world? How does it know which machines or service instances should currently exist, what their identity is, and how to fetch metrics from them? It is no longer possible for an operator to statically configure this information, as it is both too complex and changing too rapidly.

To solve this problem, Prometheus integrates with common service discovery providers in your infrastructure to discover and update its monitoring targets from an external source of truth in an ongoing fashion:

Service discovery in the Prometheus architecture

Prometheus supports multiple built-in service discovery mechanisms, for example for:

  • Discovering VMs on cloud providers (AWS, Azure, Google, ...)
  • Discovering service instances on cluster orchestrators (Kubernetes, Marathon, …)
  • Discovering targets using generic lookup methods such as DNS, Consul, Zookeeper, or custom discovery mechanisms.

Prometheus uses service discovery for three distinct, yet related purposes:

  • To build a view of what targets should exist (so it can record and alert if one is missing)
  • To gain technical information of how to pull metrics from a target via HTTP
  • To enrich the series collected from a target with labeled metadata about the target.

In this way, Prometheus uses service discovery as a source of truth to reliably monitor dynamic environments while minimizing administrative overhead.