Summary

In this training, you learned:

  • How you can debug many aspects of Prometheus' behavior directly using its web-based status pages,
  • How to interpret and configure Prometheus' various logging options to understand run-time behavior as well as executed query statistics,
  • How to build a robust meta-monitoring setup to monitor Prometheus and Alertmanager servers based on their own metrics,
  • How to profile Prometheus and Alertmanager servers to diagnose memory and CPU usage issues, as well as goroutine states.

You should now be able to monitor and debug your Prometheus alerting setups in an effective way.