Standard Error Logs

Prometheus logs a number of lifecycle events (like startup and shutdown information), as well as certain errors and other unusual activity, to the standard error (stderr) file descriptor. For example, when you start Prometheus in a terminal, you will see something like the following being logged:

ts=2022-08-20T14:59:30.969Z caller=main.go:495 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2022-08-20T14:59:30.969Z caller=main.go:539 level=info msg="Starting Prometheus Server" mode=server version="(version=2.38.0, branch=HEAD, revision=818d6e60888b2a3ea363aee8a9828c7bafd73699)"
ts=2022-08-20T14:59:30.969Z caller=main.go:544 level=info build_context="(go=go1.18.5, user=root@e6b781f65453, date=20220816-13:23:14)"
ts=2022-08-20T14:59:30.969Z caller=main.go:545 level=info host_details="(Linux 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 cef9611c32ee (none))"
ts=2022-08-20T14:59:30.969Z caller=main.go:546 level=info fd_limits="(soft=1048576, hard=1048576)"
ts=2022-08-20T14:59:30.969Z caller=main.go:547 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2022-08-20T14:59:30.971Z caller=web.go:553 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2022-08-20T14:59:30.972Z caller=main.go:976 level=info msg="Starting TSDB ..."
ts=2022-08-20T14:59:30.972Z caller=tls_config.go:195 level=info component=web msg="TLS is disabled." http2=false
ts=2022-08-20T14:59:30.974Z caller=head.go:495 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2022-08-20T14:59:30.974Z caller=head.go:538 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=1.499µs
ts=2022-08-20T14:59:30.974Z caller=head.go:544 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2022-08-20T14:59:30.974Z caller=head.go:615 level=info component=tsdb msg="WAL segment loaded" segment=0 maxSegment=0
ts=2022-08-20T14:59:30.974Z caller=head.go:621 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=38.18µs wal_replay_duration=106.071µs total_replay_duration=185.691µs
ts=2022-08-20T14:59:30.975Z caller=main.go:997 level=info fs_type=EXT4_SUPER_MAGIC
ts=2022-08-20T14:59:30.975Z caller=main.go:1000 level=info msg="TSDB started"
ts=2022-08-20T14:59:30.975Z caller=main.go:1181 level=info msg="Loading configuration file" filename=/etc/prometheus/prometheus.yml
ts=2022-08-20T14:59:30.975Z caller=main.go:1218 level=info msg="Completed loading of configuration file" filename=/etc/prometheus/prometheus.yml totalDuration=427.463µs db_storage=565ns remote_storage=1.298µs web_handler=273ns query_engine=592ns scrape=160.516µs scrape_sd=15.985µs notify=17.577µs notify_sd=6.66µs rules=982ns tracing=4.314µs
ts=2022-08-20T14:59:30.975Z caller=main.go:961 level=info msg="Server is ready to receive web requests."
ts=2022-08-20T14:59:30.975Z caller=manager.go:941 level=info component="rule manager" msg="Starting rule manager..."

Interpreting log lines

Prometheus logs information in a structured key/value output format.

Some keys are always present in a log line:

  • ts: The timestamp at which the line was logged.
  • caller: The source code file that contained the logging statement.
  • level: The severity level of the log message (debug, info, warn, or error).

Other keys depend on the specific component and logging statement. Often you will find:

  • component: The component inside Prometheus that is emitting the log line.
  • msg: A human-readable message of what happened.
  • err: An error message associated with something that went wrong.

You will also find other custom fields on some messages that provide extra statistics about the logged event.

Especially lines with a level=error field are useful for finding things that have gone wrong, like:

  • Errors while loading a new configuration file,
  • Errors writing to the TSDB,
  • Errors writing samples to a remote storage endpoint,
  • ...etc.

Note that Prometheus does not log individual target scrape failures, as scrape failures don't usually point to a problem with Prometheus itself (but with the network or the target), and those kinds of logs would be overly verbose.

Configuring logging

By default, Prometheus only outputs log lines that have a severity level of info and above. This means that any debug-level log statements will not be visible. To change this behavior, you can use the --log.level flag to set the minimum shown log level to something else than info (allowed values are debug, info, warn, and error).

For example, to also show debug log lines, you could start Prometheus like this:

./prometheus --log.level=debug

NOTE: This will produce excessively noisy debug output, so you will likely not want to run Prometheus with a debug log level in normal operation. However, it can sometimes be useful to temporarily turn this on to diagnose specific problems.

Prometheus also allows you to configure the log output format by supplying a --log.format flag. The default value of logfmt produces the structured log output we saw above, while a value of json produces JSON-formatted logs:

{"caller":"main.go:495","duration":"15d","level":"info","msg":"No time or size retention was set so using the default time retention","ts":"2022-08-20T15:14:00.004Z"}
{"caller":"main.go:539","level":"info","mode":"server","msg":"Starting Prometheus Server","ts":"2022-08-20T15:14:00.004Z","version":"(version=2.38.0, branch=HEAD, revision=818d6e60888b2a3ea363aee8a9828c7bafd73699)"}
{"build_context":"(go=go1.18.5, user=root@e6b781f65453, date=20220816-13:23:14)","caller":"main.go:544","level":"info","ts":"2022-08-20T15:14:00.004Z"}
{"caller":"main.go:545","host_details":"(Linux 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 77cdb1e158c9 (none))","level":"info","ts":"2022-08-20T15:14:00.004Z"}
{"caller":"main.go:546","fd_limits":"(soft=1048576, hard=1048576)","level":"info","ts":"2022-08-20T15:14:00.004Z"}
{"caller":"main.go:547","level":"info","ts":"2022-08-20T15:14:00.004Z","vm_limits":"(soft=unlimited, hard=unlimited)"}
{"address":"0.0.0.0:9090","caller":"web.go:553","component":"web","level":"info","msg":"Start listening for connections","ts":"2022-08-20T15:14:00.007Z"}
{"caller":"main.go:976","level":"info","msg":"Starting TSDB ...","ts":"2022-08-20T15:14:00.008Z"}
{"caller":"tls_config.go:195","component":"web","http2":false,"level":"info","msg":"TLS is disabled.","ts":"2022-08-20T15:14:00.009Z"}
{"caller":"head.go:495","component":"tsdb","level":"info","msg":"Replaying on-disk memory mappable chunks if any","ts":"2022-08-20T15:14:00.010Z"}
{"caller":"head.go:538","component":"tsdb","duration":"1.551µs","level":"info","msg":"On-disk memory mappable chunks replay completed","ts":"2022-08-20T15:14:00.010Z"}
{"caller":"head.go:544","component":"tsdb","level":"info","msg":"Replaying WAL, this may take a while","ts":"2022-08-20T15:14:00.010Z"}
{"caller":"head.go:615","component":"tsdb","level":"info","maxSegment":0,"msg":"WAL segment loaded","segment":0,"ts":"2022-08-20T15:14:00.010Z"}
{"caller":"head.go:621","checkpoint_replay_duration":"20.259µs","component":"tsdb","level":"info","msg":"WAL replay completed","total_replay_duration":"187.808µs","ts":"2022-08-20T15:14:00.010Z","wal_replay_duration":"145.857µs"}
{"caller":"main.go:997","fs_type":"EXT4_SUPER_MAGIC","level":"info","ts":"2022-08-20T15:14:00.011Z"}
{"caller":"main.go:1000","level":"info","msg":"TSDB started","ts":"2022-08-20T15:14:00.011Z"}
{"caller":"main.go:1181","filename":"/etc/prometheus/prometheus.yml","level":"info","msg":"Loading configuration file","ts":"2022-08-20T15:14:00.011Z"}
{"caller":"main.go:1218","db_storage":"598ns","filename":"/etc/prometheus/prometheus.yml","level":"info","msg":"Completed loading of configuration file","notify":"26.076µs","notify_sd":"5.446µs","query_engine":"694ns","remote_storage":"1.149µs","rules":"887ns","scrape":"183.083µs","scrape_sd":"24.791µs","totalDuration":"432.148µs","tracing":"4.27µs","ts":"2022-08-20T15:14:00.012Z","web_handler":"306ns"}
{"caller":"main.go:961","level":"info","msg":"Server is ready to receive web requests.","ts":"2022-08-20T15:14:00.012Z"}
{"caller":"manager.go:941","component":"rule manager","level":"info","msg":"Starting rule manager...","ts":"2022-08-20T15:14:00.012Z"}

This can be helpful in case you are ingesting your Prometheus logs into a log-processing system that prefers JSON-based input.