Data Model
Prometheus stores time series, which are streams of numeric values that are sampled at ongoing timestamps:
On a high level, each time series consists of an identifier and a set of sample values:
Since Prometheus 3.0, the data model allows arbitrary UTF-8 characters in metric names and label names. However, using this extended character set comes with some caveats described further below.
Identifying series
Prometheus uniquely identifies every series by its metric name and a set of key/value pairs called labels. For example, one of the series identifiers in the diagram above is:
http_requests_total{job="api-server",instance="10.0.0.1:443",method="GET"}Prometheus automatically creates and indexes series identifiers in its TSDB the first time it sees them, so there is no explicit schema you need to predefine.
Metric names
The metric name identifies the overall aspect of a system that we are measuring. For example, the metric name http_requests_total indicates the total number of HTTP requests handled by a given server process, while process_resident_memory_bytes would indicate the amount of resident memory (in bytes) that a process is currently using.
Labels
Labels allow you to split up, or partition, a metric into subdimensions. For example, the instance label in the example above tells you which particular instance (process) the metric came from, while the job label indicates the job, or group of processes, that the instance belongs to. The method label further subdivides the metric by the HTTP method used within the process.
Some labels may be attached by the Prometheus server according to its service discovery configuration, while others may come directly from the instrumented target itself. In the wider Prometheus ecosystem, there are also other sources of labels, and the details of how your metrics are labeled will depend on your exact setup.
Series samples
Samples form the bulk of the data of a series and are appended to an indexed series over time:
- Timestamps are 64-bit integer Unix timestamps in millisecond precision.
- Sample values can be either:
- A 64-bit floating point number
- An entire high-resolution native histogram
New in Prometheus 3.0: Full UTF-8 support
To improve Prometheus' compatibility with OpenTelemetry metrics sources, Prometheus 3.0 introduced support for arbitrary UTF-8 characters in metric names and label names, so you are technically no longer limited to the original character set shown in the diagram above for these identifiers. However, we still recommend using the original character set for metric and label names to ensure compatibility with other systems and tools that may not support UTF-8 characters in these identifiers yet.
You will also encounter some downsides in PromQL when using the extended character set, since data selectors require more quoting and a slightly more involved syntax.
For example, if you have a selector like this for the original character set:
my_metric{my_label="value"}...you will have to change it to the following syntax when introducing previously unsupported characters (dots instead of underscores in this example):
{"my.metric", "my.label"="value"}As you can see, the metric name now has to be defined inside of the label matcher list, and you have to quote both the metric name and the my.label label name. This syntax is more cumbersome to write and read, so keep this in mind before deciding to go beyond the original character set for identifiers.