Inspecting Goroutine States
Sometimes it can be useful to get a view of all running goroutines: how many there are, which function they are currently executing, and what their current status is (running, waiting for a channel receive / send operation, waiting for a mutex, waiting for a syscall, etc.). For example, this information can help you figure out why certain parts of Prometheus (like a scrape loop) are stuck, or whether there is a goroutine leak.
The /debug/pprof/goroutine
endpoint gives us exactly this information, and it is useful directly from the web browser.
By default, this endpoint will return a text format like this:
goroutine profile: total 46
7 @ 0x43af96 0x433c17 0x465889 0x4dbdb2 0x4dd11a 0x4dd108 0x51ebe9 0x531185 0x6ff62e 0x5a9023 0x5a917d 0x70044c 0x46b2a1
# 0x465888 internal/poll.runtime_pollWait+0x88 /usr/local/go/src/runtime/netpoll.go:302
# 0x4dbdb1 internal/poll.(*pollDesc).wait+0x31 /usr/local/go/src/internal/poll/fd_poll_runtime.go:83
# 0x4dd119 internal/poll.(*pollDesc).waitRead+0x259 /usr/local/go/src/internal/poll/fd_poll_runtime.go:88
# 0x4dd107 internal/poll.(*FD).Read+0x247 /usr/local/go/src/internal/poll/fd_unix.go:167
# 0x51ebe8 net.(*netFD).Read+0x28 /usr/local/go/src/net/fd_posix.go:55
# 0x531184 net.(*conn).Read+0x44 /usr/local/go/src/net/net.go:183
# 0x6ff62d net/http.(*persistConn).Read+0x4d /usr/local/go/src/net/http/transport.go:1929
# 0x5a9022 bufio.(*Reader).fill+0x102 /usr/local/go/src/bufio/bufio.go:106
# 0x5a917c bufio.(*Reader).Peek+0x5c /usr/local/go/src/bufio/bufio.go:144
# 0x70044b net/http.(*persistConn).readLoop+0x1ab /usr/local/go/src/net/http/transport.go:2093
7 @ 0x43af96 0x44aa12 0x22c33ad 0x46b2a1
# 0x22c33ac github.com/prometheus/prometheus/scrape.(*scrapeLoop).run+0x40c /app/scrape/scrape.go:1207
7 @ 0x43af96 0x44aa12 0x702135 0x46b2a1
# 0x702134 net/http.(*persistConn).writeLoop+0xf4 /usr/local/go/src/net/http/transport.go:2392
2 @ 0x43af96 0x44aa12 0xb027f7 0x46b2a1
# 0xb027f6 github.com/prometheus/prometheus/discovery/legacymanager.(*Manager).sender+0xd6 /app/discovery/legacymanager/manager.go:235
1 @ 0x40c0f4 0x4677ef 0x75d759 0x46b2a1
# 0x4677ee os/signal.signal_recv+0x2e /usr/local/go/src/runtime/sigqueue.go:151
# 0x75d758 os/signal.loop+0x18 /usr/local/go/src/os/signal/signal_unix.go:23
[...]
However, by appending a ?debug=2
parameter to the URL (yielding /debug/pprof/goroutine?debug=2
), the format becomes much more human-readable and useful:
goroutine 7092180 [running]:
runtime/pprof.writeGoroutineStacks({0x7f43546a5f08, 0xc00133d180})
/usr/local/go/src/runtime/pprof/pprof.go:694 +0x70
runtime/pprof.writeGoroutine({0x7f43546a5f08?, 0xc00133d180?}, 0x10?)
/usr/local/go/src/runtime/pprof/pprof.go:683 +0x2b
runtime/pprof.(*Profile).WriteTo(0x2af4de0?, {0x7f43546a5f08?, 0xc00133d180?}, 0xc?)
/usr/local/go/src/runtime/pprof/pprof.go:332 +0x14b
net/http/pprof.handler.ServeHTTP({0xc004aa24d5, 0x9}, {0x7f43546a5e38, 0xc00133d180}, 0xc001085100?)
/usr/local/go/src/net/http/pprof/pprof.go:253 +0x4a5
net/http/pprof.Index({0x7f43546a5e38?, 0xc00133d180}, 0xc0023f7300)
/usr/local/go/src/net/http/pprof/pprof.go:371 +0x13e
github.com/prometheus/prometheus/web.serveDebug({0x7f43546a5e38, 0xc00133d180}, 0xc0023f7300)
/app/web/web.go:503 +0x2c5
net/http.HandlerFunc.ServeHTTP(0x7f43546a5e38?, {0x7f43546a5e38?, 0xc00133d180?}, 0x21afc1e?)
/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1({0x7f43546a5e38?, 0xc00133d130?}, 0xc0023f7300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:239 +0xc5
net/http.HandlerFunc.ServeHTTP(0x14b6cb0?, {0x7f43546a5e38?, 0xc00133d130?}, 0xc002c50bd0?)
/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2({0x7f43546a5e38, 0xc00133d130}, 0xc0023f7300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:84 +0xbf
net/http.HandlerFunc.ServeHTTP(0x3775d40?, {0x7f43546a5e38?, 0xc00133d130?}, 0x7f492db6a5b8?)
/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1({0x3775d40?, 0xc002d63980?}, 0xc0023f7300)
/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:117 +0xaa
github.com/prometheus/prometheus/web.setPathWithPrefix.func1.1({0x3775d40, 0xc002d63980}, 0xc0023f6f00)
/app/web/web.go:783 +0x290
github.com/prometheus/common/route.(*Router).handle.func1({0x3775d40, 0xc002d63980}, 0xc0023f6e00, {0xc00385dca0, 0x1, 0xa?})
/go/pkg/mod/github.com/prometheus/common@v0.34.0/route/route.go:83 +0x2ae
github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc0000dde00, {0x3775d40, 0xc002d63980}, 0xc0023f6e00)
/go/pkg/mod/github.com/julienschmidt/httprouter@v1.3.0/router.go:387 +0x82b
github.com/prometheus/common/route.(*Router).ServeHTTP(0xc001ed5560?, {0x3775d40?, 0xc002d63980?}, 0xc0010856d0?)
/go/pkg/mod/github.com/prometheus/common@v0.34.0/route/route.go:126 +0x26
net/http.(*ServeMux).ServeHTTP(0x37760f8?, {0x3775d40, 0xc002d63980}, 0xc0023f6e00)
/usr/local/go/src/net/http/server.go:2462 +0x149
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP(0xc00021c5a0, {0x3774750?, 0xc0012110a0}, 0xc0023f6200)
/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.31.0/handler.go:191 +0xcf1
github.com/prometheus/prometheus/web.withStackTracer.func1({0x3774750?, 0xc0012110a0?}, 0xc000859f10?)
/app/web/web.go:105 +0x97
net/http.HandlerFunc.ServeHTTP(0xc003e5d27b?, {0x3774750?, 0xc0012110a0?}, 0xc000421400?)
/usr/local/go/src/net/http/server.go:2084 +0x2f
net/http.serverHandler.ServeHTTP({0xc001366600?}, {0x3774750, 0xc0012110a0}, 0xc0023f6200)
/usr/local/go/src/net/http/server.go:2916 +0x43b
net/http.(*conn).serve(0xc0027320a0, {0x37760f8, 0xc000b1ddd0})
/usr/local/go/src/net/http/server.go:1966 +0x5d7
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3071 +0x4db
goroutine 1 [chan receive, 158485 minutes]:
github.com/oklog/run.(*Group).Run(0xc000283ab8)
/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:43 +0x7c
main.main()
/app/cmd/prometheus/main.go:1096 +0x9165
goroutine 35 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc0000ace00)
/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xad
created by go.opencensus.io/stats/view.init.0
/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x8d
goroutine 37 [select, 158485 minutes]:
main.main.func7()
/app/cmd/prometheus/main.go:794 +0xa8
github.com/oklog/run.(*Group).Run.func1({0xc000114040?, 0xc000696010?})
/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38 +0x2f
created by github.com/oklog/run.(*Group).Run
/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:37 +0x22a
[...]
This output lists the call stack for each goroutine, preceded by a goroutine <goroutine-number> [<goroutine status>]
line. The status of a goroutine indicates whether it is currently running or waiting for some external action or condition (e.g. [chan receive, 158485 minutes]
, [select]
, etc.). If you see a goroutine here that is waiting for an unexpectedly long time, this might point to a problem.