Inspecting Goroutine States

Sometimes it can be useful to get a view of all running goroutines: how many there are, which function they are currently executing, and what their current status is (running, waiting for a channel receive / send operation, waiting for a mutex, waiting for a syscall, etc.). For example, this information can help you figure out why certain parts of Prometheus (like a scrape loop) are stuck, or whether there is a goroutine leak.

The /debug/pprof/goroutine endpoint gives us exactly this information, and it is useful directly from the web browser.

By default, this endpoint will return a text format like this:

goroutine profile: total 46
7 @ 0x43af96 0x433c17 0x465889 0x4dbdb2 0x4dd11a 0x4dd108 0x51ebe9 0x531185 0x6ff62e 0x5a9023 0x5a917d 0x70044c 0x46b2a1
#	0x465888	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4dbdb1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4dd119	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4dd107	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x51ebe8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x531184	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x6ff62d	net/http.(*persistConn).Read+0x4d		/usr/local/go/src/net/http/transport.go:1929
#	0x5a9022	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x5a917c	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x70044b	net/http.(*persistConn).readLoop+0x1ab		/usr/local/go/src/net/http/transport.go:2093

7 @ 0x43af96 0x44aa12 0x22c33ad 0x46b2a1
#	0x22c33ac	github.com/prometheus/prometheus/scrape.(*scrapeLoop).run+0x40c	/app/scrape/scrape.go:1207

7 @ 0x43af96 0x44aa12 0x702135 0x46b2a1
#	0x702134	net/http.(*persistConn).writeLoop+0xf4	/usr/local/go/src/net/http/transport.go:2392

2 @ 0x43af96 0x44aa12 0xb027f7 0x46b2a1
#	0xb027f6	github.com/prometheus/prometheus/discovery/legacymanager.(*Manager).sender+0xd6	/app/discovery/legacymanager/manager.go:235

1 @ 0x40c0f4 0x4677ef 0x75d759 0x46b2a1
#	0x4677ee	os/signal.signal_recv+0x2e	/usr/local/go/src/runtime/sigqueue.go:151
#	0x75d758	os/signal.loop+0x18		/usr/local/go/src/os/signal/signal_unix.go:23

[...]

However, by appending a ?debug=2 parameter to the URL (yielding /debug/pprof/goroutine?debug=2), the format becomes much more human-readable and useful:

goroutine 7092180 [running]:
runtime/pprof.writeGoroutineStacks({0x7f43546a5f08, 0xc00133d180})
	/usr/local/go/src/runtime/pprof/pprof.go:694 +0x70
runtime/pprof.writeGoroutine({0x7f43546a5f08?, 0xc00133d180?}, 0x10?)
	/usr/local/go/src/runtime/pprof/pprof.go:683 +0x2b
runtime/pprof.(*Profile).WriteTo(0x2af4de0?, {0x7f43546a5f08?, 0xc00133d180?}, 0xc?)
	/usr/local/go/src/runtime/pprof/pprof.go:332 +0x14b
net/http/pprof.handler.ServeHTTP({0xc004aa24d5, 0x9}, {0x7f43546a5e38, 0xc00133d180}, 0xc001085100?)
	/usr/local/go/src/net/http/pprof/pprof.go:253 +0x4a5
net/http/pprof.Index({0x7f43546a5e38?, 0xc00133d180}, 0xc0023f7300)
	/usr/local/go/src/net/http/pprof/pprof.go:371 +0x13e
github.com/prometheus/prometheus/web.serveDebug({0x7f43546a5e38, 0xc00133d180}, 0xc0023f7300)
	/app/web/web.go:503 +0x2c5
net/http.HandlerFunc.ServeHTTP(0x7f43546a5e38?, {0x7f43546a5e38?, 0xc00133d180?}, 0x21afc1e?)
	/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1({0x7f43546a5e38?, 0xc00133d130?}, 0xc0023f7300)
	/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:239 +0xc5
net/http.HandlerFunc.ServeHTTP(0x14b6cb0?, {0x7f43546a5e38?, 0xc00133d130?}, 0xc002c50bd0?)
	/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2({0x7f43546a5e38, 0xc00133d130}, 0xc0023f7300)
	/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:84 +0xbf
net/http.HandlerFunc.ServeHTTP(0x3775d40?, {0x7f43546a5e38?, 0xc00133d130?}, 0x7f492db6a5b8?)
	/usr/local/go/src/net/http/server.go:2084 +0x2f
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1({0x3775d40?, 0xc002d63980?}, 0xc0023f7300)
	/go/pkg/mod/github.com/prometheus/client_golang@v1.12.1/prometheus/promhttp/instrument_server.go:117 +0xaa
github.com/prometheus/prometheus/web.setPathWithPrefix.func1.1({0x3775d40, 0xc002d63980}, 0xc0023f6f00)
	/app/web/web.go:783 +0x290
github.com/prometheus/common/route.(*Router).handle.func1({0x3775d40, 0xc002d63980}, 0xc0023f6e00, {0xc00385dca0, 0x1, 0xa?})
	/go/pkg/mod/github.com/prometheus/common@v0.34.0/route/route.go:83 +0x2ae
github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc0000dde00, {0x3775d40, 0xc002d63980}, 0xc0023f6e00)
	/go/pkg/mod/github.com/julienschmidt/httprouter@v1.3.0/router.go:387 +0x82b
github.com/prometheus/common/route.(*Router).ServeHTTP(0xc001ed5560?, {0x3775d40?, 0xc002d63980?}, 0xc0010856d0?)
	/go/pkg/mod/github.com/prometheus/common@v0.34.0/route/route.go:126 +0x26
net/http.(*ServeMux).ServeHTTP(0x37760f8?, {0x3775d40, 0xc002d63980}, 0xc0023f6e00)
	/usr/local/go/src/net/http/server.go:2462 +0x149
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp.(*Handler).ServeHTTP(0xc00021c5a0, {0x3774750?, 0xc0012110a0}, 0xc0023f6200)
	/go/pkg/mod/go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp@v0.31.0/handler.go:191 +0xcf1
github.com/prometheus/prometheus/web.withStackTracer.func1({0x3774750?, 0xc0012110a0?}, 0xc000859f10?)
	/app/web/web.go:105 +0x97
net/http.HandlerFunc.ServeHTTP(0xc003e5d27b?, {0x3774750?, 0xc0012110a0?}, 0xc000421400?)
	/usr/local/go/src/net/http/server.go:2084 +0x2f
net/http.serverHandler.ServeHTTP({0xc001366600?}, {0x3774750, 0xc0012110a0}, 0xc0023f6200)
	/usr/local/go/src/net/http/server.go:2916 +0x43b
net/http.(*conn).serve(0xc0027320a0, {0x37760f8, 0xc000b1ddd0})
	/usr/local/go/src/net/http/server.go:1966 +0x5d7
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3071 +0x4db

goroutine 1 [chan receive, 158485 minutes]:
github.com/oklog/run.(*Group).Run(0xc000283ab8)
	/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:43 +0x7c
main.main()
	/app/cmd/prometheus/main.go:1096 +0x9165

goroutine 35 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc0000ace00)
	/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:276 +0xad
created by go.opencensus.io/stats/view.init.0
	/go/pkg/mod/go.opencensus.io@v0.23.0/stats/view/worker.go:34 +0x8d

goroutine 37 [select, 158485 minutes]:
main.main.func7()
	/app/cmd/prometheus/main.go:794 +0xa8
github.com/oklog/run.(*Group).Run.func1({0xc000114040?, 0xc000696010?})
	/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:38 +0x2f
created by github.com/oklog/run.(*Group).Run
	/go/pkg/mod/github.com/oklog/run@v1.1.0/group.go:37 +0x22a

[...]

This output lists the call stack for each goroutine, preceded by a goroutine <goroutine-number> [<goroutine status>] line. The status of a goroutine indicates whether it is currently running or waiting for some external action or condition (e.g. [chan receive, 158485 minutes], [select], etc.). If you see a goroutine here that is waiting for an unexpectedly long time, this might point to a problem.