Fix: Go Goroutine Leak — Goroutines That Never Exit
Quick Answer
How to find and fix goroutine leaks in Go — detecting leaks with pprof and goleak, blocked channel patterns, context cancellation, and goroutine lifecycle management.
The Problem
A Go service’s memory and goroutine count grow indefinitely:
# pprof output — goroutine count keeps climbing
goroutine profile: total 14382
# After 1 hour of traffic, this number should stabilize — instead it growsOr in application logs, memory keeps increasing:
runtime.MemStats.NumGoroutine: 100 # On startup
runtime.MemStats.NumGoroutine: 1500 # After 10 minutes
runtime.MemStats.NumGoroutine: 8200 # After 1 hourOr a test catches a leak:
--- FAIL: TestHandleRequest (0.12s)
goroutine_leak_test.go:45: found unexpected goroutines:
[Goroutine 18 in state chan receive, with main.processItems on top of the stack]Or the service eventually OOM-crashes or becomes unresponsive after running for hours.
Why This Happens
A goroutine leak occurs when a goroutine is started but never exits. Unlike memory allocated with make or new, goroutines aren’t garbage collected when unreachable — they only exit when their function returns.
The most common causes:
- Blocked channel receive with no sender — a goroutine waits on
<-chbut no one ever sends tochor closes it. The goroutine blocks forever. - Blocked channel send with no receiver — a goroutine sends to an unbuffered channel but the receiver has already exited. Deadlock with no escape.
- Goroutine started in a loop — each request or event starts a goroutine that blocks on a channel or mutex. Over time, blocked goroutines accumulate.
- Missing context cancellation — a goroutine running a long loop checks
ctx.Done()but the context is never cancelled when the caller is done. The goroutine runs indefinitely. - Goroutine started in an HTTP handler — the handler returns but a goroutine it started continues running, holding references to request resources.
time.Afterin a loop — each iteration creates a new timer goroutine viatime.After. In a tight loop, thousands of timer goroutines accumulate until they fire.
Fix 1: Detect Leaks with pprof
The net/http/pprof package exposes goroutine stack traces over HTTP:
// main.go — add pprof endpoints
import (
_ "net/http/pprof" // Side-effect import registers handlers
"net/http"
)
func main() {
// pprof endpoints on a separate port (don't expose to public)
go func() {
http.ListenAndServe("localhost:6060", nil)
}()
// ... rest of your app
}# View all running goroutines
go tool pprof http://localhost:6060/debug/pprof/goroutine
# Interactive mode
(pprof) top10 # Top 10 goroutine creators
(pprof) list main. # Show goroutines with 'main.' in the stack
# Save and compare snapshots (detect growth)
curl http://localhost:6060/debug/pprof/goroutine > goroutines_before.pb
# ... run some requests ...
curl http://localhost:6060/debug/pprof/goroutine > goroutines_after.pb
go tool pprof -diff_base goroutines_before.pb goroutines_after.pb
# Quick text dump of all goroutines
curl http://localhost:6060/debug/pprof/goroutine?debug=2Monitor goroutine count in production:
import (
"runtime"
"time"
"log/slog"
)
func monitorGoroutines(interval time.Duration) {
ticker := time.NewTicker(interval)
defer ticker.Stop()
for range ticker.C {
count := runtime.NumGoroutine()
slog.Info("goroutine count", "count", count)
if count > 10000 {
slog.Warn("goroutine count exceeds threshold — possible leak", "count", count)
}
}
}Fix 2: Use goleak in Tests
The goleak package detects goroutine leaks in unit tests automatically:
go get go.uber.org/goleakpackage mypackage_test
import (
"testing"
"go.uber.org/goleak"
)
func TestMain(m *testing.M) {
// Verify no goroutines are leaked across all tests in the package
goleak.VerifyTestMain(m)
}
func TestHandleRequest(t *testing.T) {
defer goleak.VerifyNone(t) // Verify no leaks after this specific test
handler := NewRequestHandler()
handler.Handle(context.Background(), testRequest())
// goleak will fail the test if any goroutines spawned here are still running
}goleak checks goroutine state at the end of each test. If goroutines started during the test are still running, the test fails with a stack trace showing where the leaked goroutine was created.
Fix 3: Fix Blocked Channel Patterns
The most common leak — goroutines waiting on channels that will never receive a value:
// LEAKY — goroutine blocks on receive forever if processItem never sends to results
func processItems(items []Item) {
results := make(chan Result) // Unbuffered channel
for _, item := range items {
go func(item Item) {
result := process(item)
results <- result // ← If the receiver exits early, this goroutine blocks forever
}(item)
}
// If this returns early (error, timeout), goroutines above are stuck trying to send
for range items {
result := <-results
if err := handleResult(result); err != nil {
return // ← Returns here, but goroutines are still trying to send
}
}
}// FIXED — use a done channel or context to signal goroutines to exit
func processItems(ctx context.Context, items []Item) ([]Result, error) {
results := make(chan Result, len(items)) // Buffered — goroutines never block on send
for _, item := range items {
go func(item Item) {
select {
case <-ctx.Done():
return // Context cancelled — exit without sending
case results <- process(item):
// Sent successfully
}
}(item)
}
var collected []Result
for range items {
select {
case <-ctx.Done():
return nil, ctx.Err()
case result := <-results:
collected = append(collected, result)
}
}
return collected, nil
}Always close channels when done writing:
func producer(ch chan<- int) {
defer close(ch) // ← Closing unblocks all receivers waiting on <-ch
for i := 0; i < 10; i++ {
ch <- i
}
}
func consumer(ch <-chan int) {
for v := range ch { // range exits when ch is closed
fmt.Println(v)
}
// Goroutine exits cleanly after channel is closed
}Fix 4: Use Context for Goroutine Lifecycle
Pass context to all goroutines that do I/O or long-running work. Cancel the context when the caller is done:
// LEAKY — goroutine runs forever because it has no exit signal
func startWorker() {
go func() {
for {
msg := fetchMessage() // Blocks until a message arrives
process(msg)
// No way to stop this goroutine
}
}()
}
// FIXED — goroutine exits when context is cancelled
func startWorker(ctx context.Context) {
go func() {
for {
select {
case <-ctx.Done():
log.Println("Worker stopping:", ctx.Err())
return // Clean exit
default:
msg, err := fetchMessageWithContext(ctx)
if err != nil {
if ctx.Err() != nil {
return // Context cancelled during fetch — exit
}
log.Println("Fetch error:", err)
continue
}
process(msg)
}
}
}()
}
// Caller controls the goroutine's lifetime
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel() // Cancels the context (and stops the worker) when main exits
startWorker(ctx)
// ... rest of main
}For HTTP handlers — the request context is automatically cancelled when the client disconnects or the request times out:
func handleRequest(w http.ResponseWriter, r *http.Request) {
ctx := r.Context() // Cancelled when handler returns or client disconnects
// Pass ctx to goroutines — they'll stop when the request is done
go func() {
select {
case <-ctx.Done():
return // Client disconnected — stop background work
case result := <-doBackgroundWork(ctx):
log.Println("Background work done:", result)
}
}()
}Fix 5: Fix time.After Leaks in Loops
time.After creates a timer channel that’s garbage collected only after the timer fires — not when the surrounding function returns. In a loop, this creates a goroutine per iteration:
// LEAKY — creates a new timer (and goroutine) on every iteration
func processWithTimeout(items []Item) {
for _, item := range items {
select {
case result := <-process(item):
handle(result)
case <-time.After(5 * time.Second): // ← New timer goroutine each iteration
log.Println("Timeout")
}
}
}// FIXED — reuse a single timer
func processWithTimeout(items []Item) {
timer := time.NewTimer(5 * time.Second)
defer timer.Stop() // Cancel the timer when done
for _, item := range items {
timer.Reset(5 * time.Second) // Reset for each iteration
select {
case result := <-process(item):
if !timer.Stop() {
<-timer.C // Drain the channel if Stop() returns false
}
handle(result)
case <-timer.C:
log.Println("Timeout processing item")
}
}
}Common Mistake: Forgetting to drain
timer.Caftertimer.Stop(). IfStop()returnsfalse, the timer already fired and its channel has a value. The nextReset()won’t work correctly until the channel is drained.
Fix 6: Use sync.WaitGroup to Track and Wait for Goroutines
sync.WaitGroup ensures all goroutines finish before the parent function returns:
// LEAKY — goroutines continue after function returns
func processAll(items []Item) {
for _, item := range items {
go processItem(item) // Fire and forget — goroutines outlive the function
}
// Function returns immediately — goroutines are orphaned
}
// FIXED — wait for all goroutines to finish
func processAll(ctx context.Context, items []Item) error {
var wg sync.WaitGroup
errCh := make(chan error, len(items)) // Buffered — goroutines don't block on send
for _, item := range items {
wg.Add(1)
go func(item Item) {
defer wg.Done()
if err := processItem(ctx, item); err != nil {
errCh <- err
}
}(item)
}
// Wait for all goroutines to finish
wg.Wait()
close(errCh)
// Collect errors
var errs []error
for err := range errCh {
errs = append(errs, err)
}
if len(errs) > 0 {
return errors.Join(errs...)
}
return nil
}With errgroup for cleaner error handling:
import "golang.org/x/sync/errgroup"
func processAll(ctx context.Context, items []Item) error {
g, ctx := errgroup.WithContext(ctx)
for _, item := range items {
item := item // Capture loop variable (Go < 1.22)
g.Go(func() error {
return processItem(ctx, item)
})
}
return g.Wait() // Waits for all goroutines; returns first non-nil error
}errgroup.WithContext cancels the context when any goroutine returns an error, signalling all other goroutines to stop — preventing the leak when one goroutine fails.
Fix 7: Worker Pool Pattern to Bound Goroutine Count
Instead of spawning one goroutine per task (unbounded growth), use a fixed-size worker pool:
func processWithPool(ctx context.Context, items []Item, workerCount int) error {
jobs := make(chan Item, len(items))
results := make(chan error, len(items))
// Start fixed number of workers
var wg sync.WaitGroup
for i := 0; i < workerCount; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for item := range jobs { // Workers exit when jobs channel is closed
select {
case <-ctx.Done():
return
default:
results <- processItem(ctx, item)
}
}
}()
}
// Send all jobs
for _, item := range items {
jobs <- item
}
close(jobs) // Signal workers there are no more jobs
// Wait for workers to finish, then close results
go func() {
wg.Wait()
close(results)
}()
// Collect results
var errs []error
for err := range results {
if err != nil {
errs = append(errs, err)
}
}
if len(errs) > 0 {
return errors.Join(errs...)
}
return nil
}
// Usage
err := processWithPool(ctx, items, runtime.NumCPU())Still Not Working?
Check for goroutines blocked on mutex — a goroutine waiting on a locked sync.Mutex is harder to spot than a blocked channel. Use pprof’s mutex profile:
curl http://localhost:6060/debug/pprof/mutex?debug=1Check for goroutines in syscall state — goroutines making blocking system calls (DNS resolution, file I/O without context) can block indefinitely:
curl http://localhost:6060/debug/pprof/goroutine?debug=2 | grep -A 5 "syscall"Use context-aware versions of blocking operations: net.DefaultResolver.LookupHost(ctx, ...) instead of net.LookupHost(...).
Long-lived HTTP connections — http.Client connections stay open in the pool. If the pool grows unboundedly, set transport limits:
transport := &http.Transport{
MaxIdleConns: 100,
MaxIdleConnsPerHost: 10,
IdleConnTimeout: 90 * time.Second,
}
client := &http.Client{Transport: transport}For related Go issues, see Fix: Go Context Deadline Exceeded and Fix: Go Nil Pointer Dereference.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Go context deadline exceeded / context canceled
How to fix Go context.DeadlineExceeded and context.Canceled errors — setting timeouts correctly, propagating context through call chains, handling cancellation, and debugging which operation timed out.
Fix: Go Test Not Working — Tests Not Running, Failing Unexpectedly, or Coverage Not Collected
How to fix Go testing issues — test function naming, table-driven tests, t.Run subtests, httptest, testify assertions, and common go test flag errors.
Fix: Spring Boot @Cacheable Not Working — Cache Miss Every Time or Stale Data
How to fix Spring Boot @Cacheable issues — @EnableCaching missing, self-invocation bypass, key generation, TTL configuration, cache eviction, and Caffeine vs Redis setup.
Fix: Go Generics Type Constraint Error — Does Not Implement or Cannot Use as Type
How to fix Go generics errors — type constraints, interface vs constraint, comparable, union types, type inference failures, and common generic function pitfalls.