Finding and Fixing a 50k Goroutine Leak That Nearly Killed Production

https://news.ycombinator.com/rss Hits: 1
Summary

Key Takeaways Goroutine leaks are silent killers - they grow slowly until critical Always use context.Context for goroutine lifecycle management Monitor runtime.NumGoroutine() in production Unbuffered channels without readers are the #1 cause of leaks Use pprof and runtime/trace for diagnosis Table of Contents The Symptoms That Everyone Ignored The Code That Looked Perfectly Fine Debugging Process The Root Cause The Fix Prevention Strategies Monitoring Setup Security Considerations Testing Strategy Lessons Learned The Symptoms That Everyone Ignored It started innocently. A developer mentioned the API felt "sluggish" during sprint review. QA reported timeouts were "slightly higher." DevOps noted memory was "trending up but within limits." Everyone had a piece of the puzzle. Nobody saw the picture. Here's what we were looking at: Week 1: 1,200 goroutines, 2.1GB RAM, 250ms p99 latency Week 2: 3,400 goroutines, 3.8GB RAM, 380ms p99 latency Week 3: 8,900 goroutines, 7.2GB RAM, 610ms p99 latency Week 4: 19,000 goroutines, 14GB RAM, 1.4s p99 latency Week 5: 34,000 goroutines, 28GB RAM, 8.3s p99 latency Week 6: 50,847 goroutines, 47GB RAM, 32s p99 latency โ† You are here Classic exponential growth. Classic "someone else's problem." The Code That Looked Perfectly Fine The leak was in our WebSocket notification system. Here's the simplified version: func (s *NotificationService) Subscribe(userID string, ws *websocket.Conn) { ctx, cancel := context.WithCancel(context.Background()) sub := &subscription{ userID: userID, ws: ws, cancel: cancel, } s.subscribers[userID] = sub // Start the message pump go s.pumpMessages(ctx, sub) // Start the heartbeat go s.heartbeat(ctx, sub) } func (s *NotificationService) pumpMessages(ctx context.Context, sub *subscription) { for { select { case <-ctx.Done(): return case msg := <-sub.messages: sub.ws.WriteJSON(msg) // What could go wrong? } } } func (s *NotificationService) heartbeat(ctx context.Context, sub *subscription) { ticker := time.NewTick...

First seen: 2026-01-17 13:24

Last seen: 2026-01-17 13:24