Overview
A system can have low average latency but high p99 latency when a small percentage of requests experience extreme delays. This happens due to sporadic bottlenecks, which are not visible when looking at average response times.
Example Scenario (Go Backend for Food Delivery App)
-
p50 (median) latency: 200ms (half of requests are super fast).
-
p90 latency: 400ms (most users have a good experience).
-
p99 latency: 4 seconds (1% of users get stuck waiting).
-
Mean (average) latency: 250ms, which looks fine but hides the bad experiences.
✅ Takeaway: If your food order takes 4 seconds when everyone else gets it in 200ms, it feels slow, even if the system looks “fast” on average.
4 Most Common Causes of High p99 Latency (Go-Specific)
1. Garbage Collection (GC) Pauses in Go
Issue:
Go uses garbage collection (GC) to manage memory, and while Go’s GC is low-pause, it can still cause p99 spikes if heap allocations get excessive.
Example (Go Web Server With High GC Delays)
package main
import (
"fmt"
"net/http"
)
func handler(w http.ResponseWriter, r *http.Request) {
// Creates a massive slice (bad for GC)
largeData := make([]byte, 100000000) // 100MB allocation
fmt.Fprintf(w, "Processed request with %d bytes", len(largeData))
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
Solution: Reduce Heap Allocations & Use Object Pools
package main
import (
"fmt"
"net/http"
"sync"
)
var bufferPool = sync.Pool{
New: func() interface{} { return make([]byte, 100000) },
}
func handler(w http.ResponseWriter, r *http.Request) {
buf := bufferPool.Get().([]byte)
defer bufferPool.Put(buf) // Reuse memory
fmt.Fprintf(w, "Processed request efficiently")
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
✅ Fix:
-
Uses sync.Pool to reuse memory, reducing GC overhead.
-
Prevents p99 spikes by minimizing new heap allocations.
2. Slow Database Queries (Go + PostgreSQL)
Issue:
A poorly optimized query causes slow p99 responses while most queries are fast.
Example (Inefficient Query in a Go Service)
package main
import (
"database/sql"
"fmt"
"log"
"net/http"
_ "github.com/lib/pq"
)
var db, _ = sql.Open("postgres", "postgres://user:password@localhost/dbname")
func handler(w http.ResponseWriter, r *http.Request) {
// Slow query (scanning millions of rows)
rows, err := db.Query("SELECT * FROM orders WHERE status = 'pending'")
if err != nil {
http.Error(w, "DB error", http.StatusInternalServerError)
return
}
defer rows.Close()
fmt.Fprint(w, "Fetched orders")
}
func main() {
http.HandleFunc("/", handler)
log.Fatal(http.ListenAndServe(":8080", nil))
}
Solution: Add an Index + Caching
CREATE INDEX idx_orders_status ON orders(status);
rows, err := db.Query("SELECT * FROM orders WHERE status = 'pending' LIMIT 100")
✅ Fix:
- Indexing + LIMIT speeds up queries.
- Redis cache reduces DB load and p99 spikes.
3. Thread Contention & Queueing Delays (Goroutines)
Issue:
Too many requests pile up, exceeding Go’s worker thread capacity.
Example (Too Many Goroutines Causing p99 Spikes)
func handler(w http.ResponseWriter, r *http.Request) {
go expensiveOperation() // Launch goroutine for each request
w.Write([]byte("Processing..."))
}
Solution: Use a Worker Pool
var jobs = make(chan string, 100)
func worker() {
for job := range jobs {
process(job)
}
}
func handler(w http.ResponseWriter, r *http.Request) {
select {
case jobs <- "task":
w.Write([]byte("Queued for processing"))
default:
http.Error(w, "Server busy", http.StatusTooManyRequests)
}
}
func main() {
for i := 0; i < 10; i++ {
go worker()
}
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
✅ Fix:
- Uses bounded goroutines (worker pool).
- Rejects excess requests instead of queueing forever (rate limiting).
4. Network Spikes & Slow Dependencies
Issue:
A slow external API increases p99 latency.
Solution: Use Circuit Breaker (Resilience in Go)
import (
"github.com/sony/gobreaker"
"time"
)
var cb = gobreaker.NewCircuitBreaker(gobreaker.Settings{
Timeout: 5 * time.Second,
})
func handler(w http.ResponseWriter, r *http.Request) {
result, err := cb.Execute(func() (interface{}, error) {
return http.Get("https://slow-payment-gateway.com/pay")
})
if err != nil {
http.Error(w, "Payment system is busy", http.StatusServiceUnavailable)
return
}
w.Write([]byte("Payment processed"))
}
✅ Fix:
- Circuit breaker prevents cascading failures.
- Timeouts ensure requests don’t wait indefinitely.
Key Takeaways
- Go GC tuning, object pools, and reducing heap allocations help prevent p99 spikes.
- Slow queries can be fixed with indexing, caching, and efficient database queries.
- Goroutine explosion must be controlled with worker pools and rate limiting.
- External dependencies should use circuit breakers and timeouts to prevent long delays.