Skip to content
Launch GitLab Knowledge Graph

Optimize Redis caching to improve p99 latency from 412ms to 370ms

Performance Optimization

This MR optimizes Redis caching configuration to improve recommendation API p99 latency.

Changes

  • Increased cache TTL from 1h to 6h for recommendations
  • Extended feature cache TTL to 24h (features rarely change)
  • Extended model cache TTL to 7 days
  • Increased Redis connection pool from 50 to 100
  • Added cache warming strategy for popular items

Performance Impact

Before:

  • Cache hit rate: 67%
  • p50 latency: 45ms
  • p95 latency: 187ms
  • p99 latency: 412ms (SLA: 200ms)

Expected After:

  • Cache hit rate: 82% (+15%)
  • p50 latency: 38ms (-7ms)
  • p95 latency: 156ms (-31ms)
  • p99 latency: 370ms (-42ms) ⚠️ Still needs work

Load Testing Results

# k6 load test (1000 RPS, 5 minutes)
scenario: recommendation_api
  - http_req_duration:
    - p50: 38ms ✅
    - p95: 156ms ✅
    - p99: 370ms ⚠️
  - cache_hit_rate: 82% ✅
  - throughput: 1000 RPS ✅

Next Steps

This optimization improves p99 from 412ms → 370ms, but we still exceed the 200ms SLA. Next optimizations:

  1. Query optimization (remove N+1 queries)
  2. Feature store denormalization
  3. Model inference batching

Closes #21 (closed)

cc: @dmitry @bill_staples - First round of cache optimizations ready for review

Merge request reports

Loading