[TECH DEBT] Refactor collaborative filtering to support real-time updates
Problem
Current collaborative filtering model requires full retrain (8 hours) to incorporate new user interactions. This creates stale recommendations:
- New users: cold start problem for 24h
- Trending items: not reflected until next retrain
- User preference changes: 24h lag
Current Architecture
User interactions → Batch job (daily) → Full model retrain (8h) → Deploy
Issues:
- Cannot adapt to viral content
- Poor experience for new users
- Wastes compute (retraining entire model daily)
Proposed Architecture
User interactions → Stream processing → Incremental model update → Live serving
Benefits:
- Real-time personalization
- Trending content surfaces faster
- Reduced compute cost (incremental updates)
Technical Approach
Option 1: Online Matrix Factorization
- Incremental SGD updates
- Update user/item factors in real-time
- Requires streaming infrastructure (Kafka/Flink)
Option 2: Hybrid Model
- Keep batch CF for long-term patterns
- Add real-time popularity boost
- Blend: 70% CF + 30% trending
Option 3: Neural Collaborative Filtering
- Replace SVD with deep learning model
- Train on mini-batches (hourly)
- More flexible but higher complexity
Effort Estimate
- Option 1: 6 weeks (requires streaming infra)
- Option 2: 2 weeks (easier, good enough)
- Option 3: 8 weeks (highest quality, most complex)
Recommendation
Start with Option 2 (hybrid model) as quick win, then evaluate Option 3 for Q2.
Related Work
- Blocked by: #12 (need A/B testing to validate)
- Relates to: #1 (closed) (original CF implementation)
- Enables: Better cold-start handling