Launch GitLab Knowledge Graph

Integration: Block toxic content from recommendations feed

Problem

Currently the recommendation engine (ai-recommendation-engine) and content moderation (ai-content-moderation) operate independently:

Recommendations suggest items based solely on collaborative filtering
Moderation checks content only when reported
Gap: Users may see toxic content in their recommendation feed

Customer Impact

Real incident last week:

User reported seeing hate speech in recommended posts
Content HAD been flagged as toxic (score: 0.91)
But recommendation engine was unaware → still served it
Result: User complained, considering churn

Proposed Solution

Integrate moderation pipeline with recommendation serving:

Architecture

Recommendation Engine (Project 1)
        ↓
  Generate top-100 items
        ↓
  ┌─────────────────┐
  │ Moderation      │ ← ai-content-moderation API
  │ Filter          │
  └─────────────────┘
        ↓
  Filter out items with:
  - toxicity > 0.6
  - content_blocked = true
        ↓
  Return clean recommendations

Implementation Options

Option A: Real-time API calls

Pros: Always up-to-date, no stale data
Cons: Adds 20-30ms latency, more load on moderation service

Option B: Shared cache/database

Pros: Fast (1-2ms lookup), scales better
Cons: Eventual consistency (5min lag)

Recommendation: Option B with Redis

Moderation service writes blocked_content:{item_id} = true to Redis
Recommendation service checks Redis before serving
TTL: 24h, invalidated on moderation decision change

Changes Required

ai-recommendation-engine:

Add toxicity filter to CollaborativeFilteringModel.recommend()
Check Redis for blocked_content:{item_id} keys
Skip blocked items, backfill with next-best recommendations

ai-content-moderation:

Write blocking decisions to shared Redis
Key format: blocked_content:{item_id} → {"blocked": true, "reason": "hate_speech", "score": 0.91}
Publish Redis pub/sub events for real-time invalidation

Testing Plan

Unit tests: Mock Redis, verify filtering works
Integration test: Deploy both services to staging
Load test: Ensure latency impact < 5ms
A/B test: Measure impact on user engagement

Success Metrics

Zero toxic content (score > 0.6) in recommendations
Latency increase < 5ms (P99)
User reports of toxic content down 95%

Timeline

Week 1: Design + shared Redis schema (@dmitry)
Week 2: Implement moderation-side writes (@bob_wilson)
Week 3: Implement recommendation-side reads (@dmitry)
Week 4: Integration testing (@sabrina_farmer)
Week 5: Production rollout

Dependencies

Blocked by: ai-recommendation-engine!2 (merged) (cache optimization - need shared Redis)
Blocked by: #3 (moderation API)
Related: #9 (closed) (false positives - don't want to over-block)

cc @dmitry @bob_wilson @sabrina_farmer

Edited Oct 25, 2025 by Administrator