Add model training pipeline with MLflow tracking
Overview
Setup automated model training pipeline with experiment tracking.
Components
- Data preprocessing pipeline
- Feature engineering
- Model training (scheduled daily)
- Model evaluation and validation
- Model registry and versioning
MLflow Integration
- Track hyperparameters
- Log metrics (precision, recall, F1)
- Store model artifacts
- Compare experiment runs
Infrastructure
- Kubernetes CronJob for training
- S3 for model artifacts
- MLflow server for tracking
Acceptance Criteria
-
Automated daily training runs -
Model versioning with semantic versioning -
Automatic deployment of models with accuracy > 90% -
Rollback capability to previous model version