feat(ml): implement training stage with MLflow tracking and model wrappers
- Create RandomForestModel and XGBoostModel wrappers with class weight support - Implement temporal and random train/val/test splitting - Add MLflow experiment tracking with full parameter and metric logging - Create evaluation module for confusion matrix, feature importance, and classification reports - Implement model training with sklearn/xgboost flavor logging and optional registry registration - Store training run metadata in PostgreSQL - Wire training stage into pipeline.py orchestrator - Support both RandomForest and XGBoost models with configurable hyperparameters
This commit is contained in:
parent
16763b967e
commit
f4c0f9a836
8 changed files with 900 additions and 14 deletions
|
|
@ -36,18 +36,18 @@
|
|||
|
||||
## 5. Training Stage
|
||||
|
||||
- [ ] 5.1 Create `services/ml/training/train.py` — main training entry point: load labeled CSV, split, train, evaluate, log to MLflow
|
||||
- [ ] 5.2 Implement temporal train/validation/test splitting with configurable ratios, warn on random split
|
||||
- [ ] 5.3 Create `services/ml/training/models/random_forest.py` — RandomForestClassifier wrapper with class_weights support
|
||||
- [ ] 5.4 Create `services/ml/training/models/xgboost_model.py` — XGBClassifier wrapper with class_weights support
|
||||
- [ ] 5.5 Implement model dispatch — select model class based on `model_type` config, fail with supported types list for unknown types
|
||||
- [ ] 5.6 Implement MLflow experiment tracking — create run, log config artifact, dataset params, per-class sample counts, all hyperparameters
|
||||
- [ ] 5.7 Implement metrics logging — accuracy, f1_macro, f1_weighted, per-class precision/recall/F1
|
||||
- [ ] 5.8 Create `services/ml/training/evaluation.py` — generate confusion matrix plot, feature importance plot, classification report text
|
||||
- [ ] 5.9 Implement MLflow artifact logging — log confusion_matrix.png, feature_importance.png, classification_report.txt, pipeline_config.yaml
|
||||
- [ ] 5.10 Implement MLflow model registration — log model with sklearn/xgboost flavor, register in registry if configured
|
||||
- [ ] 5.11 Store training run metadata in PostgreSQL `training_runs` table
|
||||
- [ ] 5.12 Wire training into `pipeline.py`
|
||||
- [x] 5.1 Create `services/ml/training/train.py` — main training entry point: load labeled CSV, split, train, evaluate, log to MLflow
|
||||
- [x] 5.2 Implement temporal train/validation/test splitting with configurable ratios, warn on random split
|
||||
- [x] 5.3 Create `services/ml/training/models/random_forest.py` — RandomForestClassifier wrapper with class_weights support
|
||||
- [x] 5.4 Create `services/ml/training/models/xgboost_model.py` — XGBClassifier wrapper with class_weights support
|
||||
- [x] 5.5 Implement model dispatch — select model class based on `model_type` config, fail with supported types list for unknown types
|
||||
- [x] 5.6 Implement MLflow experiment tracking — create run, log config artifact, dataset params, per-class sample counts, all hyperparameters
|
||||
- [x] 5.7 Implement metrics logging — accuracy, f1_macro, f1_weighted, per-class precision/recall/F1
|
||||
- [x] 5.8 Create `services/ml/training/evaluation.py` — generate confusion matrix plot, feature importance plot, classification report text
|
||||
- [x] 5.9 Implement MLflow artifact logging — log confusion_matrix.png, feature_importance.png, classification_report.txt, pipeline_config.yaml
|
||||
- [x] 5.10 Implement MLflow model registration — log model with sklearn/xgboost flavor, register in registry if configured
|
||||
- [x] 5.11 Store training run metadata in PostgreSQL `training_runs` table
|
||||
- [x] 5.12 Wire training into `pipeline.py`
|
||||
|
||||
## 6. Inference Service (FastAPI)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue