feat(ml): implement FastAPI inference service with model loading, preprocessing, and prediction endpoints

This commit is contained in:
Marko Djordjevic 2026-02-15 14:29:07 +01:00
parent f4c0f9a836
commit 3a83fd38e9
3 changed files with 945 additions and 10 deletions

View file

@ -51,16 +51,16 @@
## 6. Inference Service (FastAPI)
- [ ] 6.1 Create `services/ml/app/main.py` — FastAPI app with CORS, startup event to load model
- [ ] 6.2 Implement model loading — from MLflow registry (by name + stage) or from local .pkl file via joblib
- [ ] 6.3 Implement preprocessing parity — load pipeline config from MLflow artifact, apply same feature engineering as training
- [ ] 6.4 Create `POST /predict` endpoint — accept candles array, run preprocessing, predict, return per-candle labels + confidence + spans + model_info
- [ ] 6.5 Implement prediction span grouping — group consecutive same-label non-"O" predictions into spans with avg_confidence
- [ ] 6.6 Create `POST /predict/batch` endpoint — accept pair/timeframe/date range, load data, process in batch_size chunks, return predictions
- [ ] 6.7 Create `GET /model/info` endpoint — return model metadata, per-class metrics from MLflow
- [ ] 6.8 Create `GET /model/labels` endpoint — return label names and colors
- [ ] 6.9 Create `GET /health` endpoint — check model loaded status, MLflow connection, PostgreSQL connection
- [ ] 6.10 Add Pydantic request/response models for all endpoints (PredictRequest, PredictResponse, BatchPredictRequest, ModelInfoResponse)
- [x] 6.1 Create `services/ml/app/main.py` — FastAPI app with CORS, startup event to load model
- [x] 6.2 Implement model loading — from MLflow registry (by name + stage) or from local .pkl file via joblib
- [x] 6.3 Implement preprocessing parity — load pipeline config from MLflow artifact, apply same feature engineering as training
- [x] 6.4 Create `POST /predict` endpoint — accept candles array, run preprocessing, predict, return per-candle labels + confidence + spans + model_info
- [x] 6.5 Implement prediction span grouping — group consecutive same-label non-"O" predictions into spans with avg_confidence
- [x] 6.6 Create `POST /predict/batch` endpoint — accept pair/timeframe/date range, load data, process in batch_size chunks, return predictions
- [x] 6.7 Create `GET /model/info` endpoint — return model metadata, per-class metrics from MLflow
- [x] 6.8 Create `GET /model/labels` endpoint — return label names and colors
- [x] 6.9 Create `GET /health` endpoint — check model loaded status, MLflow connection, PostgreSQL connection
- [x] 6.10 Add Pydantic request/response models for all endpoints (PredictRequest, PredictResponse, BatchPredictRequest, ModelInfoResponse)
## 7. Next.js API Proxy Routes