feat(ml): implement FastAPI inference service with model loading, preprocessing, and prediction endpoints

2026-02-15 14:29:07 +01:00 · 2026-02-15 14:29:07 +01:00 · 3a83fd38e9
commit 3a83fd38e9
parent f4c0f9a836
3 changed files with 945 additions and 10 deletions
--- a/openspec/changes/candle-backend/tasks.md
+++ b/openspec/changes/candle-backend/tasks.md
@ -51,16 +51,16 @@

 ## 6. Inference Service (FastAPI)

- [ ] 6.1 Create `services/ml/app/main.py` — FastAPI app with CORS, startup event to load model
- [ ] 6.2 Implement model loading — from MLflow registry (by name + stage) or from local .pkl file via joblib
- [ ] 6.3 Implement preprocessing parity — load pipeline config from MLflow artifact, apply same feature engineering as training
- [ ] 6.4 Create `POST /predict` endpoint — accept candles array, run preprocessing, predict, return per-candle labels + confidence + spans + model_info
- [ ] 6.5 Implement prediction span grouping — group consecutive same-label non-"O" predictions into spans with avg_confidence
- [ ] 6.6 Create `POST /predict/batch` endpoint — accept pair/timeframe/date range, load data, process in batch_size chunks, return predictions
- [ ] 6.7 Create `GET /model/info` endpoint — return model metadata, per-class metrics from MLflow
- [ ] 6.8 Create `GET /model/labels` endpoint — return label names and colors
- [ ] 6.9 Create `GET /health` endpoint — check model loaded status, MLflow connection, PostgreSQL connection
- [ ] 6.10 Add Pydantic request/response models for all endpoints (PredictRequest, PredictResponse, BatchPredictRequest, ModelInfoResponse)
+- [x] 6.1 Create `services/ml/app/main.py` — FastAPI app with CORS, startup event to load model
+- [x] 6.2 Implement model loading — from MLflow registry (by name + stage) or from local .pkl file via joblib
+- [x] 6.3 Implement preprocessing parity — load pipeline config from MLflow artifact, apply same feature engineering as training
+- [x] 6.4 Create `POST /predict` endpoint — accept candles array, run preprocessing, predict, return per-candle labels + confidence + spans + model_info
+- [x] 6.5 Implement prediction span grouping — group consecutive same-label non-"O" predictions into spans with avg_confidence
+- [x] 6.6 Create `POST /predict/batch` endpoint — accept pair/timeframe/date range, load data, process in batch_size chunks, return predictions
+- [x] 6.7 Create `GET /model/info` endpoint — return model metadata, per-class metrics from MLflow
+- [x] 6.8 Create `GET /model/labels` endpoint — return label names and colors
+- [x] 6.9 Create `GET /health` endpoint — check model loaded status, MLflow connection, PostgreSQL connection
+- [x] 6.10 Add Pydantic request/response models for all endpoints (PredictRequest, PredictResponse, BatchPredictRequest, ModelInfoResponse)

 ## 7. Next.js API Proxy Routes