feat: add ML service scaffolding with Python FastAPI, Docker, and MLflow setup
This commit is contained in:
parent
92abab5316
commit
1a653c5866
18 changed files with 1952 additions and 2593 deletions
39
openspec/changes/candle-backend/proposal.md
Normal file
39
openspec/changes/candle-backend/proposal.md
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
## Why
|
||||
|
||||
The annotation tool currently creates labeled datasets but has no way to train models on them or get predictions back. Adding a Python ML backend closes the loop: annotations become training data, models produce predictions, and predictions guide further annotation — creating an active learning cycle for candlestick pattern recognition.
|
||||
|
||||
## What Changes
|
||||
|
||||
- Add a Python service (`services/ml/`) alongside the existing Next.js app, using FastAPI for the REST API
|
||||
- Implement TA-Lib-based candlestick pattern recognition to auto-generate annotations programmatically
|
||||
- Build a configurable ML training pipeline (feature engineering → annotation ingestion → training → evaluation) with MLflow tracking and DVC for data versioning
|
||||
- Support multiple model types: RandomForest and XGBoost initially, with architecture ready for LSTM/GRU and transformer-based models later
|
||||
- Serve trained models via a FastAPI inference API that accepts OHLCV candles and returns pattern predictions with confidence scores
|
||||
- Add Next.js API proxy routes (`/api/predict`, `/api/predict/batch`, `/api/model/info`) to connect the frontend to the Python backend
|
||||
- Add prediction visualization layer on the chart (distinct from human annotations) with confidence filtering and disagreement detection
|
||||
- Add a prediction controls panel for toggling predictions, filtering by label/confidence, and viewing per-class model metrics
|
||||
- Implement a feedback loop: users can confirm, correct, or dismiss model predictions as new annotations
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
|
||||
- `feature-engineering`: TA-Lib indicator computation and candle feature extraction from raw OHLCV data, producing enriched datasets for training and inference
|
||||
- `annotation-ingestion`: Converting span annotations (human and programmatic) into labeled training datasets with BIO or windowed encoding, including TA-Lib CDL* pattern auto-labeling
|
||||
- `ml-training`: Configurable model training pipeline with temporal splits, class balancing, MLflow experiment tracking, artifact logging, and model registry integration
|
||||
- `ml-inference`: REST API serving trained models — accepts OHLCV candles, runs preprocessing, returns predictions with confidence scores and model metadata
|
||||
- `prediction-ui`: Frontend prediction layer with chart visualization, controls panel, confidence filtering, disagreement detection, and feedback loop for active learning
|
||||
|
||||
### Modified Capabilities
|
||||
|
||||
- `backend-api`: New proxy routes (`/api/predict`, `/api/predict/batch`, `/api/model/info`) added to forward requests to the Python inference service
|
||||
- `span-annotation`: Span export format consumed by the ML pipeline for training; prediction-confirmed spans can be saved as new annotations
|
||||
|
||||
## Impact
|
||||
|
||||
- **New dependencies**: Python 3.11+, FastAPI, uvicorn, scikit-learn, XGBoost, TA-Lib (C library + Python wrapper), MLflow, DVC, pandas, numpy, joblib
|
||||
- **New service**: Python FastAPI service running on port 8001, needs to be added to docker-compose
|
||||
- **Data flow**: Annotation JSON/CSV exports feed into Python pipeline; inference results flow back to the frontend via Next.js proxy routes
|
||||
- **Infrastructure**: MLflow tracking server (port 5000), DVC remote storage for dataset versioning
|
||||
- **Existing code changes**: New API routes in Next.js, new React components for prediction panel, chart overlay modifications for prediction rendering
|
||||
- **Config**: Pipeline YAML config (`config/pipeline.yaml`) controls all ML stages; env vars for inference API URL and feature flags
|
||||
Loading…
Add table
Add a link
Reference in a new issue