10 KiB
10 KiB
1. Project Scaffolding & Infrastructure
- 1.1 Create
services/ml/directory structure:config/,features/,features/custom/,training/,training/models/,inference/,data/raw/,data/enriched/,data/labeled/,data/annotations/ - 1.2 Create
services/ml/pyproject.toml(orrequirements.txt) with dependencies: fastapi, uvicorn, scikit-learn, xgboost, pandas, numpy, joblib, mlflow, pyyaml, ta-lib, dvc, sqlalchemy, psycopg2-binary, pydantic - 1.3 Create
services/ml/Dockerfilewith Python 3.11, TA-Lib C library installation (libta-lib-dev), and pip install of dependencies - 1.4 Create
config/pipeline.yamlwith the full pipeline configuration (all stages, default hyperparameters, MLflow/DVC settings) - 1.5 Add PostgreSQL, ml-service, and mlflow containers to
docker-compose.ymlwith shared data volume - 1.6 Initialize DVC in
services/ml/with local remote storage backend - 1.7 Create PostgreSQL database schema:
training_runstable (run_id, model_type, experiment_name, pipeline_config_hash, dataset_version, metrics_summary JSON, status, created_at, completed_at) - 1.8 Create
services/ml/app/db.py— SQLAlchemy engine and session setup for PostgreSQL connection
2. Pipeline Config & Entry Point
- 2.1 Create
services/ml/app/config.py— Pydantic model for pipeline YAML config with validation (stages, data paths, hyperparameters) - 2.2 Create
services/ml/pipeline.py— main orchestrator that reads config and runs enabled stages in sequence - 2.3 Add CLI argument parsing:
--config,--stage(run individual stage), support forpython pipeline.py --config config/pipeline.yaml
3. Feature Engineering Stage
- 3.1 Create
services/ml/features/talib_features.py— compute TA-Lib indicators from config list, append columns with{indicator}_{param}naming, fail with clear error if TA-Lib not installed - 3.2 Create
services/ml/features/candle_features.py— compute body_size, body_direction, upper_wick, lower_wick, wick_ratio, body_to_range, gap, range with division-by-zero handling - 3.3 Create
services/ml/features/custom_loader.py— dynamic import of custom feature functions from config paths, call with DataFrame, append result as column - 3.4 Implement NaN warmup row handling — drop rows with NaN in indicator columns, log count of dropped rows
- 3.5 Wire feature engineering into
pipeline.py— read raw OHLCV CSV, run enabled feature steps, write enriched CSV todata.enriched_path
4. Annotation Ingestion Stage
- 4.1 Create
services/ml/app/annotation_ingestion.py— load annotations JSON fromdata.annotations_path, filter by min_confidence - 4.2 Implement windowed classification encoding — extract fixed-size windows centered on each annotation span, flatten into single rows, handle boundary padding
- 4.3 Implement BIO sequence labeling encoding — assign B-{label}/I-{label}/O tags per candle, handle overlapping annotations with multiple tag columns
- 4.4 Implement TA-Lib CDL* programmatic labeling — run configured CDL functions, convert +100/-100 to label names (bullish_/bearish_ prefix)
- 4.5 Implement human/programmatic label merge strategies — human_priority, programmatic_priority, both (separate columns)
- 4.6 Implement context padding — include N candles before/after each annotation span
- 4.7 Add dataset statistics logging — counts per label, class distribution %, avg span length, human/programmatic agreement rate
- 4.8 Wire annotation ingestion into
pipeline.py— read enriched CSV + annotations JSON, run encoding, write labeled CSV todata.labeled_path
5. Training Stage
- 5.1 Create
services/ml/training/train.py— main training entry point: load labeled CSV, split, train, evaluate, log to MLflow - 5.2 Implement temporal train/validation/test splitting with configurable ratios, warn on random split
- 5.3 Create
services/ml/training/models/random_forest.py— RandomForestClassifier wrapper with class_weights support - 5.4 Create
services/ml/training/models/xgboost_model.py— XGBClassifier wrapper with class_weights support - 5.5 Implement model dispatch — select model class based on
model_typeconfig, fail with supported types list for unknown types - 5.6 Implement MLflow experiment tracking — create run, log config artifact, dataset params, per-class sample counts, all hyperparameters
- 5.7 Implement metrics logging — accuracy, f1_macro, f1_weighted, per-class precision/recall/F1
- 5.8 Create
services/ml/training/evaluation.py— generate confusion matrix plot, feature importance plot, classification report text - 5.9 Implement MLflow artifact logging — log confusion_matrix.png, feature_importance.png, classification_report.txt, pipeline_config.yaml
- 5.10 Implement MLflow model registration — log model with sklearn/xgboost flavor, register in registry if configured
- 5.11 Store training run metadata in PostgreSQL
training_runstable - 5.12 Wire training into
pipeline.py
6. Inference Service (FastAPI)
- 6.1 Create
services/ml/app/main.py— FastAPI app with CORS, startup event to load model - 6.2 Implement model loading — from MLflow registry (by name + stage) or from local .pkl file via joblib
- 6.3 Implement preprocessing parity — load pipeline config from MLflow artifact, apply same feature engineering as training
- 6.4 Create
POST /predictendpoint — accept candles array, run preprocessing, predict, return per-candle labels + confidence + spans + model_info - 6.5 Implement prediction span grouping — group consecutive same-label non-"O" predictions into spans with avg_confidence
- 6.6 Create
POST /predict/batchendpoint — accept pair/timeframe/date range, load data, process in batch_size chunks, return predictions - 6.7 Create
GET /model/infoendpoint — return model metadata, per-class metrics from MLflow - 6.8 Create
GET /model/labelsendpoint — return label names and colors - 6.9 Create
GET /healthendpoint — check model loaded status, MLflow connection, PostgreSQL connection - 6.10 Add Pydantic request/response models for all endpoints (PredictRequest, PredictResponse, BatchPredictRequest, ModelInfoResponse)
7. Next.js API Proxy Routes
- 7.1 Create
src/app/api/predict/route.ts— POST proxy to${INFERENCE_API_URL}/predictwith timeout handling - 7.2 Create
src/app/api/predict/batch/route.ts— POST proxy to${INFERENCE_API_URL}/predict/batchwith INFERENCE_BATCH_TIMEOUT - 7.3 Create
src/app/api/model/info/route.ts— GET proxy to${INFERENCE_API_URL}/model/info - 7.4 Add environment variables to
.env.local: INFERENCE_API_URL, INFERENCE_API_TIMEOUT, INFERENCE_BATCH_TIMEOUT, NEXT_PUBLIC_PREDICTIONS_ENABLED
8. Span Annotation Export & Feedback
- 8.1 Create
src/app/api/span-annotations/export/route.ts— GET endpoint exporting span annotations as JSON in ML pipeline format - 8.2 Add
sourceandmodel_predictionfields to span annotation schema (Drizzle migration) — source defaults to "human", model_prediction is nullable JSON - 8.3 Update span annotation POST endpoint to accept optional
sourceandmodel_predictionfields - 8.4 Support negative annotations — span with label "O", source "human_correction", and model_prediction metadata
9. Prediction UI — State & Controls
- 9.1 Create
src/types/predictions.ts— PredictionSpan, PredictionState, ModelInfoResponse interfaces - 9.2 Create prediction state management in page.tsx (or dedicated context) — spans, isLoading, error, modelInfo, visible, confidenceThreshold, selectedLabels, autoPredict
- 9.3 Create
src/components/PredictionPanel.tsx— controls panel with master toggle, model info display, action buttons, confidence slider, label checkboxes with metrics - 9.4 Implement on-demand prediction fetching — "Run on Visible" sends visible candles to /api/predict, "Predict All" sends batch request
- 9.5 Implement prediction caching — Map keyed by pair_timeframe_range_modelVersion, invalidate on model version change
10. Prediction UI — Chart Rendering
- 10.1 Add histogram series to CandleChart for prediction rendering — per-bar colors from label config at 10-20% opacity
- 10.2 Add series markers for prediction span labels — show
{label} ({confidence}%)below bars at span start - 10.3 Implement confidence threshold filtering — only render predictions above threshold
- 10.4 Implement label type filtering — toggle visibility per label from PredictionPanel checkboxes
- 10.5 Implement prediction layer visibility toggle — show/hide histogram series and markers
11. Prediction UI — Disagreements & Feedback
- 11.1 Implement disagreement detection — compare human spans vs prediction spans with >50% overlap, classify as missed_by_model, missed_by_human, label_mismatch
- 11.2 Render disagreement highlights — red dashed border (missed_by_model), yellow highlight (missed_by_human), orange border (label_mismatch)
- 11.3 Add "Show only disagreements" filter toggle in PredictionPanel
- 11.4 Implement prediction-to-annotation feedback — click missed_by_human prediction opens span annotation dialog pre-filled with predicted label/times
- 11.5 Add "Not a pattern" dismiss action — saves negative annotation with label "O" and model_prediction metadata
- 11.6 Display prediction summary in PredictionPanel — prediction count, agreement count, disagreement count
12. Inference API Connection & Error Handling
- 12.1 Implement inference API health polling — poll /api/model/info every 30 seconds when API unavailable, auto-reconnect
- 12.2 Show "Model server offline" banner when inference API unavailable, disable prediction controls
- 12.3 Ensure annotation tools work independently — prediction API errors never block human annotation
- 12.4 Add loading states for prediction fetching — skeleton/shimmer overlay during prediction requests
13. Documentation & Deployment
- 13.1 Update docker-compose.yml with all service environment variables and health checks
- 13.2 Update DEPLOYMENT.md with Python service setup instructions, TA-Lib installation, MLflow server, PostgreSQL, DVC init
- 13.3 Update README.md with ML pipeline overview, architecture diagram, and usage instructions
- 13.4 Update CLAUDE_DESCRIPTION.md with new ML service capabilities and file structure