candle-annotator/openspec/changes/candle-backend/specs/ml-inference/spec.md at 1a653c58663693c748fccf61f45b776b7ad83561

Marko Djordjevic 1a653c5866 feat: add ML service scaffolding with Python FastAPI, Docker, and MLflow setup

2026-02-15 11:58:31 +01:00

6.8 KiB

Raw Blame History

ADDED Requirements

Requirement: Model loading from MLflow registry

When stages.inference.model_source is "mlflow", the system SHALL load the model from the MLflow model registry using the model name (stages.inference.mlflow_model_name) and stage (stages.inference.mlflow_model_stage).

Scenario: Load production model

WHEN model_source is "mlflow", model name is "candlestick_pattern_v1", and stage is "Production"
THEN the system loads the model registered as "candlestick_pattern_v1" at the "Production" stage from MLflow

Scenario: Model not found in registry

WHEN the specified model name or stage does not exist in the MLflow registry
THEN the system SHALL return a clear error indicating the model was not found

Requirement: Model loading from local file

When stages.inference.model_source is "local", the system SHALL load the model from the file path specified by stages.inference.local_model_path using joblib.

Scenario: Load local model

WHEN model_source is "local" and local_model_path is "models/best_model.pkl"
THEN the system loads the model from that file path

Scenario: Local model file missing

WHEN the local_model_path does not exist
THEN the system SHALL return an error indicating the model file was not found

Requirement: Preprocessing parity

The inference service SHALL replicate the exact preprocessing (feature engineering) used during training. The system SHALL load the pipeline config artifact from the MLflow run that produced the model and apply the same feature engineering steps (TA-Lib indicators, candle features) with the same parameters.

Scenario: Matching preprocessing

WHEN the model was trained with RSI(14) and EMA(20) features
THEN inference SHALL compute RSI(14) and EMA(20) on the input candles before running the model

Scenario: Config mismatch warning

WHEN the current pipeline config differs from the config stored with the model
THEN the system SHALL log a warning about the mismatch

Requirement: Predict endpoint

The system SHALL provide a POST /predict endpoint on the FastAPI service (port 8001). The endpoint SHALL accept a JSON body with pair (string), timeframe (string), and candles (array of objects with time, open, high, low, close, volume). It SHALL return predictions with per-candle labels and confidence scores, prediction spans (grouped continuous predictions), and model metadata.

Scenario: Successful prediction

WHEN POST /predict is called with 100 valid candle objects
THEN the system returns a JSON response with predictions array (one entry per candle with time, label, confidence), spans array (continuous same-label predictions grouped with start_time, end_time, label, avg_confidence), and model_info object

Scenario: Empty candles array

WHEN POST /predict is called with an empty candles array
THEN the system returns HTTP 400 with an error message

Scenario: Invalid candle data

WHEN POST /predict is called with candle objects missing required fields
THEN the system returns HTTP 422 with validation error details

Requirement: Batch predict endpoint

The system SHALL provide a POST /predict/batch endpoint that accepts pair, timeframe, start_date, and end_date. The system SHALL load OHLCV data from its own data store for the specified range, process in chunks of stages.inference.batch_size, and return predictions for the full range.

Scenario: Batch prediction

WHEN POST /predict/batch is called with pair "EURUSD", timeframe "1H", start_date and end_date spanning 6 months
THEN the system loads the data, processes in batches, and returns predictions for the full range

Scenario: No data for range

WHEN the requested date range has no OHLCV data available
THEN the system returns HTTP 404 with a message indicating no data found for the range

Requirement: Model info endpoint

The system SHALL provide a GET /model/info endpoint that returns metadata about the currently loaded model: model_name, model_version, model_type, trained_at, dataset_version, feature_engineering enabled status, list of all labels the model knows, and per-class metrics (precision, recall, F1, training sample count for each label).

Scenario: Get model info

WHEN GET /model/info is called and a model is loaded
THEN the system returns JSON with model metadata and per-class metrics

Scenario: No model loaded

WHEN GET /model/info is called and no model has been loaded
THEN the system returns HTTP 503 with a message indicating no model is available

Requirement: Model labels endpoint

The system SHALL provide a GET /model/labels endpoint that returns the list of all pattern labels the current model can predict, along with their display colors.

Scenario: Get model labels

WHEN GET /model/labels is called
THEN the system returns a JSON array of label objects with name and color fields

Requirement: Health check endpoint

The system SHALL provide a GET /health endpoint that returns the service status including whether a model is loaded, the MLflow connection status, and the PostgreSQL connection status.

Scenario: Healthy service

WHEN GET /health is called and all dependencies are available
THEN the system returns HTTP 200 with { "status": "healthy", "model_loaded": true, "mlflow": "connected", "database": "connected" }

Scenario: Degraded service

WHEN GET /health is called but the MLflow server is unreachable
THEN the system returns HTTP 200 with { "status": "degraded", "model_loaded": true, "mlflow": "disconnected", "database": "connected" }

Requirement: Prediction confidence scores

Each prediction SHALL include a confidence score between 0.0 and 1.0 derived from the model's probability output. For tree-based models, this is the max class probability from predict_proba().

Scenario: Confidence from predict_proba

WHEN the model predicts class "bull_flag" with probability 0.87
THEN the prediction confidence for that candle is 0.87

Requirement: Prediction span grouping

The system SHALL group consecutive candle predictions with the same non-"O" label into prediction spans. Each span SHALL have start_time, end_time, label, and avg_confidence (mean confidence of candles in the span).

Scenario: Group consecutive predictions

WHEN candles at T1, T2, T3 are all predicted as "bull_flag" with confidences 0.85, 0.90, 0.80
THEN the system creates one span: { start_time: T1, end_time: T3, label: "bull_flag", avg_confidence: 0.85 }

Scenario: Break on label change

WHEN candle T1 is "bull_flag" and candle T2 is "bear_flag"
THEN the system creates two separate spans

6.8 KiB Raw Blame History