candle-annotator/openspec/specs/ml-inference/spec.md at 925e7284e33f1af4d058fe210335c44a2f50a316

Marko Djordjevic 925e7284e3 Archive code-review-fix change and sync specs to main

- Synced 14 capability delta specs to main specs
- Created 6 new main specs: api-authentication, error-boundary, input-validation, security-headers, shared-types
- Updated 8 existing specs with security, validation, and performance requirements
- Archived change to openspec/changes/archive/2026-02-20-code-review-fix/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 08:54:59 +01:00

11 KiB

Raw Blame History

ADDED Requirements

Requirement: Model loading from MLflow registry

When stages.inference.model_source is "mlflow", the system SHALL load the model from the MLflow model registry using the model name (stages.inference.mlflow_model_name) and stage (stages.inference.mlflow_model_stage).

Scenario: Load production model

WHEN model_source is "mlflow", model name is "candlestick_pattern_v1", and stage is "Production"
THEN the system loads the model registered as "candlestick_pattern_v1" at the "Production" stage from MLflow

Scenario: Model not found in registry

WHEN the specified model name or stage does not exist in the MLflow registry
THEN the system SHALL return a clear error indicating the model was not found

Requirement: Model loading from local file

When stages.inference.model_source is "local", the system SHALL load the model from the file path specified by stages.inference.local_model_path using joblib.

Scenario: Load local model

WHEN model_source is "local" and local_model_path is "models/best_model.pkl"
THEN the system loads the model from that file path

Scenario: Local model file missing

WHEN the local_model_path does not exist
THEN the system SHALL return an error indicating the model file was not found

Requirement: Preprocessing parity

The inference service SHALL replicate the exact preprocessing (feature engineering) used during training. The system SHALL load the pipeline config artifact from the MLflow run that produced the model and apply the same feature engineering steps (TA-Lib indicators, candle features) with the same parameters.

Scenario: Matching preprocessing

WHEN the model was trained with RSI(14) and EMA(20) features
THEN inference SHALL compute RSI(14) and EMA(20) on the input candles before running the model

Scenario: Config mismatch warning

WHEN the current pipeline config differs from the config stored with the model
THEN the system SHALL log a warning about the mismatch

Requirement: Predict endpoint

The system SHALL provide a POST /predict endpoint on the FastAPI service (port 8001). The endpoint SHALL accept a JSON body with pair (string), timeframe (string), and candles (array of objects with time, open, high, low, close, volume). It SHALL return predictions with per-candle labels and confidence scores, prediction spans (grouped continuous predictions), and model metadata.

Scenario: Successful prediction

WHEN POST /predict is called with 100 valid candle objects
THEN the system returns a JSON response with predictions array (one entry per candle with time, label, confidence), spans array (continuous same-label predictions grouped with start_time, end_time, label, avg_confidence), and model_info object

Scenario: Empty candles array

WHEN POST /predict is called with an empty candles array
THEN the system returns HTTP 400 with an error message

Scenario: Invalid candle data

WHEN POST /predict is called with candle objects missing required fields
THEN the system returns HTTP 422 with validation error details

Requirement: Batch predict endpoint

The system SHALL provide a POST /predict/batch endpoint that accepts pair, timeframe, start_date, and end_date. The system SHALL load OHLCV data from its own data store for the specified range, process in chunks of stages.inference.batch_size, and return predictions for the full range.

Scenario: Batch prediction

WHEN POST /predict/batch is called with pair "EURUSD", timeframe "1H", start_date and end_date spanning 6 months
THEN the system loads the data, processes in batches, and returns predictions for the full range

Scenario: No data for range

WHEN the requested date range has no OHLCV data available
THEN the system returns HTTP 404 with a message indicating no data found for the range

Requirement: Model info endpoint

The system SHALL provide a GET /model/info endpoint that returns metadata about the currently loaded model: model_name, model_version, model_type, trained_at, dataset_version, feature_engineering enabled status, list of all labels the model knows, and per-class metrics (precision, recall, F1, training sample count for each label).

Scenario: Get model info

WHEN GET /model/info is called and a model is loaded
THEN the system returns JSON with model metadata and per-class metrics

Scenario: No model loaded

WHEN GET /model/info is called and no model has been loaded
THEN the system returns HTTP 503 with a message indicating no model is available

Requirement: Model labels endpoint

The system SHALL provide a GET /model/labels endpoint that returns the list of all pattern labels the current model can predict, along with their display colors.

Scenario: Get model labels

WHEN GET /model/labels is called
THEN the system returns a JSON array of label objects with name and color fields

Requirement: Health check endpoint

The system SHALL provide a GET /health endpoint that returns the service status including whether a model is loaded, the MLflow connection status, and the PostgreSQL connection status.

Scenario: Healthy service

WHEN GET /health is called and all dependencies are available
THEN the system returns HTTP 200 with { "status": "healthy", "model_loaded": true, "mlflow": "connected", "database": "connected" }

Scenario: Degraded service

WHEN GET /health is called but the MLflow server is unreachable
THEN the system returns HTTP 200 with { "status": "degraded", "model_loaded": true, "mlflow": "disconnected", "database": "connected" }

Requirement: Prediction confidence scores

Each prediction SHALL include a confidence score between 0.0 and 1.0 derived from the model's probability output. For tree-based models, this is the max class probability from predict_proba().

Scenario: Confidence from predict_proba

WHEN the model predicts class "bull_flag" with probability 0.87
THEN the prediction confidence for that candle is 0.87

Requirement: Prediction span grouping

The system SHALL group consecutive candle predictions with the same non-"O" label into prediction spans. Each span SHALL have start_time, end_time, label, and avg_confidence (mean confidence of candles in the span).

Scenario: Group consecutive predictions

WHEN candles at T1, T2, T3 are all predicted as "bull_flag" with confidences 0.85, 0.90, 0.80
THEN the system creates one span: { start_time: T1, end_time: T3, label: "bull_flag", avg_confidence: 0.85 }

Scenario: Break on label change

WHEN candle T1 is "bull_flag" and candle T2 is "bear_flag"
THEN the system creates two separate spans

Requirement: CORS restricted to explicit origins

The FastAPI ML service SHALL configure CORS with explicit allowed origins instead of wildcard *. The default allowed origins SHALL be ["http://localhost:3000"]. Additional origins MAY be configured via the CORS_ORIGINS environment variable (comma-separated).

Scenario: Same-origin request allowed

WHEN a request from http://localhost:3000 hits the ML service
THEN the CORS headers allow the request

Scenario: Unknown origin blocked

WHEN a request from http://evil.com hits the ML service
THEN the CORS headers do not include Access-Control-Allow-Origin for that origin

Requirement: Generic error responses from ML service

All FastAPI endpoints SHALL return generic error messages for unexpected exceptions. The response SHALL be { "detail": "Internal server error" } with HTTP 500. Full exception details SHALL be logged via logging.error with traceback.

Scenario: Internal error generic response

WHEN a predict endpoint throws an unexpected exception
THEN the client receives { "detail": "Internal server error" } and the traceback is logged server-side

Requirement: Model file integrity check

When loading a model via joblib.load(), the system SHALL verify the model file's SHA256 hash against a manifest file (models/checksums.sha256). If the hash does not match or the manifest entry is missing, the load SHALL fail with an error.

Scenario: Valid model file

WHEN joblib.load("models/abc123.pkl") is called and the file's SHA256 matches the manifest
THEN the model loads successfully

Scenario: Tampered model file

WHEN the model file's SHA256 does not match the manifest entry
THEN the system refuses to load the model and returns HTTP 500 with { "detail": "Model integrity check failed" }

Requirement: Thread-safe model reads

The ML service SHALL use a lock for model reads during prediction, not just model writes. All code that reads the current model reference SHALL acquire the _model_swap_lock or use an atomic reference swap pattern.

Scenario: Concurrent prediction and model swap

WHEN a prediction request is in progress and a model load request arrives
THEN the prediction completes with the old model (or waits), and the new model is loaded atomically

Requirement: Path traversal prevention on model operations

The FastAPI endpoints that accept run_id for model load and delete SHALL validate the run_id format and verify that the resolved file path is within the expected models/ directory using Path.resolve().

Scenario: Valid run_id resolves within models directory

WHEN POST /model/load receives { run_id: "abc123" }
THEN the path resolves to models/abc123.pkl within the models directory

Scenario: Path traversal attempt blocked

WHEN POST /model/load receives { run_id: "../../etc/passwd" }
THEN the endpoint returns HTTP 400 before attempting any file operation

Requirement: Real health check

The GET /health endpoint SHALL perform actual connectivity checks instead of returning hardcoded status. It SHALL execute SELECT 1 against PostgreSQL and attempt an MLflow API call.

Scenario: All services healthy

WHEN PostgreSQL responds to SELECT 1 and MLflow API is reachable
THEN the health endpoint returns { "status": "healthy", "database": "connected", "mlflow": "connected" }

Scenario: Database unreachable

WHEN PostgreSQL does not respond to SELECT 1
THEN the health endpoint returns { "status": "degraded", "database": "disconnected" } with HTTP 200

Requirement: Candle time-sort validation

The POST /predict endpoint SHALL validate that input candles are sorted by time in ascending order. If candles are not sorted, the endpoint SHALL sort them before processing.

Scenario: Unsorted candles auto-sorted

WHEN candles are provided in random time order
THEN the endpoint sorts them by time before feature engineering and prediction

11 KiB Raw Blame History