candle-annotator/openspec/changes/code-review-fix/specs/ml-inference/spec.md

## ADDED Requirements

### Requirement: CORS restricted to explicit origins
The FastAPI ML service SHALL configure CORS with explicit allowed origins instead of wildcard `*`. The default allowed origins SHALL be `["http://localhost:3000"]`. Additional origins MAY be configured via the `CORS_ORIGINS` environment variable (comma-separated).

#### Scenario: Same-origin request allowed
- **WHEN** a request from `http://localhost:3000` hits the ML service
- **THEN** the CORS headers allow the request

#### Scenario: Unknown origin blocked
- **WHEN** a request from `http://evil.com` hits the ML service
- **THEN** the CORS headers do not include `Access-Control-Allow-Origin` for that origin

### Requirement: Generic error responses from ML service
All FastAPI endpoints SHALL return generic error messages for unexpected exceptions. The response SHALL be `{ "detail": "Internal server error" }` with HTTP 500. Full exception details SHALL be logged via `logging.error` with traceback.

#### Scenario: Internal error generic response
- **WHEN** a predict endpoint throws an unexpected exception
- **THEN** the client receives `{ "detail": "Internal server error" }` and the traceback is logged server-side

### Requirement: Model file integrity check
When loading a model via `joblib.load()`, the system SHALL verify the model file's SHA256 hash against a manifest file (`models/checksums.sha256`). If the hash does not match or the manifest entry is missing, the load SHALL fail with an error.

#### Scenario: Valid model file
- **WHEN** `joblib.load("models/abc123.pkl")` is called and the file's SHA256 matches the manifest
- **THEN** the model loads successfully

#### Scenario: Tampered model file
- **WHEN** the model file's SHA256 does not match the manifest entry
- **THEN** the system refuses to load the model and returns HTTP 500 with `{ "detail": "Model integrity check failed" }`

### Requirement: Thread-safe model reads
The ML service SHALL use a lock for model reads during prediction, not just model writes. All code that reads the current model reference SHALL acquire the `_model_swap_lock` or use an atomic reference swap pattern.

#### Scenario: Concurrent prediction and model swap
- **WHEN** a prediction request is in progress and a model load request arrives
- **THEN** the prediction completes with the old model (or waits), and the new model is loaded atomically

### Requirement: Path traversal prevention on model operations
The FastAPI endpoints that accept `run_id` for model load and delete SHALL validate the `run_id` format and verify that the resolved file path is within the expected `models/` directory using `Path.resolve()`.

#### Scenario: Valid run_id resolves within models directory
- **WHEN** POST `/model/load` receives `{ run_id: "abc123" }`
- **THEN** the path resolves to `models/abc123.pkl` within the models directory

#### Scenario: Path traversal attempt blocked
- **WHEN** POST `/model/load` receives `{ run_id: "../../etc/passwd" }`
- **THEN** the endpoint returns HTTP 400 before attempting any file operation

### Requirement: Real health check
The `GET /health` endpoint SHALL perform actual connectivity checks instead of returning hardcoded status. It SHALL execute `SELECT 1` against PostgreSQL and attempt an MLflow API call.

#### Scenario: All services healthy
- **WHEN** PostgreSQL responds to `SELECT 1` and MLflow API is reachable
- **THEN** the health endpoint returns `{ "status": "healthy", "database": "connected", "mlflow": "connected" }`

#### Scenario: Database unreachable
- **WHEN** PostgreSQL does not respond to `SELECT 1`
- **THEN** the health endpoint returns `{ "status": "degraded", "database": "disconnected" }` with HTTP 200

### Requirement: Candle time-sort validation
The `POST /predict` endpoint SHALL validate that input candles are sorted by time in ascending order. If candles are not sorted, the endpoint SHALL sort them before processing.

#### Scenario: Unsorted candles auto-sorted
- **WHEN** candles are provided in random time order
- **THEN** the endpoint sorts them by time before feature engineering and prediction