Changes: - Updated docker-compose.yml MLflow service port binding from 5000:5000 to 127.0.0.1:5000:5000 to restrict access to localhost only for security - Marked task 1.7 as complete in tasks.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
66 lines
4 KiB
Markdown
66 lines
4 KiB
Markdown
## ADDED Requirements
|
|
|
|
### Requirement: CORS restricted to explicit origins
|
|
The FastAPI ML service SHALL configure CORS with explicit allowed origins instead of wildcard `*`. The default allowed origins SHALL be `["http://localhost:3000"]`. Additional origins MAY be configured via the `CORS_ORIGINS` environment variable (comma-separated).
|
|
|
|
#### Scenario: Same-origin request allowed
|
|
- **WHEN** a request from `http://localhost:3000` hits the ML service
|
|
- **THEN** the CORS headers allow the request
|
|
|
|
#### Scenario: Unknown origin blocked
|
|
- **WHEN** a request from `http://evil.com` hits the ML service
|
|
- **THEN** the CORS headers do not include `Access-Control-Allow-Origin` for that origin
|
|
|
|
### Requirement: Generic error responses from ML service
|
|
All FastAPI endpoints SHALL return generic error messages for unexpected exceptions. The response SHALL be `{ "detail": "Internal server error" }` with HTTP 500. Full exception details SHALL be logged via `logging.error` with traceback.
|
|
|
|
#### Scenario: Internal error generic response
|
|
- **WHEN** a predict endpoint throws an unexpected exception
|
|
- **THEN** the client receives `{ "detail": "Internal server error" }` and the traceback is logged server-side
|
|
|
|
### Requirement: Model file integrity check
|
|
When loading a model via `joblib.load()`, the system SHALL verify the model file's SHA256 hash against a manifest file (`models/checksums.sha256`). If the hash does not match or the manifest entry is missing, the load SHALL fail with an error.
|
|
|
|
#### Scenario: Valid model file
|
|
- **WHEN** `joblib.load("models/abc123.pkl")` is called and the file's SHA256 matches the manifest
|
|
- **THEN** the model loads successfully
|
|
|
|
#### Scenario: Tampered model file
|
|
- **WHEN** the model file's SHA256 does not match the manifest entry
|
|
- **THEN** the system refuses to load the model and returns HTTP 500 with `{ "detail": "Model integrity check failed" }`
|
|
|
|
### Requirement: Thread-safe model reads
|
|
The ML service SHALL use a lock for model reads during prediction, not just model writes. All code that reads the current model reference SHALL acquire the `_model_swap_lock` or use an atomic reference swap pattern.
|
|
|
|
#### Scenario: Concurrent prediction and model swap
|
|
- **WHEN** a prediction request is in progress and a model load request arrives
|
|
- **THEN** the prediction completes with the old model (or waits), and the new model is loaded atomically
|
|
|
|
### Requirement: Path traversal prevention on model operations
|
|
The FastAPI endpoints that accept `run_id` for model load and delete SHALL validate the `run_id` format and verify that the resolved file path is within the expected `models/` directory using `Path.resolve()`.
|
|
|
|
#### Scenario: Valid run_id resolves within models directory
|
|
- **WHEN** POST `/model/load` receives `{ run_id: "abc123" }`
|
|
- **THEN** the path resolves to `models/abc123.pkl` within the models directory
|
|
|
|
#### Scenario: Path traversal attempt blocked
|
|
- **WHEN** POST `/model/load` receives `{ run_id: "../../etc/passwd" }`
|
|
- **THEN** the endpoint returns HTTP 400 before attempting any file operation
|
|
|
|
### Requirement: Real health check
|
|
The `GET /health` endpoint SHALL perform actual connectivity checks instead of returning hardcoded status. It SHALL execute `SELECT 1` against PostgreSQL and attempt an MLflow API call.
|
|
|
|
#### Scenario: All services healthy
|
|
- **WHEN** PostgreSQL responds to `SELECT 1` and MLflow API is reachable
|
|
- **THEN** the health endpoint returns `{ "status": "healthy", "database": "connected", "mlflow": "connected" }`
|
|
|
|
#### Scenario: Database unreachable
|
|
- **WHEN** PostgreSQL does not respond to `SELECT 1`
|
|
- **THEN** the health endpoint returns `{ "status": "degraded", "database": "disconnected" }` with HTTP 200
|
|
|
|
### Requirement: Candle time-sort validation
|
|
The `POST /predict` endpoint SHALL validate that input candles are sorted by time in ascending order. If candles are not sorted, the endpoint SHALL sort them before processing.
|
|
|
|
#### Scenario: Unsorted candles auto-sorted
|
|
- **WHEN** candles are provided in random time order
|
|
- **THEN** the endpoint sorts them by time before feature engineering and prediction
|