bind: MLflow port to 127.0.0.1:5000:5000 in docker-compose.yml

Changes:
- Updated docker-compose.yml MLflow service port binding from 5000:5000 to 127.0.0.1:5000:5000
  to restrict access to localhost only for security
- Marked task 1.7 as complete in tasks.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marko Djordjevic 2026-02-18 10:58:11 +01:00
parent 9efa1dbbcc
commit c327ba3370
19 changed files with 1002 additions and 2 deletions

View file

@ -0,0 +1,66 @@
## ADDED Requirements
### Requirement: CORS restricted to explicit origins
The FastAPI ML service SHALL configure CORS with explicit allowed origins instead of wildcard `*`. The default allowed origins SHALL be `["http://localhost:3000"]`. Additional origins MAY be configured via the `CORS_ORIGINS` environment variable (comma-separated).
#### Scenario: Same-origin request allowed
- **WHEN** a request from `http://localhost:3000` hits the ML service
- **THEN** the CORS headers allow the request
#### Scenario: Unknown origin blocked
- **WHEN** a request from `http://evil.com` hits the ML service
- **THEN** the CORS headers do not include `Access-Control-Allow-Origin` for that origin
### Requirement: Generic error responses from ML service
All FastAPI endpoints SHALL return generic error messages for unexpected exceptions. The response SHALL be `{ "detail": "Internal server error" }` with HTTP 500. Full exception details SHALL be logged via `logging.error` with traceback.
#### Scenario: Internal error generic response
- **WHEN** a predict endpoint throws an unexpected exception
- **THEN** the client receives `{ "detail": "Internal server error" }` and the traceback is logged server-side
### Requirement: Model file integrity check
When loading a model via `joblib.load()`, the system SHALL verify the model file's SHA256 hash against a manifest file (`models/checksums.sha256`). If the hash does not match or the manifest entry is missing, the load SHALL fail with an error.
#### Scenario: Valid model file
- **WHEN** `joblib.load("models/abc123.pkl")` is called and the file's SHA256 matches the manifest
- **THEN** the model loads successfully
#### Scenario: Tampered model file
- **WHEN** the model file's SHA256 does not match the manifest entry
- **THEN** the system refuses to load the model and returns HTTP 500 with `{ "detail": "Model integrity check failed" }`
### Requirement: Thread-safe model reads
The ML service SHALL use a lock for model reads during prediction, not just model writes. All code that reads the current model reference SHALL acquire the `_model_swap_lock` or use an atomic reference swap pattern.
#### Scenario: Concurrent prediction and model swap
- **WHEN** a prediction request is in progress and a model load request arrives
- **THEN** the prediction completes with the old model (or waits), and the new model is loaded atomically
### Requirement: Path traversal prevention on model operations
The FastAPI endpoints that accept `run_id` for model load and delete SHALL validate the `run_id` format and verify that the resolved file path is within the expected `models/` directory using `Path.resolve()`.
#### Scenario: Valid run_id resolves within models directory
- **WHEN** POST `/model/load` receives `{ run_id: "abc123" }`
- **THEN** the path resolves to `models/abc123.pkl` within the models directory
#### Scenario: Path traversal attempt blocked
- **WHEN** POST `/model/load` receives `{ run_id: "../../etc/passwd" }`
- **THEN** the endpoint returns HTTP 400 before attempting any file operation
### Requirement: Real health check
The `GET /health` endpoint SHALL perform actual connectivity checks instead of returning hardcoded status. It SHALL execute `SELECT 1` against PostgreSQL and attempt an MLflow API call.
#### Scenario: All services healthy
- **WHEN** PostgreSQL responds to `SELECT 1` and MLflow API is reachable
- **THEN** the health endpoint returns `{ "status": "healthy", "database": "connected", "mlflow": "connected" }`
#### Scenario: Database unreachable
- **WHEN** PostgreSQL does not respond to `SELECT 1`
- **THEN** the health endpoint returns `{ "status": "degraded", "database": "disconnected" }` with HTTP 200
### Requirement: Candle time-sort validation
The `POST /predict` endpoint SHALL validate that input candles are sorted by time in ascending order. If candles are not sorted, the endpoint SHALL sort them before processing.
#### Scenario: Unsorted candles auto-sorted
- **WHEN** candles are provided in random time order
- **THEN** the endpoint sorts them by time before feature engineering and prediction