Archive code-review-fix change and sync specs to main

- Synced 14 capability delta specs to main specs
- Created 6 new main specs: api-authentication, error-boundary, input-validation, security-headers, shared-types
- Updated 8 existing specs with security, validation, and performance requirements
- Archived change to openspec/changes/archive/2026-02-20-code-review-fix/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Marko Djordjevic 2026-02-20 08:54:59 +01:00
parent adb93a2d2e
commit 925e7284e3
32 changed files with 691 additions and 4 deletions

View file

@ -113,3 +113,44 @@ The system SHALL log the full pipeline YAML config as an MLflow artifact with ea
#### Scenario: Config artifact logged
- **WHEN** a training run starts
- **THEN** the full pipeline.yaml content is logged as "pipeline_config.yaml" artifact in the MLflow run
### Requirement: Training resource limits
The `POST /training/start` endpoint SHALL enforce resource limits: the training dataset file size SHALL not exceed 500MB, and the training thread SHALL have a configurable timeout (default: 30 minutes). If the timeout is exceeded, the training thread SHALL be marked as failed.
#### Scenario: Dataset too large
- **WHEN** the training dataset exceeds 500MB
- **THEN** training fails immediately with `{ "detail": "Dataset too large. Maximum 500MB." }`
#### Scenario: Training timeout
- **WHEN** a training run exceeds the 30-minute timeout
- **THEN** the training status is set to "failed" with reason "Training timed out"
### Requirement: run_id validation on training endpoints
The FastAPI training endpoints (`DELETE /training/runs/{run_id}`, `GET /training/runs/{run_id}`) SHALL validate that `run_id` matches `/^[a-zA-Z0-9_-]+$/` before any database or file operation.
#### Scenario: Valid run_id
- **WHEN** `DELETE /training/runs/run-2024-01-15_v3` is called
- **THEN** the request proceeds normally
#### Scenario: Invalid run_id
- **WHEN** `DELETE /training/runs/../../admin` is called
- **THEN** the endpoint returns HTTP 400 with `{ "detail": "Invalid run_id format" }`
### Requirement: Environment variable configuration (credentials)
The project SHALL use environment variables for runtime configuration. Credentials SHALL NOT be hardcoded in any committed file.
#### Scenario: .env file gitignored
- **WHEN** `.gitignore` is inspected
- **THEN** it includes `.env` (bare, not just `.env*.local`)
#### Scenario: .env removed from git history
- **WHEN** `git ls-files .env` is run
- **THEN** `.env` is NOT tracked by git
#### Scenario: .env.example has placeholder credentials
- **WHEN** `.env.example` is inspected
- **THEN** it contains `POSTGRES_PASSWORD=change_me_to_a_strong_password` (not a real password)
#### Scenario: No credentials in Python source
- **WHEN** `services/ml/app/db.py` is inspected
- **THEN** there are no SQL comments containing usernames or passwords, and the code fails fast if `DATABASE_URL` env var is not set