Archive code-review-fix change and sync specs to main

- Synced 14 capability delta specs to main specs - Created 6 new main specs: api-authentication, error-boundary, input-validation, security-headers, shared-types - Updated 8 existing specs with security, validation, and performance requirements - Archived change to openspec/changes/archive/2026-02-20-code-review-fix/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 08:54:59 +01:00 · 2026-02-20 08:54:59 +01:00 · 925e7284e3
commit 925e7284e3
parent adb93a2d2e
32 changed files with 691 additions and 4 deletions
--- a/openspec/specs/ml-training/spec.md
+++ b/openspec/specs/ml-training/spec.md
@ -113,3 +113,44 @@ The system SHALL log the full pipeline YAML config as an MLflow artifact with ea
 #### Scenario: Config artifact logged
 - **WHEN** a training run starts
 - **THEN** the full pipeline.yaml content is logged as "pipeline_config.yaml" artifact in the MLflow run
+
+### Requirement: Training resource limits
+The `POST /training/start` endpoint SHALL enforce resource limits: the training dataset file size SHALL not exceed 500MB, and the training thread SHALL have a configurable timeout (default: 30 minutes). If the timeout is exceeded, the training thread SHALL be marked as failed.
+
+#### Scenario: Dataset too large
+- **WHEN** the training dataset exceeds 500MB
+- **THEN** training fails immediately with `{ "detail": "Dataset too large. Maximum 500MB." }`
+
+#### Scenario: Training timeout
+- **WHEN** a training run exceeds the 30-minute timeout
+- **THEN** the training status is set to "failed" with reason "Training timed out"
+
+### Requirement: run_id validation on training endpoints
+The FastAPI training endpoints (`DELETE /training/runs/{run_id}`, `GET /training/runs/{run_id}`) SHALL validate that `run_id` matches `/^[a-zA-Z0-9_-]+$/` before any database or file operation.
+
+#### Scenario: Valid run_id
+- **WHEN** `DELETE /training/runs/run-2024-01-15_v3` is called
+- **THEN** the request proceeds normally
+
+#### Scenario: Invalid run_id
+- **WHEN** `DELETE /training/runs/../../admin` is called
+- **THEN** the endpoint returns HTTP 400 with `{ "detail": "Invalid run_id format" }`
+
+### Requirement: Environment variable configuration (credentials)
+The project SHALL use environment variables for runtime configuration. Credentials SHALL NOT be hardcoded in any committed file.
+
+#### Scenario: .env file gitignored
+- **WHEN** `.gitignore` is inspected
+- **THEN** it includes `.env` (bare, not just `.env*.local`)
+
+#### Scenario: .env removed from git history
+- **WHEN** `git ls-files .env` is run
+- **THEN** `.env` is NOT tracked by git
+
+#### Scenario: .env.example has placeholder credentials
+- **WHEN** `.env.example` is inspected
+- **THEN** it contains `POSTGRES_PASSWORD=change_me_to_a_strong_password` (not a real password)
+
+#### Scenario: No credentials in Python source
+- **WHEN** `services/ml/app/db.py` is inspected
+- **THEN** there are no SQL comments containing usernames or passwords, and the code fails fast if `DATABASE_URL` env var is not set