bind: MLflow port to 127.0.0.1:5000:5000 in docker-compose.yml

Changes: - Updated docker-compose.yml MLflow service port binding from 5000:5000 to 127.0.0.1:5000:5000 to restrict access to localhost only for security - Marked task 1.7 as complete in tasks.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-18 10:58:11 +01:00 · 2026-02-18 10:58:11 +01:00 · c327ba3370
commit c327ba3370
parent 9efa1dbbcc
19 changed files with 1002 additions and 2 deletions
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -52,7 +52,7 @@ services:
  mlflow:
    image: ghcr.io/mlflow/mlflow:v2.10.0
    ports:
-      - "5000:5000"
+      - "127.0.0.1:5000:5000"
    volumes:
      - mlflow-data:/mlflow
    command: >
--- a/openspec/changes/code-review-fix/.openspec.yaml
+++ b/openspec/changes/code-review-fix/.openspec.yaml
@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-02-18
--- a/openspec/changes/code-review-fix/design.md
+++ b/openspec/changes/code-review-fix/design.md
@ -0,0 +1,133 @@
+## Context
+
+The Candle Annotator is a full-stack application (Next.js frontend + FastAPI ML service + PostgreSQL) for annotating candlestick chart data and training ML models. A code review found 93 issues (13 Critical, 35 Major, 45 Minor). The app currently runs in Docker Compose with no authentication, hardcoded credentials, exposed ports, and multiple frontend stability problems. Zero tests exist.
+
+The application is used internally for ML development — it is not yet public-facing, but the security posture must be production-ready before any deployment beyond localhost.
+
+## Goals / Non-Goals
+
+**Goals:**
+- Eliminate all 13 critical issues (security + frontend stability)
+- Fix all 35 major issues (API hardening, code quality, infrastructure)
+- Fix all 45 minor issues (polish, accessibility, deprecated APIs)
+- Establish a type-safe foundation with centralized shared types
+- Make the Docker deployment secure by default (localhost bindings, non-root, no leaked creds)
+- Add basic input validation on all API boundaries
+
+**Non-Goals:**
+- Full test suite (tracked separately — this change fixes code, not adds tests)
+- Migrating away from joblib/pickle to ONNX/safetensors (integrity checks are sufficient for now)
+- Replacing the ML service framework (FastAPI stays)
+- Adding a full RBAC/user management system (API key auth is sufficient)
+- Rewriting CandleChart.tsx from scratch (targeted fixes only, no full decomposition into hooks)
+- Adding Celery/RQ task queue for training (out of scope, timeout + size limits are sufficient)
+
+## Decisions
+
+### D-1. Authentication: API Key via Environment Variable
+
+**Choice**: Simple `X-API-Key` header checked in Next.js middleware and FastAPI `Depends()`.
+
+**Alternatives considered**:
+- NextAuth.js session auth — overkill for a single-user internal tool
+- OAuth2 — unnecessary complexity for the use case
+- mTLS — too heavy for dev/staging environments
+
+**Rationale**: API key is the simplest approach that closes the "zero auth" gap. The key lives in `.env` (which will be gitignored). Next.js middleware intercepts all `/api/*` routes. FastAPI uses a shared dependency. Both read from `API_KEY` env var.
+
+### D-2. Input Validation: Zod on Next.js, Pydantic Already on FastAPI
+
+**Choice**: Add Zod schemas in each proxy route file. FastAPI already uses Pydantic models.
+
+**Alternatives considered**:
+- Shared validation library — no shared runtime between TS and Python
+- tRPC — would require significant refactoring of the API layer
+
+**Rationale**: Zod is the standard for Next.js API route validation. Each proxy route gets a schema that validates before forwarding to the ML service. The `run_id` validation (alphanumeric + hyphens + underscores) is applied at both layers.
+
+### D-3. Shared Types: `src/types/index.ts` Barrel File
+
+**Choice**: Create `src/types/` with domain type files (`annotations.ts`, `candles.ts`, `predictions.ts`, `charts.ts`) and a barrel `index.ts`. Replace all 6+ duplicate interface definitions.
+
+**Alternatives considered**:
+- Zod schemas as single source of truth (infer types) — good but adds Zod dependency to components that don't validate
+- Keep types co-located — current approach, causes drift
+
+**Rationale**: Separate type files keep imports clean. Components import types, API routes import both types and Zod schemas. No circular dependencies.
+
+### D-4. Error Handling: Generic Messages to Client, Structured Server Logs
+
+**Choice**: All API routes return `{ error: "Internal server error" }` for 500s. Full error logged via `console.error` (Next.js) and `logging.error` (FastAPI) with request context.
+
+**Alternatives considered**:
+- Error tracking service (Sentry) — good addition later, not needed to fix the leak now
+- Custom error codes — over-engineering for this stage
+
+**Rationale**: Minimal change. Replace `error.message` with static string in catch blocks. The actual error is already logged in most routes.
+
+### D-5. Frontend Stale Closures: Refs for Mutable State
+
+**Choice**: Use `useRef` for values that change frequently and are read in event handlers (`modelInfo`, `selectedSpanId`, `drawingState`, `dragState`, `annotations`). Keep `useState` for values that trigger re-renders.
+
+**Alternatives considered**:
+- Zustand store — solves the closure problem but is a large refactor
+- `useReducer` — consolidates state but doesn't fix closure staleness
+
+**Rationale**: Refs are the targeted fix. Each stale closure issue (C-9, C-11, C-12) is fixed by reading from a ref instead of the closure variable. Minimal blast radius.
+
+### D-6. Docker Security: Localhost Bindings + Env Var Interpolation
+
+**Choice**: Bind all ports to `127.0.0.1`, use `${POSTGRES_PASSWORD}` interpolation in `docker-compose.yml`, add `.env` to `.gitignore`.
+
+**Alternatives considered**:
+- Docker secrets — better for production swarm, overkill for compose
+- Vault — enterprise-grade, way too heavy
+
+**Rationale**: Env var interpolation is native to docker-compose and is the standard pattern. `.env` file is read automatically by compose. Binding to localhost prevents external access without requiring firewall rules.
+
+### D-7. AbortController for Prediction Requests
+
+**Choice**: Each prediction fetch creates an `AbortController`. A ref holds the current controller. New requests abort the previous one. Stale responses are discarded by checking if the signal was aborted.
+
+**Alternatives considered**:
+- Request ID tracking (increment counter, discard if stale) — works but doesn't free network resources
+- Debouncing — reduces frequency but doesn't prevent race conditions
+
+**Rationale**: AbortController is the browser-native solution. It both prevents stale responses AND frees up network resources. Combined with the ref pattern from D-5, this cleanly solves C-10.
+
+### D-8. Security Headers via next.config.js
+
+**Choice**: Add `headers()` function to `next.config.js` with CSP, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy.
+
+**Alternatives considered**:
+- Middleware-based headers — works but `next.config.js` is the canonical Next.js approach
+- Reverse proxy (nginx) headers — adds infrastructure complexity
+
+**Rationale**: Next.js config `headers()` is zero-dependency and applies to all routes automatically.
+
+## Risks / Trade-offs
+
+**[API key auth is minimal]** → Sufficient for internal tool. Can upgrade to session auth later without breaking changes. The middleware pattern makes swapping auth strategies straightforward.
+
+**[Refs pattern increases cognitive complexity]** → Developers must remember to update both state and ref. Mitigated by co-locating ref updates next to setState calls and adding brief comments.
+
+**[93 issues is a large changeset]** → Risk of introducing regressions. Mitigated by phased approach (security first, then stability, then polish) and the model-tier task dispatch (complex changes get opus-level attention).
+
+**[No tests to catch regressions]** → This is explicitly a non-goal for this change, but it's the biggest risk. Mitigated by making each fix small and targeted, avoiding broad refactors.
+
+**[Breaking docker-compose.yml env var pattern]** → Existing users with hardcoded `.env` files will need to update. Mitigated by updating `.env.example` with clear instructions.
+
+## Migration Plan
+
+1. **Phase 1 (Security Critical)**: `.gitignore`, credentials, port bindings, run_id validation, upload limits, CORS — all low-effort, high-impact
+2. **Phase 2 (Before Production)**: Auth middleware, Zod validation, transaction fixes, error message cleanup, security headers, Dockerfile hardening
+3. **Phase 3 (Frontend Stability)**: AbortController, stale closure fixes, memory leaks, response.ok checks, shared types, Error Boundary
+4. **Phase 4 (Polish)**: Rate limiting, accessibility, deprecated APIs, dark theme, debounce, constants extraction
+
+Each phase can be deployed independently. Rollback is git revert on the phase branch.
+
+## Open Questions
+
+- **API key distribution**: How will the API key be shared with legitimate clients? Manual `.env` copy for now?
+- **Rate limiting implementation**: Next.js middleware-based or external (e.g., nginx)? Deferred to Phase 4.
+- **Model integrity**: SHA256 manifest file checked into git, or computed on first load? Leaning toward manifest file.
--- a/openspec/changes/code-review-fix/proposal.md
+++ b/openspec/changes/code-review-fix/proposal.md
@ -0,0 +1,92 @@
+## Why
+
+A comprehensive code review identified **93 issues** (13 Critical, 35 Major, 45 Minor) across the entire codebase. The application has zero authentication, hardcoded credentials in committed files, path traversal vectors, unsafe deserialization, frontend race conditions/memory leaks, and zero test files. These must be fixed before any production deployment.
+
+## What Changes
+
+### Security (Critical)
+- Add authentication middleware to all API endpoints (Next.js + FastAPI)
+- Fix `.env` gitignore and remove credentials from git history
+- Replace all hardcoded database credentials with env var interpolation
+- Bind PostgreSQL port to localhost only
+- Validate `run_id` parameters to prevent SSRF/path traversal
+- Add file upload size limits
+- Fix CORS wildcard on ML service
+- Add model file integrity checks (SHA256) for joblib/pickle files
+
+### Frontend Stability (Critical + Major)
+- Fix stale closures in `fetchPredictions`, CandleChart click handler, SpanAnnotationManager keyboard handler
+- Add AbortController to cancel stale prediction requests
+- Fix SpanAnnotationManager preview primitive memory leak
+- Add `response.ok` checks on all fetch calls
+- Add React Error Boundary
+- Fix hardcoded 1-minute candle interval assumption
+- Use `chart.applyOptions()` for theme changes instead of re-creating chart
+- Fix `fitContent()` called on every span change
+
+### API Hardening (Major)
+- Add Zod validation on all proxy routes
+- Wrap chart cascade delete in transaction + add span annotation deletion
+- Stop leaking error details to clients (frontend + backend)
+- Require `chartId` for bulk annotation deletes
+- Add input size limits on batch prediction
+
+### Code Quality (Major + Minor)
+- Centralize duplicate interface definitions into `src/types/`
+- Replace pervasive `any` types with proper interfaces
+- Move dev dependencies to `devDependencies`
+- Remove dead code (migrate.ts, get_db_session, dead filter)
+- Replace deprecated Python APIs (on_event, declarative_base, utcnow)
+
+### Infrastructure (Major + Minor)
+- Add security response headers (CSP, X-Frame-Options)
+- ML Dockerfile: run as non-root user
+- Create `.dockerignore`
+- Add rate limiting middleware
+- Pin Docker base images to digests
+- Implement real health checks on ML service
+- Use `next/font` instead of CSS `@import`
+
+### UX Polish (Minor)
+- Add confirmation dialog for delete-all annotations
+- Fix accessibility (ARIA labels, focus trapping, keyboard nav)
+- Add CSV injection protection to exports
+- Fix dark theme on settings pages
+- Add debounce to confidence slider
+- Add click-outside handler to chart dropdown
+
+## Capabilities
+
+### New Capabilities
+- `api-authentication`: API key or session-based auth middleware for Next.js routes and FastAPI endpoints
+- `input-validation`: Zod schemas for all proxy routes, file upload limits, run_id format validation, batch size limits
+- `security-headers`: CSP, X-Frame-Options, and other security response headers via Next.js config
+- `error-boundary`: React Error Boundary wrapping critical component subtrees
+- `shared-types`: Centralized TypeScript interfaces in `src/types/` replacing duplicate definitions across 6+ files
+
+### Modified Capabilities
+- `docker-deployment`: Bind ports to localhost, add `.dockerignore`, run ML as non-root, pin images, fix healthcheck, TA-Lib HTTPS download
+- `backend-api`: Transaction-wrapped cascade deletes, generic error responses, scoped bulk delete, rate limiting
+- `ml-inference`: CORS fix, model integrity checks, path traversal prevention, thread-safe model reads, input size limits
+- `ml-training`: run_id validation, resource limits, path traversal prevention
+- `chart-canvas`: Fix stale closures, theme change via applyOptions, fix 1-min interval assumption, extract magic numbers, use refs for click handler
+- `span-annotation`: Fix keyboard handler stale closure, preview primitive memory leak, fitContent removal, primitive recreation optimization
+- `prediction-ui`: AbortController for requests, bounded prediction cache, response.ok checks
+- `postgres-data-layer`: Remove `.env` from git, env var interpolation for credentials, SSL/TLS support
+- `ui-shell`: Error boundary integration, confirmation dialogs, accessibility improvements, dark theme fixes, next/font migration
+- `label-management`: (no spec-level changes, only implementation cleanup)
+- `feature-engineering`: (no spec-level changes, deprecated API cleanup only)
+
+## Impact
+
+- **All 27 API route files** in `src/app/api/` — auth middleware, error handling, validation
+- **`src/app/page.tsx`** — AbortController, response checks, cache bounds, confirmation dialog
+- **`src/components/CandleChart.tsx`** — stale closures, theme, interval fix, refs
+- **`src/components/SpanAnnotationManager.tsx`** — memory leak, keyboard handler, fitContent, primitives
+- **`services/ml/app/main.py`** — CORS, auth, error messages, model integrity, path validation, deprecated APIs
+- **`services/ml/Dockerfile`** — non-root user, HTTPS download
+- **`docker-compose.yml`** — port bindings, credential interpolation
+- **`next.config.js`** — security headers
+- **`package.json`** — devDependencies reorganization
+- **New files**: `src/middleware.ts`, `src/types/*.ts`, `src/components/ErrorBoundary.tsx`, `.dockerignore`
+- **No breaking API changes** — all fixes are additive or internal
--- a/openspec/changes/code-review-fix/specs/api-authentication/spec.md
+++ b/openspec/changes/code-review-fix/specs/api-authentication/spec.md
@ -0,0 +1,39 @@
+## ADDED Requirements
+
+### Requirement: Next.js API key middleware
+The system SHALL enforce API key authentication on all `/api/*` routes via Next.js middleware (`src/middleware.ts`). The middleware SHALL read the expected key from the `API_KEY` environment variable. Requests MUST include the key in the `X-API-Key` header. If the key is missing or incorrect, the middleware SHALL return HTTP 401 with `{ "error": "Unauthorized" }`. The `/api/health` endpoint SHALL be exempt from authentication.
+
+#### Scenario: Valid API key
+- **WHEN** a request to `/api/candles` includes header `X-API-Key: <correct key>`
+- **THEN** the request proceeds to the route handler normally
+
+#### Scenario: Missing API key
+- **WHEN** a request to `/api/candles` has no `X-API-Key` header
+- **THEN** the middleware returns HTTP 401 with `{ "error": "Unauthorized" }`
+
+#### Scenario: Invalid API key
+- **WHEN** a request to `/api/candles` includes header `X-API-Key: wrong-key`
+- **THEN** the middleware returns HTTP 401 with `{ "error": "Unauthorized" }`
+
+#### Scenario: Health endpoint exempt
+- **WHEN** a request to `/api/health` has no `X-API-Key` header
+- **THEN** the request proceeds normally (health check is unauthenticated)
+
+#### Scenario: API_KEY not configured
+- **WHEN** the `API_KEY` environment variable is not set
+- **THEN** the middleware SHALL allow all requests (auth disabled) and log a warning at startup
+
+### Requirement: FastAPI API key dependency
+The FastAPI ML service SHALL enforce API key authentication via a shared `Depends()` dependency. The dependency SHALL read the expected key from the `API_KEY` environment variable. Requests MUST include the key in the `X-API-Key` header. The `/health` endpoint SHALL be exempt.
+
+#### Scenario: Valid API key on ML service
+- **WHEN** a request to `/predict` includes the correct `X-API-Key` header
+- **THEN** the request proceeds to the endpoint handler
+
+#### Scenario: Unauthorized ML service request
+- **WHEN** a request to `/predict` has no `X-API-Key` header and `API_KEY` is configured
+- **THEN** the service returns HTTP 401 with `{ "detail": "Unauthorized" }`
+
+#### Scenario: Next.js proxy forwards API key
+- **WHEN** the Next.js proxy route calls the ML service
+- **THEN** it SHALL include the `X-API-Key` header from its own environment variable
--- a/openspec/changes/code-review-fix/specs/backend-api/spec.md
+++ b/openspec/changes/code-review-fix/specs/backend-api/spec.md
@ -0,0 +1,74 @@
+## ADDED Requirements
+
+### Requirement: Generic error responses
+All Next.js API routes SHALL return generic error messages to clients for 500-level errors. The response body SHALL be `{ "error": "Internal server error" }`. The full error details SHALL be logged server-side via `console.error` with request context.
+
+#### Scenario: Internal error returns generic message
+- **WHEN** an API route handler throws an unexpected error
+- **THEN** the client receives HTTP 500 with `{ "error": "Internal server error" }` and the full error is logged server-side
+
+#### Scenario: No stack traces in response
+- **WHEN** a database query fails in any API route
+- **THEN** the response does NOT contain table names, file paths, connection strings, or stack traces
+
+### Requirement: Scoped bulk annotation delete
+The `DELETE /api/annotations` endpoint SHALL require a `chartId` query parameter when `all=true` is specified. Unscoped delete-all (without chartId) SHALL be rejected with HTTP 400.
+
+#### Scenario: Scoped bulk delete
+- **WHEN** `DELETE /api/annotations?all=true&chartId=5` is called
+- **THEN** all annotations for chart 5 are deleted
+
+#### Scenario: Unscoped bulk delete rejected
+- **WHEN** `DELETE /api/annotations?all=true` is called without chartId
+- **THEN** the route returns HTTP 400 with `{ "error": "chartId is required for bulk delete" }`
+
+### Requirement: Transaction-wrapped chart cascade delete
+The `DELETE /api/charts/[id]` route SHALL wrap all related deletions (annotations, span annotations, candles, chart) in a single database transaction using `db.transaction()`.
+
+#### Scenario: Cascade delete in transaction
+- **WHEN** `DELETE /api/charts/5` is called
+- **THEN** span annotations, annotations, candles, and the chart record for chart 5 are all deleted within a single transaction
+
+#### Scenario: Partial failure rolls back
+- **WHEN** the candles deletion fails mid-transaction
+- **THEN** all deletions are rolled back and the chart remains intact
+
+### Requirement: Span annotations included in chart cascade delete
+The `DELETE /api/charts/[id]` route SHALL delete span annotations for the chart in addition to annotations and candles.
+
+#### Scenario: Span annotations deleted with chart
+- **WHEN** `DELETE /api/charts/5` is called
+- **THEN** all `span_annotations` rows with `chart_id=5` are deleted before the chart record
+
+### Requirement: parseInt validation
+All API routes that parse integer query parameters SHALL use `parseInt(value, 10)` with radix 10 and check for `isNaN()`. Invalid integer parameters SHALL return HTTP 400.
+
+#### Scenario: Valid integer parameter
+- **WHEN** `GET /api/candles?chartId=5` is called
+- **THEN** chartId is parsed as integer 5
+
+#### Scenario: Invalid integer parameter
+- **WHEN** `GET /api/candles?chartId=abc` is called
+- **THEN** the route returns HTTP 400 with `{ "error": "Invalid chartId" }`
+
+### Requirement: CSV injection protection on exports
+All CSV export routes SHALL prefix cell values starting with `=`, `+`, `-`, or `@` with a single quote (`'`) to prevent spreadsheet formula injection.
+
+#### Scenario: Dangerous cell value escaped
+- **WHEN** an annotation note contains `=CMD("calc")`
+- **THEN** the exported CSV cell contains `'=CMD("calc")`
+
+#### Scenario: Normal values unchanged
+- **WHEN** an annotation note contains `regular text`
+- **THEN** the exported CSV cell contains `regular text` (no prefix)
+
+### Requirement: response.ok checks on all fetch calls
+All `fetch()` calls in frontend components (`page.tsx`, `CandleChart.tsx`) SHALL check `response.ok` before calling `response.json()`. If `!response.ok`, the code SHALL throw an error or handle the failure explicitly.
+
+#### Scenario: Successful response parsed
+- **WHEN** a fetch call returns HTTP 200
+- **THEN** `response.json()` is called and the data is used normally
+
+#### Scenario: Error response handled
+- **WHEN** a fetch call returns HTTP 500
+- **THEN** the code detects `!response.ok` and shows an error message instead of attempting JSON parse
--- a/openspec/changes/code-review-fix/specs/chart-canvas/spec.md
+++ b/openspec/changes/code-review-fix/specs/chart-canvas/spec.md
@ -0,0 +1,48 @@
+## ADDED Requirements
+
+### Requirement: Refs for event handler closure values
+The CandleChart component SHALL use `useRef` for state values that are read inside event handlers (`drawingState`, `selectedLineId`, `dragState`, `annotations`). The ref SHALL be updated alongside every `setState` call. Event handlers SHALL read from the ref instead of the closure variable.
+
+#### Scenario: Click handler reads current state
+- **WHEN** the user clicks on the chart after state has changed
+- **THEN** the click handler reads the current value from the ref (not a stale closure value)
+
+#### Scenario: Reduced re-subscription frequency
+- **WHEN** refs are used for mutable state in event handlers
+- **THEN** the `useEffect` dependency array for chart event subscriptions is smaller, reducing re-subscription frequency
+
+### Requirement: Theme change without chart re-creation
+The CandleChart component SHALL apply theme changes using `chart.applyOptions()` instead of destroying and re-creating the entire chart instance. This preserves scroll position, zoom level, and attached primitives.
+
+#### Scenario: Theme toggle preserves state
+- **WHEN** the user toggles between light and dark theme
+- **THEN** the chart colors update without losing scroll position, zoom level, or annotation primitives
+
+### Requirement: Dynamic candle interval detection
+The CandleChart component SHALL determine the actual candle interval from the data (by examining the time difference between consecutive candles) instead of hardcoding 60 seconds. The detected interval SHALL be used for span annotation iteration loops.
+
+#### Scenario: 1-minute candles
+- **WHEN** candle data has 60-second intervals
+- **THEN** the span iteration step is 60 seconds
+
+#### Scenario: 1-hour candles
+- **WHEN** candle data has 3600-second intervals
+- **THEN** the span iteration step is 3600 seconds (not 60)
+
+#### Scenario: No performance degradation on high timeframes
+- **WHEN** iterating over a span on daily candles
+- **THEN** the loop iterates over actual candle count (not 1440x more iterations)
+
+### Requirement: Named constants for magic numbers
+The CandleChart component SHALL extract hardcoded magic numbers (8px padding, 60s interval, color values) into named constants at the module level.
+
+#### Scenario: Constants used instead of magic numbers
+- **WHEN** the CandleChart source is inspected
+- **THEN** magic numbers like `8`, `60`, hardcoded colors are replaced with descriptive constant names
+
+### Requirement: Module-level empty Set default
+The `new Set<string>()` default prop value SHALL be defined as a module-level constant instead of being recreated on every render.
+
+#### Scenario: Stable default reference
+- **WHEN** CandleChart renders without `hiddenLabels` prop
+- **THEN** it uses a module-level `EMPTY_SET` constant (same reference across renders)
--- a/openspec/changes/code-review-fix/specs/docker-deployment/spec.md
+++ b/openspec/changes/code-review-fix/specs/docker-deployment/spec.md
@ -0,0 +1,152 @@
+## ADDED Requirements
+
+### Requirement: ML service non-root user
+The ML service Dockerfile SHALL create a non-root user and run the application as that user. The Dockerfile SHALL include `RUN useradd -m -r appuser` and `USER appuser` directives.
+
+#### Scenario: Container runs as non-root
+- **WHEN** the ML service container starts
+- **THEN** the application process runs as user `appuser` (not root)
+
+### Requirement: TA-Lib downloaded over HTTPS with checksum
+The ML service Dockerfile SHALL download TA-Lib source over HTTPS (not HTTP). The download SHALL be verified with a SHA256 checksum before extraction.
+
+#### Scenario: HTTPS download
+- **WHEN** the Dockerfile downloads TA-Lib source
+- **THEN** the URL uses `https://` protocol
+
+#### Scenario: Checksum verification
+- **WHEN** the TA-Lib tarball is downloaded
+- **THEN** a `sha256sum -c` check runs before extraction, and the build fails if the checksum does not match
+
+### Requirement: .dockerignore file exists
+The project SHALL include a `.dockerignore` file at the repository root that excludes `.git`, `.env`, `.env*`, `node_modules`, `.next`, `data/`, `*.md`, `__pycache__/`, `mlruns/`, and `models/`.
+
+#### Scenario: Docker context excludes sensitive files
+- **WHEN** `docker build` runs
+- **THEN** `.env`, `.git`, and `node_modules` are not included in the build context
+
+## MODIFIED Requirements
+
+### Requirement: Docker Compose configuration
+The project SHALL include docker-compose.yml for simplified deployment orchestration.
+
+#### Scenario: Service definition
+- **WHEN** docker-compose.yml is parsed
+- **THEN** defines service named 'candle-annotator' using Dockerfile from current directory
+
+#### Scenario: Port mapping
+- **WHEN** docker-compose up runs
+- **THEN** maps host port 3000 to container port 3000
+
+#### Scenario: Volume mounting for ML data
+- **WHEN** docker-compose up runs
+- **THEN** mounts named volume 'ml-data' to /app/ml-data in the candle-annotator container
+
+#### Scenario: Frontend depends on PostgreSQL
+- **WHEN** docker-compose up runs
+- **THEN** the candle-annotator service starts only after the postgres service is healthy (`depends_on: postgres: condition: service_healthy`)
+
+#### Scenario: Frontend DATABASE_URL uses env var interpolation
+- **WHEN** the candle-annotator service starts
+- **THEN** the `DATABASE_URL` environment variable uses `${POSTGRES_PASSWORD}` interpolation: `postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}`
+
+#### Scenario: Restart policy
+- **WHEN** container crashes or stops
+- **THEN** docker-compose automatically restarts container unless explicitly stopped (restart: unless-stopped)
+
+#### Scenario: No SQLite volume
+- **WHEN** docker-compose.yml is parsed
+- **THEN** there is no `candle-data` volume defined or mounted
+
+#### Scenario: PostgreSQL port bound to localhost only
+- **WHEN** docker-compose up runs
+- **THEN** the postgres service port mapping is `127.0.0.1:5432:5432` (not `5432:5432`)
+
+#### Scenario: MLflow port bound to localhost only
+- **WHEN** docker-compose up runs
+- **THEN** the mlflow service port mapping is `127.0.0.1:5000:5000`
+
+#### Scenario: ML service port bound to localhost only
+- **WHEN** docker-compose up runs
+- **THEN** the ml-service port mapping is `127.0.0.1:8001:8001`
+
+#### Scenario: Credentials via env var interpolation
+- **WHEN** docker-compose.yml is parsed
+- **THEN** all database credentials use `${POSTGRES_USER}`, `${POSTGRES_PASSWORD}`, and `${POSTGRES_DB}` variable interpolation from `.env`
+
+### Requirement: Environment variable configuration
+The project SHALL use environment variables for runtime configuration.
+
+#### Scenario: .env.example file with placeholder credentials
+- **WHEN** repository is cloned
+- **THEN** `.env.example` contains `POSTGRES_PASSWORD=change_me_to_a_strong_password` (not a real password)
+
+#### Scenario: .env file gitignored
+- **WHEN** `.gitignore` is inspected
+- **THEN** it includes `.env` (not just `.env*.local`)
+
+#### Scenario: DATABASE_URL configuration
+- **WHEN** `DATABASE_URL` environment variable is set
+- **THEN** the Next.js application connects to the PostgreSQL database at the specified URL
+
+#### Scenario: No DATABASE_PATH variable
+- **WHEN** environment variables are inspected
+- **THEN** there is no `DATABASE_PATH` variable (SQLite path is removed)
+
+#### Scenario: PORT configuration
+- **WHEN** PORT environment variable is set
+- **THEN** Next.js server listens on specified port (default: 3000)
+
+#### Scenario: NODE_ENV configuration
+- **WHEN** NODE_ENV environment variable is set to 'production'
+- **THEN** Next.js runs in production mode with optimizations enabled
+
+#### Scenario: API_KEY configuration
+- **WHEN** `API_KEY` environment variable is set
+- **THEN** both Next.js middleware and FastAPI dependency use this key for authentication
+
+### Requirement: Container security
+The Docker setup SHALL follow security best practices.
+
+#### Scenario: Non-root user
+- **WHEN** container runs
+- **THEN** application process runs as non-root user 'appuser' (UID 1000)
+
+#### Scenario: Read-only filesystem where possible
+- **WHEN** container runs
+- **THEN** only /app/data directory requires write permissions, all other files are read-only to appuser
+
+#### Scenario: No sensitive data in image
+- **WHEN** Docker image is built
+- **THEN** .env files, secrets, and database files are not included in image layers
+
+#### Scenario: Minimal attack surface
+- **WHEN** container runs
+- **THEN** only port 3000 is exposed, no SSH, no unnecessary services, alpine base reduces package vulnerabilities
+
+#### Scenario: No node_modules in production image
+- **WHEN** the Next.js production Docker image is built
+- **THEN** the `COPY --from=builder /app/node_modules` line is removed (standalone output bundles needed deps)
+
+### Requirement: Production build optimization
+The Docker image SHALL be optimized for production use with minimal size.
+
+#### Scenario: Use alpine base images
+- **WHEN** Dockerfile specifies base images
+- **THEN** uses node:18-alpine for both build and runtime stages
+
+#### Scenario: Multi-stage build cleanup
+- **WHEN** Docker image is built
+- **THEN** build artifacts, devDependencies, and source files are not included in final image
+
+#### Scenario: Layer caching optimization
+- **WHEN** Dockerfile is structured
+- **THEN** package.json and package-lock.json are copied and dependencies installed before source code copy for better layer caching
+
+#### Scenario: Final image size
+- **WHEN** Docker image build completes
+- **THEN** final image size is under 200MB (excluding data volume)
+
+#### Scenario: Base images pinned to digest
+- **WHEN** Dockerfiles specify base images
+- **THEN** images use `@sha256:<hash>` pinning for reproducible builds
--- a/openspec/changes/code-review-fix/specs/error-boundary/spec.md
+++ b/openspec/changes/code-review-fix/specs/error-boundary/spec.md
@ -0,0 +1,23 @@
+## ADDED Requirements
+
+### Requirement: React Error Boundary component
+The application SHALL include a React Error Boundary component that catches JavaScript errors in its child component tree and renders a fallback UI instead of crashing to a white screen. The fallback SHALL display: an error message, a "Reload" button that calls `window.location.reload()`, and a "Try Again" button that resets the error boundary state.
+
+#### Scenario: Child component throws during render
+- **WHEN** CandleChart or any child component throws an error during rendering
+- **THEN** the Error Boundary catches the error and renders the fallback UI instead of a white screen
+
+#### Scenario: Fallback UI interaction
+- **WHEN** the user sees the error fallback and clicks "Try Again"
+- **THEN** the Error Boundary resets its state and attempts to re-render the child component tree
+
+#### Scenario: Error logging
+- **WHEN** the Error Boundary catches an error
+- **THEN** the error and component stack are logged via `console.error`
+
+### Requirement: Error Boundary placement in layout
+The root layout (`src/app/layout.tsx`) SHALL wrap `{children}` with the Error Boundary component so that any render error in any page is caught.
+
+#### Scenario: Layout wraps children
+- **WHEN** the application renders any page
+- **THEN** the page content is wrapped inside an Error Boundary
--- a/openspec/changes/code-review-fix/specs/input-validation/spec.md
+++ b/openspec/changes/code-review-fix/specs/input-validation/spec.md
@ -0,0 +1,76 @@
+## ADDED Requirements
+
+### Requirement: Zod validation on proxy routes
+All Next.js proxy routes SHALL validate request bodies using Zod schemas before forwarding to the ML service. If validation fails, the route SHALL return HTTP 400 with `{ "error": "<zod error message>" }` without contacting the ML service.
+
+#### Scenario: Valid predict request
+- **WHEN** POST `/api/predict` receives `{ pair: "EURUSD", timeframe: "1H", candles: [{time: 1, open: 1, high: 2, low: 0.5, close: 1.5}] }`
+- **THEN** the request passes validation and is forwarded to the ML service
+
+#### Scenario: Invalid predict request
+- **WHEN** POST `/api/predict` receives `{ pair: 123, candles: "not-an-array" }`
+- **THEN** the route returns HTTP 400 with a Zod validation error message
+
+#### Scenario: Valid training start request
+- **WHEN** POST `/api/training/start` receives `{ model_type: "random_forest" }`
+- **THEN** the request passes validation and is forwarded
+
+#### Scenario: Valid model load request
+- **WHEN** POST `/api/model/load` receives `{ run_id: "abc-123" }`
+- **THEN** the request passes validation and is forwarded
+
+#### Scenario: Valid pattern detection request
+- **WHEN** POST `/api/patterns/detect` receives `{ candles: [...], patterns: ["CDLENGULFING"] }`
+- **THEN** the request passes validation and is forwarded
+
+### Requirement: run_id format validation
+All routes and endpoints that accept a `run_id` parameter SHALL validate that the value matches the pattern `/^[a-zA-Z0-9_-]+$/`. If validation fails, the route SHALL return HTTP 400 with `{ "error": "Invalid run_id format" }`.
+
+#### Scenario: Valid run_id in URL path
+- **WHEN** DELETE `/api/training/runs/abc-123_v2` is called
+- **THEN** the request proceeds normally
+
+#### Scenario: Path traversal attempt in run_id
+- **WHEN** DELETE `/api/training/runs/../../admin/delete-all` is called
+- **THEN** the route returns HTTP 400 with `{ "error": "Invalid run_id format" }`
+
+#### Scenario: Python service run_id validation
+- **WHEN** POST `/model/load` receives `{ run_id: "../../../etc/passwd" }`
+- **THEN** the FastAPI endpoint returns HTTP 400 with `{ "detail": "Invalid run_id format" }`
+
+### Requirement: File upload size limit
+The `POST /api/upload` route SHALL reject files larger than 10MB. The route SHALL check `file.size` before processing and return HTTP 413 with `{ "error": "File too large. Maximum size is 10MB." }` if exceeded. The route SHALL also limit the number of rows inserted to 500,000.
+
+#### Scenario: File within size limit
+- **WHEN** a 5MB CSV file is uploaded
+- **THEN** the upload proceeds normally
+
+#### Scenario: File exceeds size limit
+- **WHEN** a 15MB CSV file is uploaded
+- **THEN** the route returns HTTP 413 with `{ "error": "File too large. Maximum size is 10MB." }`
+
+#### Scenario: Row count limit
+- **WHEN** a CSV file contains 600,000 rows
+- **THEN** the route returns HTTP 400 with `{ "error": "Too many rows. Maximum is 500,000." }`
+
+### Requirement: Batch prediction input size limit
+The FastAPI `/predict/batch` endpoint SHALL validate that the requested date range does not exceed 1 year. If exceeded, the endpoint SHALL return HTTP 400.
+
+#### Scenario: Date range within limit
+- **WHEN** POST `/predict/batch` requests a 6-month range
+- **THEN** the request proceeds normally
+
+#### Scenario: Date range exceeds limit
+- **WHEN** POST `/predict/batch` requests a 2-year range
+- **THEN** the endpoint returns HTTP 400 with `{ "detail": "Date range exceeds maximum of 1 year" }`
+
+### Requirement: File type validation on upload
+The `POST /api/upload` route SHALL validate that the uploaded file has a `.csv` extension and a text-based MIME type. Other file types SHALL be rejected with HTTP 400.
+
+#### Scenario: Valid CSV upload
+- **WHEN** a file named `data.csv` with MIME type `text/csv` is uploaded
+- **THEN** the upload proceeds normally
+
+#### Scenario: Invalid file type
+- **WHEN** a file named `model.pkl` is uploaded
+- **THEN** the route returns HTTP 400 with `{ "error": "Only CSV files are accepted" }`
--- a/openspec/changes/code-review-fix/specs/ml-inference/spec.md
+++ b/openspec/changes/code-review-fix/specs/ml-inference/spec.md
@ -0,0 +1,66 @@
+## ADDED Requirements
+
+### Requirement: CORS restricted to explicit origins
+The FastAPI ML service SHALL configure CORS with explicit allowed origins instead of wildcard `*`. The default allowed origins SHALL be `["http://localhost:3000"]`. Additional origins MAY be configured via the `CORS_ORIGINS` environment variable (comma-separated).
+
+#### Scenario: Same-origin request allowed
+- **WHEN** a request from `http://localhost:3000` hits the ML service
+- **THEN** the CORS headers allow the request
+
+#### Scenario: Unknown origin blocked
+- **WHEN** a request from `http://evil.com` hits the ML service
+- **THEN** the CORS headers do not include `Access-Control-Allow-Origin` for that origin
+
+### Requirement: Generic error responses from ML service
+All FastAPI endpoints SHALL return generic error messages for unexpected exceptions. The response SHALL be `{ "detail": "Internal server error" }` with HTTP 500. Full exception details SHALL be logged via `logging.error` with traceback.
+
+#### Scenario: Internal error generic response
+- **WHEN** a predict endpoint throws an unexpected exception
+- **THEN** the client receives `{ "detail": "Internal server error" }` and the traceback is logged server-side
+
+### Requirement: Model file integrity check
+When loading a model via `joblib.load()`, the system SHALL verify the model file's SHA256 hash against a manifest file (`models/checksums.sha256`). If the hash does not match or the manifest entry is missing, the load SHALL fail with an error.
+
+#### Scenario: Valid model file
+- **WHEN** `joblib.load("models/abc123.pkl")` is called and the file's SHA256 matches the manifest
+- **THEN** the model loads successfully
+
+#### Scenario: Tampered model file
+- **WHEN** the model file's SHA256 does not match the manifest entry
+- **THEN** the system refuses to load the model and returns HTTP 500 with `{ "detail": "Model integrity check failed" }`
+
+### Requirement: Thread-safe model reads
+The ML service SHALL use a lock for model reads during prediction, not just model writes. All code that reads the current model reference SHALL acquire the `_model_swap_lock` or use an atomic reference swap pattern.
+
+#### Scenario: Concurrent prediction and model swap
+- **WHEN** a prediction request is in progress and a model load request arrives
+- **THEN** the prediction completes with the old model (or waits), and the new model is loaded atomically
+
+### Requirement: Path traversal prevention on model operations
+The FastAPI endpoints that accept `run_id` for model load and delete SHALL validate the `run_id` format and verify that the resolved file path is within the expected `models/` directory using `Path.resolve()`.
+
+#### Scenario: Valid run_id resolves within models directory
+- **WHEN** POST `/model/load` receives `{ run_id: "abc123" }`
+- **THEN** the path resolves to `models/abc123.pkl` within the models directory
+
+#### Scenario: Path traversal attempt blocked
+- **WHEN** POST `/model/load` receives `{ run_id: "../../etc/passwd" }`
+- **THEN** the endpoint returns HTTP 400 before attempting any file operation
+
+### Requirement: Real health check
+The `GET /health` endpoint SHALL perform actual connectivity checks instead of returning hardcoded status. It SHALL execute `SELECT 1` against PostgreSQL and attempt an MLflow API call.
+
+#### Scenario: All services healthy
+- **WHEN** PostgreSQL responds to `SELECT 1` and MLflow API is reachable
+- **THEN** the health endpoint returns `{ "status": "healthy", "database": "connected", "mlflow": "connected" }`
+
+#### Scenario: Database unreachable
+- **WHEN** PostgreSQL does not respond to `SELECT 1`
+- **THEN** the health endpoint returns `{ "status": "degraded", "database": "disconnected" }` with HTTP 200
+
+### Requirement: Candle time-sort validation
+The `POST /predict` endpoint SHALL validate that input candles are sorted by time in ascending order. If candles are not sorted, the endpoint SHALL sort them before processing.
+
+#### Scenario: Unsorted candles auto-sorted
+- **WHEN** candles are provided in random time order
+- **THEN** the endpoint sorts them by time before feature engineering and prediction
--- a/openspec/changes/code-review-fix/specs/ml-training/spec.md
+++ b/openspec/changes/code-review-fix/specs/ml-training/spec.md
@ -0,0 +1,23 @@
+## ADDED Requirements
+
+### Requirement: Training resource limits
+The `POST /training/start` endpoint SHALL enforce resource limits: the training dataset file size SHALL not exceed 500MB, and the training thread SHALL have a configurable timeout (default: 30 minutes). If the timeout is exceeded, the training thread SHALL be marked as failed.
+
+#### Scenario: Dataset too large
+- **WHEN** the training dataset exceeds 500MB
+- **THEN** training fails immediately with `{ "detail": "Dataset too large. Maximum 500MB." }`
+
+#### Scenario: Training timeout
+- **WHEN** a training run exceeds the 30-minute timeout
+- **THEN** the training status is set to "failed" with reason "Training timed out"
+
+### Requirement: run_id validation on training endpoints
+The FastAPI training endpoints (`DELETE /training/runs/{run_id}`, `GET /training/runs/{run_id}`) SHALL validate that `run_id` matches `/^[a-zA-Z0-9_-]+$/` before any database or file operation.
+
+#### Scenario: Valid run_id
+- **WHEN** `DELETE /training/runs/run-2024-01-15_v3` is called
+- **THEN** the request proceeds normally
+
+#### Scenario: Invalid run_id
+- **WHEN** `DELETE /training/runs/../../admin` is called
+- **THEN** the endpoint returns HTTP 400 with `{ "detail": "Invalid run_id format" }`
--- a/openspec/changes/code-review-fix/specs/postgres-data-layer/spec.md
+++ b/openspec/changes/code-review-fix/specs/postgres-data-layer/spec.md
@ -0,0 +1,40 @@
+## MODIFIED Requirements
+
+### Requirement: Environment variable configuration (credentials)
+The project SHALL use environment variables for runtime configuration. Credentials SHALL NOT be hardcoded in any committed file.
+
+#### Scenario: .env file gitignored
+- **WHEN** `.gitignore` is inspected
+- **THEN** it includes `.env` (bare, not just `.env*.local`)
+
+#### Scenario: .env removed from git history
+- **WHEN** `git ls-files .env` is run
+- **THEN** `.env` is NOT tracked by git
+
+#### Scenario: .env.example has placeholder credentials
+- **WHEN** `.env.example` is inspected
+- **THEN** it contains `POSTGRES_PASSWORD=change_me_to_a_strong_password` (not a real password)
+
+#### Scenario: No credentials in Python source
+- **WHEN** `services/ml/app/db.py` is inspected
+- **THEN** there are no SQL comments containing usernames or passwords, and the code fails fast if `DATABASE_URL` env var is not set
+
+## ADDED Requirements
+
+### Requirement: models directory gitignored
+The `.gitignore` file SHALL include `models/` and `*.pkl` patterns to prevent model files from being committed.
+
+#### Scenario: Model files excluded
+- **WHEN** a model file is saved to `models/best.pkl`
+- **THEN** `git status` does not show it as untracked
+
+### Requirement: devDependencies correctly categorized
+The `package.json` SHALL list `@types/*`, `typescript`, `eslint`, `eslint-config-next`, `autoprefixer`, and `postcss` under `devDependencies` (not `dependencies`).
+
+#### Scenario: Type packages in devDependencies
+- **WHEN** `package.json` is inspected
+- **THEN** `@types/node`, `@types/react`, `@types/react-dom`, `@types/papaparse`, `@types/pg` are in `devDependencies`
+
+#### Scenario: Build tools in devDependencies
+- **WHEN** `package.json` is inspected
+- **THEN** `typescript`, `eslint`, `eslint-config-next`, `autoprefixer`, `postcss` are in `devDependencies`
--- a/openspec/changes/code-review-fix/specs/prediction-ui/spec.md
+++ b/openspec/changes/code-review-fix/specs/prediction-ui/spec.md
@ -0,0 +1,41 @@
+## ADDED Requirements
+
+### Requirement: AbortController for prediction requests
+The prediction fetching functions (`fetchPredictions`, `handleFetchBatchPredictions`) SHALL use `AbortController` to cancel previous in-flight requests when a new request is initiated. A ref SHALL hold the current controller. Stale responses SHALL be discarded.
+
+#### Scenario: Rapid clicks cancel previous request
+- **WHEN** the user clicks "Run on Visible" twice quickly
+- **THEN** the first request is aborted and only the second response is rendered
+
+#### Scenario: Batch prediction cancellation
+- **WHEN** the user clicks "Predict All" while a previous batch prediction is in progress
+- **THEN** the previous batch request is aborted
+
+### Requirement: Bounded prediction cache
+The `predictionCacheRef` Map SHALL have a maximum size of 100 entries. When the cache exceeds the limit, the oldest entry (first inserted) SHALL be evicted. The cache SHALL function as a simple FIFO bounded map.
+
+#### Scenario: Cache eviction at limit
+- **WHEN** the cache has 100 entries and a new prediction result is cached
+- **THEN** the oldest entry is deleted before the new one is inserted
+
+#### Scenario: Cache size never exceeds limit
+- **WHEN** 200 unique prediction ranges are fetched
+- **THEN** the cache contains at most 100 entries at any point
+
+### Requirement: Stable modelInfo reference for cache key
+The `generateCacheKey` function SHALL read `modelInfo` from a ref (not from the `predictionState` closure) to prevent stale cache keys.
+
+#### Scenario: Model change produces correct cache key
+- **WHEN** the user loads a new model and then runs predictions
+- **THEN** the cache key includes the new model version (not the old one from a stale closure)
+
+### Requirement: Confirmation dialog for delete-all annotations
+The "Delete All" annotations action SHALL show a confirmation dialog before executing. The dialog SHALL display "Are you sure you want to delete all annotations?" with "Cancel" and "Delete" buttons.
+
+#### Scenario: User confirms deletion
+- **WHEN** the user clicks "Delete All" and then clicks "Delete" in the confirmation dialog
+- **THEN** all annotations for the active chart are deleted
+
+#### Scenario: User cancels deletion
+- **WHEN** the user clicks "Delete All" and then clicks "Cancel" in the confirmation dialog
+- **THEN** no annotations are deleted
--- a/openspec/changes/code-review-fix/specs/security-headers/spec.md
+++ b/openspec/changes/code-review-fix/specs/security-headers/spec.md
@ -0,0 +1,21 @@
+## ADDED Requirements
+
+### Requirement: Security response headers
+The Next.js application SHALL add security response headers to all routes via the `headers()` function in `next.config.js`. The following headers SHALL be set:
+- `X-Frame-Options: DENY`
+- `X-Content-Type-Options: nosniff`
+- `Referrer-Policy: strict-origin-when-cross-origin`
+- `Permissions-Policy: camera=(), microphone=(), geolocation=()`
+- `Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-eval' 'unsafe-inline'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src 'self' https://fonts.gstatic.com; img-src 'self' data: blob:`
+
+#### Scenario: Headers present on HTML response
+- **WHEN** a browser requests any page
+- **THEN** the response includes all five security headers
+
+#### Scenario: Headers present on API response
+- **WHEN** a client requests any `/api/*` endpoint
+- **THEN** the response includes all five security headers
+
+#### Scenario: Clickjacking prevented
+- **WHEN** a third-party site attempts to embed the application in an iframe
+- **THEN** the browser blocks the embedding due to `X-Frame-Options: DENY`
--- a/openspec/changes/code-review-fix/specs/shared-types/spec.md
+++ b/openspec/changes/code-review-fix/specs/shared-types/spec.md
@ -0,0 +1,33 @@
+## ADDED Requirements
+
+### Requirement: Centralized TypeScript interfaces
+The project SHALL define shared TypeScript interfaces in `src/types/` with the following files:
+- `candles.ts`: `Candle` interface with `time`, `open`, `high`, `low`, `close`, optional `volume`
+- `annotations.ts`: `Annotation`, `AnnotationType` interfaces
+- `charts.ts`: `Chart` interface
+- `predictions.ts`: `PredictionSpan`, `PredictionState`, `ModelInfo` interfaces
+- `span-annotations.ts`: `SpanAnnotation`, `SpanLabelType`, `SubSpan` interfaces
+- `index.ts`: barrel file re-exporting all types
+
+#### Scenario: Import shared Candle type
+- **WHEN** a component needs the `Candle` interface
+- **THEN** it imports from `@/types` instead of defining its own
+
+#### Scenario: No duplicate interface definitions
+- **WHEN** the codebase is searched for `interface Candle` or `interface SpanAnnotation`
+- **THEN** each interface is defined exactly once in `src/types/` (not in component files)
+
+#### Scenario: Type consistency across components
+- **WHEN** `page.tsx`, `CandleChart.tsx`, `SpanAnnotationManager.tsx`, `Toolbox.tsx`, `SpanAnnotationList.tsx`, and `SpanPopover.tsx` reference shared types
+- **THEN** they all import from `@/types` and use the same interface definitions
+
+### Requirement: Replace pervasive any types
+All `any` type annotations in component files SHALL be replaced with proper typed interfaces from `src/types/`. Specifically: `geometry` fields SHALL use a defined `Geometry` interface, `sub_spans` SHALL use `SubSpan[]`, candle arrays SHALL use `Candle[]`, and prediction cache values SHALL use `PredictionSpan[]`.
+
+#### Scenario: Geometry field typed
+- **WHEN** a component accesses `annotation.geometry`
+- **THEN** the type is `Geometry | null` (not `any`)
+
+#### Scenario: Prediction cache typed
+- **WHEN** `predictionCacheRef` is accessed
+- **THEN** its type is `Map<string, PredictionSpan[]>` (not `Map<string, any>`)
--- a/openspec/changes/code-review-fix/specs/span-annotation/spec.md
+++ b/openspec/changes/code-review-fix/specs/span-annotation/spec.md
@ -0,0 +1,40 @@
+## ADDED Requirements
+
+### Requirement: Keyboard handler uses stable selectedSpanId
+The SpanAnnotationManager keyboard handler SHALL use a `useRef` for `selectedSpanId` to prevent stale closure reads. The `handleDeleteSpan` function SHALL be wrapped in `useCallback` with proper dependencies.
+
+#### Scenario: Delete correct span via keyboard
+- **WHEN** user selects span A, then selects span B, then presses Delete
+- **THEN** span B is deleted (not span A from a stale closure)
+
+### Requirement: Preview primitive uses ref instead of state
+The SpanAnnotationManager preview primitive (shown during span drawing on mouse move) SHALL use a `useRef` instead of `useState` to avoid a state/effect feedback loop and unnecessary re-renders.
+
+#### Scenario: Mouse move during drawing does not cause re-render cascade
+- **WHEN** the user moves the mouse while drawing a span annotation
+- **THEN** the preview primitive updates via ref mutation without triggering a React re-render
+
+### Requirement: Preview primitive cleanup on unmount
+The SpanAnnotationManager SHALL detach the preview primitive in the `useEffect` cleanup function when the component unmounts, preventing a leaked canvas primitive.
+
+#### Scenario: Component unmounts during drawing
+- **WHEN** the user navigates away while mid-draw
+- **THEN** the preview primitive is detached from the chart
+
+### Requirement: fitContent not called on every span change
+The SpanAnnotationManager reconciliation effect SHALL NOT call `chart.timeScale().fitContent()` on every span annotation change. `fitContent()` SHALL only be called on initial chart load.
+
+#### Scenario: Span annotation added preserves zoom
+- **WHEN** the user adds a new span annotation
+- **THEN** the chart maintains the current scroll position and zoom level (no fitContent reset)
+
+### Requirement: Incremental primitive updates
+The SpanAnnotationManager SHALL update only the selection state of existing primitives on selection change instead of detaching all and re-attaching all. Full recreation SHALL only occur when the annotation list itself changes (add/remove/edit).
+
+#### Scenario: Selection change is O(1)
+- **WHEN** the user clicks a different span annotation
+- **THEN** only the previously selected and newly selected primitives are updated (not all N primitives)
+
+#### Scenario: Annotation add triggers full reconciliation
+- **WHEN** a new span annotation is added
+- **THEN** the primitive list is reconciled (new primitive added, existing ones kept)
--- a/openspec/changes/code-review-fix/specs/ui-shell/spec.md
+++ b/openspec/changes/code-review-fix/specs/ui-shell/spec.md
@ -0,0 +1,97 @@
+## ADDED Requirements
+
+### Requirement: Accessibility on interactive elements
+All interactive elements (buttons, dropdowns, modals) SHALL have `aria-label` attributes describing their function. Toggle buttons SHALL use `aria-pressed`. The keyboard shortcuts modal SHALL have `role="dialog"` and `aria-modal="true"`.
+
+#### Scenario: Button has aria-label
+- **WHEN** a toolbar button renders
+- **THEN** it has an `aria-label` attribute describing its action (e.g., "Draw span annotation")
+
+#### Scenario: Modal has dialog role
+- **WHEN** the keyboard shortcuts modal opens
+- **THEN** it has `role="dialog"` and `aria-modal="true"`
+
+### Requirement: Focus trapping in modals
+Modal dialogs (keyboard shortcuts, confirmation dialogs) SHALL trap focus within the modal while open. Tab and Shift+Tab SHALL cycle through focusable elements within the modal.
+
+#### Scenario: Focus trapped in modal
+- **WHEN** a modal is open and the user presses Tab
+- **THEN** focus cycles through focusable elements within the modal only
+
+### Requirement: Dark theme on settings pages
+The annotation-types and span-label-types settings pages SHALL use theme-aware CSS variables instead of hardcoded light colors. They SHALL render correctly in both light and dark themes.
+
+#### Scenario: Settings page in dark mode
+- **WHEN** the theme is set to dark and user navigates to annotation-types page
+- **THEN** the page renders with dark background and light text (no hardcoded white backgrounds)
+
+### Requirement: next/font for Google Fonts
+The application SHALL use `next/font/google` to load Google Fonts instead of CSS `@import` in `globals.css`. This prevents render-blocking font loading.
+
+#### Scenario: Font loaded via next/font
+- **WHEN** the application loads
+- **THEN** Google Fonts are loaded via `next/font/google` (not via CSS `@import url()` in globals.css)
+
+### Requirement: Confidence slider debounce
+The confidence threshold slider in PredictionPanel SHALL debounce chart re-renders. The slider SHALL update the displayed value immediately but only trigger chart updates after 150ms of inactivity (or on `onPointerUp`).
+
+#### Scenario: Dragging slider does not re-render chart per pixel
+- **WHEN** the user drags the confidence slider
+- **THEN** the chart re-renders at most once every 150ms (not on every pixel movement)
+
+### Requirement: ChartSelector closes on outside click
+The custom chart selector dropdown SHALL close when the user clicks outside of it.
+
+#### Scenario: Click outside closes dropdown
+- **WHEN** the chart selector dropdown is open and the user clicks elsewhere on the page
+- **THEN** the dropdown closes
+
+### Requirement: Dead code removal
+The following dead code SHALL be removed:
+- `src/lib/db/migrate.ts` (SQLite migration code)
+- `get_db_session()` in `services/ml/app/db.py` (unused session leak)
+- Dead filter code (TODO comment, no-op) in `page.tsx`
+- Dead `inference*` package reference in `pyproject.toml`
+
+#### Scenario: No dead migration code
+- **WHEN** `src/lib/db/migrate.ts` is searched for
+- **THEN** the file does not exist
+
+#### Scenario: No unused db session function
+- **WHEN** `services/ml/app/db.py` is inspected
+- **THEN** `get_db_session()` function is absent
+
+### Requirement: Deprecated Python API replacements
+The ML service SHALL replace deprecated Python APIs:
+- `@app.on_event("startup")` replaced with lifespan pattern
+- `declarative_base()` replaced with `class Base(DeclarativeBase)`
+- `datetime.utcnow()` replaced with `datetime.now(datetime.UTC)`
+
+#### Scenario: No deprecated startup event
+- **WHEN** `services/ml/app/main.py` is inspected
+- **THEN** startup logic uses the FastAPI lifespan pattern (not `@app.on_event`)
+
+#### Scenario: No deprecated utcnow
+- **WHEN** `datetime.utcnow()` is searched for in the ML service
+- **THEN** zero matches are found (all replaced with `datetime.now(datetime.UTC)`)
+
+### Requirement: Unused import cleanup
+Components SHALL not have unused imports. Specifically, `Toolbox.tsx` SHALL remove unused `TrendingUp` and `ChevronUp` imports.
+
+#### Scenario: No unused imports in Toolbox
+- **WHEN** `Toolbox.tsx` is inspected
+- **THEN** only imported symbols that are used in the component are present
+
+### Requirement: Tooltip component functional or removed
+The `ui/tooltip.tsx` component SHALL either be implemented using Radix Tooltip or removed if unused.
+
+#### Scenario: Tooltip is functional
+- **WHEN** the Tooltip component is used
+- **THEN** it renders an actual tooltip on hover (not a no-op passthrough)
+
+### Requirement: SpanAnnotationList confidence check for zero
+The confidence display in SpanAnnotationList SHALL use `!= null` instead of a falsy check, so that a confidence value of `0` is correctly displayed.
+
+#### Scenario: Confidence zero displayed
+- **WHEN** a span annotation has confidence value `0`
+- **THEN** the list displays "0%" (not hidden as if confidence is absent)
--- a/openspec/changes/code-review-fix/tasks.md
+++ b/openspec/changes/code-review-fix/tasks.md
@ -6,7 +6,7 @@
 - [x] 1.4 `[haiku]` Remove SQL comment with credentials from `services/ml/app/db.py` and add fail-fast check for missing `DATABASE_URL`
 - [x] 1.5 `[sonnet]` Update `docker-compose.yml` to use `${POSTGRES_USER}`, `${POSTGRES_PASSWORD}`, `${POSTGRES_DB}` env var interpolation in all DATABASE_URL values
 - [x] 1.6 `[haiku]` Bind PostgreSQL port to `127.0.0.1:5432:5432` in `docker-compose.yml`
- [ ] 1.7 `[haiku]` Bind MLflow port to `127.0.0.1:5000:5000` in `docker-compose.yml`
+- [x] 1.7 `[haiku]` Bind MLflow port to `127.0.0.1:5000:5000` in `docker-compose.yml`
 - [ ] 1.8 `[haiku]` Bind ML service port to `127.0.0.1:8001:8001` in `docker-compose.yml`

 ## 2. Security Critical — Input Validation & CORS