candle-annotator/openspec/specs/docker-deployment/spec.md

## ADDED Requirements

### Requirement: Multi-stage Dockerfile
The project SHALL include a Dockerfile with multi-stage build for optimized production images.

#### Scenario: Build stage setup
- **WHEN** Dockerfile build stage executes
- **THEN** uses Node.js 18-alpine base image, copies package files, installs ALL dependencies including devDependencies, copies source code, and runs `npm run build`

#### Scenario: Runtime stage setup
- **WHEN** Dockerfile runtime stage executes
- **THEN** uses Node.js 18-alpine base image, creates non-root user 'appuser', copies only production dependencies and built files from build stage, and sets USER to appuser

#### Scenario: Working directory structure
- **WHEN** container runs
- **THEN** application files are in /app directory, database volume mounts to /app/data, and permissions allow appuser to write to /app/data

#### Scenario: Environment variables in Dockerfile
- **WHEN** Dockerfile defines environment
- **THEN** sets NODE_ENV=production, PORT=3000, and HOSTNAME=0.0.0.0 for Next.js standalone server

#### Scenario: Exposed ports
- **WHEN** container is built
- **THEN** Dockerfile exposes port 3000 for HTTP traffic

#### Scenario: Container startup
- **WHEN** container starts
- **THEN** executes `node server.js` (Next.js standalone output) as the CMD

### Requirement: Docker Compose configuration
The project SHALL include docker-compose.yml for simplified deployment orchestration.

#### Scenario: Service definition
- **WHEN** docker-compose.yml is parsed
- **THEN** defines service named 'candle-annotator' using Dockerfile from current directory

#### Scenario: Port mapping
- **WHEN** docker-compose up runs
- **THEN** maps host port 3000 to container port 3000

#### Scenario: Volume mounting for ML data
- **WHEN** docker-compose up runs
- **THEN** mounts named volume 'ml-data' to /app/ml-data in the candle-annotator container

#### Scenario: Frontend depends on PostgreSQL
- **WHEN** docker-compose up runs
- **THEN** the candle-annotator service starts only after the postgres service is healthy (`depends_on: postgres: condition: service_healthy`)

#### Scenario: Frontend DATABASE_URL uses env var interpolation
- **WHEN** the candle-annotator service starts
- **THEN** the `DATABASE_URL` environment variable uses `${POSTGRES_PASSWORD}` interpolation: `postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}`

#### Scenario: Restart policy
- **WHEN** container crashes or stops
- **THEN** docker-compose automatically restarts container unless explicitly stopped (restart: unless-stopped)

#### Scenario: No SQLite volume
- **WHEN** docker-compose.yml is parsed
- **THEN** there is no `candle-data` volume defined or mounted

#### Scenario: PostgreSQL port bound to localhost only
- **WHEN** docker-compose up runs
- **THEN** the postgres service port mapping is `127.0.0.1:5432:5432` (not `5432:5432`)

#### Scenario: MLflow port bound to localhost only
- **WHEN** docker-compose up runs
- **THEN** the mlflow service port mapping is `127.0.0.1:5000:5000`

#### Scenario: ML service port bound to localhost only
- **WHEN** docker-compose up runs
- **THEN** the ml-service port mapping is `127.0.0.1:8001:8001`

#### Scenario: Credentials via env var interpolation
- **WHEN** docker-compose.yml is parsed
- **THEN** all database credentials use `${POSTGRES_USER}`, `${POSTGRES_PASSWORD}`, and `${POSTGRES_DB}` variable interpolation from `.env`

### Requirement: Environment variable configuration
The project SHALL use environment variables for runtime configuration.

#### Scenario: .env.example file with placeholder credentials
- **WHEN** repository is cloned
- **THEN** `.env.example` contains `POSTGRES_PASSWORD=change_me_to_a_strong_password` (not a real password)

#### Scenario: .env file gitignored
- **WHEN** `.gitignore` is inspected
- **THEN** it includes `.env` (not just `.env*.local`)

#### Scenario: DATABASE_URL configuration
- **WHEN** `DATABASE_URL` environment variable is set
- **THEN** the Next.js application connects to the PostgreSQL database at the specified URL

#### Scenario: No DATABASE_PATH variable
- **WHEN** environment variables are inspected
- **THEN** there is no `DATABASE_PATH` variable (SQLite path is removed)

#### Scenario: PORT configuration
- **WHEN** PORT environment variable is set
- **THEN** Next.js server listens on specified port (default: 3000)

#### Scenario: NODE_ENV configuration
- **WHEN** NODE_ENV environment variable is set to 'production'
- **THEN** Next.js runs in production mode with optimizations enabled

#### Scenario: API_KEY configuration
- **WHEN** `API_KEY` environment variable is set
- **THEN** both Next.js middleware and FastAPI dependency use this key for authentication

### Requirement: Health check endpoint
The API SHALL provide a health check endpoint for container orchestration.

#### Scenario: Health check endpoint responds
- **WHEN** GET request sent to `/api/health`
- **THEN** system returns 200 status with JSON `{ status: 'ok', timestamp: <unix_timestamp> }`

#### Scenario: Database connection check
- **WHEN** GET request sent to `/api/health?check=db`
- **THEN** system attempts a PostgreSQL query and returns 200 if successful, 503 if database unavailable

#### Scenario: Health check in Dockerfile
- **WHEN** Dockerfile defines HEALTHCHECK
- **THEN** runs `curl -f http://localhost:3000/api/health || exit 1` every 30 seconds with 3 retries

### Requirement: .dockerignore file
The project SHALL include .dockerignore to exclude unnecessary files from Docker context.

#### Scenario: Excluded files
- **WHEN** Docker build context is created
- **THEN** .dockerignore excludes node_modules, .next, .git, data/, *.md, .env*, and test files

#### Scenario: Included files
- **WHEN** Docker build context is created
- **THEN** includes package.json, package-lock.json, source code in src/, and required config files

### Requirement: Next.js standalone output
The build SHALL use Next.js standalone output mode for minimal production bundle.

#### Scenario: next.config.js standalone setting
- **WHEN** next.config.js is read
- **THEN** output property is set to 'standalone'

#### Scenario: Standalone build output
- **WHEN** npm run build executes
- **THEN** Next.js creates .next/standalone directory with minimal runtime files and dependencies

#### Scenario: Copy standalone files to image
- **WHEN** Dockerfile runtime stage executes
- **THEN** copies .next/standalone/ contents to /app, copies .next/static to /app/.next/static, and copies public/ to /app/public

### Requirement: Production build optimization
The Docker image SHALL be optimized for production use with minimal size.

#### Scenario: Use alpine base images
- **WHEN** Dockerfile specifies base images
- **THEN** uses node:18-alpine for both build and runtime stages

#### Scenario: Multi-stage build cleanup
- **WHEN** Docker image is built
- **THEN** build artifacts, devDependencies, and source files are not included in final image

#### Scenario: Layer caching optimization
- **WHEN** Dockerfile is structured
- **THEN** package.json and package-lock.json are copied and dependencies installed before source code copy for better layer caching

#### Scenario: Final image size
- **WHEN** Docker image build completes
- **THEN** final image size is under 200MB (excluding data volume)

#### Scenario: Base images pinned to digest
- **WHEN** Dockerfiles specify base images
- **THEN** images use `@sha256:<hash>` pinning for reproducible builds

### Requirement: ML service non-root user
The ML service Dockerfile SHALL create a non-root user and run the application as that user. The Dockerfile SHALL include `RUN useradd -m -r appuser` and `USER appuser` directives.

#### Scenario: Container runs as non-root
- **WHEN** the ML service container starts
- **THEN** the application process runs as user `appuser` (not root)

### Requirement: TA-Lib downloaded over HTTPS with checksum
The ML service Dockerfile SHALL download TA-Lib source over HTTPS (not HTTP). The download SHALL be verified with a SHA256 checksum before extraction.

#### Scenario: HTTPS download
- **WHEN** the Dockerfile downloads TA-Lib source
- **THEN** the URL uses `https://` protocol

#### Scenario: Checksum verification
- **WHEN** the TA-Lib tarball is downloaded
- **THEN** a `sha256sum -c` check runs before extraction, and the build fails if the checksum does not match

### Requirement: .dockerignore file exists
The project SHALL include a `.dockerignore` file at the repository root that excludes `.git`, `.env`, `.env*`, `node_modules`, `.next`, `data/`, `*.md`, `__pycache__/`, `mlruns/`, and `models/`.

#### Scenario: Docker context excludes sensitive files
- **WHEN** `docker build` runs
- **THEN** `.env`, `.git`, and `node_modules` are not included in the build context

### Requirement: Database persistence
The deployment SHALL ensure PostgreSQL data persists across container restarts.

#### Scenario: PostgreSQL volume
- **WHEN** docker-compose up runs
- **THEN** the `postgres-data` named volume is mounted to `/var/lib/postgresql/data` in the postgres container

#### Scenario: Container restart preserves data
- **WHEN** the postgres container is stopped and restarted
- **THEN** all database tables and data remain intact

#### Scenario: PostgreSQL database name
- **WHEN** the postgres service starts
- **THEN** the `POSTGRES_DB` environment variable is set to `candle_annotator`

### Requirement: Deployment documentation
DEPLOYMENT.md SHALL include comprehensive Docker deployment instructions.

#### Scenario: Docker deployment section
- **WHEN** DEPLOYMENT.md is read
- **THEN** includes dedicated "Docker Deployment" section with prerequisites, build steps, and run commands

#### Scenario: Quick start commands
- **WHEN** following deployment docs
- **THEN** provides complete commands: `docker-compose up -d` for production and `docker-compose up --build` for rebuilding

#### Scenario: Environment setup instructions
- **WHEN** following deployment docs
- **THEN** explains how to copy .env.example to .env and configure required variables

#### Scenario: Volume backup instructions
- **WHEN** following deployment docs
- **THEN** provides commands to backup database: `docker cp candle-annotator:/app/data/candles.db ./backup.db`

#### Scenario: Troubleshooting section
- **WHEN** deployment issues occur
- **THEN** DEPLOYMENT.md includes troubleshooting for common Docker issues: port conflicts, permission errors, build failures

#### Scenario: Update and maintenance
- **WHEN** updating deployed application
- **THEN** documentation provides steps: pull new code, rebuild image, restart containers with data preservation

### Requirement: Container security
The Docker setup SHALL follow security best practices.

#### Scenario: Non-root user
- **WHEN** container runs
- **THEN** application process runs as non-root user 'appuser' (UID 1000)

#### Scenario: Read-only filesystem where possible
- **WHEN** container runs
- **THEN** only /app/data directory requires write permissions, all other files are read-only to appuser

#### Scenario: No sensitive data in image
- **WHEN** Docker image is built
- **THEN** .env files, secrets, and database files are not included in image layers

#### Scenario: Minimal attack surface
- **WHEN** container runs
- **THEN** only port 3000 is exposed, no SSH, no unnecessary services, alpine base reduces package vulnerabilities

#### Scenario: No node_modules in production image
- **WHEN** the Next.js production Docker image is built
- **THEN** the `COPY --from=builder /app/node_modules` line is removed (standalone output bundles needed deps)