- Archived change to openspec/changes/archive/2026-02-17-ml-db-consolidation/ - Created new postgres-data-layer spec with PostgreSQL connection, schema definitions, Drizzle migrations, npm deps, and SQLite migration requirements - Updated docker-deployment spec: Docker Compose now PostgreSQL-based (postgres dependency, ml-data volume, DATABASE_URL); env vars updated (DATABASE_URL added, DATABASE_PATH removed); database persistence updated to PostgreSQL volumes; health check updated to PostgreSQL - Updated ml-training spec: added database name scenario (candle_annotator) and new direct annotation data access requirement Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
8.6 KiB
ADDED Requirements
Requirement: Multi-stage Dockerfile
The project SHALL include a Dockerfile with multi-stage build for optimized production images.
Scenario: Build stage setup
- WHEN Dockerfile build stage executes
- THEN uses Node.js 18-alpine base image, copies package files, installs ALL dependencies including devDependencies, copies source code, and runs
npm run build
Scenario: Runtime stage setup
- WHEN Dockerfile runtime stage executes
- THEN uses Node.js 18-alpine base image, creates non-root user 'appuser', copies only production dependencies and built files from build stage, and sets USER to appuser
Scenario: Working directory structure
- WHEN container runs
- THEN application files are in /app directory, database volume mounts to /app/data, and permissions allow appuser to write to /app/data
Scenario: Environment variables in Dockerfile
- WHEN Dockerfile defines environment
- THEN sets NODE_ENV=production, PORT=3000, and HOSTNAME=0.0.0.0 for Next.js standalone server
Scenario: Exposed ports
- WHEN container is built
- THEN Dockerfile exposes port 3000 for HTTP traffic
Scenario: Container startup
- WHEN container starts
- THEN executes
node server.js(Next.js standalone output) as the CMD
Requirement: Docker Compose configuration
The project SHALL include docker-compose.yml for simplified deployment orchestration.
Scenario: Service definition
- WHEN docker-compose.yml is parsed
- THEN defines service named 'candle-annotator' using Dockerfile from current directory
Scenario: Port mapping
- WHEN docker-compose up runs
- THEN maps host port 3000 to container port 3000
Scenario: Volume mounting for ML data
- WHEN docker-compose up runs
- THEN mounts named volume 'ml-data' to /app/ml-data in the candle-annotator container
Scenario: Frontend depends on PostgreSQL
- WHEN docker-compose up runs
- THEN the candle-annotator service starts only after the postgres service is healthy (
depends_on: postgres: condition: service_healthy)
Scenario: Frontend DATABASE_URL
- WHEN the candle-annotator service starts
- THEN the
DATABASE_URLenvironment variable is set topostgresql://ml_user:ml_password@postgres:5432/candle_annotator
Scenario: Restart policy
- WHEN container crashes or stops
- THEN docker-compose automatically restarts container unless explicitly stopped (restart: unless-stopped)
Scenario: No SQLite volume
- WHEN docker-compose.yml is parsed
- THEN there is no
candle-datavolume defined or mounted
Requirement: Environment variable configuration
The project SHALL use environment variables for runtime configuration.
Scenario: .env.example file
- WHEN repository is cloned
- THEN includes .env.example file documenting all configurable environment variables with example values
Scenario: DATABASE_URL configuration
- WHEN
DATABASE_URLenvironment variable is set - THEN the Next.js application connects to the PostgreSQL database at the specified URL
Scenario: No DATABASE_PATH variable
- WHEN environment variables are inspected
- THEN there is no
DATABASE_PATHvariable (SQLite path is removed)
Scenario: PORT configuration
- WHEN PORT environment variable is set
- THEN Next.js server listens on specified port (default: 3000)
Scenario: NODE_ENV configuration
- WHEN NODE_ENV environment variable is set to 'production'
- THEN Next.js runs in production mode with optimizations enabled
Requirement: Health check endpoint
The API SHALL provide a health check endpoint for container orchestration.
Scenario: Health check endpoint responds
- WHEN GET request sent to
/api/health - THEN system returns 200 status with JSON
{ status: 'ok', timestamp: <unix_timestamp> }
Scenario: Database connection check
- WHEN GET request sent to
/api/health?check=db - THEN system attempts a PostgreSQL query and returns 200 if successful, 503 if database unavailable
Scenario: Health check in Dockerfile
- WHEN Dockerfile defines HEALTHCHECK
- THEN runs
curl -f http://localhost:3000/api/health || exit 1every 30 seconds with 3 retries
Requirement: .dockerignore file
The project SHALL include .dockerignore to exclude unnecessary files from Docker context.
Scenario: Excluded files
- WHEN Docker build context is created
- THEN .dockerignore excludes node_modules, .next, .git, data/, .md, .env, and test files
Scenario: Included files
- WHEN Docker build context is created
- THEN includes package.json, package-lock.json, source code in src/, and required config files
Requirement: Next.js standalone output
The build SHALL use Next.js standalone output mode for minimal production bundle.
Scenario: next.config.js standalone setting
- WHEN next.config.js is read
- THEN output property is set to 'standalone'
Scenario: Standalone build output
- WHEN npm run build executes
- THEN Next.js creates .next/standalone directory with minimal runtime files and dependencies
Scenario: Copy standalone files to image
- WHEN Dockerfile runtime stage executes
- THEN copies .next/standalone/ contents to /app, copies .next/static to /app/.next/static, and copies public/ to /app/public
Requirement: Production build optimization
The Docker image SHALL be optimized for production use with minimal size.
Scenario: Use alpine base images
- WHEN Dockerfile specifies base images
- THEN uses node:18-alpine for both build and runtime stages
Scenario: Multi-stage build cleanup
- WHEN Docker image is built
- THEN build artifacts, devDependencies, and source files are not included in final image
Scenario: Layer caching optimization
- WHEN Dockerfile is structured
- THEN package.json and package-lock.json are copied and dependencies installed before source code copy for better layer caching
Scenario: Final image size
- WHEN Docker image build completes
- THEN final image size is under 200MB (excluding data volume)
Requirement: Database persistence
The deployment SHALL ensure PostgreSQL data persists across container restarts.
Scenario: PostgreSQL volume
- WHEN docker-compose up runs
- THEN the
postgres-datanamed volume is mounted to/var/lib/postgresql/datain the postgres container
Scenario: Container restart preserves data
- WHEN the postgres container is stopped and restarted
- THEN all database tables and data remain intact
Scenario: PostgreSQL database name
- WHEN the postgres service starts
- THEN the
POSTGRES_DBenvironment variable is set tocandle_annotator
Requirement: Deployment documentation
DEPLOYMENT.md SHALL include comprehensive Docker deployment instructions.
Scenario: Docker deployment section
- WHEN DEPLOYMENT.md is read
- THEN includes dedicated "Docker Deployment" section with prerequisites, build steps, and run commands
Scenario: Quick start commands
- WHEN following deployment docs
- THEN provides complete commands:
docker-compose up -dfor production anddocker-compose up --buildfor rebuilding
Scenario: Environment setup instructions
- WHEN following deployment docs
- THEN explains how to copy .env.example to .env and configure required variables
Scenario: Volume backup instructions
- WHEN following deployment docs
- THEN provides commands to backup database:
docker cp candle-annotator:/app/data/candles.db ./backup.db
Scenario: Troubleshooting section
- WHEN deployment issues occur
- THEN DEPLOYMENT.md includes troubleshooting for common Docker issues: port conflicts, permission errors, build failures
Scenario: Update and maintenance
- WHEN updating deployed application
- THEN documentation provides steps: pull new code, rebuild image, restart containers with data preservation
Requirement: Container security
The Docker setup SHALL follow security best practices.
Scenario: Non-root user
- WHEN container runs
- THEN application process runs as non-root user 'appuser' (UID 1000)
Scenario: Read-only filesystem where possible
- WHEN container runs
- THEN only /app/data directory requires write permissions, all other files are read-only to appuser
Scenario: No sensitive data in image
- WHEN Docker image is built
- THEN .env files, secrets, and database files are not included in image layers
Scenario: Minimal attack surface
- WHEN container runs
- THEN only port 3000 is exposed, no SSH, no unnecessary services, alpine base reduces package vulnerabilities