candle-annotator/openspec/changes/archive/2026-02-17-ml-db-consolidation/proposal.md
Marko Djordjevic 38df874255 chore: archive ml-db-consolidation change and sync specs
- Archived change to openspec/changes/archive/2026-02-17-ml-db-consolidation/
- Created new postgres-data-layer spec with PostgreSQL connection, schema definitions, Drizzle migrations, npm deps, and SQLite migration requirements
- Updated docker-deployment spec: Docker Compose now PostgreSQL-based (postgres dependency, ml-data volume, DATABASE_URL); env vars updated (DATABASE_URL added, DATABASE_PATH removed); database persistence updated to PostgreSQL volumes; health check updated to PostgreSQL
- Updated ml-training spec: added database name scenario (candle_annotator) and new direct annotation data access requirement

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-17 18:22:28 +01:00

35 lines
3.1 KiB
Markdown

## Why
The project currently runs two separate database servers: SQLite (via Drizzle ORM) for the Next.js frontend and PostgreSQL for the ML service. This creates unnecessary operational complexity — two different ORMs, two migration systems, two backup strategies, and no ability for the ML service to directly query annotation/candle data. Consolidating to PostgreSQL as the single database simplifies deployment, enables direct cross-service data access, and reduces the infrastructure footprint.
## What Changes
- **BREAKING**: Replace SQLite/better-sqlite3/Drizzle with PostgreSQL/Drizzle (pg driver) for the Next.js frontend
- Remove the `candle-data` Docker volume (SQLite file storage) and `DATABASE_PATH` env var
- Migrate all frontend tables (charts, candles, annotations, annotation_types, span_annotations, span_label_types) into the existing PostgreSQL instance
- Update Drizzle schema and config to target PostgreSQL instead of SQLite
- Regenerate Drizzle migrations for PostgreSQL dialect (column types change: `integer``serial`, `real``double precision`, timestamps as proper `timestamp` types, etc.)
- Update the ML service to share the same PostgreSQL database (or a separate schema within it) so it can directly query candle/annotation data instead of relying on CSV/JSON exports
- Update docker-compose.yml to remove SQLite volume dependency and point the frontend at PostgreSQL
- Update environment variables: frontend gets `DATABASE_URL` pointing to PostgreSQL
## Capabilities
### New Capabilities
- `postgres-data-layer`: Unified PostgreSQL data access layer for the Next.js frontend, replacing the SQLite/better-sqlite3 setup with Drizzle's PostgreSQL driver
### Modified Capabilities
- `docker-deployment`: Container configuration changes — remove SQLite volume, add PostgreSQL dependency for the frontend service, update environment variables
- `ml-training`: ML service can now query annotations and candle data directly from PostgreSQL instead of requiring CSV/JSON file exports
## Impact
- **Database schema**: All 6 frontend tables move to PostgreSQL with type adaptations (SQLite integers → PostgreSQL serial/integer/timestamp)
- **ORM layer**: `src/lib/db/index.ts` switches from `better-sqlite3` to `postgres` driver; schema types in `src/lib/db/schema.ts` change to PostgreSQL equivalents
- **Dependencies**: Remove `better-sqlite3`, add `postgres` (or `pg`) npm package for Drizzle's PostgreSQL adapter
- **Migrations**: Existing SQLite migrations become obsolete; new PostgreSQL migrations needed
- **Docker**: `candle-annotator` service gains `depends_on: postgres`, loses `candle-data` volume mount
- **Environment**: `.env` and `.env.example` updated with PostgreSQL connection string for frontend
- **ML service**: `services/ml/app/db.py` gains access to frontend tables (candles, annotations) for direct querying
- **Data migration**: Existing SQLite data needs a one-time migration script to PostgreSQL
- **API routes**: All Next.js API routes using `db` from `src/lib/db` continue working (Drizzle abstracts the driver change), but queries using SQLite-specific syntax may need adjustment