candle-annotator/openspec/changes/ml-db-consolidation/design.md

## Context

The candle annotator runs two databases:

1. **SQLite** (`data/candles.db`) — serves the Next.js frontend via Drizzle ORM (`better-sqlite3` driver). Contains 6 tables: charts, candles, annotations, annotation_types, span_annotations, span_label_types.
2. **PostgreSQL** (`postgres:5432/ml_db`) — serves the Python ML service via SQLAlchemy. Contains 1 table: training_runs.

The ML service cannot directly query annotation/candle data. Data flows through CSV/JSON file exports. PostgreSQL already runs in Docker for the ML service, so consolidating means adding frontend tables there — not introducing a new service.

## Goals / Non-Goals

**Goals:**
- Single PostgreSQL instance for all application data
- Drizzle ORM continues to manage frontend schema (just switches dialect)
- ML service gains direct read access to candle/annotation tables
- Simplified Docker setup (one fewer volume, one database to back up)
- One-time data migration path from SQLite to PostgreSQL

**Non-Goals:**
- Changing the ML service ORM (SQLAlchemy stays)
- Merging Drizzle and SQLAlchemy migration systems (each manages its own tables)
- Changing API route logic or query patterns beyond what's needed for the dialect switch
- Multi-tenant or schema separation (all tables go in the `public` schema)
- Migrating away from Drizzle ORM

## Decisions

### 1. Drizzle PostgreSQL driver: `drizzle-orm/node-postgres` with `pg`

**Choice**: Use `pg` (node-postgres) as the driver.

**Why**: `pg` is the most mature PostgreSQL driver for Node.js. Drizzle supports it natively via `drizzle-orm/node-postgres`. The `postgres` (postgres.js) driver is also an option but `pg` has broader ecosystem support and is easier to debug.

**Alternative considered**: `postgres` (postgres.js) — lighter, promise-native, but less battle-tested with Drizzle migrations.

### 2. Shared database, single `public` schema

**Choice**: All tables (frontend + ML) live in the same database (`ml_db`) and the default `public` schema.

**Why**: The table sets don't overlap (frontend has charts/candles/annotations, ML has training_runs). Separate schemas add complexity with no benefit for 7 total tables. The ML service already connects to `ml_db`.

**Alternative considered**: Separate PostgreSQL schemas (`app` and `ml`) — cleaner isolation but adds schema-prefix complexity to queries and cross-schema references. Not worth it at this scale.

### 3. Rename database from `ml_db` to `candle_annotator`

**Choice**: Rename the PostgreSQL database to `candle_annotator` since it now serves the whole application, not just ML.

**Why**: `ml_db` is misleading when the database holds frontend data too. Renaming during consolidation is the natural time to do it.

**Alternative considered**: Keep `ml_db` — avoids a rename step but creates lasting confusion.

### 4. Fresh Drizzle migrations (drop SQLite migrations)

**Choice**: Delete all existing SQLite migrations in `drizzle/`, rewrite the schema file with `pgTable` equivalents, and run `drizzle-kit generate` to produce a fresh initial PostgreSQL migration.

**Why**: SQLite migrations are dialect-specific (e.g., `integer` for booleans, no native timestamps). Converting them one-by-one is fragile. A clean start from the PostgreSQL schema is simpler and produces idiomatic SQL.

**Alternative considered**: Manually converting each SQLite migration to PostgreSQL — error-prone and provides no benefit since there's no production data that needs incremental migration history.

### 5. Type mappings: SQLite → PostgreSQL

| SQLite type | PostgreSQL type | Notes |
|---|---|---|
| `integer` (PK, autoIncrement) | `serial` | Auto-incrementing integer |
| `integer` (timestamps) | `timestamp` | Use `defaultNow()` where applicable |
| `integer` (booleans like `is_active`) | `boolean` | True PostgreSQL booleans |
| `real` | `doublePrecision` | OHLC price data |
| `text` | `text` | No change |
| `text` (JSON strings) | `jsonb` | For `geometry`, `sub_spans`, `model_prediction` |

### 6. Connection management for Next.js

**Choice**: Use a connection pool via `pg.Pool` with `max: 10` connections. Connection string from `DATABASE_URL` env var.

**Why**: SQLite was single-file, no pooling needed. PostgreSQL requires connection pooling for concurrent API requests. 10 connections is reasonable for the frontend workload.

### 7. ML service direct access to frontend tables

**Choice**: The ML service reads frontend tables (candles, annotations, span_annotations) directly via SQLAlchemy using its existing connection. No new SQLAlchemy models needed — raw SQL queries or lightweight table reflections are sufficient for read-only access.

**Why**: The ML service only needs to read training data. Adding full SQLAlchemy models for tables owned by Drizzle creates a dual-ownership problem. Raw queries or `Table` reflections keep it simple.

## Risks / Trade-offs

**[Schema drift between Drizzle and SQLAlchemy]** → Both ORMs manage tables in the same database. Drizzle owns frontend tables, SQLAlchemy owns ML tables. Neither should modify the other's tables. This is enforced by convention, not tooling.

**[Connection pool exhaustion]** → Adding the frontend's database traffic to the same PostgreSQL instance increases load. Mitigation: PostgreSQL 16 handles far more concurrent connections than SQLite. The `pg.Pool` max of 10 plus SQLAlchemy's pool of 5 is well within PostgreSQL's default `max_connections` of 100.

**[Data loss during migration]** → SQLite data must be migrated before switching. Mitigation: Write a migration script that exports SQLite data and imports to PostgreSQL. Run before deploying the new code. Keep the SQLite file as backup.

**[Drizzle push/generate differences]** → PostgreSQL dialect may generate slightly different migration SQL than expected. Mitigation: Review generated migrations before applying. Use `drizzle-kit push` for development, `drizzle-kit generate` + `drizzle-kit migrate` for production.

**[Boolean conversion]** → SQLite uses `0/1` for booleans, PostgreSQL uses `true/false`. Mitigation: The migration script handles conversion. Drizzle's `boolean()` type handles this transparently at the ORM level going forward.

## Migration Plan

1. **Update schema and dependencies** — Rewrite Drizzle schema for PostgreSQL, swap npm packages
2. **Generate fresh migrations** — `drizzle-kit generate` from the new PostgreSQL schema
3. **Update docker-compose.yml** — Rename database, add frontend dependency on postgres, remove `candle-data` volume
4. **Update environment variables** — `DATABASE_URL` for the frontend service
5. **Write data migration script** — `scripts/migrate-sqlite-to-postgres.ts` that reads SQLite and inserts into PostgreSQL with type conversions
6. **Update db/index.ts** — Switch from `better-sqlite3` to `pg` pool, update migration runner
7. **Test locally** — Run migrations, migrate data, verify API routes work
8. **Deploy** — Stop current services, run PostgreSQL migrations, run data migration, deploy new code
9. **Rollback** — If issues arise, revert docker-compose and code, restore SQLite volume. The SQLite file is kept as backup for 1 week post-migration.

## Open Questions

- Should the ML service user (`ml_user`) have write access to frontend tables, or should we create a separate read-only role? (Recommendation: keep `ml_user` with full access for simplicity, revisit if the team grows.)
- Do we need to preserve SQLite migration history in git for reference, or delete the `drizzle/` folder contents entirely? (Recommendation: delete and start fresh.)