candle-annotator/openspec/changes/ml-db-consolidation/tasks.md
Marko Djordjevic bfe437857b feat: add Python migration script and successfully test SQLite to PostgreSQL data migration
- Created scripts/migrate-sqlite-to-postgres.py as alternative to TypeScript version
- Handles all type conversions: timestamps, booleans, and JSONB fields
- Successfully migrated all 2,836 rows from SQLite to PostgreSQL
- Verified data integrity: all 6 tables migrated correctly
- Charts: 1, Candles: 2,592, Annotations: 4, Span annotations: 223
2026-02-17 14:01:21 +01:00

3.4 KiB

1. Dependencies and Configuration

  • 1.1 Remove better-sqlite3 and @types/better-sqlite3 from package.json
  • 1.2 Add pg and @types/pg to package.json dependencies
  • 1.3 Run npm install to update node_modules and lockfile
  • 1.4 Update drizzle.config.ts to target postgresql dialect with DATABASE_URL env var
  • 1.5 Update .env.example — replace DATABASE_PATH with DATABASE_URL=postgresql://ml_user:ml_password@postgres:5432/candle_annotator

2. Drizzle Schema Migration (SQLite → PostgreSQL)

  • 2.1 Rewrite src/lib/db/schema.ts — replace all sqliteTable with pgTable, apply type mappings (integer→serial, integer→timestamp, integer→boolean, real→doublePrecision, text JSON→jsonb)
  • 2.2 Delete all existing SQLite migration files in drizzle/ directory
  • 2.3 Run drizzle-kit generate to produce fresh PostgreSQL migration SQL
  • 2.4 Review generated migration SQL for correctness

3. Database Connection Layer

  • 3.1 Rewrite src/lib/db/index.ts — replace better-sqlite3 driver with pg.Pool (max: 10), read DATABASE_URL from env, fail if missing
  • 3.2 Update migration runner to use PostgreSQL-compatible execution (skip during build phase via NEXT_PHASE check)
  • 3.3 Update all imports if any changed (verify db export still works for API routes)

4. API Route Adjustments

  • 4.1 Audit all Next.js API routes using db for SQLite-specific syntax (e.g., integer booleans, raw SQL fragments)
  • 4.2 Fix any SQLite-specific query patterns to work with PostgreSQL (boolean handling, timestamp handling, jsonb operations)
  • 4.3 Update health check endpoint (/api/health) to verify PostgreSQL connectivity

5. Docker and Deployment

  • 5.1 Update docker-compose.yml — rename POSTGRES_DB to candle_annotator, add DATABASE_URL env to candle-annotator service, add depends_on: postgres with health check condition
  • 5.2 Remove candle-data volume from docker-compose.yml (SQLite volume)
  • 5.3 Update Dockerfile if it references SQLite or DATABASE_PATH
  • 5.4 Update ML service database connection — change database name from ml_db to candle_annotator in environment config

6. ML Service Direct Data Access

  • 6.1 Add SQLAlchemy table reflections or raw queries in the ML service for reading candles, annotations, span_annotations, charts tables
  • 6.2 Update ML training pipeline to query candle/annotation data from PostgreSQL instead of CSV/JSON exports
  • 6.3 Remove or deprecate any CSV/JSON export code paths that are no longer needed

7. Data Migration Script

  • 7.1 Create scripts/migrate-sqlite-to-postgres.ts — read all 6 tables from SQLite, apply type conversions (timestamps, booleans, JSON→jsonb), insert into PostgreSQL
  • 7.2 Make the script idempotent (skip or clear+re-insert with flag)
  • 7.3 Test migration script with existing SQLite data

8. Testing and Verification

  • 8.1 Run the full application locally with PostgreSQL — verify all API routes work
  • 8.2 Verify ML service can query candle/annotation data from shared database
  • 8.3 Run docker compose up and verify all services start correctly with new configuration
  • 8.4 Update DEPLOYMENT.md with new deployment steps (PostgreSQL migration, data migration script, rollback procedure)
  • 8.5 Update README.md and CLAUDE_DESCRIPTION.md with database architecture changes