fix: resolve numpy type conversion issues in ML service data access
- Convert numpy.int64 to Python int before passing to SQLAlchemy queries - Prevents psycopg2.ProgrammingError: can't adapt type 'numpy.int64' - Applied to get_candles(), get_span_annotations(), and get_point_annotations() - All ML service database access tests now passing successfully
This commit is contained in:
parent
5377431c9d
commit
d1557a3846
6 changed files with 437 additions and 119 deletions
|
|
@ -4,9 +4,28 @@
|
|||
|
||||
Candle Annotator is a complete machine learning platform for candlestick pattern recognition, combining manual annotation tools with an integrated Python ML pipeline. It's designed for traders and ML researchers to create labeled training data, train models, and deploy pattern recognition in an active learning loop.
|
||||
|
||||
**Current Version**: 3.0.0 (ML Pipeline Integration - Feature engineering, training, and inference service)
|
||||
**Current Version**: 3.1.0 (Database Consolidation - PostgreSQL for all application data)
|
||||
|
||||
## Recent Changes (v3.0.0)
|
||||
## Recent Changes (v3.1.0)
|
||||
|
||||
### Database Consolidation to PostgreSQL
|
||||
- **Unified Database**: Migrated from SQLite to PostgreSQL for all application data (frontend + ML service)
|
||||
- **Shared Data Access**: ML service can now directly query candle and annotation tables from PostgreSQL
|
||||
- **No More CSV Exports**: ML service reads training data directly from database, eliminating export/import steps
|
||||
- **Schema Updates**: All tables converted to PostgreSQL types (serial, timestamp, double precision, jsonb)
|
||||
- **Type Safety**: Fixed numpy type conversion issues in ML service data access layer
|
||||
- **Migration Script**: One-time migration script (`scripts/migrate-sqlite-to-postgres.ts`) for upgrading from SQLite
|
||||
- **Rollback Support**: Documented rollback procedure if migration fails
|
||||
|
||||
### Backend Infrastructure Updates
|
||||
- **Drizzle ORM**: Updated to PostgreSQL dialect with pg driver
|
||||
- **Database Connection**: Connection pooling (max: 10 connections) for Next.js API routes
|
||||
- **ML Service ORM**: SQLAlchemy table reflections for read-only access to frontend tables
|
||||
- **Environment Variables**: `DATABASE_URL` replaces `DATABASE_PATH`
|
||||
- **Docker Compose**: Updated with shared PostgreSQL instance, removed `candle-data` volume
|
||||
- **Health Checks**: Enhanced health endpoints verify PostgreSQL connectivity
|
||||
|
||||
## Previous Changes (v3.0.0)
|
||||
|
||||
### ML Pipeline Integration
|
||||
- **Python ML Service**: FastAPI-based inference service for real-time pattern prediction
|
||||
|
|
@ -72,8 +91,8 @@ Candle Annotator is a complete machine learning platform for candlestick pattern
|
|||
### Backend (Next.js)
|
||||
- **Runtime**: Node.js 18.x
|
||||
- **API**: Next.js API Routes
|
||||
- **Database**: SQLite with better-sqlite3
|
||||
- **ORM**: Drizzle ORM
|
||||
- **Database**: PostgreSQL 16 with pg driver
|
||||
- **ORM**: Drizzle ORM (PostgreSQL dialect)
|
||||
- **CSV**: papaparse
|
||||
|
||||
### ML Service (Python)
|
||||
|
|
@ -83,7 +102,8 @@ Candle Annotator is a complete machine learning platform for candlestick pattern
|
|||
- **Data Processing**: pandas, numpy
|
||||
- **Experiment Tracking**: MLflow (model registry, artifact storage)
|
||||
- **Data Versioning**: DVC
|
||||
- **Database**: PostgreSQL 16 (training run metadata)
|
||||
- **Database**: PostgreSQL 16 (shared with frontend - reads candles/annotations, writes training runs)
|
||||
- **ORM**: SQLAlchemy (for training runs) + table reflection (for frontend data)
|
||||
- **Model Persistence**: joblib
|
||||
- **Validation**: Pydantic
|
||||
|
||||
|
|
@ -103,8 +123,9 @@ Candle Annotator is a complete machine learning platform for candlestick pattern
|
|||
|
||||
### Data Management
|
||||
- **CSV Upload**: Import OHLC data (time, open, high, low, close)
|
||||
- **Persistent Storage**: SQLite database stores all annotations
|
||||
- **CSV Export**: Download labeled data for ML training
|
||||
- **Persistent Storage**: PostgreSQL database stores all annotations and candles
|
||||
- **Shared Database**: ML service directly queries candle/annotation data
|
||||
- **CSV Export**: Download labeled data (optional, no longer required for ML training)
|
||||
|
||||
### UI Components
|
||||
- **Toolbox**: Left sidebar with tool buttons, color picker, label management
|
||||
|
|
@ -274,7 +295,7 @@ candle_annotator/
|
|||
|
||||
- **Single User**: No authentication, local data only
|
||||
- **No Undo**: Annotations can only be deleted, not undone
|
||||
- **SQLite Limits**: Not for concurrent multi-user access
|
||||
- **PostgreSQL Required**: Application requires PostgreSQL server to be running
|
||||
- **Memory**: Large CSV files (100k+ rows) slow performance
|
||||
- **Lines**: Free-form drawing, no snap-to-candle
|
||||
|
||||
|
|
@ -322,14 +343,28 @@ Before marking features complete:
|
|||
|
||||
## Version History
|
||||
|
||||
### v2.0.0 (Current)
|
||||
### v3.1.0 (Current)
|
||||
- Database consolidation to PostgreSQL
|
||||
- Shared database between frontend and ML service
|
||||
- Direct database access for ML training (no CSV exports)
|
||||
- Migration script for SQLite to PostgreSQL
|
||||
- Type safety improvements in ML service
|
||||
|
||||
### v3.0.0
|
||||
- ML pipeline integration with FastAPI inference service
|
||||
- Feature engineering with TA-Lib
|
||||
- Model training with MLflow tracking
|
||||
- Prediction UI with disagreement detection
|
||||
- Active learning loop
|
||||
|
||||
### v2.0.0
|
||||
- Label management sidebar with search/filter
|
||||
- Hacker theme with matrix colors and monospace font
|
||||
- Docker deployment with compose
|
||||
- Keyboard delete for labels
|
||||
- Label selection on chart
|
||||
|
||||
### v1.0.0 (Previous)
|
||||
### v1.0.0
|
||||
- Basic annotation tools (Break Up, Break Down, Line, Delete)
|
||||
- Candlestick chart visualization
|
||||
- CSV import/export
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue