fix: resolve numpy type conversion issues in ML service data access
- Convert numpy.int64 to Python int before passing to SQLAlchemy queries - Prevents psycopg2.ProgrammingError: can't adapt type 'numpy.int64' - Applied to get_candles(), get_span_annotations(), and get_point_annotations() - All ML service database access tests now passing successfully
This commit is contained in:
parent
5377431c9d
commit
d1557a3846
6 changed files with 437 additions and 119 deletions
134
README.md
134
README.md
|
|
@ -33,7 +33,8 @@ Candle Annotator is a complete machine learning platform for candlestick pattern
|
|||
- **CSV Upload**: Import OHLC data with support for both Unix timestamps and date strings
|
||||
- **Replace Mode**: Uploading a new CSV deletes all old candles and replaces them with new data
|
||||
- **Initial Data**: Docker containers automatically load EURUSD.csv on first startup if database is empty
|
||||
- **SQLite Storage**: All candle data and annotations stored locally in SQLite database
|
||||
- **PostgreSQL Storage**: All candle data and annotations stored in PostgreSQL database
|
||||
- **Shared Database**: Frontend and ML service use the same database for seamless data access
|
||||
- **Data Persistence**: Annotations and candles persist between sessions
|
||||
|
||||
### Chart Visualization
|
||||
|
|
@ -134,21 +135,23 @@ The integrated ML pipeline provides:
|
|||
│ - /api/predict (proxy) │
|
||||
│ - /api/model/info (proxy) │
|
||||
│ └───────────┬─────────────────────────────────────┬─────────── │
|
||||
│ │ SQLite (annotations) │ HTTP │
|
||||
│ │ PostgreSQL │ HTTP │
|
||||
│ ▼ ▼ │
|
||||
│ ┌──────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ SQLite Database │ │ ML Inference API │ │
|
||||
│ │ (Drizzle ORM) │ │ (FastAPI, Python) │ │
|
||||
│ └──────────────────┘ └──────────┬───────────┘ │
|
||||
└────────────────────────────────────────────────────┼──────────────┘
|
||||
│
|
||||
┌──────────────────────────────┴───────────────┐
|
||||
│ │
|
||||
┌───────────▼─────────┐ ┌──────────────▼────────┐
|
||||
│ MLflow Server │ │ PostgreSQL │
|
||||
│ (Experiments, │ │ (Training run │
|
||||
│ Model Registry) │ │ metadata) │
|
||||
└─────────────────────┘ └───────────────────────┘
|
||||
│ ┌──────────────────────────────────────────────────────────┐ │
|
||||
│ │ PostgreSQL Database (Shared) │ │
|
||||
│ │ - Frontend tables (candles, annotations, span_annotations) │
|
||||
│ │ - ML tables (training_runs) │ │
|
||||
│ │ - Accessed by: Next.js (Drizzle ORM) │ │
|
||||
│ │ ML Service (SQLAlchemy) │ │
|
||||
│ └──────────────────────┬──────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────┴──────────┐ │
|
||||
│ │ │ │
|
||||
│ ┌───────────▼─────────┐ ┌───────▼───────────────────┐ │
|
||||
│ │ ML Inference API │ │ MLflow Server │ │
|
||||
│ │ (FastAPI, Python) │ │ (Experiments, Registry) │ │
|
||||
│ └─────────────────────┘ └───────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### ML Pipeline Workflow
|
||||
|
|
@ -189,8 +192,8 @@ The integrated ML pipeline provides:
|
|||
- **Charting**: lightweight-charts 4.x (TradingView)
|
||||
- **Icons**: lucide-react
|
||||
- **Backend**: Next.js API Routes
|
||||
- **Database**: SQLite with better-sqlite3
|
||||
- **ORM**: Drizzle ORM
|
||||
- **Database**: PostgreSQL 16 with pg driver
|
||||
- **ORM**: Drizzle ORM (PostgreSQL dialect)
|
||||
- **CSV Parsing**: papaparse
|
||||
|
||||
### ML Pipeline (Python)
|
||||
|
|
@ -200,7 +203,8 @@ The integrated ML pipeline provides:
|
|||
- **Data Processing**: pandas, numpy
|
||||
- **Experiment Tracking**: MLflow (model registry, artifact storage)
|
||||
- **Data Versioning**: DVC (Data Version Control)
|
||||
- **Database**: PostgreSQL 16 (training run metadata)
|
||||
- **Database**: PostgreSQL 16 (shared with frontend - reads candles/annotations, writes training runs)
|
||||
- **ORM**: SQLAlchemy (for training runs) + table reflection (for frontend data)
|
||||
- **Model Persistence**: joblib
|
||||
- **Validation**: Pydantic
|
||||
|
||||
|
|
@ -222,8 +226,8 @@ See [DEPLOYMENT.md](./DEPLOYMENT.md#docker-deployment) for detailed Docker instr
|
|||
|
||||
- Node.js 18.x or higher (for local development)
|
||||
- npm 9.x or higher (for local development)
|
||||
- PostgreSQL 16 or higher (for local development)
|
||||
- Docker & docker-compose (for containerized deployment)
|
||||
- Build tools for native modules (see DEPLOYMENT.md)
|
||||
|
||||
### Local Development Installation
|
||||
|
||||
|
|
@ -238,12 +242,26 @@ See [DEPLOYMENT.md](./DEPLOYMENT.md#docker-deployment) for detailed Docker instr
|
|||
npm install
|
||||
```
|
||||
|
||||
3. Start the development server:
|
||||
3. Setup PostgreSQL database:
|
||||
```bash
|
||||
createdb candle_annotator
|
||||
createuser -P ml_user
|
||||
# Enter password: ml_password
|
||||
psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;"
|
||||
```
|
||||
|
||||
4. Create `.env` file:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env to set DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator
|
||||
```
|
||||
|
||||
5. Start the development server:
|
||||
```bash
|
||||
npm run dev
|
||||
```
|
||||
|
||||
4. Open http://localhost:3000 in your browser
|
||||
6. Open http://localhost:3000 in your browser
|
||||
|
||||
### Usage
|
||||
|
||||
|
|
@ -289,28 +307,68 @@ timestamp,label_type,price
|
|||
|
||||
## Database Schema
|
||||
|
||||
### Candles Table
|
||||
### Candles Table (PostgreSQL)
|
||||
|
||||
```typescript
|
||||
{
|
||||
id: integer (PK, auto-increment),
|
||||
time: integer (Unix timestamp, unique),
|
||||
open: real,
|
||||
high: real,
|
||||
low: real,
|
||||
close: real
|
||||
id: serial (PK, auto-increment),
|
||||
chart_id: integer (FK to charts.id),
|
||||
time: timestamp (not null, indexed with chart_id),
|
||||
open: double precision,
|
||||
high: double precision,
|
||||
low: double precision,
|
||||
close: double precision
|
||||
}
|
||||
```
|
||||
|
||||
### Annotations Table
|
||||
### Annotations Table (Point Annotations)
|
||||
|
||||
```typescript
|
||||
{
|
||||
id: integer (PK, auto-increment),
|
||||
timestamp: integer (Unix timestamp),
|
||||
label_type: text ('break_up' | 'break_down' | 'line'),
|
||||
geometry: text (JSON string for line coordinates, null for markers),
|
||||
created_at: integer (Unix timestamp)
|
||||
id: serial (PK, auto-increment),
|
||||
chart_id: integer (FK to charts.id),
|
||||
timestamp: timestamp (not null),
|
||||
label_type: text ('line' | 'rectangle'),
|
||||
geometry: jsonb (for line/rectangle coordinates, nullable),
|
||||
color: text (default '#3b82f6'),
|
||||
created_at: timestamp (default now())
|
||||
}
|
||||
```
|
||||
|
||||
### Span Annotations Table (Pattern Labels)
|
||||
|
||||
```typescript
|
||||
{
|
||||
id: serial (PK, auto-increment),
|
||||
chart_id: integer (FK to charts.id),
|
||||
start_time: timestamp (not null),
|
||||
end_time: timestamp (not null),
|
||||
label: text (pattern name, e.g., 'Bullish Engulfing'),
|
||||
confidence: integer (nullable),
|
||||
outcome: text (nullable),
|
||||
notes: text (nullable),
|
||||
sub_spans: jsonb (nullable),
|
||||
color: text (default '#2196F3'),
|
||||
source: text (default 'human'), # 'human' | 'model' | 'hybrid'
|
||||
model_prediction: jsonb (nullable),
|
||||
created_at: timestamp (default now())
|
||||
}
|
||||
```
|
||||
|
||||
### Training Runs Table (ML Service)
|
||||
|
||||
```typescript
|
||||
{
|
||||
id: serial (PK, auto-increment),
|
||||
run_id: text (unique, MLflow run ID),
|
||||
model_type: text (e.g., 'RandomForest', 'XGBoost'),
|
||||
experiment_name: text,
|
||||
pipeline_config_hash: text,
|
||||
dataset_version: text,
|
||||
metrics_summary: jsonb,
|
||||
status: text (e.g., 'running', 'completed', 'failed'),
|
||||
created_at: timestamp (default now()),
|
||||
completed_at: timestamp (nullable)
|
||||
}
|
||||
```
|
||||
|
||||
|
|
@ -411,9 +469,9 @@ candle_annotator/
|
|||
### Key Technical Decisions
|
||||
|
||||
1. **lightweight-charts v4**: Stable API with good candlestick and marker support
|
||||
2. **SQLite with better-sqlite3**: Synchronous access, perfect for single-user local apps
|
||||
2. **PostgreSQL**: Shared database enables ML service to directly query candle/annotation data without CSV exports
|
||||
3. **SVG Overlay for Lines**: Maintains separate rendering layer from chart, easier coordinate management
|
||||
4. **Drizzle ORM**: Type-safe queries with minimal overhead
|
||||
4. **Drizzle ORM**: Type-safe queries with minimal overhead, PostgreSQL dialect for production-grade features
|
||||
5. **Next.js App Router**: Server-side API routes co-located with frontend code
|
||||
|
||||
### Known Limitations
|
||||
|
|
@ -428,9 +486,9 @@ candle_annotator/
|
|||
See [DEPLOYMENT.md](./DEPLOYMENT.md) for detailed troubleshooting steps.
|
||||
|
||||
Common issues:
|
||||
- **better-sqlite3 binding errors**: Run `npm rebuild better-sqlite3`
|
||||
- **PostgreSQL connection errors**: Check `DATABASE_URL` environment variable and verify PostgreSQL is running
|
||||
- **Port 3000 in use**: Use `PORT=3001 npm run dev`
|
||||
- **Database corruption**: Delete `data/candles.db` and restart
|
||||
- **Migration errors**: Ensure PostgreSQL is accessible before starting the application
|
||||
|
||||
## License
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue