feat(ml): add TA-Lib annotation generation and import workflow

Add complete workflow for using TA-Lib to bootstrap training data:

- generate_talib_annotations.py: Python script to run TA-Lib CDL* functions
  and output span annotations in UI-compatible format
- import_talib_annotations.ts: TypeScript script to import generated
  annotations into the UI database with auto-label-type creation
- npm script 'import-annotations' for easy execution
- TALIB_WORKFLOW.md: Comprehensive guide covering the full cycle:
  * Generate patterns with TA-Lib
  * Import into UI
  * Review and edit in browser
  * Export and train model
  * Compare predictions with TA-Lib detections
  * Iterate for improvement

This enables the intended workflow: use TA-Lib for initial annotations,
manually refine them, then train a model that learns from corrections.
This commit is contained in:
Marko Djordjevic 2026-02-15 19:18:28 +01:00
parent 228f70daf3
commit 847ff67986
18 changed files with 5416 additions and 7 deletions

9
.env Normal file
View file

@ -0,0 +1,9 @@
NODE_ENV=production
PORT=3000
DATABASE_PATH=/app/data/candles.db
# ML Inference Service Configuration
INFERENCE_API_URL=http://localhost:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true

View file

@ -196,7 +196,8 @@ sudo make install
```bash
cd services/ml
pip install -r requirements.txt
uv sync
#pip install -r requirements.txt
```
#### 3. Setup PostgreSQL
@ -217,7 +218,7 @@ DVC is used for dataset versioning:
```bash
cd services/ml
dvc init
dvc init #--subdir
dvc remote add -d local /path/to/dvc-storage
```

372
TALIB_WORKFLOW.md Normal file
View file

@ -0,0 +1,372 @@
# TA-Lib Annotation Workflow
This guide shows how to use TA-Lib to generate initial pattern annotations, edit them in the UI, and train a model.
## Overview
1. **Generate** - Run TA-Lib CDL* functions to detect patterns automatically
2. **Import** - Import detected patterns into the UI database
3. **Review & Edit** - View, correct, and refine annotations in the web UI
4. **Train** - Export annotations and train your model
5. **Iterate** - Get predictions, compare with TA-Lib, retrain
## Step 1: Generate TA-Lib Annotations
### 1.1 Prepare Your Data
Export candles from your database:
```bash
# From host
docker-compose exec candle-annotator sh -c "sqlite3 /app/data/candles.db -csv -header 'SELECT time, open, high, low, close, volume FROM candles ORDER BY time;'" > OHLCV.csv
# Or copy your existing CSV
cp your_data.csv OHLCV.csv
```
### 1.2 Run Pattern Detection
Enter the ML service container and run the generator:
```bash
# Enter container
docker-compose exec ml-service bash
# Generate annotations (all patterns, perfect matches only)
python generate_talib_annotations.py \
--input data/raw/OHLCV.csv \
--output talib_annotations.json
# Or specify patterns and lower confidence threshold
python generate_talib_annotations.py \
--input data/raw/OHLCV.csv \
--output talib_annotations.json \
--min-confidence 50 \
--patterns CDLENGULFING CDLHAMMER CDLDOJI
# Exit container
exit
```
This creates `talib_annotations.json` with detected patterns.
### 1.3 Review Detection Results
```bash
# Check what was detected
cat services/ml/talib_annotations.json | jq '.annotations | length'
cat services/ml/talib_annotations.json | jq '.annotations[0]'
# See pattern distribution
cat services/ml/talib_annotations.json | jq '[.annotations[].label] | group_by(.) | map({label: .[0], count: length}) | sort_by(.count) | reverse'
```
**Output example:**
```json
{
"start_time": 1700000000,
"end_time": 1700003600,
"label": "Bullish Engulfing",
"confidence": 1.0,
"source": "programmatic",
"notes": "TA-Lib CDLENGULFING detection"
}
```
## Step 2: Import into UI
### 2.1 Copy Annotations File
```bash
# Copy from ML service to project root
docker-compose cp ml-service:/app/talib_annotations.json ./talib_annotations.json
```
### 2.2 Get Your Chart ID
Open http://localhost:3000 and note your chart ID (shown in the chart selector, or check database):
```bash
docker-compose exec candle-annotator sh -c "sqlite3 /app/data/candles.db 'SELECT id, name FROM charts;'"
```
### 2.3 Import Annotations
```bash
# Import into chart 1
npm run import-annotations -- --file talib_annotations.json --chart-id 1
# Or clear existing annotations first
npm run import-annotations -- --file talib_annotations.json --chart-id 1 --clear
```
**Output:**
```
=== TA-Lib Annotation Import ===
Input file: talib_annotations.json
Chart ID: 1
Clear existing: no
Reading annotations file...
Found 147 annotations
Source: talib
Ensuring 12 label types exist...
✓ Bullish Engulfing (existing, id: 1)
+ Bearish Engulfing (created, id: 8, color: #ef4444)
✓ Bullish Hammer (existing, id: 2)
...
Importing 147 annotations for chart 1...
✓ Imported 147 annotations
=== Import Complete ===
```
## Step 3: Review & Edit in UI
### 3.1 Open the Annotator
1. Open http://localhost:3000
2. Select your chart from the dropdown
3. Scroll down to see the span annotations in the sidebar
### 3.2 Review TA-Lib Detections
You'll see all the TA-Lib detected patterns as span annotations:
- Green spans = Bullish patterns
- Red spans = Bearish patterns
- Source labeled as "programmatic"
### 3.3 Edit Annotations
**Correct false positives:**
- Click on a span annotation in the sidebar or chart
- Press Delete/Backspace to remove it
**Add missing patterns:**
- Select a span label type
- Click and drag on the chart to create new annotations
- Your annotations are marked as source "human"
**Adjust boundaries:**
- Delete the annotation
- Recreate it with correct start/end times
**Add new pattern types:**
- Go to "Manage Span Label Types"
- Add custom patterns TA-Lib doesn't detect
- Return to main page and annotate
### 3.4 Best Practices
- **Review all TA-Lib detections** - They're not always perfect
- **Focus on quality over quantity** - Better to have 50 accurate annotations than 200 noisy ones
- **Add context** - TA-Lib only detects classic patterns; add your own insights
- **Diverse examples** - Make sure you have patterns in different market conditions
## Step 4: Export & Train
### 4.1 Export Annotations
```bash
# Export all span annotations (includes both human and TA-Lib)
curl http://localhost:3000/api/span-annotations/export > services/ml/data/annotations/export.json
# Verify export
cat services/ml/data/annotations/export.json | jq '.annotations | length'
```
### 4.2 Prepare OHLCV Data
```bash
# Copy candles to ML service
docker-compose exec candle-annotator sh -c "sqlite3 /app/data/candles.db -csv -header 'SELECT time, open, high, low, close, volume FROM candles ORDER BY time;'" > services/ml/data/raw/OHLCV.csv
```
### 4.3 Train Model
```bash
# Enter ML service
docker-compose exec ml-service bash
# Run full pipeline
python pipeline.py --config config/pipeline.yaml
# Exit
exit
```
### 4.4 Restart Inference
```bash
# Restart to load new model
docker-compose restart ml-service
# Verify model loaded
curl http://localhost:8001/model/info | jq '.model_info'
```
## Step 5: Compare & Iterate
### 5.1 Get Predictions
1. Open http://localhost:3000
2. Scroll to Predictions panel
3. Click "Run on Visible" or "Predict All"
### 5.2 Compare with TA-Lib
Now you can compare:
- **Your edited annotations** (human judgment)
- **TA-Lib raw detections** (programmatic)
- **Model predictions** (trained on your corrections)
The disagreement detection shows where these differ!
### 5.3 Iterate
Use the prediction summary to find:
- **Missed by model** - Patterns you annotated but model missed
- **Missed by human** - Model found patterns you didn't annotate
- **Label mismatch** - Same location, different pattern type
Add more annotations where the model struggles, then retrain.
## Configuration Options
### Pattern Selection
Edit which patterns to detect:
```bash
python generate_talib_annotations.py \
--input data/raw/OHLCV.csv \
--output talib_annotations.json \
--patterns CDLENGULFING CDLHAMMER CDLDOJI CDLMORNINGSTAR CDLEVENINGSTAR
```
Common patterns:
- `CDLENGULFING` - Bullish/Bearish Engulfing
- `CDLHAMMER` - Hammer
- `CDLDOJI` - Doji
- `CDLMORNINGSTAR` / `CDLEVENINGSTAR` - Morning/Evening Star
- `CDLHARAMI` - Harami
- `CDLTHREEWHITESOLDIERS` / `CDLTHREEBLACKCROWS` - Three Soldiers/Crows
See full list: https://ta-lib.org/function.html (search for CDL)
### Confidence Threshold
TA-Lib returns -100/+100 for pattern matches. Lower the threshold to get more detections:
```bash
# Get 50-100% matches (more patterns, potentially noisier)
python generate_talib_annotations.py \
--input data/raw/OHLCV.csv \
--output talib_annotations.json \
--min-confidence 50
```
## Troubleshooting
### "No module named 'talib'"
TA-Lib not installed:
```bash
# Rebuild ml-service with TA-Lib
docker-compose build --no-cache ml-service
docker-compose up -d ml-service
```
### "No patterns detected"
Try:
1. **Lower confidence threshold** - Use `--min-confidence 50`
2. **Check data quality** - Make sure OHLCV has valid data
3. **Try more patterns** - Don't specify `--patterns`, detect all
### Import script fails
Make sure:
1. **File exists** - Check path to JSON file
2. **Chart ID valid** - Run: `docker-compose exec candle-annotator sh -c "sqlite3 /app/data/candles.db 'SELECT id, name FROM charts;'"`
3. **tsx installed** - Run: `npm install`
### Annotations not showing in UI
1. **Refresh page** - Hard refresh (Ctrl+F5)
2. **Check chart ID** - Make sure you selected the correct chart
3. **Check database** - Run: `docker-compose exec candle-annotator sh -c "sqlite3 /app/data/candles.db 'SELECT COUNT(*) FROM span_annotations;'"`
## Tips & Best Practices
### Balancing Human & Programmatic Labels
When training, you can choose merge strategy in `config/pipeline.yaml`:
```yaml
annotation_ingestion:
merge_strategy: "human_priority" # Use human labels where they overlap
# or "programmatic_priority" # Use TA-Lib where they overlap
# or "both" # Keep both as separate features
```
**Recommended**: Start with `human_priority` - trust your corrections over TA-Lib.
### Iterative Improvement
1. **Round 1**: Generate TA-Lib → Review → Train baseline model
2. **Round 2**: Get predictions → Find disagreements → Add corrections → Retrain
3. **Round 3**: Focus on low-confidence predictions → Add more examples → Retrain
4. **Repeat** until model performance meets your needs
### Pattern Coverage
Make sure you have examples of:
- **Bullish patterns** in uptrends
- **Bearish patterns** in downtrends
- **Neutral patterns** in sideways markets
- **False signals** (TA-Lib detected but actually not a tradeable pattern)
This teaches the model context, not just shape recognition.
### Quality Metrics
Track these in MLflow UI (http://localhost:5000):
- **Accuracy** - Overall correctness
- **F1 (macro)** - Average across all pattern types
- **Per-class F1** - Performance for each pattern individually
- **Confusion matrix** - Where the model makes mistakes
Focus on improving F1 for patterns you actually trade.
## Quick Reference
```bash
# Generate TA-Lib annotations
docker-compose exec ml-service python generate_talib_annotations.py \
--input data/raw/OHLCV.csv --output talib_annotations.json
# Copy to host
docker-compose cp ml-service:/app/talib_annotations.json ./
# Import to UI
npm run import-annotations -- --file talib_annotations.json --chart-id 1
# Export after editing
curl http://localhost:3000/api/span-annotations/export > services/ml/data/annotations/export.json
# Train model
docker-compose exec ml-service python pipeline.py --config config/pipeline.yaml
# Restart inference
docker-compose restart ml-service
# View results
open http://localhost:3000
open http://localhost:5000 # MLflow UI
```

2
next-env.d.ts vendored
View file

@ -1,6 +1,6 @@
/// <reference types="next" />
/// <reference types="next/image-types/global" />
import "./.next/types/routes.d.ts";
import "./.next/dev/types/routes.d.ts";
// NOTE: This file should not be edited
// see https://nextjs.org/docs/app/api-reference/config/typescript for more information.

508
package-lock.json generated
View file

@ -38,7 +38,8 @@
},
"devDependencies": {
"drizzle-kit": "^0.31.9",
"tailwindcss": "^3.4.19"
"tailwindcss": "^3.4.19",
"tsx": "^4.21.0"
}
},
"node_modules/@alloc/quick-lru": {
@ -7473,6 +7474,511 @@
"version": "2.8.1",
"license": "0BSD"
},
"node_modules/tsx": {
"version": "4.21.0",
"resolved": "https://registry.npmjs.org/tsx/-/tsx-4.21.0.tgz",
"integrity": "sha512-5C1sg4USs1lfG0GFb2RLXsdpXqBSEhAaA/0kPL01wxzpMqLILNxIxIOKiILz+cdg/pLnOUxFYOR5yhHU666wbw==",
"dev": true,
"license": "MIT",
"peer": true,
"dependencies": {
"esbuild": "~0.27.0",
"get-tsconfig": "^4.7.5"
},
"bin": {
"tsx": "dist/cli.mjs"
},
"engines": {
"node": ">=18.0.0"
},
"optionalDependencies": {
"fsevents": "~2.3.3"
}
},
"node_modules/tsx/node_modules/@esbuild/aix-ppc64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.3.tgz",
"integrity": "sha512-9fJMTNFTWZMh5qwrBItuziu834eOCUcEqymSH7pY+zoMVEZg3gcPuBNxH1EvfVYe9h0x/Ptw8KBzv7qxb7l8dg==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"aix"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/android-arm": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.3.tgz",
"integrity": "sha512-i5D1hPY7GIQmXlXhs2w8AWHhenb00+GxjxRncS2ZM7YNVGNfaMxgzSGuO8o8SJzRc/oZwU2bcScvVERk03QhzA==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/android-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.3.tgz",
"integrity": "sha512-YdghPYUmj/FX2SYKJ0OZxf+iaKgMsKHVPF1MAq/P8WirnSpCStzKJFjOjzsW0QQ7oIAiccHdcqjbHmJxRb/dmg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/android-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.3.tgz",
"integrity": "sha512-IN/0BNTkHtk8lkOM8JWAYFg4ORxBkZQf9zXiEOfERX/CzxW3Vg1ewAhU7QSWQpVIzTW+b8Xy+lGzdYXV6UZObQ==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"android"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/darwin-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.3.tgz",
"integrity": "sha512-Re491k7ByTVRy0t3EKWajdLIr0gz2kKKfzafkth4Q8A5n1xTHrkqZgLLjFEHVD+AXdUGgQMq+Godfq45mGpCKg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/darwin-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.3.tgz",
"integrity": "sha512-vHk/hA7/1AckjGzRqi6wbo+jaShzRowYip6rt6q7VYEDX4LEy1pZfDpdxCBnGtl+A5zq8iXDcyuxwtv3hNtHFg==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/freebsd-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.3.tgz",
"integrity": "sha512-ipTYM2fjt3kQAYOvo6vcxJx3nBYAzPjgTCk7QEgZG8AUO3ydUhvelmhrbOheMnGOlaSFUoHXB6un+A7q4ygY9w==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/freebsd-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.3.tgz",
"integrity": "sha512-dDk0X87T7mI6U3K9VjWtHOXqwAMJBNN2r7bejDsc+j03SEjtD9HrOl8gVFByeM0aJksoUuUVU9TBaZa2rgj0oA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"freebsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-arm": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.3.tgz",
"integrity": "sha512-s6nPv2QkSupJwLYyfS+gwdirm0ukyTFNl3KTgZEAiJDd+iHZcbTPPcWCcRYH+WlNbwChgH2QkE9NSlNrMT8Gfw==",
"cpu": [
"arm"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.3.tgz",
"integrity": "sha512-sZOuFz/xWnZ4KH3YfFrKCf1WyPZHakVzTiqji3WDc0BCl2kBwiJLCXpzLzUBLgmp4veFZdvN5ChW4Eq/8Fc2Fg==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-ia32": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.3.tgz",
"integrity": "sha512-yGlQYjdxtLdh0a3jHjuwOrxQjOZYD/C9PfdbgJJF3TIZWnm/tMd/RcNiLngiu4iwcBAOezdnSLAwQDPqTmtTYg==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-loong64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.3.tgz",
"integrity": "sha512-WO60Sn8ly3gtzhyjATDgieJNet/KqsDlX5nRC5Y3oTFcS1l0KWba+SEa9Ja1GfDqSF1z6hif/SkpQJbL63cgOA==",
"cpu": [
"loong64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-mips64el": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.3.tgz",
"integrity": "sha512-APsymYA6sGcZ4pD6k+UxbDjOFSvPWyZhjaiPyl/f79xKxwTnrn5QUnXR5prvetuaSMsb4jgeHewIDCIWljrSxw==",
"cpu": [
"mips64el"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-ppc64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.3.tgz",
"integrity": "sha512-eizBnTeBefojtDb9nSh4vvVQ3V9Qf9Df01PfawPcRzJH4gFSgrObw+LveUyDoKU3kxi5+9RJTCWlj4FjYXVPEA==",
"cpu": [
"ppc64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-riscv64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.3.tgz",
"integrity": "sha512-3Emwh0r5wmfm3ssTWRQSyVhbOHvqegUDRd0WhmXKX2mkHJe1SFCMJhagUleMq+Uci34wLSipf8Lagt4LlpRFWQ==",
"cpu": [
"riscv64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-s390x": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.3.tgz",
"integrity": "sha512-pBHUx9LzXWBc7MFIEEL0yD/ZVtNgLytvx60gES28GcWMqil8ElCYR4kvbV2BDqsHOvVDRrOxGySBM9Fcv744hw==",
"cpu": [
"s390x"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/linux-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.3.tgz",
"integrity": "sha512-Czi8yzXUWIQYAtL/2y6vogER8pvcsOsk5cpwL4Gk5nJqH5UZiVByIY8Eorm5R13gq+DQKYg0+JyQoytLQas4dA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"linux"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/netbsd-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.3.tgz",
"integrity": "sha512-sDpk0RgmTCR/5HguIZa9n9u+HVKf40fbEUt+iTzSnCaGvY9kFP0YKBWZtJaraonFnqef5SlJ8/TiPAxzyS+UoA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/netbsd-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.3.tgz",
"integrity": "sha512-P14lFKJl/DdaE00LItAukUdZO5iqNH7+PjoBm+fLQjtxfcfFE20Xf5CrLsmZdq5LFFZzb5JMZ9grUwvtVYzjiA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"netbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/openbsd-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.3.tgz",
"integrity": "sha512-AIcMP77AvirGbRl/UZFTq5hjXK+2wC7qFRGoHSDrZ5v5b8DK/GYpXW3CPRL53NkvDqb9D+alBiC/dV0Fb7eJcw==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/openbsd-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.3.tgz",
"integrity": "sha512-DnW2sRrBzA+YnE70LKqnM3P+z8vehfJWHXECbwBmH/CU51z6FiqTQTHFenPlHmo3a8UgpLyH3PT+87OViOh1AQ==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openbsd"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/openharmony-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.3.tgz",
"integrity": "sha512-NinAEgr/etERPTsZJ7aEZQvvg/A6IsZG/LgZy+81wON2huV7SrK3e63dU0XhyZP4RKGyTm7aOgmQk0bGp0fy2g==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"openharmony"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/sunos-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.3.tgz",
"integrity": "sha512-PanZ+nEz+eWoBJ8/f8HKxTTD172SKwdXebZ0ndd953gt1HRBbhMsaNqjTyYLGLPdoWHy4zLU7bDVJztF5f3BHA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"sunos"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/win32-arm64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.3.tgz",
"integrity": "sha512-B2t59lWWYrbRDw/tjiWOuzSsFh1Y/E95ofKz7rIVYSQkUYBjfSgf6oeYPNWHToFRr2zx52JKApIcAS/D5TUBnA==",
"cpu": [
"arm64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/win32-ia32": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.3.tgz",
"integrity": "sha512-QLKSFeXNS8+tHW7tZpMtjlNb7HKau0QDpwm49u0vUp9y1WOF+PEzkU84y9GqYaAVW8aH8f3GcBck26jh54cX4Q==",
"cpu": [
"ia32"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/@esbuild/win32-x64": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.3.tgz",
"integrity": "sha512-4uJGhsxuptu3OcpVAzli+/gWusVGwZZHTlS63hh++ehExkVT8SgiEf7/uC/PclrPPkLhZqGgCTjd0VWLo6xMqA==",
"cpu": [
"x64"
],
"dev": true,
"license": "MIT",
"optional": true,
"os": [
"win32"
],
"engines": {
"node": ">=18"
}
},
"node_modules/tsx/node_modules/esbuild": {
"version": "0.27.3",
"resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.3.tgz",
"integrity": "sha512-8VwMnyGCONIs6cWue2IdpHxHnAjzxnw2Zr7MkVxB2vjmQ2ivqGFb4LEG3SMnv0Gb2F/G/2yA8zUaiL1gywDCCg==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"bin": {
"esbuild": "bin/esbuild"
},
"engines": {
"node": ">=18"
},
"optionalDependencies": {
"@esbuild/aix-ppc64": "0.27.3",
"@esbuild/android-arm": "0.27.3",
"@esbuild/android-arm64": "0.27.3",
"@esbuild/android-x64": "0.27.3",
"@esbuild/darwin-arm64": "0.27.3",
"@esbuild/darwin-x64": "0.27.3",
"@esbuild/freebsd-arm64": "0.27.3",
"@esbuild/freebsd-x64": "0.27.3",
"@esbuild/linux-arm": "0.27.3",
"@esbuild/linux-arm64": "0.27.3",
"@esbuild/linux-ia32": "0.27.3",
"@esbuild/linux-loong64": "0.27.3",
"@esbuild/linux-mips64el": "0.27.3",
"@esbuild/linux-ppc64": "0.27.3",
"@esbuild/linux-riscv64": "0.27.3",
"@esbuild/linux-s390x": "0.27.3",
"@esbuild/linux-x64": "0.27.3",
"@esbuild/netbsd-arm64": "0.27.3",
"@esbuild/netbsd-x64": "0.27.3",
"@esbuild/openbsd-arm64": "0.27.3",
"@esbuild/openbsd-x64": "0.27.3",
"@esbuild/openharmony-arm64": "0.27.3",
"@esbuild/sunos-x64": "0.27.3",
"@esbuild/win32-arm64": "0.27.3",
"@esbuild/win32-ia32": "0.27.3",
"@esbuild/win32-x64": "0.27.3"
}
},
"node_modules/tunnel-agent": {
"version": "0.6.0",
"license": "Apache-2.0",

View file

@ -7,7 +7,8 @@
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint"
"lint": "next lint",
"import-annotations": "tsx scripts/import_talib_annotations.ts"
},
"keywords": [],
"author": "",
@ -42,6 +43,7 @@
},
"devDependencies": {
"drizzle-kit": "^0.31.9",
"tailwindcss": "^3.4.19"
"tailwindcss": "^3.4.19",
"tsx": "^4.21.0"
}
}

View file

@ -0,0 +1,252 @@
#!/usr/bin/env tsx
/**
* Import TA-Lib generated annotations into the Candle Annotator database.
*
* This script reads a JSON file with annotations (from generate_talib_annotations.py)
* and imports them as span annotations that can be viewed and edited in the UI.
*
* Usage:
* npm run import-annotations -- --file talib_annotations.json --chart-id 1
*/
import { readFile } from 'fs/promises';
import { db } from '../src/lib/db';
import { spanAnnotations, spanLabelTypes } from '../src/lib/db/schema';
import { eq } from 'drizzle-orm';
interface Annotation {
start_time: number;
end_time: number;
label: string;
confidence?: number;
source?: string;
notes?: string;
}
interface ImportData {
annotations: Annotation[];
metadata?: {
source?: string;
count?: number;
};
}
interface LabelTypeMap {
[key: string]: {
id: number;
color: string;
};
}
async function ensureLabelTypes(labels: string[]): Promise<LabelTypeMap> {
/**
* Ensure span label types exist for all unique labels.
* Creates missing label types with auto-generated colors.
*/
const uniqueLabels = [...new Set(labels)];
const labelMap: LabelTypeMap = {};
console.log(`\nEnsuring ${uniqueLabels.length} label types exist...`);
// Fetch existing label types
const existing = await db.select().from(spanLabelTypes);
const existingMap = new Map(existing.map(lt => [lt.name, lt]));
// Color palette for auto-generated labels
const colors = [
'#22c55e', // green (bullish)
'#ef4444', // red (bearish)
'#3b82f6', // blue
'#f59e0b', // amber
'#8b5cf6', // purple
'#ec4899', // pink
'#06b6d4', // cyan
'#f97316', // orange
];
let colorIndex = 0;
let sortOrder = existing.length;
for (const label of uniqueLabels) {
if (existingMap.has(label)) {
const existing = existingMap.get(label)!;
labelMap[label] = {
id: existing.id,
color: existing.color,
};
console.log(`${label} (existing, id: ${existing.id})`);
} else {
// Assign color based on bullish/bearish
let color: string;
if (label.toLowerCase().includes('bullish')) {
color = '#22c55e'; // green
} else if (label.toLowerCase().includes('bearish')) {
color = '#ef4444'; // red
} else {
color = colors[colorIndex % colors.length];
colorIndex++;
}
// Create new label type
const [newLabel] = await db.insert(spanLabelTypes).values({
name: label,
display_name: label,
color,
hotkey: null,
is_active: 1,
sort_order: sortOrder++,
}).returning();
labelMap[label] = {
id: newLabel.id,
color: newLabel.color,
};
console.log(` + ${label} (created, id: ${newLabel.id}, color: ${color})`);
}
}
return labelMap;
}
async function importAnnotations(
annotations: Annotation[],
chartId: number,
labelMap: LabelTypeMap
): Promise<number> {
/**
* Import annotations into the database.
* Returns the number of annotations imported.
*/
console.log(`\nImporting ${annotations.length} annotations for chart ${chartId}...`);
let imported = 0;
let skipped = 0;
for (const ann of annotations) {
const labelInfo = labelMap[ann.label];
if (!labelInfo) {
console.error(` ✗ Skipping: Unknown label "${ann.label}"`);
skipped++;
continue;
}
try {
await db.insert(spanAnnotations).values({
chart_id: chartId,
start_time: ann.start_time,
end_time: ann.end_time,
label: ann.label,
confidence: ann.confidence || null,
outcome: null,
notes: ann.notes || null,
sub_spans: null,
color: labelInfo.color,
source: ann.source || 'programmatic',
model_prediction: null,
});
imported++;
if (imported % 100 === 0) {
console.log(` ${imported}/${annotations.length} imported...`);
}
} catch (error: any) {
console.error(` ✗ Error importing annotation: ${error.message}`);
skipped++;
}
}
console.log(`\n✓ Imported ${imported} annotations`);
if (skipped > 0) {
console.log(`✗ Skipped ${skipped} annotations`);
}
return imported;
}
async function main() {
const args = process.argv.slice(2);
// Parse arguments
let filePath: string | null = null;
let chartId: number | null = null;
let clearExisting = false;
for (let i = 0; i < args.length; i++) {
if (args[i] === '--file' || args[i] === '-f') {
filePath = args[++i];
} else if (args[i] === '--chart-id' || args[i] === '-c') {
chartId = parseInt(args[++i], 10);
} else if (args[i] === '--clear') {
clearExisting = true;
} else if (args[i] === '--help' || args[i] === '-h') {
console.log(`
Import TA-Lib annotations into Candle Annotator database
Usage:
npm run import-annotations -- --file <json-file> --chart-id <id> [--clear]
Options:
--file, -f <path> Input JSON file (from generate_talib_annotations.py)
--chart-id, -c <id> Chart ID to import annotations into
--clear Clear existing annotations for this chart before import
--help, -h Show this help message
Example:
npm run import-annotations -- --file talib_annotations.json --chart-id 1
`);
process.exit(0);
}
}
if (!filePath || chartId === null) {
console.error('Error: --file and --chart-id are required');
console.error('Run with --help for usage information');
process.exit(1);
}
console.log('=== TA-Lib Annotation Import ===\n');
console.log(`Input file: ${filePath}`);
console.log(`Chart ID: ${chartId}`);
console.log(`Clear existing: ${clearExisting ? 'yes' : 'no'}`);
// Read annotations file
console.log('\nReading annotations file...');
const fileContent = await readFile(filePath, 'utf-8');
const data: ImportData = JSON.parse(fileContent);
console.log(`Found ${data.annotations.length} annotations`);
if (data.metadata) {
console.log(`Source: ${data.metadata.source || 'unknown'}`);
}
// Clear existing annotations if requested
if (clearExisting) {
console.log('\nClearing existing annotations...');
const result = await db.delete(spanAnnotations)
.where(eq(spanAnnotations.chart_id, chartId));
console.log(`Deleted existing annotations`);
}
// Collect unique labels
const uniqueLabels = [...new Set(data.annotations.map(a => a.label))];
// Ensure label types exist
const labelMap = await ensureLabelTypes(uniqueLabels);
// Import annotations
const imported = await importAnnotations(data.annotations, chartId, labelMap);
console.log('\n=== Import Complete ===');
console.log(`\nNext steps:`);
console.log(`1. Open http://localhost:3000 and select chart ${chartId}`);
console.log(`2. Review the TA-Lib generated annotations`);
console.log(`3. Edit, delete, or add new annotations as needed`);
console.log(`4. Export and train: npm run ml:export-and-train`);
}
main().catch((error) => {
console.error('\n✗ Import failed:', error.message);
process.exit(1);
});

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,6 +1,22 @@
pyproject.toml
app/__init__.py
app/annotation_ingestion.py
app/config.py
app/db.py
app/main.py
app/preprocessing.py
candle_ml.egg-info/PKG-INFO
candle_ml.egg-info/SOURCES.txt
candle_ml.egg-info/dependency_links.txt
candle_ml.egg-info/requires.txt
candle_ml.egg-info/top_level.txt
features/candle_features.py
features/custom_loader.py
features/engineer.py
features/talib_features.py
training/__init__.py
training/evaluation.py
training/train.py
training/models/__init__.py
training/models/random_forest.py
training/models/xgboost_model.py

View file

@ -0,0 +1,267 @@
#!/usr/bin/env python3
"""
Generate span annotations from TA-Lib candlestick pattern functions.
This script runs TA-Lib CDL* functions on OHLCV data and outputs
annotations in a format that can be imported into the Candle Annotator UI.
Usage:
python generate_talib_annotations.py --input data/raw/OHLCV.csv --output talib_annotations.json
"""
import argparse
import json
import logging
from pathlib import Path
from typing import List, Dict, Any
import pandas as pd
import numpy as np
try:
import talib
except ImportError:
print("ERROR: TA-Lib not installed. Install with: pip install TA-Lib")
print("Note: You may need to install the C library first. See DEPLOYMENT.md")
exit(1)
logging.basicConfig(level=logging.INFO, format='[%(levelname)s] %(message)s')
logger = logging.getLogger(__name__)
# TA-Lib candlestick pattern functions with friendly names
TALIB_PATTERNS = {
'CDLENGULFING': 'Engulfing',
'CDLHAMMER': 'Hammer',
'CDLINVERTEDHAMMER': 'Inverted Hammer',
'CDLSHOOTINGSTAR': 'Shooting Star',
'CDLDOJI': 'Doji',
'CDLDOJISTAR': 'Doji Star',
'CDLMORNINGSTAR': 'Morning Star',
'CDLEVENINGSTAR': 'Evening Star',
'CDLHARAMI': 'Harami',
'CDLHARAMICROSS': 'Harami Cross',
'CDLPIERCING': 'Piercing',
'CDLDARKCLOUDCOVER': 'Dark Cloud Cover',
'CDLTHREEWHITESOLDIERS': 'Three White Soldiers',
'CDLTHREEBLACKCROWS': 'Three Black Crows',
'CDLMARUBOZU': 'Marubozu',
'CDLSPINNINGTOP': 'Spinning Top',
'CDL3BLACKCROWS': 'Three Black Crows',
'CDL3WHITESOLDIERS': 'Three White Soldiers',
'CDLABANDONEDBABY': 'Abandoned Baby',
'CDLADVANCEBLOCK': 'Advance Block',
'CDLBELTHOLD': 'Belt Hold',
'CDLBREAKAWAY': 'Breakaway',
'CDLCLOSINGMARUBOZU': 'Closing Marubozu',
'CDLCONCEALBABYSWALL': 'Concealing Baby Swallow',
'CDLCOUNTERATTACK': 'Counterattack',
'CDLDRAGONFLYDOJI': 'Dragonfly Doji',
'CDLGAPSIDESIDEWHITE': 'Up/Down Gap Side-by-Side White Lines',
'CDLGRAVESTONEDOJI': 'Gravestone Doji',
'CDLHANGINGMAN': 'Hanging Man',
'CDLHIGHWAVE': 'High Wave',
'CDLHIKKAKE': 'Hikkake',
'CDLHIKKAKEMOD': 'Modified Hikkake',
'CDLHOMINGPIGEON': 'Homing Pigeon',
'CDLIDENTICAL3CROWS': 'Identical Three Crows',
'CDLINNECK': 'In-Neck',
'CDLKICKING': 'Kicking',
'CDLKICKINGBYLENGTH': 'Kicking by Length',
'CDLLADDERBOTTOM': 'Ladder Bottom',
'CDLLONGLEGGEDDOJI': 'Long-Legged Doji',
'CDLLONGLINE': 'Long Line',
'CDLMATCHINGLOW': 'Matching Low',
'CDLMATHOLD': 'Mat Hold',
'CDLMORNINGDOJISTAR': 'Morning Doji Star',
'CDLONNECK': 'On-Neck',
'CDLRISEFALL3METHODS': 'Rising/Falling Three Methods',
'CDLSEPARATINGLINES': 'Separating Lines',
'CDLSHORTLINE': 'Short Line',
'CDLSTALLEDPATTERN': 'Stalled Pattern',
'CDLSTICKSANDWICH': 'Stick Sandwich',
'CDLTAKURI': 'Takuri',
'CDLTASUKIGAP': 'Tasuki Gap',
'CDLTHRUSTING': 'Thrusting',
'CDLTRISTAR': 'Tristar',
'CDLUNIQUE3RIVER': 'Unique Three River',
'CDLUPSIDEGAP2CROWS': 'Upside Gap Two Crows',
'CDLXSIDEGAP3METHODS': 'Upside/Downside Gap Three Methods',
}
def load_ohlcv(input_path: str) -> pd.DataFrame:
"""
Load OHLCV data from CSV file.
Expected columns: time, open, high, low, close[, volume]
"""
logger.info(f"Loading OHLCV data from {input_path}")
df = pd.read_csv(input_path)
required_cols = ['time', 'open', 'high', 'low', 'close']
missing = [col for col in required_cols if col not in df.columns]
if missing:
raise ValueError(f"Missing required columns: {missing}")
logger.info(f"Loaded {len(df)} candles")
return df
def detect_talib_patterns(df: pd.DataFrame, patterns: List[str] = None) -> pd.DataFrame:
"""
Run TA-Lib CDL* functions on OHLCV data.
Args:
df: DataFrame with open, high, low, close columns
patterns: List of pattern names to detect (default: all)
Returns:
DataFrame with pattern detection results
"""
if patterns is None:
patterns = list(TALIB_PATTERNS.keys())
results = df[['time', 'open', 'high', 'low', 'close']].copy()
open_prices = df['open'].values
high_prices = df['high'].values
low_prices = df['low'].values
close_prices = df['close'].values
logger.info(f"Running {len(patterns)} TA-Lib pattern detection functions...")
for pattern_func in patterns:
if not hasattr(talib, pattern_func):
logger.warning(f"Unknown TA-Lib function: {pattern_func}")
continue
try:
func = getattr(talib, pattern_func)
pattern_values = func(open_prices, high_prices, low_prices, close_prices)
results[pattern_func] = pattern_values
except Exception as e:
logger.error(f"Error running {pattern_func}: {e}")
results[pattern_func] = 0
return results
def create_span_annotations(df: pd.DataFrame, min_confidence: int = 100) -> List[Dict[str, Any]]:
"""
Convert TA-Lib pattern detection results to span annotations.
TA-Lib returns:
- 0: No pattern
- 100: Bullish pattern
- -100: Bearish pattern
Args:
df: DataFrame with pattern detection columns
min_confidence: Minimum absolute confidence (100 = only perfect matches)
Returns:
List of annotation dicts
"""
annotations = []
pattern_cols = [col for col in df.columns if col.startswith('CDL')]
for idx, row in df.iterrows():
for pattern_col in pattern_cols:
pattern_value = row[pattern_col]
# Skip if no pattern detected
if pattern_value == 0 or abs(pattern_value) < min_confidence:
continue
# Determine label
friendly_name = TALIB_PATTERNS.get(pattern_col, pattern_col)
if pattern_value > 0:
label = f"Bullish {friendly_name}"
else:
label = f"Bearish {friendly_name}"
# Create annotation
# TA-Lib patterns are single-candle or small multi-candle patterns
# We'll use a 3-candle span centered on the detection point
start_idx = max(0, idx - 1)
end_idx = min(len(df) - 1, idx + 1)
start_time = int(df.iloc[start_idx]['time'])
end_time = int(df.iloc[end_idx]['time'])
annotation = {
'start_time': start_time,
'end_time': end_time,
'label': label,
'confidence': abs(pattern_value) / 100.0, # Normalize to 0-1
'source': 'programmatic',
'notes': f'TA-Lib {pattern_col} detection',
}
annotations.append(annotation)
return annotations
def save_annotations(annotations: List[Dict[str, Any]], output_path: str):
"""
Save annotations to JSON file in format compatible with the UI.
"""
output_data = {
'annotations': annotations,
'metadata': {
'source': 'talib',
'count': len(annotations),
}
}
output_file = Path(output_path)
output_file.parent.mkdir(parents=True, exist_ok=True)
with open(output_file, 'w') as f:
json.dump(output_data, f, indent=2)
logger.info(f"Saved {len(annotations)} annotations to {output_path}")
def main():
parser = argparse.ArgumentParser(description='Generate span annotations from TA-Lib patterns')
parser.add_argument('--input', '-i', required=True, help='Input OHLCV CSV file')
parser.add_argument('--output', '-o', required=True, help='Output JSON file')
parser.add_argument('--min-confidence', type=int, default=100,
help='Minimum confidence (0-100, default: 100 = perfect matches only)')
parser.add_argument('--patterns', nargs='+',
help='Specific patterns to detect (default: all)')
args = parser.parse_args()
# Load data
df = load_ohlcv(args.input)
# Detect patterns
patterns = args.patterns if args.patterns else list(TALIB_PATTERNS.keys())
results_df = detect_talib_patterns(df, patterns)
# Create annotations
annotations = create_span_annotations(results_df, min_confidence=args.min_confidence)
logger.info(f"Found {len(annotations)} pattern annotations")
# Show label distribution
from collections import Counter
label_counts = Counter(ann['label'] for ann in annotations)
logger.info("Pattern distribution:")
for label, count in label_counts.most_common():
logger.info(f" {label}: {count}")
# Save
save_annotations(annotations, args.output)
logger.info("\nNext steps:")
logger.info(f"1. Review/edit annotations: Use the import script (coming soon)")
logger.info(f"2. Or use directly for training: Copy to services/ml/data/annotations/export.json")
if __name__ == '__main__':
main()

View file

@ -0,0 +1,6 @@
artifact_location: /home/homoludens/projekti/bitcon/candle_annotator/services/ml/mlruns/artifacts/0
creation_time: 1771178634585
experiment_id: '0'
last_update_time: 1771178634585
lifecycle_stage: active
name: Default

3978
services/ml/uv.lock generated Normal file

File diff suppressed because it is too large Load diff