Add complete workflow for using TA-Lib to bootstrap training data: - generate_talib_annotations.py: Python script to run TA-Lib CDL* functions and output span annotations in UI-compatible format - import_talib_annotations.ts: TypeScript script to import generated annotations into the UI database with auto-label-type creation - npm script 'import-annotations' for easy execution - TALIB_WORKFLOW.md: Comprehensive guide covering the full cycle: * Generate patterns with TA-Lib * Import into UI * Review and edit in browser * Export and train model * Compare predictions with TA-Lib detections * Iterate for improvement This enables the intended workflow: use TA-Lib for initial annotations, manually refine them, then train a model that learns from corrections.
11 KiB
Deployment Guide
Prerequisites
- Node.js 18.x or higher
- npm 9.x or higher
- Python and build tools (for native module compilation)
Local Development Setup
1. Install Dependencies
npm install
Note: The better-sqlite3 package requires native compilation. If you encounter build errors, ensure you have the necessary build tools:
Linux:
sudo apt-get install build-essential python3
macOS:
xcode-select --install
Windows:
npm install --global windows-build-tools
2. Database Setup
The SQLite database will be automatically created when you start the application. The database file is located at:
./data/candles.db
To run migrations manually:
npx drizzle-kit generate
npx drizzle-kit migrate
3. Start Development Server
npm run dev
The application will be available at:
4. Verify Setup
- Open the application in your browser
- Upload a sample CSV file with OHLC data (columns: time, open, high, low, close)
- Verify the candlestick chart renders correctly
- Test annotation tools (Break Up, Break Down, Draw Line, Delete)
- Export annotations as CSV
CSV File Format
The application expects CSV files with the following format:
time,open,high,low,close
1700000000,1.0500,1.0520,1.0490,1.0510
1700000060,1.0510,1.0530,1.0505,1.0525
Time column formats:
- Unix timestamp (seconds):
1700000000 - Date string:
2024-01-15
Building for Production
npm run build
Note: Production builds with better-sqlite3 require the native module to be compiled for the target platform.
Running Production Build
npm run build
npm start
The production server will run on port 3000 by default.
Troubleshooting
better-sqlite3 Build Issues
If you encounter errors related to better-sqlite3 not finding bindings:
-
Rebuild the module:
npm rebuild better-sqlite3 -
If that fails, reinstall:
npm uninstall better-sqlite3 npm install better-sqlite3 -
For development, you can use
npm run devwhich handles the module better than production builds.
Database Issues
If the database becomes corrupted or you want to start fresh:
- Stop the application
- Delete the database file:
rm -f data/candles.db - Restart the application (it will recreate the database)
Port Already in Use
If port 3000 is already in use, you can specify a different port:
PORT=3001 npm run dev
Environment Variables
The application doesn't require any environment variables for local development. All configuration is hardcoded for simplicity.
File Structure
.
├── src/
│ ├── app/ # Next.js app router
│ │ ├── api/ # API routes
│ │ ├── layout.tsx # Root layout
│ │ └── page.tsx # Main page
│ ├── components/ # React components
│ │ ├── CandleChart.tsx
│ │ ├── SvgOverlay.tsx
│ │ ├── Toolbox.tsx
│ │ └── FileUpload.tsx
│ └── lib/ # Utilities
│ └── db/ # Database configuration
├── data/ # SQLite database directory
├── drizzle/ # Database migrations
└── public/ # Static assets
ML Service Setup (Optional)
The Candle Annotator includes an optional Python ML service for pattern recognition and prediction.
Prerequisites for ML Service
- Python 3.11+
- TA-Lib C library
- PostgreSQL 16
Local ML Service Setup
1. Install TA-Lib C Library
Linux (Debian/Ubuntu):
sudo apt-get update
sudo apt-get install libta-lib-dev
macOS:
brew install ta-lib
From Source:
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure --prefix=/usr
make
sudo make install
2. Install Python Dependencies
cd services/ml
uv sync
#pip install -r requirements.txt
3. Setup PostgreSQL
The ML service requires PostgreSQL for storing training run metadata:
# Create database
createdb ml_db
# Or using psql
psql -c "CREATE DATABASE ml_db;"
4. Initialize DVC
DVC is used for dataset versioning:
cd services/ml
dvc init #--subdir
dvc remote add -d local /path/to/dvc-storage
5. Run MLflow Tracking Server
MLflow tracks experiments and stores models:
mlflow server \
--backend-store-uri ./mlruns \
--default-artifact-root ./mlruns/artifacts \
--host 0.0.0.0 \
--port 5000
6. Configure Pipeline
Edit services/ml/config/pipeline.yaml to configure:
- Feature engineering settings
- Model hyperparameters
- Data paths
- MLflow experiment name
7. Start ML Service
cd services/ml
uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
The inference API will be available at http://localhost:8001
8. Configure Next.js App
Create .env.local in the project root:
INFERENCE_API_URL=http://localhost:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true
Running the ML Pipeline
The ML pipeline consists of:
- Feature Engineering - Extract TA-Lib indicators from OHLCV data
- Annotation Ingestion - Convert span annotations to labeled datasets
- Training - Train models with MLflow tracking
- Inference - Serve predictions via FastAPI
Train a Model
cd services/ml
python pipeline.py --config config/pipeline.yaml
This will:
- Load raw OHLCV data from
data/raw/ - Compute features and save to
data/enriched/ - Load annotations and create labeled dataset in
data/labeled/ - Train the model with MLflow tracking
- Save model artifacts
Run Individual Stages
# Feature engineering only
python pipeline.py --config config/pipeline.yaml --stage feature_engineering
# Training only (requires labeled data)
python pipeline.py --config config/pipeline.yaml --stage training
View Experiments
Open MLflow UI at http://localhost:5000
Test Inference API
# Check health
curl http://localhost:8001/health
# Get model info
curl http://localhost:8001/model/info
# Predict (requires candles JSON)
curl -X POST http://localhost:8001/predict \
-H "Content-Type: application/json" \
-d '{"candles": [...]}'
Docker Deployment
Prerequisites
- Docker (20.10+)
- docker-compose (2.0+)
Build and Run with Docker Compose
The easiest way to deploy is with docker-compose:
docker-compose up --build
This will:
- Build the Next.js app and ML service Docker images
- Start PostgreSQL for ML service metadata
- Start MLflow tracking server
- Start the ML inference service (FastAPI)
- Start the Next.js web application
- Create named volumes for persistent storage:
candle-data- SQLite database for annotationsml-data- OHLCV data, features, labeled datasetsmlflow-data- MLflow experiments and model artifactspostgres-data- PostgreSQL data
- Enable automatic restart unless stopped
Services will be available at:
- Web UI: http://localhost:3000
- ML Inference API: http://localhost:8001
- MLflow UI: http://localhost:5000
- PostgreSQL: localhost:5432
Running in Detached Mode
docker-compose up -d --build
View logs:
docker-compose logs -f candle-annotator
Stop the service:
docker-compose down
Manual Docker Build and Run
If you prefer to build and run manually:
# Build image
docker build -t candle-annotator .
# Run container
docker run -d \
-p 3000:3000 \
-v candle-data:/app/data \
--restart unless-stopped \
candle-annotator
Environment Configuration
Create a .env file in the project root based on .env.example:
cp .env.example .env
Edit .env to customize:
NODE_ENV=production
PORT=3000
DATABASE_PATH=/app/data/candles.db
Pass environment variables to docker-compose:
docker-compose --env-file .env up -d
Data Persistence
The application stores the SQLite database in a Docker named volume candle-data. This ensures data persists across container restarts:
# View volumes
docker volume ls | grep candle
# Backup database
docker cp candle-annotator:/app/data/candles.db ./backup.db
# Restore database
docker cp ./backup.db candle-annotator:/app/data/candles.db
Accessing the Application
Once running, access the application at:
http://localhost:3000
Health check endpoint:
curl http://localhost:3000/api/health
With database check:
curl http://localhost:3000/api/health?check=db
Port Mapping
To run on a different port (e.g., 8080), modify docker-compose.yml:
services:
candle-annotator:
ports:
- "8080:3000"
Or use environment variable in docker-compose:
services:
candle-annotator:
ports:
- "${HOST_PORT:-3000}:3000"
Then run:
HOST_PORT=8080 docker-compose up -d
Container Health Checks
Docker automatically checks container health every 30 seconds using the /api/health endpoint. The container will restart if:
- Health check fails 3 times consecutively
- Takes longer than 3 seconds to respond
View health status:
docker ps
Look for the STATUS column - it should show healthy.
Troubleshooting
Port already in use:
docker-compose down # Stop any existing containers
docker-compose up -d -p 8080:3000/tcp
Database permission errors:
# Ensure volume has correct permissions
docker-compose down
docker volume rm candle-data
docker-compose up --build
Rebuild without cache:
docker-compose build --no-cache
docker-compose up -d
View container logs:
docker-compose logs -f --tail=100
Update Procedure
To update the application:
git pull origin master
docker-compose down
docker-compose up --build -d
Or with no-cache rebuild:
git pull
docker-compose down
docker-compose build --no-cache
docker-compose up -d
Production Deployment
For production deployments, consider:
-
Use a container registry (Docker Hub, ECR, GCR):
docker tag candle-annotator myregistry/candle-annotator:v1.0.0 docker push myregistry/candle-annotator:v1.0.0 -
Run on a remote server (AWS, DigitalOcean, etc.):
# SSH into server, clone repo, then: docker-compose up -d -
Add reverse proxy (nginx, traefik) for HTTPS:
# docker-compose.yml services: nginx: image: nginx:alpine ports: - "443:443" volumes: - ./nginx.conf:/etc/nginx/nginx.conf -
Enable Docker logging for production monitoring:
docker-compose logs -f --tail=1000 > app.log &
Notes
- This application is designed for single-user local use only
- There is no authentication or user management
- The SQLite database is stored locally and not intended for concurrent access
- For production multi-user deployments, consider migrating to PostgreSQL or similar
- Docker deployment provides lightweight containerization ideal for standalone instances
- The multi-stage Dockerfile keeps image size minimal (~100MB)