candle-annotator/DEPLOYMENT.md

12 KiB

Deployment Guide

Prerequisites

  • Node.js 18.x or higher
  • npm 9.x or higher
  • Python and build tools (for native module compilation)

Local Development Setup

1. Install Dependencies

npm install

Note: The better-sqlite3 package requires native compilation. If you encounter build errors, ensure you have the necessary build tools:

Linux:

sudo apt-get install build-essential python3

macOS:

xcode-select --install

Windows:

npm install --global windows-build-tools

2. Database Setup

The SQLite database will be automatically created when you start the application. The database file is located at:

./data/candles.db

To run migrations manually:

npx drizzle-kit generate
npx drizzle-kit migrate

3. Start Development Server

npm run dev

The application will be available at:

4. Verify Setup

  1. Open the application in your browser
  2. Upload a sample CSV file with OHLC data (columns: time, open, high, low, close)
  3. Verify the candlestick chart renders correctly
  4. Test annotation tools (Break Up, Break Down, Draw Line, Delete)
  5. Export annotations as CSV

CSV File Format

The application expects CSV files with the following format:

time,open,high,low,close
1700000000,1.0500,1.0520,1.0490,1.0510
1700000060,1.0510,1.0530,1.0505,1.0525

Time column formats:

  • Unix timestamp (seconds): 1700000000
  • Date string: 2024-01-15

Building for Production

npm run build

Note: Production builds with better-sqlite3 require the native module to be compiled for the target platform.

Running Production Build

npm run build
npm start

The production server will run on port 3000 by default.

Troubleshooting

better-sqlite3 Build Issues

If you encounter errors related to better-sqlite3 not finding bindings:

  1. Rebuild the module:

    npm rebuild better-sqlite3
    
  2. If that fails, reinstall:

    npm uninstall better-sqlite3
    npm install better-sqlite3
    
  3. For development, you can use npm run dev which handles the module better than production builds.

Database Issues

If the database becomes corrupted or you want to start fresh:

  1. Stop the application
  2. Delete the database file:
    rm -f data/candles.db
    
  3. Restart the application (it will recreate the database)

Port Already in Use

If port 3000 is already in use, you can specify a different port:

PORT=3001 npm run dev

Environment Variables

The application doesn't require any environment variables for local development. All configuration is hardcoded for simplicity.

File Structure

.
├── src/
│   ├── app/              # Next.js app router
│   │   ├── api/          # API routes
│   │   ├── layout.tsx    # Root layout
│   │   └── page.tsx      # Main page
│   ├── components/       # React components
│   │   ├── CandleChart.tsx
│   │   ├── SvgOverlay.tsx
│   │   ├── Toolbox.tsx
│   │   └── FileUpload.tsx
│   └── lib/              # Utilities
│       └── db/           # Database configuration
├── data/                 # SQLite database directory
├── drizzle/              # Database migrations
└── public/               # Static assets

ML Service Setup (Optional)

The Candle Annotator includes an optional Python ML service for pattern recognition and prediction.

Prerequisites for ML Service

  • Python 3.11+
  • TA-Lib C library
  • PostgreSQL 16

Local ML Service Setup

1. Install TA-Lib C Library

Linux (Debian/Ubuntu):

sudo apt-get update
sudo apt-get install libta-lib-dev

macOS:

brew install ta-lib

From Source:

wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure --prefix=/usr
make
sudo make install

2. Install Python Dependencies

cd services/ml
uv sync
#pip install -r requirements.txt

3. Setup PostgreSQL

The ML service requires PostgreSQL for storing training run metadata:

# Create database
createdb ml_db

# Or using psql
psql -c "CREATE DATABASE ml_db;"

4. Initialize DVC

DVC is used for dataset versioning:

cd services/ml
dvc init #--subdir
dvc remote add -d local /path/to/dvc-storage

5. Run MLflow Tracking Server

MLflow tracks experiments and stores models:

mlflow server \
  --backend-store-uri ./mlruns \
  --default-artifact-root ./mlruns/artifacts \
  --host 0.0.0.0 \
  --port 5000

6. Configure Pipeline

Edit services/ml/config/pipeline.yaml to configure:

  • Feature engineering settings
  • Model hyperparameters
  • Data paths
  • MLflow experiment name

7. Start ML Service

cd services/ml
uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload

The inference API will be available at http://localhost:8001

8. Configure Next.js App

Create .env.local in the project root:

INFERENCE_API_URL=http://localhost:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true

Running the ML Pipeline

The ML pipeline consists of:

  1. Feature Engineering - Extract TA-Lib indicators from OHLCV data
  2. Annotation Ingestion - Convert span annotations to labeled datasets
  3. Training - Train models with MLflow tracking
  4. Inference - Serve predictions via FastAPI

Train a Model

cd services/ml
python pipeline.py --config config/pipeline.yaml

This will:

  • Load raw OHLCV data from data/raw/
  • Compute features and save to data/enriched/
  • Load annotations and create labeled dataset in data/labeled/
  • Train the model with MLflow tracking
  • Save model artifacts

Run Individual Stages

# Feature engineering only
python pipeline.py --config config/pipeline.yaml --stage feature_engineering

# Training only (requires labeled data)
python pipeline.py --config config/pipeline.yaml --stage training

View Experiments

Open MLflow UI at http://localhost:5000

Test Inference API

# Check health
curl http://localhost:8001/health

# Get model info
curl http://localhost:8001/model/info

# Predict (requires candles JSON)
curl -X POST http://localhost:8001/predict \
  -H "Content-Type: application/json" \
  -d '{"candles": [...]}'

Docker Deployment

Prerequisites

  • Docker (20.10+)
  • docker-compose (2.0+)

Build and Run with Docker Compose

The easiest way to deploy is with docker-compose:

docker compose up --build

This will:

  1. Build the Next.js app and ML service Docker images
  2. Start PostgreSQL for ML service metadata
  3. Start MLflow tracking server
  4. Start the ML inference service (FastAPI)
  5. Start the Next.js web application
  6. Create named volumes for persistent storage:
    • candle-data - SQLite database for annotations
    • ml-data - OHLCV data, features, labeled datasets
    • mlflow-data - MLflow experiments and model artifacts
    • postgres-data - PostgreSQL data
  7. Enable automatic restart unless stopped

Services will be available at:

Running in Detached Mode

docker-compose up -d --build

View logs:

docker-compose logs -f candle-annotator

Stop the service:

docker-compose down

Manual Docker Build and Run

If you prefer to build and run manually:

# Build image
docker build -t candle-annotator .

# Run container
docker run -d \
  -p 3000:3000 \
  -v candle-data:/app/data \
  --restart unless-stopped \
  candle-annotator

Environment Configuration

Create a .env file in the project root based on .env.example:

cp .env.example .env

Edit .env to customize:

NODE_ENV=production
PORT=3000
DATABASE_PATH=/app/data/candles.db

Pass environment variables to docker-compose:

docker-compose --env-file .env up -d

Data Persistence

The application stores the SQLite database in a Docker named volume candle-data. This ensures data persists across container restarts:

# View volumes
docker volume ls | grep candle

# Backup database
docker cp candle-annotator:/app/data/candles.db ./backup.db

# Restore database
docker cp ./backup.db candle-annotator:/app/data/candles.db

Accessing the Application

Once running, access the application at:

http://localhost:3000

Health check endpoint:

curl http://localhost:3000/api/health

With database check:

curl http://localhost:3000/api/health?check=db

Port Mapping

To run on a different port (e.g., 8080), modify docker-compose.yml:

services:
  candle-annotator:
    ports:
      - "8080:3000"

Or use environment variable in docker-compose:

services:
  candle-annotator:
    ports:
      - "${HOST_PORT:-3000}:3000"

Then run:

HOST_PORT=8080 docker-compose up -d

Container Health Checks

Docker automatically checks container health every 30 seconds using the /api/health endpoint. The container will restart if:

  • Health check fails 3 times consecutively
  • Takes longer than 3 seconds to respond

View health status:

docker ps

Look for the STATUS column - it should show healthy.

Troubleshooting

Port already in use:

docker-compose down  # Stop any existing containers
docker-compose up -d -p 8080:3000/tcp

Database permission errors:

# Ensure volume has correct permissions
docker-compose down
docker volume rm candle-data
docker-compose up --build

Rebuild without cache:

docker-compose build --no-cache
docker-compose up -d

View container logs:

docker-compose logs -f --tail=100

ML service healthcheck failing:

If the candle-annotator service fails to start with error "dependency failed to start: container candle_annotator-ml-service-1 is unhealthy", this is because the ml-service healthcheck requires curl to be installed in the container. This was fixed in commit ecb2385 by adding curl to the ml-service Dockerfile.

If you encounter this issue:

  1. Rebuild the ml-service: docker compose build ml-service
  2. Restart services: docker compose up -d

Migration errors during build:

If you see Drizzle migration errors like "table annotations already exists" during the Docker build process, this means the database file from a previous build is being included. This was fixed in commit by skipping migrations during the build phase. The migrations now only run at runtime.

If you encounter this issue:

  1. Ensure data/ is in .dockerignore
  2. Rebuild: docker compose build --no-cache candle-annotator
  3. Restart: docker compose up -d

Update Procedure

To update the application:

git pull origin master
docker-compose down
docker-compose up --build -d

Or with no-cache rebuild:

git pull
docker-compose down
docker-compose build --no-cache
docker-compose up -d

Production Deployment

For production deployments, consider:

  1. Use a container registry (Docker Hub, ECR, GCR):

    docker tag candle-annotator myregistry/candle-annotator:v1.0.0
    docker push myregistry/candle-annotator:v1.0.0
    
  2. Run on a remote server (AWS, DigitalOcean, etc.):

    # SSH into server, clone repo, then:
    docker-compose up -d
    
  3. Add reverse proxy (nginx, traefik) for HTTPS:

    # docker-compose.yml
    services:
      nginx:
        image: nginx:alpine
        ports:
          - "443:443"
        volumes:
          - ./nginx.conf:/etc/nginx/nginx.conf
    
  4. Enable Docker logging for production monitoring:

    docker-compose logs -f --tail=1000 > app.log &
    

Notes

  • This application is designed for single-user local use only
  • There is no authentication or user management
  • The SQLite database is stored locally and not intended for concurrent access
  • For production multi-user deployments, consider migrating to PostgreSQL or similar
  • Docker deployment provides lightweight containerization ideal for standalone instances
  • The multi-stage Dockerfile keeps image size minimal (~100MB)