# Deployment Guide ## Prerequisites - Node.js 18.x or higher - npm 9.x or higher - PostgreSQL 16 or higher ## Local Development Setup ### 1. Install Dependencies ```bash npm install ``` ### 2. Database Setup #### PostgreSQL Setup The application uses PostgreSQL for all data storage. Set up the database: ```bash # Create database createdb candle_annotator # Create user (if needed) createuser -P ml_user # Enter password: ml_password # Grant privileges psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;" ``` #### Environment Configuration Create a `.env` file in the project root: ```env DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator NODE_ENV=development PORT=3000 ``` #### Run Migrations Database migrations run automatically on application startup. To run manually: ```bash npx drizzle-kit generate npx drizzle-kit migrate ``` ### 3. Start Development Server ```bash npm run dev ``` The application will be available at: - http://localhost:3000 ### 4. Verify Setup 1. Open the application in your browser 2. Upload a sample CSV file with OHLC data (columns: time, open, high, low, close) 3. Verify the candlestick chart renders correctly 4. Test annotation tools (Break Up, Break Down, Draw Line, Delete) 5. Export annotations as CSV ## CSV File Format The application expects CSV files with the following format: ```csv time,open,high,low,close 1700000000,1.0500,1.0520,1.0490,1.0510 1700000060,1.0510,1.0530,1.0505,1.0525 ``` **Time column formats:** - Unix timestamp (seconds): `1700000000` - Date string: `2024-01-15` ### 4. Migrating from SQLite (if applicable) If you have existing data in an SQLite database from a previous version, use the migration script: ```bash # Run the migration script npm run migrate:sqlite-to-postgres # Or with TypeScript directly npx ts-node scripts/migrate-sqlite-to-postgres.ts ``` This script will: - Read all data from the SQLite database (`data/candles.db`) - Convert data types (timestamps, booleans, JSON→jsonb) - Insert data into PostgreSQL - Skip if run multiple times (idempotent) ## Building for Production ```bash npm run build ``` ## Running Production Build ```bash npm run build npm start ``` The production server will run on port 3000 by default. ## Troubleshooting ### Database Connection Issues If the application fails to connect to PostgreSQL: 1. Verify PostgreSQL is running: ```bash pg_isready -h localhost -p 5432 ``` 2. Check DATABASE_URL environment variable: ```bash echo $DATABASE_URL ``` 3. Verify credentials: ```bash psql -U ml_user -d candle_annotator ``` ### Database Issues If you want to reset the database: 1. Stop the application 2. Drop and recreate the database: ```bash dropdb candle_annotator createdb candle_annotator psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;" ``` 3. Restart the application (migrations will run automatically) ### Port Already in Use If port 3000 is already in use, you can specify a different port: ```bash PORT=3001 npm run dev ``` ## Environment Variables Required environment variables: - `DATABASE_URL` - PostgreSQL connection string (e.g., `postgresql://ml_user:ml_password@localhost:5432/candle_annotator`) - `NODE_ENV` - Environment (`development` or `production`) - `PORT` - Server port (default: 3000) Optional variables for ML inference: - `INFERENCE_API_URL` - ML service endpoint (default: `http://localhost:8001`) - `INFERENCE_API_TIMEOUT` - Request timeout in ms (default: 30000) - `INFERENCE_BATCH_TIMEOUT` - Batch processing timeout in ms (default: 120000) - `NEXT_PUBLIC_PREDICTIONS_ENABLED` - Enable predictions UI (default: true) ## File Structure ``` . ├── src/ │ ├── app/ # Next.js app router │ │ ├── api/ # API routes │ │ ├── layout.tsx # Root layout │ │ └── page.tsx # Main page │ ├── components/ # React components │ │ ├── CandleChart.tsx │ │ ├── SvgOverlay.tsx │ │ ├── Toolbox.tsx │ │ └── FileUpload.tsx │ └── lib/ # Utilities │ └── db/ # Database configuration ├── data/ # SQLite database directory ├── drizzle/ # Database migrations └── public/ # Static assets ``` ## ML Service Setup (Optional) The Candle Annotator includes an optional Python ML service for pattern recognition and prediction. ### Prerequisites for ML Service - Python 3.11+ - TA-Lib C library - PostgreSQL 16 ### Local ML Service Setup #### 1. Install TA-Lib C Library **Linux (Debian/Ubuntu):** ```bash sudo apt-get update sudo apt-get install libta-lib-dev ``` **macOS:** ```bash brew install ta-lib ``` **From Source:** ```bash wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz tar -xzf ta-lib-0.4.0-src.tar.gz cd ta-lib/ ./configure --prefix=/usr make sudo make install ``` #### 2. Install Python Dependencies ```bash cd services/ml uv sync #pip install -r requirements.txt ``` #### 3. Setup PostgreSQL The ML service shares the same PostgreSQL database as the frontend (`candle_annotator`). If you've already set up the database in the main setup steps, you're all set. The ML service will use the same connection. #### 4. Initialize DVC DVC is used for dataset versioning: ```bash cd services/ml dvc init #--subdir dvc remote add -d local /path/to/dvc-storage ``` #### 5. Run MLflow Tracking Server MLflow tracks experiments and stores models: ```bash mlflow server \ --backend-store-uri ./mlruns \ --default-artifact-root ./mlruns/artifacts \ --host 0.0.0.0 \ --port 5000 ``` #### 6. Configure Pipeline Edit `services/ml/config/pipeline.yaml` to configure: - Feature engineering settings - Model hyperparameters - Data paths - MLflow experiment name #### 7. Start ML Service ```bash cd services/ml uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload ``` The inference API will be available at http://localhost:8001 #### 8. Configure Next.js App Create `.env.local` in the project root: ```env INFERENCE_API_URL=http://localhost:8001 INFERENCE_API_TIMEOUT=30000 INFERENCE_BATCH_TIMEOUT=120000 NEXT_PUBLIC_PREDICTIONS_ENABLED=true ``` ### Running the ML Pipeline The ML pipeline consists of: 1. **Feature Engineering** - Extract TA-Lib indicators from OHLCV data 2. **Annotation Ingestion** - Convert span annotations to labeled datasets 3. **Training** - Train models with MLflow tracking 4. **Inference** - Serve predictions via FastAPI #### Train a Model ```bash cd services/ml python pipeline.py --config config/pipeline.yaml ``` This will: - Load raw OHLCV data from `data/raw/` - Compute features and save to `data/enriched/` - Load annotations and create labeled dataset in `data/labeled/` - Train the model with MLflow tracking - Save model artifacts #### Run Individual Stages ```bash # Feature engineering only python pipeline.py --config config/pipeline.yaml --stage feature_engineering # Training only (requires labeled data) python pipeline.py --config config/pipeline.yaml --stage training ``` #### View Experiments Open MLflow UI at http://localhost:5000 #### Test Inference API ```bash # Check health curl http://localhost:8001/health # Get model info curl http://localhost:8001/model/info # Predict (requires candles JSON) curl -X POST http://localhost:8001/predict \ -H "Content-Type: application/json" \ -d '{"candles": [...]}' ``` ## Docker Deployment ### Prerequisites - Docker (20.10+) - docker-compose (2.0+) ### Build and Run with Docker Compose The easiest way to deploy is with docker-compose: ```bash docker compose up --build ``` This will: 1. Build the Next.js app and ML service Docker images 2. Start PostgreSQL (shared by frontend and ML service) 3. Start MLflow tracking server 4. Start the ML inference service (FastAPI) 5. Start the Next.js web application 6. Create named volumes for persistent storage: - `ml-data` - OHLCV data, features, labeled datasets - `mlflow-data` - MLflow experiments and model artifacts - `postgres-data` - PostgreSQL data (all application tables) 7. Enable automatic restart unless stopped Services will be available at: - **Web UI**: http://localhost:3000 - **ML Inference API**: http://localhost:8001 - **MLflow UI**: http://localhost:5000 - **PostgreSQL**: localhost:5432 ### Running in Detached Mode ```bash docker-compose up -d --build ``` View logs: ```bash docker-compose logs -f candle-annotator ``` Stop the service: ```bash docker-compose down ``` ### Manual Docker Build and Run If you prefer to build and run manually: ```bash # Build image docker build -t candle-annotator . # Run container docker run -d \ -p 3000:3000 \ -v candle-data:/app/data \ --restart unless-stopped \ candle-annotator ``` ### Environment Configuration Create a `.env` file in the project root based on `.env.example`: ```bash cp .env.example .env ``` Edit `.env` to customize: ``` NODE_ENV=production PORT=3000 DATABASE_URL=postgresql://ml_user:ml_password@postgres:5432/candle_annotator ``` Pass environment variables to docker-compose: ```bash docker-compose --env-file .env up -d ``` ### Data Persistence The application stores all data in PostgreSQL using the Docker named volume `postgres-data`. This ensures data persists across container restarts: ```bash # View volumes docker volume ls | grep postgres # Backup database docker exec candle_annotator-postgres-1 pg_dump -U ml_user candle_annotator > backup.sql # Restore database cat backup.sql | docker exec -i candle_annotator-postgres-1 psql -U ml_user -d candle_annotator ``` ### Data Migration from SQLite If you're upgrading from a SQLite-based version, you need to migrate your data: 1. **Before upgrading**, backup your SQLite database: ```bash docker cp candle_annotator-candle-annotator-1:/app/data/candles.db ./backup-sqlite.db ``` 2. **Stop the old containers**: ```bash docker compose down ``` 3. **Pull the new version** and start services: ```bash git pull origin master docker compose up -d ``` 4. **Run the migration script** from your host machine: ```bash # Copy SQLite database to a location accessible to the script cp backup-sqlite.db data/candles.db # Run migration (requires ts-node and dependencies) npm install DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator \ npx ts-node scripts/migrate-sqlite-to-postgres.ts ``` **Rollback Procedure** (if migration fails): 1. Stop new containers: ```bash docker compose down ``` 2. Restore SQLite-based docker-compose.yml from git history: ```bash git checkout HEAD~1 docker-compose.yml Dockerfile ``` 3. Restore SQLite database: ```bash mkdir -p data cp backup-sqlite.db data/candles.db ``` 4. Start old version: ```bash docker compose up -d ``` ### Accessing the Application Once running, access the application at: ``` http://localhost:3000 ``` Health check endpoint: ```bash curl http://localhost:3000/api/health ``` With database check: ```bash curl http://localhost:3000/api/health?check=db ``` ### Port Mapping To run on a different port (e.g., 8080), modify docker-compose.yml: ```yaml services: candle-annotator: ports: - "8080:3000" ``` Or use environment variable in docker-compose: ```yaml services: candle-annotator: ports: - "${HOST_PORT:-3000}:3000" ``` Then run: ```bash HOST_PORT=8080 docker-compose up -d ``` ### Container Health Checks Docker automatically checks container health every 30 seconds using the `/api/health` endpoint. The container will restart if: - Health check fails 3 times consecutively - Takes longer than 3 seconds to respond View health status: ```bash docker ps ``` Look for the `STATUS` column - it should show `healthy`. ### Troubleshooting **Port already in use:** ```bash docker-compose down # Stop any existing containers docker-compose up -d -p 8080:3000/tcp ``` **Database connection errors:** ```bash # Check PostgreSQL logs docker compose logs postgres # Verify database exists docker exec -it candle_annotator-postgres-1 psql -U ml_user -d candle_annotator -c "\dt" # Recreate database if needed docker compose down docker volume rm candle_annotator_postgres-data docker compose up --build ``` **Rebuild without cache:** ```bash docker-compose build --no-cache docker-compose up -d ``` **View container logs:** ```bash docker-compose logs -f --tail=100 ``` **ML service healthcheck failing:** If the candle-annotator service fails to start with error "dependency failed to start: container candle_annotator-ml-service-1 is unhealthy", this is because the ml-service healthcheck requires `curl` to be installed in the container. This was fixed in commit `ecb2385` by adding curl to the ml-service Dockerfile. If you encounter this issue: 1. Rebuild the ml-service: `docker compose build ml-service` 2. Restart services: `docker compose up -d` **Migration errors during startup:** If you see Drizzle migration errors during container startup, check: 1. Ensure PostgreSQL is fully started and healthy: ```bash docker compose ps postgres ``` 2. Check migration logs: ```bash docker compose logs candle-annotator | grep -i migration ``` 3. If needed, run migrations manually: ```bash docker exec -it candle_annotator-candle-annotator-1 npm run db:migrate ``` ### Update Procedure To update the application: ```bash git pull origin master docker-compose down docker-compose up --build -d ``` Or with no-cache rebuild: ```bash git pull docker-compose down docker-compose build --no-cache docker-compose up -d ``` ### Production Deployment For production deployments, consider: 1. **Use a container registry** (Docker Hub, ECR, GCR): ```bash docker tag candle-annotator myregistry/candle-annotator:v1.0.0 docker push myregistry/candle-annotator:v1.0.0 ``` 2. **Run on a remote server** (AWS, DigitalOcean, etc.): ```bash # SSH into server, clone repo, then: docker-compose up -d ``` 3. **Add reverse proxy** (nginx, traefik) for HTTPS: ```yaml # docker-compose.yml services: nginx: image: nginx:alpine ports: - "443:443" volumes: - ./nginx.conf:/etc/nginx/nginx.conf ``` 4. **Enable Docker logging** for production monitoring: ```bash docker-compose logs -f --tail=1000 > app.log & ``` ## Notes - This application is designed for **single-user local use** only - There is no authentication or user management - PostgreSQL is used for all application data (frontend and ML service) - The shared database enables the ML service to directly query candle and annotation data - Docker deployment provides lightweight containerization ideal for standalone instances - The multi-stage Dockerfile keeps image size minimal (~100MB)