candle-annotator/DEPLOYMENT.md
Marko Djordjevic 227a5b66e6 Update DEPLOYMENT.md with authentication and migration documentation
- Add new environment variables section: AUTH_SECRET, AUTH_GOOGLE_ID, AUTH_GOOGLE_SECRET, AUTH_TRUST_HOST, DEFAULT_ADMIN_EMAIL, DEFAULT_ADMIN_PASSWORD
- Add detailed Google OAuth setup instructions with step-by-step guide to create OAuth app
- Add database migration steps: Drizzle migrations + user migration script with explanation
- Update Docker environment configuration with all auth-related variables
- Update Notes section to reflect new multi-user authentication system

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 18:39:14 +01:00

776 lines
18 KiB
Markdown

# Deployment Guide
## Prerequisites
- Node.js 18.x or higher
- npm 9.x or higher
- PostgreSQL 16 or higher
## Local Development Setup
### 1. Install Dependencies
```bash
npm install
```
### 2. Database Setup
#### PostgreSQL Setup
The application uses PostgreSQL for all data storage. Set up the database:
```bash
# Create database
createdb candle_annotator
# Create user (if needed)
createuser -P ml_user
# Enter password: ml_password
# Grant privileges
psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;"
```
#### Environment Configuration
Create a `.env` file in the project root based on `.env.example`:
```bash
cp .env.example .env
```
Edit `.env` to customize:
```env
DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator
NODE_ENV=development
PORT=3000
# Authentication (Auth.js v5)
AUTH_SECRET=your_strong_random_secret_here
AUTH_TRUST_HOST=true
# Google OAuth
AUTH_GOOGLE_ID=your_google_oauth_client_id
AUTH_GOOGLE_SECRET=your_google_oauth_client_secret
# Default admin user
DEFAULT_ADMIN_EMAIL=admin@example.com
DEFAULT_ADMIN_PASSWORD=your_strong_password_here
```
#### Google OAuth Setup
To enable Google OAuth sign-in:
1. **Create a Google OAuth Application**:
- Go to [Google Cloud Console](https://console.cloud.google.com/)
- Create a new project
- Enable the "Google+ API"
- Go to "Credentials" and click "Create Credentials" → "OAuth 2.0 Client IDs"
- Select "Web application"
- Add authorized redirect URIs:
- `http://localhost:3000/api/auth/callback/google` (local development)
- `https://yourdomain.com/api/auth/callback/google` (production)
- Copy the Client ID and Client Secret
2. **Set Environment Variables**:
```env
AUTH_GOOGLE_ID=<your-client-id>
AUTH_GOOGLE_SECRET=<your-client-secret>
```
3. **Generate AUTH_SECRET**:
```bash
openssl rand -hex 32
```
#### Database Migrations
After setting up the database and environment variables, run the migrations:
1. **Run Drizzle migrations** (creates schema):
```bash
npm run db:migrate
```
2. **Run the user migration script** (creates default admin user, backfills user_id):
```bash
DEFAULT_ADMIN_EMAIL=admin@example.com \
DEFAULT_ADMIN_PASSWORD=your_password \
npx tsx scripts/migrate-users.ts
```
This script:
- Creates a default admin user with the specified email and password (hashed with bcryptjs)
- Backfills `user_id` on all existing rows in: `charts`, `annotations`, `annotation_types`, `span_annotations`, `span_label_types`
- Is idempotent (safe to run multiple times)
#### Run Migrations
Database migrations run automatically on application startup. To run manually:
```bash
npx drizzle-kit generate
npx drizzle-kit migrate
```
### 3. Start Development Server
```bash
npm run dev
```
The application will be available at:
- http://localhost:3000
### 4. Verify Setup
1. Open the application in your browser
2. Upload a sample CSV file with OHLC data (columns: time, open, high, low, close)
3. Verify the candlestick chart renders correctly
4. Test annotation tools (Break Up, Break Down, Draw Line, Delete)
5. Export annotations as CSV
## CSV File Format
The application expects CSV files with the following format:
```csv
time,open,high,low,close
1700000000,1.0500,1.0520,1.0490,1.0510
1700000060,1.0510,1.0530,1.0505,1.0525
```
**Time column formats:**
- Unix timestamp (seconds): `1700000000`
- Date string: `2024-01-15`
### 4. Migrating from SQLite (if applicable)
If you have existing data in an SQLite database from a previous version, use the migration script:
```bash
# Run the migration script
npm run migrate:sqlite-to-postgres
# Or with TypeScript directly
npx ts-node scripts/migrate-sqlite-to-postgres.ts
```
This script will:
- Read all data from the SQLite database (`data/candles.db`)
- Convert data types (timestamps, booleans, JSON→jsonb)
- Insert data into PostgreSQL
- Skip if run multiple times (idempotent)
## Building for Production
```bash
npm run build
```
## Running Production Build
```bash
npm run build
npm start
```
The production server will run on port 3000 by default.
## Troubleshooting
### Database Connection Issues
If the application fails to connect to PostgreSQL:
1. Verify PostgreSQL is running:
```bash
pg_isready -h localhost -p 5432
```
2. Check DATABASE_URL environment variable:
```bash
echo $DATABASE_URL
```
3. Verify credentials:
```bash
psql -U ml_user -d candle_annotator
```
### Database Issues
If you want to reset the database:
1. Stop the application
2. Drop and recreate the database:
```bash
dropdb candle_annotator
createdb candle_annotator
psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;"
```
3. Restart the application (migrations will run automatically)
### Port Already in Use
If port 3000 is already in use, you can specify a different port:
```bash
PORT=3001 npm run dev
```
## Environment Variables
### Required Variables
- `DATABASE_URL` - PostgreSQL connection string (e.g., `postgresql://ml_user:ml_password@localhost:5432/candle_annotator`)
- `NODE_ENV` - Environment (`development` or `production`)
- `PORT` - Server port (default: 3000)
- `AUTH_SECRET` - Random secret for Auth.js JWT signing (generate with `openssl rand -hex 32`)
- `AUTH_GOOGLE_ID` - Google OAuth Client ID
- `AUTH_GOOGLE_SECRET` - Google OAuth Client Secret
- `DEFAULT_ADMIN_EMAIL` - Default admin user email (used by migration script)
- `DEFAULT_ADMIN_PASSWORD` - Default admin user password (used by migration script)
### Authentication Variables
- `AUTH_TRUST_HOST` - Set to `true` when using HTTP (localhost development), `false` for HTTPS production with proper domain
- `AUTH_GOOGLE_ID` - Google OAuth Client ID from Google Cloud Console
- `AUTH_GOOGLE_SECRET` - Google OAuth Client Secret from Google Cloud Console
### Optional Variables for ML Inference
- `INFERENCE_API_URL` - ML service endpoint (default: `http://localhost:8001`)
- `INFERENCE_API_TIMEOUT` - Request timeout in ms (default: 30000)
- `INFERENCE_BATCH_TIMEOUT` - Batch processing timeout in ms (default: 120000)
- `NEXT_PUBLIC_PREDICTIONS_ENABLED` - Enable predictions UI (default: true)
### API Key Variable
- `API_KEY` - Strong random key for authenticating requests between Next.js and ML service (generate with `openssl rand -hex 32`)
## File Structure
```
.
├── src/
│ ├── app/ # Next.js app router
│ │ ├── api/ # API routes
│ │ ├── layout.tsx # Root layout
│ │ └── page.tsx # Main page
│ ├── components/ # React components
│ │ ├── CandleChart.tsx
│ │ ├── SvgOverlay.tsx
│ │ ├── Toolbox.tsx
│ │ └── FileUpload.tsx
│ └── lib/ # Utilities
│ └── db/ # Database configuration
├── data/ # SQLite database directory
├── drizzle/ # Database migrations
└── public/ # Static assets
```
## ML Service Setup (Optional)
The Candle Annotator includes an optional Python ML service for pattern recognition and prediction.
### Prerequisites for ML Service
- Python 3.11+
- TA-Lib C library
- PostgreSQL 16
### Local ML Service Setup
#### 1. Install TA-Lib C Library
**Linux (Debian/Ubuntu):**
```bash
sudo apt-get update
sudo apt-get install libta-lib-dev
```
**macOS:**
```bash
brew install ta-lib
```
**From Source:**
```bash
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure --prefix=/usr
make
sudo make install
```
#### 2. Install Python Dependencies
```bash
cd services/ml
uv sync
#pip install -r requirements.txt
```
#### 3. Setup PostgreSQL
The ML service shares the same PostgreSQL database as the frontend (`candle_annotator`). If you've already set up the database in the main setup steps, you're all set. The ML service will use the same connection.
#### 4. Initialize DVC
DVC is used for dataset versioning:
```bash
cd services/ml
dvc init #--subdir
dvc remote add -d local /path/to/dvc-storage
```
#### 5. Run MLflow Tracking Server
MLflow tracks experiments and stores models:
```bash
mlflow server \
--backend-store-uri ./mlruns \
--default-artifact-root ./mlruns/artifacts \
--host 0.0.0.0 \
--port 5000
```
#### 6. Configure Pipeline
Edit `services/ml/config/pipeline.yaml` to configure:
- Feature engineering settings
- Model hyperparameters
- Data paths
- MLflow experiment name
#### 7. Start ML Service
```bash
cd services/ml
uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
```
The inference API will be available at http://localhost:8001
#### 8. Configure Next.js App
Create `.env.local` in the project root:
```env
INFERENCE_API_URL=http://localhost:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true
```
### Running the ML Pipeline
The ML pipeline consists of:
1. **Feature Engineering** - Extract TA-Lib indicators from OHLCV data
2. **Annotation Ingestion** - Convert span annotations to labeled datasets
3. **Training** - Train models with MLflow tracking
4. **Inference** - Serve predictions via FastAPI
#### Train a Model
```bash
cd services/ml
python pipeline.py --config config/pipeline.yaml
```
This will:
- Load raw OHLCV data from `data/raw/`
- Compute features and save to `data/enriched/`
- Load annotations and create labeled dataset in `data/labeled/`
- Train the model with MLflow tracking
- Save model artifacts
#### Run Individual Stages
```bash
# Feature engineering only
python pipeline.py --config config/pipeline.yaml --stage feature_engineering
# Training only (requires labeled data)
python pipeline.py --config config/pipeline.yaml --stage training
```
#### View Experiments
Open MLflow UI at http://localhost:5000
#### Test Inference API
```bash
# Check health
curl http://localhost:8001/health
# Get model info
curl http://localhost:8001/model/info
# Predict (requires candles JSON)
curl -X POST http://localhost:8001/predict \
-H "Content-Type: application/json" \
-d '{"candles": [...]}'
```
## Docker Deployment
### Prerequisites
- Docker (20.10+)
- docker-compose (2.0+)
### Build and Run with Docker Compose
The easiest way to deploy is with docker-compose:
```bash
docker compose up --build
```
This will:
1. Build the Next.js app and ML service Docker images
2. Start PostgreSQL (shared by frontend and ML service)
3. Start MLflow tracking server
4. Start the ML inference service (FastAPI)
5. Start the Next.js web application
6. Create named volumes for persistent storage:
- `ml-data` - OHLCV data, features, labeled datasets
- `mlflow-data` - MLflow experiments and model artifacts
- `postgres-data` - PostgreSQL data (all application tables)
7. Enable automatic restart unless stopped
Services will be available at:
- **Web UI**: http://localhost:3000
- **ML Inference API**: http://localhost:8001
- **MLflow UI**: http://localhost:5000
- **PostgreSQL**: localhost:5432
### Running in Detached Mode
```bash
docker-compose up -d --build
```
View logs:
```bash
docker-compose logs -f candle-annotator
```
Stop the service:
```bash
docker-compose down
```
### Manual Docker Build and Run
If you prefer to build and run manually:
```bash
# Build image
docker build -t candle-annotator .
# Run container
docker run -d \
-p 3000:3000 \
-v candle-data:/app/data \
--restart unless-stopped \
candle-annotator
```
### Environment Configuration for Docker
Create a `.env` file in the project root based on `.env.example`:
```bash
cp .env.example .env
```
Edit `.env` to customize for your deployment:
```env
NODE_ENV=production
PORT=3000
DATABASE_URL=postgresql://ml_user:ml_password@postgres:5432/candle_annotator
# Authentication
AUTH_SECRET=your_strong_random_secret_here
AUTH_TRUST_HOST=false
AUTH_GOOGLE_ID=your_google_oauth_client_id
AUTH_GOOGLE_SECRET=your_google_oauth_client_secret
# Default admin user
DEFAULT_ADMIN_EMAIL=admin@example.com
DEFAULT_ADMIN_PASSWORD=your_strong_password_here
# API Key
API_KEY=your_strong_random_api_key_here
# ML Inference (optional)
INFERENCE_API_URL=http://ml-service:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true
```
Pass environment variables to docker-compose:
```bash
docker-compose --env-file .env up -d
```
After containers are running, the user migration will run automatically during startup.
### Data Persistence
The application stores all data in PostgreSQL using the Docker named volume `postgres-data`. This ensures data persists across container restarts:
```bash
# View volumes
docker volume ls | grep postgres
# Backup database
docker exec candle_annotator-postgres-1 pg_dump -U ml_user candle_annotator > backup.sql
# Restore database
cat backup.sql | docker exec -i candle_annotator-postgres-1 psql -U ml_user -d candle_annotator
```
### Data Migration from SQLite
If you're upgrading from a SQLite-based version, you need to migrate your data:
1. **Before upgrading**, backup your SQLite database:
```bash
docker cp candle_annotator-candle-annotator-1:/app/data/candles.db ./backup-sqlite.db
```
2. **Stop the old containers**:
```bash
docker compose down
```
3. **Pull the new version** and start services:
```bash
git pull origin master
docker compose up -d
```
4. **Run the migration script** from your host machine:
```bash
# Copy SQLite database to a location accessible to the script
cp backup-sqlite.db data/candles.db
# Run migration (requires ts-node and dependencies)
npm install
DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator \
npx ts-node scripts/migrate-sqlite-to-postgres.ts
```
**Rollback Procedure** (if migration fails):
1. Stop new containers:
```bash
docker compose down
```
2. Restore SQLite-based docker-compose.yml from git history:
```bash
git checkout HEAD~1 docker-compose.yml Dockerfile
```
3. Restore SQLite database:
```bash
mkdir -p data
cp backup-sqlite.db data/candles.db
```
4. Start old version:
```bash
docker compose up -d
```
### Accessing the Application
Once running, access the application at:
```
http://localhost:3000
```
Health check endpoint:
```bash
curl http://localhost:3000/api/health
```
With database check:
```bash
curl http://localhost:3000/api/health?check=db
```
### Port Mapping
To run on a different port (e.g., 8080), modify docker-compose.yml:
```yaml
services:
candle-annotator:
ports:
- "8080:3000"
```
Or use environment variable in docker-compose:
```yaml
services:
candle-annotator:
ports:
- "${HOST_PORT:-3000}:3000"
```
Then run:
```bash
HOST_PORT=8080 docker-compose up -d
```
### Container Health Checks
Docker automatically checks container health every 30 seconds using the `/api/health` endpoint. The container will restart if:
- Health check fails 3 times consecutively
- Takes longer than 3 seconds to respond
View health status:
```bash
docker ps
```
Look for the `STATUS` column - it should show `healthy`.
### Troubleshooting
**Port already in use:**
```bash
docker-compose down # Stop any existing containers
docker-compose up -d -p 8080:3000/tcp
```
**Database connection errors:**
```bash
# Check PostgreSQL logs
docker compose logs postgres
# Verify database exists
docker exec -it candle_annotator-postgres-1 psql -U ml_user -d candle_annotator -c "\dt"
# Recreate database if needed
docker compose down
docker volume rm candle_annotator_postgres-data
docker compose up --build
```
**Rebuild without cache:**
```bash
docker-compose build --no-cache
docker-compose up -d
```
**View container logs:**
```bash
docker-compose logs -f --tail=100
```
**ML service healthcheck failing:**
If the candle-annotator service fails to start with error "dependency failed to start: container candle_annotator-ml-service-1 is unhealthy", this is because the ml-service healthcheck requires `curl` to be installed in the container. This was fixed in commit `ecb2385` by adding curl to the ml-service Dockerfile.
If you encounter this issue:
1. Rebuild the ml-service: `docker compose build ml-service`
2. Restart services: `docker compose up -d`
**Migration errors during startup:**
If you see Drizzle migration errors during container startup, check:
1. Ensure PostgreSQL is fully started and healthy:
```bash
docker compose ps postgres
```
2. Check migration logs:
```bash
docker compose logs candle-annotator | grep -i migration
```
3. If needed, run migrations manually:
```bash
docker exec -it candle_annotator-candle-annotator-1 npm run db:migrate
```
### Update Procedure
To update the application:
```bash
git pull origin master
docker-compose down
docker-compose up --build -d
```
Or with no-cache rebuild:
```bash
git pull
docker-compose down
docker-compose build --no-cache
docker-compose up -d
```
### Production Deployment
For production deployments, consider:
1. **Use a container registry** (Docker Hub, ECR, GCR):
```bash
docker tag candle-annotator myregistry/candle-annotator:v1.0.0
docker push myregistry/candle-annotator:v1.0.0
```
2. **Run on a remote server** (AWS, DigitalOcean, etc.):
```bash
# SSH into server, clone repo, then:
docker-compose up -d
```
3. **Add reverse proxy** (nginx, traefik) for HTTPS:
```yaml
# docker-compose.yml
services:
nginx:
image: nginx:alpine
ports:
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
```
4. **Enable Docker logging** for production monitoring:
```bash
docker-compose logs -f --tail=1000 > app.log &
```
## Notes
- The application now includes **user authentication and multi-user support**
- Users can sign up with email/password or use Google OAuth
- A default admin user is created during the migration phase (see Database Migrations section)
- Each user's data (charts, annotations, etc.) is isolated and scoped by `user_id`
- PostgreSQL is used for all application data (frontend and ML service)
- The shared database enables the ML service to directly query candle and annotation data with user scoping
- Docker deployment provides lightweight containerization ideal for standalone instances
- The multi-stage Dockerfile keeps image size minimal (~100MB)
- Auth.js v5 is used for authentication with JWT strategy and 30-day session expiry