Candle annotator
Find a file
Marko Djordjevic fba0b29d64 Add back navigation link to /app on settings page
- Added ChevronLeft icon from lucide-react
- Added Link import from next/link
- Created back navigation component at top of settings page with hover state
- Links back to /app with "← Back" text

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 13:37:48 +01:00
.agents fix: remove all SQLite references (migrate.ts, migration script, package.json) 2026-02-17 23:34:12 +01:00
.cursor/skills fix: remove all SQLite references (migrate.ts, migration script, package.json) 2026-02-17 23:34:12 +01:00
.github/workflows update github action for deploy 2026-02-18 22:15:00 +01:00
.qwen/skills fix: remove all SQLite references (migrate.ts, migration script, package.json) 2026-02-17 23:34:12 +01:00
drizzle Implement task 6.1: Create PUT /api/auth/profile endpoint for updating user display name 2026-02-20 10:20:20 +01:00
lovable_design_html Implement task 6.1: Create PUT /api/auth/profile endpoint for updating user display name 2026-02-20 10:20:20 +01:00
models feat: add SHA256 model integrity check before joblib.load() 2026-02-18 11:25:14 +01:00
openspec Add back navigation link to /app on settings page 2026-02-20 13:37:48 +01:00
public Implement task 6.1: Create PUT /api/auth/profile endpoint for updating user display name 2026-02-20 10:20:20 +01:00
RAZNO feat: auto-build training dataset from DB annotations before training 2026-02-18 00:24:39 +01:00
scripts Add data migration script for user-accounts (task 2.5) 2026-02-20 09:51:23 +01:00
services/ml Fix XGBoost label encoding and single-class guard 2026-02-18 23:58:24 +01:00
src Add back navigation link to /app on settings page 2026-02-20 13:37:48 +01:00
.dockerignore fix: update .dockerignore with all required entries and mark task 6.3 complete 2026-02-18 11:34:52 +01:00
.env.example Add authentication environment variables to .env.example 2026-02-20 09:44:25 +01:00
.eslintrc.json feat: initialize Next.js project with database schema 2026-02-12 10:23:02 +01:00
.gitignore update gitingore for models dir 2026-02-19 00:01:22 +01:00
candle_annotator.db feat: redesign UI to match lovable compact sidebar layout 2026-02-16 20:50:30 +01:00
candle_annotator@1.0.0 fix(ml): add CCI to hlc_indicators list 2026-02-15 21:08:20 +01:00
CLAUDE.md feat: migrate from SQLite to PostgreSQL - complete schema and API updates 2026-02-17 13:43:06 +01:00
CLAUDE_DESCRIPTION.md fix: resolve numpy type conversion issues in ML service data access 2026-02-17 14:10:21 +01:00
CODE_REVIEW.md sync: ml-ui-connection delta specs to main specs 2026-02-18 10:21:05 +01:00
components.json feat: initialize Next.js project with database schema 2026-02-12 10:23:02 +01:00
deploy_zero_downtime.sh fix deployment folder permisions 2026-02-18 22:50:36 +01:00
DEPLOYMENT.md fix: resolve numpy type conversion issues in ML service data access 2026-02-17 14:10:21 +01:00
docker-compose.yml Task 1.3: Update docker-compose.yml to pass new auth environment variables 2026-02-20 09:45:12 +01:00
Dockerfile fix: install prod node_modules in Docker final stage for migration script 2026-02-18 22:11:51 +01:00
drizzle.config.ts feat: migrate from SQLite to PostgreSQL - complete schema and API updates 2026-02-17 13:43:06 +01:00
EURUSD.csv add for docker until i fix it 2026-02-16 16:21:46 +01:00
inference-ui-prompt.md feat: add ML service scaffolding with Python FastAPI, Docker, and MLflow setup 2026-02-15 11:58:31 +01:00
ml-pipeline-prompt.md feat: add ML service scaffolding with Python FastAPI, Docker, and MLflow setup 2026-02-15 11:58:31 +01:00
ML_QUICKSTART.md docs: add ML pipeline quickstart guide for training first model 2026-02-15 19:08:09 +01:00
next-env.d.ts fix: rename middleware.ts to proxy.ts (deprecated convention) 2026-02-18 21:14:42 +01:00
next.config.js feat: add security headers to next.config.js (task 6.1) 2026-02-18 11:33:52 +01:00
openspec_teams_prompt.md fix: Change TA-Lib download URL to HTTPS in Dockerfile 2026-02-18 11:35:15 +01:00
package-lock.json Task 10.3: Add "Forgot password?" link with toast and "Sign up" link to login page 2026-02-20 13:28:17 +01:00
package.json Task 10.3: Add "Forgot password?" link with toast and "Sign up" link to login page 2026-02-20 13:28:17 +01:00
postcss.config.js feat: initialize Next.js project with database schema 2026-02-12 10:23:02 +01:00
README.md fix: resolve numpy type conversion issues in ML service data access 2026-02-17 14:10:21 +01:00
tailwind.config.js feat: redesign UI to match lovable compact sidebar layout 2026-02-16 20:50:30 +01:00
tailwind.config.ts design: change to minimalistic and clean light theme 2026-02-12 16:18:14 +01:00
TALIB_WORKFLOW.md feat(ml): add TA-Lib annotation generation and import workflow 2026-02-15 19:18:28 +01:00
TODO.md Implement task 6.1: Create PUT /api/auth/profile endpoint for updating user display name 2026-02-20 10:20:20 +01:00
tsconfig.json fix: exclude scripts/ from TypeScript build to fix deployment 2026-02-17 23:29:52 +01:00
tsconfig.tsbuildinfo Add back navigation link to /app on settings page 2026-02-20 13:37:48 +01:00
tsx fix(ml): add CCI to hlc_indicators list 2026-02-15 21:08:20 +01:00

Candle Annotator

A web-based tool for manually annotating candlestick charts with pattern labels and trend lines. Built for creating labeled training data for machine learning models in trading analysis.

Overview

Candle Annotator is a complete machine learning platform for candlestick pattern recognition, combining:

Annotation Tools - TradingView-like charting interface for creating labeled training data:

  • Upload historical OHLC (Open, High, Low, Close) candle data from CSV files
  • Visualize candlestick charts with interactive zoom and pan
  • Annotate patterns with span labels (e.g., "Bullish Engulfing", "Doji", "Hammer")
  • Mark breakout patterns (Break Up, Break Down) directly on candles
  • Draw custom trend lines with two-click interaction
  • Export annotations for ML training

ML Pipeline - Python-based training and inference system:

  • Feature engineering with TA-Lib indicators (RSI, MACD, Bollinger Bands, etc.)
  • Automated pattern detection using TA-Lib CDL* functions
  • Train RandomForest and XGBoost models with MLflow experiment tracking
  • FastAPI inference service for real-time predictions
  • Integration with Next.js UI for prediction visualization

Active Learning Loop - Close the feedback cycle:

  • Model predictions displayed as overlays on the chart
  • Disagreement detection between human annotations and model predictions
  • One-click feedback to confirm, correct, or dismiss predictions as new training data
  • Continuous improvement through iterative annotation and retraining

Features

Data Management

  • CSV Upload: Import OHLC data with support for both Unix timestamps and date strings
    • Replace Mode: Uploading a new CSV deletes all old candles and replaces them with new data
    • Initial Data: Docker containers automatically load EURUSD.csv on first startup if database is empty
  • PostgreSQL Storage: All candle data and annotations stored in PostgreSQL database
  • Shared Database: Frontend and ML service use the same database for seamless data access
  • Data Persistence: Annotations and candles persist between sessions

Chart Visualization

  • Interactive Candlestick Chart: Powered by lightweight-charts library
  • Dark Theme: Eye-friendly slate color scheme
  • Zoom & Pan: Mouse wheel zoom and drag-to-pan functionality
  • Crosshair: Precise price and time tracking

Annotation Tools

  • Break Up Markers: Green arrow markers below candles indicating upward breakouts
  • Break Down Markers: Red arrow markers above candles indicating downward breakouts
  • Trend Lines: Two-click line drawing with real-time preview
  • Delete Tool: Remove any annotation (markers or lines) by clicking on them
  • Tool Toggle: Click tool button again to deactivate

Label Management

  • Label Sidebar: View all annotations in collapsible sidebar with:
    • Click Selection: Click markers on chart or in sidebar to select/highlight
    • Keyboard Delete: Press Delete or Backspace to remove selected label
    • Individual Delete: Delete button on each list item
    • Search: Search annotations by timestamp
    • Filter: Filter by Break Up, Break Down, or All types
    • Count Display: See how many Break Up vs Break Down markers exist
    • Visual Highlight: Selected markers highlighted with glow effect

UI Theme

  • Hacker Theme: Terminal-inspired dark aesthetic with:
    • Matrix green (#00ff41) on dark background (#0a0e0a)
    • Monospace font (JetBrains Mono) throughout
    • Glow effects on button hover and active states
    • Custom scrollbars styled to match theme
    • High contrast for accessibility

Export & Deployment

  • CSV Export: Download all annotations with timestamp, label type, and price data
  • ML-Ready Format: Structured data suitable for training ML models
  • Docker Deployment: One-command deployment with persistent data volume
  • Health Check: Built-in /api/health endpoint for monitoring

ML Pipeline Features (Optional)

The integrated ML pipeline provides:

Feature Engineering

  • TA-Lib Indicators: Automatic computation of 150+ technical indicators (RSI, MACD, Bollinger Bands, ATR, Stochastic, etc.)
  • Candle Features: Body size, wick ratios, gap detection, price ranges
  • Custom Features: Plugin system for domain-specific feature functions
  • NaN Handling: Automatic warmup period detection and cleanup

Annotation Ingestion

  • Windowed Classification: Extract fixed-size windows around each pattern for classification models
  • BIO Sequence Labeling: Begin-Inside-Outside encoding for sequence models (future LSTM/GRU support)
  • Programmatic Labels: TA-Lib CDL* pattern functions for auto-labeling (23+ candlestick patterns)
  • Label Merging: Human-priority, programmatic-priority, or both strategies
  • Dataset Statistics: Class distribution, label counts, human/programmatic agreement metrics

Model Training

  • Model Types: RandomForest and XGBoost with class balancing
  • Temporal Splitting: Train/val/test splits that respect time series order (no data leakage)
  • MLflow Integration: Automatic experiment tracking, hyperparameter logging, artifact storage
  • Model Registry: Versioned model storage with stage management (Production, Staging, Archived)
  • Evaluation Metrics: Accuracy, F1 (macro/weighted), per-class precision/recall/F1
  • Visualization: Confusion matrix, feature importance plots, classification reports

Inference Service

  • FastAPI REST API: High-performance inference with automatic OpenAPI docs
  • Preprocessing Parity: Loads pipeline config from MLflow to ensure training/inference consistency
  • Batch Processing: Efficient prediction for large time ranges
  • Span Grouping: Consecutive predictions merged into labeled spans with confidence scores
  • Model Metadata: Endpoint to query model version, metrics, and label configuration

Prediction UI

  • Chart Overlay: Predictions rendered as histogram series with label-specific colors
  • Confidence Filtering: Slider to hide low-confidence predictions
  • Label Filtering: Toggle visibility per pattern type with per-class F1 scores
  • Disagreement Detection: Automatic comparison of human vs model predictions
  • Prediction Summary: Counts for total predictions, agreements, disagreements
  • Active Learning Feedback: Click predictions to convert them to annotations (future feature)

Architecture

System Components

┌─────────────────────────────────────────────────────────────────┐
│                         Web Browser                              │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  Next.js Frontend (React 19, Tailwind, lightweight-charts)│  │
│  │  - Annotation tools                                        │  │
│  │  - Prediction visualization                                │  │
│  └──────────────────┬──────────────────────────────────────────┘  │
└─────────────────────┼──────────────────────────────────────────────┘
                      │ HTTP
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│  Next.js API Routes (TypeScript)                                 │
│  - /api/candles, /api/annotations, /api/span-annotations        │
│  - /api/predict (proxy)                                          │
│  - /api/model/info (proxy)                                       │
│  └───────────┬─────────────────────────────────────┬───────────  │
│              │ PostgreSQL                          │ HTTP        │
│              ▼                                     ▼             │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │          PostgreSQL Database (Shared)                    │   │
│  │  - Frontend tables (candles, annotations, span_annotations) │
│  │  - ML tables (training_runs)                             │   │
│  │  - Accessed by: Next.js (Drizzle ORM)                    │   │
│  │                 ML Service (SQLAlchemy)                  │   │
│  └──────────────────────┬──────────────────────────────────┘    │
│                         │                                        │
│              ┌──────────┴──────────┐                             │
│              │                     │                             │
│  ┌───────────▼─────────┐  ┌───────▼───────────────────┐         │
│  │  ML Inference API   │  │    MLflow Server          │         │
│  │  (FastAPI, Python)  │  │  (Experiments, Registry)  │         │
│  └─────────────────────┘  └───────────────────────────┘         │
└─────────────────────────────────────────────────────────────────┘

ML Pipeline Workflow

1. Annotate Data (Web UI)
   ↓
2. Export Annotations (JSON)
   ↓
3. Feature Engineering (TA-Lib)
   ├─ Raw OHLCV → Enriched CSV (with indicators)
   ↓
4. Annotation Ingestion
   ├─ Annotations + Enriched CSV → Labeled Dataset
   ├─ Optional: TA-Lib CDL* auto-labeling
   ↓
5. Model Training
   ├─ Temporal train/val/test split
   ├─ RandomForest or XGBoost training
   ├─ MLflow experiment tracking
   ├─ Model registration
   ↓
6. Inference Service
   ├─ Load model from MLflow registry
   ├─ Serve predictions via FastAPI
   ↓
7. Prediction Visualization (Web UI)
   ├─ Display predictions on chart
   ├─ Detect disagreements
   ├─ Feedback loop: predictions → new annotations → retrain

Tech Stack

Frontend & Web Service

  • Frontend: Next.js 16 (App Router), React 19, TypeScript
  • Styling: Tailwind CSS 3, shadcn/ui components
  • Charting: lightweight-charts 4.x (TradingView)
  • Icons: lucide-react
  • Backend: Next.js API Routes
  • Database: PostgreSQL 16 with pg driver
  • ORM: Drizzle ORM (PostgreSQL dialect)
  • CSV Parsing: papaparse

ML Pipeline (Python)

  • API Framework: FastAPI with uvicorn
  • ML Libraries: scikit-learn (RandomForest), XGBoost
  • Feature Engineering: TA-Lib (Technical Analysis Library)
  • Data Processing: pandas, numpy
  • Experiment Tracking: MLflow (model registry, artifact storage)
  • Data Versioning: DVC (Data Version Control)
  • Database: PostgreSQL 16 (shared with frontend - reads candles/annotations, writes training runs)
  • ORM: SQLAlchemy (for training runs) + table reflection (for frontend data)
  • Model Persistence: joblib
  • Validation: Pydantic

Getting Started

The fastest way to get running with Docker:

docker-compose up --build

Then open http://localhost:3000

See DEPLOYMENT.md for detailed Docker instructions.

Prerequisites

  • Node.js 18.x or higher (for local development)
  • npm 9.x or higher (for local development)
  • PostgreSQL 16 or higher (for local development)
  • Docker & docker-compose (for containerized deployment)

Local Development Installation

  1. Clone the repository:

    git clone <repository-url>
    cd candle_annotator
    
  2. Install dependencies:

    npm install
    
  3. Setup PostgreSQL database:

    createdb candle_annotator
    createuser -P ml_user
    # Enter password: ml_password
    psql -c "GRANT ALL PRIVILEGES ON DATABASE candle_annotator TO ml_user;"
    
  4. Create .env file:

    cp .env.example .env
    # Edit .env to set DATABASE_URL=postgresql://ml_user:ml_password@localhost:5432/candle_annotator
    
  5. Start the development server:

    npm run dev
    
  6. Open http://localhost:3000 in your browser

Usage

  1. Upload Data: Click "Choose CSV File" and select a CSV with columns: time,open,high,low,close
  2. View Chart: The candlestick chart renders automatically after upload
  3. Add Annotations:
    • Click "Label: Break Up" or "Label: Break Down" then click on a candle
    • Click "Draw Line" then click two points to draw a trend line
    • Press Escape to cancel line drawing
  4. Delete Annotations: Click "Delete" tool, then click on markers or lines to remove them
  5. Export: Click "Export CSV" to download all annotations

CSV File Format

Input Format

Your CSV file should have these columns:

time,open,high,low,close
1700000000,1.0500,1.0520,1.0490,1.0510
1700000060,1.0510,1.0530,1.0505,1.0525

Time column accepts:

  • Unix timestamps (seconds): 1700000000
  • Date strings: 2024-01-15, 2024-01-15 10:30:00

Export Format

The exported CSV includes:

timestamp,label_type,price
1700000000,break_up,1.0510
1700000120,break_down,1.0505
1700000000,line,1.0500
  • timestamp: Unix timestamp of the annotation
  • label_type: break_up, break_down, or line
  • price: Close price for markers, start price for lines

Database Schema

Candles Table (PostgreSQL)

{
  id: serial (PK, auto-increment),
  chart_id: integer (FK to charts.id),
  time: timestamp (not null, indexed with chart_id),
  open: double precision,
  high: double precision,
  low: double precision,
  close: double precision
}

Annotations Table (Point Annotations)

{
  id: serial (PK, auto-increment),
  chart_id: integer (FK to charts.id),
  timestamp: timestamp (not null),
  label_type: text ('line' | 'rectangle'),
  geometry: jsonb (for line/rectangle coordinates, nullable),
  color: text (default '#3b82f6'),
  created_at: timestamp (default now())
}

Span Annotations Table (Pattern Labels)

{
  id: serial (PK, auto-increment),
  chart_id: integer (FK to charts.id),
  start_time: timestamp (not null),
  end_time: timestamp (not null),
  label: text (pattern name, e.g., 'Bullish Engulfing'),
  confidence: integer (nullable),
  outcome: text (nullable),
  notes: text (nullable),
  sub_spans: jsonb (nullable),
  color: text (default '#2196F3'),
  source: text (default 'human'),  # 'human' | 'model' | 'hybrid'
  model_prediction: jsonb (nullable),
  created_at: timestamp (default now())
}

Training Runs Table (ML Service)

{
  id: serial (PK, auto-increment),
  run_id: text (unique, MLflow run ID),
  model_type: text (e.g., 'RandomForest', 'XGBoost'),
  experiment_name: text,
  pipeline_config_hash: text,
  dataset_version: text,
  metrics_summary: jsonb,
  status: text (e.g., 'running', 'completed', 'failed'),
  created_at: timestamp (default now()),
  completed_at: timestamp (nullable)
}

API Endpoints

POST /api/upload

Upload CSV file and store candle data

Behavior: Deletes all existing candles before inserting new data (replace mode)
Request: multipart/form-data with file field
Response: { success: true, count: number } or { error: string }

GET /api/candles

Retrieve all candle records

Response: Array of candle objects ordered by time

GET /api/annotations

Retrieve all annotations

Response: Array of annotation objects with parsed geometry

POST /api/annotations

Create a new annotation

Request: { timestamp: number, label_type: string, geometry?: object }
Response: Created annotation object with ID

DELETE /api/annotations/[id]

Delete an annotation by ID

Response: { success: true } or { error: string }

GET /api/export

Export annotations as downloadable CSV

Response: CSV file download with Content-Disposition header

Architecture

Component Structure

  • page.tsx: Main page composition, manages active tool state
  • Toolbox.tsx: Sidebar with tool buttons and export functionality
  • FileUpload.tsx: CSV upload component with status messages
  • CandleChart.tsx: Core chart wrapper with lightweight-charts integration
    • Initializes chart with dark theme
    • Handles marker annotations (Break Up/Down)
    • Manages click events for annotation creation
    • Exposes refreshData() method for parent updates
  • SvgOverlay.tsx: Transparent SVG layer for line drawing
    • Coordinate transformation between data and pixels
    • Two-click line drawing with preview
    • Line hit detection for deletion

Data Flow

  1. User uploads CSV → POST /api/upload → SQLite storage
  2. Chart mounts → GET /api/candles + GET /api/annotations → Render
  3. User clicks with active tool → POST /api/annotations → Refresh chart
  4. User deletes → DELETE /api/annotations/[id] → Refresh chart
  5. User exports → GET /api/export → CSV download

Development

Project Structure

candle_annotator/
├── src/
│   ├── app/
│   │   ├── api/              # API route handlers
│   │   │   ├── upload/
│   │   │   ├── candles/
│   │   │   ├── annotations/
│   │   │   └── export/
│   │   ├── globals.css       # Tailwind styles
│   │   ├── layout.tsx        # Root layout with dark theme
│   │   └── page.tsx          # Main page
│   ├── components/
│   │   ├── ui/               # shadcn/ui components
│   │   ├── CandleChart.tsx
│   │   ├── SvgOverlay.tsx
│   │   ├── Toolbox.tsx
│   │   └── FileUpload.tsx
│   └── lib/
│       ├── db/
│       │   ├── index.ts      # Drizzle client
│       │   ├── schema.ts     # Table definitions
│       │   └── migrate.ts    # Migration runner
│       └── utils.ts          # Utility functions
├── data/                     # SQLite database directory
├── drizzle/                  # Migration files
├── DEPLOYMENT.md             # Deployment instructions
└── README.md                 # This file

Key Technical Decisions

  1. lightweight-charts v4: Stable API with good candlestick and marker support
  2. PostgreSQL: Shared database enables ML service to directly query candle/annotation data without CSV exports
  3. SVG Overlay for Lines: Maintains separate rendering layer from chart, easier coordinate management
  4. Drizzle ORM: Type-safe queries with minimal overhead, PostgreSQL dialect for production-grade features
  5. Next.js App Router: Server-side API routes co-located with frontend code

Known Limitations

  • Single User: No authentication or concurrent access support
  • No Undo: Can only delete annotations, not undo placement
  • Memory: Large CSV files (100k+ rows) may cause slow uploads
  • Line Snapping: Lines don't snap to candles, free-form placement only

Troubleshooting

See DEPLOYMENT.md for detailed troubleshooting steps.

Common issues:

  • PostgreSQL connection errors: Check DATABASE_URL environment variable and verify PostgreSQL is running
  • Port 3000 in use: Use PORT=3001 npm run dev
  • Migration errors: Ensure PostgreSQL is accessible before starting the application

License

ISC

Contributing

This is a focused tool for a specific use case. For questions or issues, please open a GitHub issue.