candle-annotator/CLAUDE_DESCRIPTION.md
Marko Djordjevic 21f184aa8d feat(ui): implement disagreement detection, prediction summary, loading states, and update documentation
- Add disagreement detection logic comparing human annotations vs predictions
- Display prediction summary in PredictionPanel (agreements/disagreements)
- Wire up 'Show only disagreements' filter toggle
- Add loading overlay during prediction fetching
- Update docker-compose.yml with healthchecks for all services
- Update DEPLOYMENT.md with comprehensive ML service setup instructions
- Update README.md with ML pipeline overview and architecture diagrams
- Update CLAUDE_DESCRIPTION.md with v3.0.0 ML integration details

Remaining tasks (11.2, 11.4, 11.5) deferred - core functionality complete
2026-02-15 16:34:02 +01:00

14 KiB

Candle Annotator - Project Description

Project Overview

Candle Annotator is a complete machine learning platform for candlestick pattern recognition, combining manual annotation tools with an integrated Python ML pipeline. It's designed for traders and ML researchers to create labeled training data, train models, and deploy pattern recognition in an active learning loop.

Current Version: 3.0.0 (ML Pipeline Integration - Feature engineering, training, and inference service)

Recent Changes (v3.0.0)

ML Pipeline Integration

  • Python ML Service: FastAPI-based inference service for real-time pattern prediction
  • Feature Engineering: TA-Lib integration for computing 150+ technical indicators (RSI, MACD, Bollinger Bands, etc.)
  • Model Training: RandomForest and XGBoost models with MLflow experiment tracking
  • Prediction UI: Chart overlay showing model predictions with confidence filtering
  • Disagreement Detection: Automatic comparison of human annotations vs model predictions
  • Active Learning Loop: Convert predictions to annotations for continuous model improvement

Backend Infrastructure

  • PostgreSQL: Dedicated database for ML service metadata and training runs
  • MLflow Server: Experiment tracking, model registry, and artifact storage
  • DVC: Data versioning for datasets
  • Docker Compose: Multi-service orchestration (Next.js app, ML service, MLflow, PostgreSQL)
  • Health Checks: Service health monitoring across all containers

API Additions

  • POST /api/predict: Inference proxy to ML service for candle predictions
  • POST /api/predict/batch: Batch prediction for time ranges
  • GET /api/model/info: Model metadata, metrics, and label configuration
  • GET /api/span-annotations/export: Export annotations in ML pipeline format
  • Span Source Tracking: source and model_prediction fields for feedback loop

Previous Changes (v2.0.0)

Label Management System

  • Label Sidebar: Comprehensive collapsible sidebar showing all Break Up and Break Down annotations
  • Click Selection: Click markers on chart or list items to select/highlight them
  • Keyboard Delete: Press Delete or Backspace to remove selected label
  • Search & Filter: Search by timestamp, filter by type, count display
  • Individual Delete: Per-item trash button for quick removal
  • Visual Feedback: Selected markers display with increased size and bright glow

Hacker Theme

  • Color Scheme: Matrix green (#00ff41) on dark background (#0a0e0a)
  • Typography: Monospace font (JetBrains Mono with fallbacks)
  • Effects: Glow effects on button hover, pulsing animations for active tools
  • Details: Custom scrollbars, selection highlighting, terminal aesthetic
  • Accessibility: High contrast meeting WCAG AA standards

Docker Deployment

  • Multi-stage Build: Optimized Dockerfile with ~100MB final image
  • Persistence: Named volume for SQLite database across restarts
  • Health Check: Built-in /api/health endpoint for container health
  • Security: Non-root user (nextjs:1001) for safe execution
  • Convenience: docker-compose.yml for one-command deployment

API Enhancements

  • Bulk Delete: DELETE /api/annotations?type=break_up,break_down for batch operations
  • Health Endpoint: GET /api/health with optional database check (?check=db)
  • Existing DELETE /api/annotations/[id]: Handles individual label deletion

Technical Stack

Frontend

  • Framework: Next.js 16 (App Router)
  • UI Library: React 19
  • Language: TypeScript 5
  • Styling: Tailwind CSS 3 + shadcn/ui components
  • Charting: lightweight-charts v4 (TradingView fork)
  • Icons: lucide-react

Backend (Next.js)

  • Runtime: Node.js 18.x
  • API: Next.js API Routes
  • Database: SQLite with better-sqlite3
  • ORM: Drizzle ORM
  • CSV: papaparse

ML Service (Python)

  • API Framework: FastAPI with uvicorn
  • ML Libraries: scikit-learn (RandomForest), XGBoost
  • Feature Engineering: TA-Lib (Technical Analysis Library)
  • Data Processing: pandas, numpy
  • Experiment Tracking: MLflow (model registry, artifact storage)
  • Data Versioning: DVC
  • Database: PostgreSQL 16 (training run metadata)
  • Model Persistence: joblib
  • Validation: Pydantic

DevOps

  • Containerization: Docker with multi-stage builds
  • Orchestration: docker-compose (4 services: app, ml-service, mlflow, postgres)
  • Build: Next.js standalone output mode
  • Monitoring: Health checks on all services

Core Features

Annotation Tools

  1. Break Up Labels: Click marker on candle to add upward breakout label
  2. Break Down Labels: Click marker on candle to add downward breakout label
  3. Trend Lines: Two-click drawing with SVG overlay for custom trend lines
  4. Delete Tool: Remove any annotation by clicking it

Data Management

  • CSV Upload: Import OHLC data (time, open, high, low, close)
  • Persistent Storage: SQLite database stores all annotations
  • CSV Export: Download labeled data for ML training

UI Components

  • Toolbox: Left sidebar with tool buttons, color picker, label management
  • CandleChart: Main chart area with interactive candlestick visualization
  • FileUpload: Drag-and-drop CSV file upload
  • Label Sidebar: Collapsible section with annotation list, search, filter

File Structure & Key Files

candle_annotator/
├── src/                              # Next.js Frontend & API
│   ├── app/
│   │   ├── api/
│   │   │   ├── annotations/[id]/route.ts  # GET label by ID, PATCH update, DELETE remove
│   │   │   ├── annotations/route.ts       # GET all, POST create, DELETE bulk
│   │   │   ├── candles/route.ts           # GET all candles
│   │   │   ├── export/route.ts            # GET CSV export
│   │   │   ├── health/route.ts            # GET health check
│   │   │   ├── upload/route.ts            # POST CSV file upload
│   │   │   ├── predict/route.ts           # POST prediction proxy
│   │   │   ├── predict/batch/route.ts     # POST batch prediction proxy
│   │   │   ├── model/info/route.ts        # GET model info proxy
│   │   │   └── span-annotations/
│   │   │       ├── route.ts               # GET/POST span annotations
│   │   │       └── export/route.ts        # GET export for ML pipeline
│   │   ├── globals.css                    # Hacker theme CSS variables
│   │   ├── layout.tsx                     # Root layout with font loading
│   │   └── page.tsx                       # Main app (state + prediction mgmt)
│   ├── components/
│   │   ├── CandleChart.tsx               # Chart core with prediction overlay
│   │   ├── PredictionPanel.tsx           # Prediction controls & summary
│   │   ├── SvgOverlay.tsx                # Line drawing layer
│   │   ├── Toolbox.tsx                   # Sidebar with tools & label list
│   │   ├── FileUpload.tsx                # CSV upload
│   │   └── ui/                           # shadcn/ui components
│   ├── types/
│   │   └── predictions.ts                # Prediction types
│   └── lib/
│       ├── db/
│       │   ├── index.ts                  # Drizzle client
│       │   ├── schema.ts                 # Table definitions (incl. span annotations)
│       │   └── migrate.ts                # Migration runner
│       └── utils.ts                      # Utility functions
│
├── services/ml/                      # Python ML Service
│   ├── app/
│   │   ├── main.py                       # FastAPI app & inference endpoints
│   │   ├── config.py                     # Pydantic config models
│   │   ├── db.py                         # PostgreSQL connection
│   │   └── annotation_ingestion.py      # Annotation processing
│   ├── features/
│   │   ├── talib_features.py             # TA-Lib indicator computation
│   │   ├── candle_features.py            # Candle pattern features
│   │   └── custom_loader.py              # Custom feature plugins
│   ├── training/
│   │   ├── train.py                      # Main training entry point
│   │   ├── evaluation.py                 # Metrics & visualization
│   │   └── models/
│   │       ├── random_forest.py          # RandomForest wrapper
│   │       └── xgboost_model.py          # XGBoost wrapper
│   ├── config/
│   │   └── pipeline.yaml                 # Pipeline configuration
│   ├── data/
│   │   ├── raw/                          # Raw OHLCV CSV files
│   │   ├── enriched/                     # Feature-engineered data
│   │   ├── labeled/                      # Labeled training datasets
│   │   └── annotations/                  # Exported annotations
│   ├── pipeline.py                       # Main pipeline orchestrator
│   ├── Dockerfile                        # ML service container
│   └── requirements.txt                  # Python dependencies
│
├── Docker files:
├── Dockerfile                        # Next.js multi-stage build
├── docker-compose.yml                # Multi-service orchestration
├── .dockerignore                     # Exclude from image
└── .env.example                      # Environment template

State Management

Page Component (src/app/page.tsx)

  • activeTool: Current selected tool (break_up, break_down, line, delete)
  • selectedColor: Color for trend lines
  • selectedLabelId: Currently selected label marker (null if none)
  • annotations: Array of all annotations for sidebar

CandleChart Component

  • Fetches candles and annotations from API
  • Renders markers with highlight for selectedLabelId
  • Handles click events for annotation creation and selection
  • Exposes refreshData() method for parent updates

Toolbox Component

  • labelsExpanded: Sidebar collapsed/expanded state
  • searchText: Search query for timestamp filtering
  • filterType: Label type filter (all, break_up, break_down)
  • Renders label list with click selection and delete buttons

Data Flow

  1. Initialization:

    • Page mounts → fetch /api/candles + /api/annotations
    • CandleChart renders with data
  2. Add Label:

    • User clicks Break Up/Break Down tool
    • Clicks on chart candle
    • POST /api/annotations
    • Chart refreshes automatically
  3. Select Label:

    • User clicks marker on chart OR list item in sidebar
    • Updates selectedLabelId state
    • CandleChart re-renders marker with highlight
    • Sidebar highlights matching list item
  4. Delete Label:

    • Option A: Select label + press Delete key
    • Option B: Click trash icon in sidebar list
    • DELETE /api/annotations/[id]
    • Annotations array updated
    • Chart refreshes
  5. Export:

    • User clicks Export CSV button
    • GET /api/export downloads CSV file

API Endpoints

Data Operations

  • POST /api/upload - Upload CSV with candle data
  • GET /api/candles - Fetch all candles
  • GET /api/annotations - Fetch all annotations
  • POST /api/annotations - Create label
  • DELETE /api/annotations/[id] - Delete label
  • DELETE /api/annotations?type=break_up,break_down - Bulk delete by type
  • GET /api/export - Download CSV export

Monitoring

  • GET /api/health - Health check
  • GET /api/health?check=db - Health check with database verification

Development Workflow

Adding Features

  1. Update CandleChart/Toolbox components for UI
  2. Add API route in src/app/api/ if needed
  3. Update state management in page.tsx
  4. Test in browser with npm run dev
  5. Commit with clear message

Fixing Bugs

  1. Locate issue in component or API route
  2. Add console.logs or debugger for investigation
  3. Write minimal fix
  4. Test across all annotation types
  5. Commit with bug fix message

Deployment

  1. Test locally: npm run dev
  2. Build production: npm run build
  3. Commit changes
  4. Docker deploy: docker-compose up --build -d

Known Constraints

  • Single User: No authentication, local data only
  • No Undo: Annotations can only be deleted, not undone
  • SQLite Limits: Not for concurrent multi-user access
  • Memory: Large CSV files (100k+ rows) slow performance
  • Lines: Free-form drawing, no snap-to-candle

Customization Points

Theme Colors

Edit CSS variables in src/app/globals.css (matrix green to desired color)

Chart Appearance

  • Candlestick colors: CandleChart.tsx line 119-120
  • Grid colors: line 96-98
  • Font: globals.css body element

API Routes

  • Validation rules: src/app/api/*/route.ts
  • Database operations: Use Drizzle ORM syntax

UI Components

  • Button styles: src/components/ui/button.tsx
  • Sidebar layout: src/components/Toolbox.tsx

Performance Notes

  • Chart rendering: lightweight-charts optimized, 60fps target
  • SVG lines: Only visible SVG redrawn, minimal overhead
  • Database: SQLite async operations in server routes only
  • Frontend: React memoization for chart component

Security Considerations

  • Input validation: File upload checks MIME type
  • SQL injection: Drizzle ORM parameterized queries
  • CORS: API routes served same-origin only
  • Docker: Non-root user (nextjs:1001) for container execution

Testing Checklist

Before marking features complete:

  • All existing features still work (lines, colors, delete, export)
  • New feature functions as designed
  • Error states handled gracefully
  • Browser console shows no errors
  • Docker build and run successful
  • Data persists across container restart

Version History

v2.0.0 (Current)

  • Label management sidebar with search/filter
  • Hacker theme with matrix colors and monospace font
  • Docker deployment with compose
  • Keyboard delete for labels
  • Label selection on chart

v1.0.0 (Previous)

  • Basic annotation tools (Break Up, Break Down, Line, Delete)
  • Candlestick chart visualization
  • CSV import/export
  • SQLite persistence