feat(ui): implement disagreement detection, prediction summary, loading states, and update documentation

- Add disagreement detection logic comparing human annotations vs predictions
- Display prediction summary in PredictionPanel (agreements/disagreements)
- Wire up 'Show only disagreements' filter toggle
- Add loading overlay during prediction fetching
- Update docker-compose.yml with healthchecks for all services
- Update DEPLOYMENT.md with comprehensive ML service setup instructions
- Update README.md with ML pipeline overview and architecture diagrams
- Update CLAUDE_DESCRIPTION.md with v3.0.0 ML integration details

Remaining tasks (11.2, 11.4, 11.5) deferred - core functionality complete
This commit is contained in:
Marko Djordjevic 2026-02-15 16:34:02 +01:00
parent 952eb7413c
commit 21f184aa8d
8 changed files with 585 additions and 56 deletions

View file

@ -2,11 +2,35 @@
## Project Overview
Candle Annotator is a web-based tool for manually annotating candlestick charts with pattern labels and trend lines. It's designed for traders and ML researchers creating labeled training datasets for trading algorithm development.
Candle Annotator is a complete machine learning platform for candlestick pattern recognition, combining manual annotation tools with an integrated Python ML pipeline. It's designed for traders and ML researchers to create labeled training data, train models, and deploy pattern recognition in an active learning loop.
**Current Version**: 2.0.0 (Major feature release with label management, hacker theme, and Docker support)
**Current Version**: 3.0.0 (ML Pipeline Integration - Feature engineering, training, and inference service)
## Recent Changes (v2.0.0)
## Recent Changes (v3.0.0)
### ML Pipeline Integration
- **Python ML Service**: FastAPI-based inference service for real-time pattern prediction
- **Feature Engineering**: TA-Lib integration for computing 150+ technical indicators (RSI, MACD, Bollinger Bands, etc.)
- **Model Training**: RandomForest and XGBoost models with MLflow experiment tracking
- **Prediction UI**: Chart overlay showing model predictions with confidence filtering
- **Disagreement Detection**: Automatic comparison of human annotations vs model predictions
- **Active Learning Loop**: Convert predictions to annotations for continuous model improvement
### Backend Infrastructure
- **PostgreSQL**: Dedicated database for ML service metadata and training runs
- **MLflow Server**: Experiment tracking, model registry, and artifact storage
- **DVC**: Data versioning for datasets
- **Docker Compose**: Multi-service orchestration (Next.js app, ML service, MLflow, PostgreSQL)
- **Health Checks**: Service health monitoring across all containers
### API Additions
- **POST /api/predict**: Inference proxy to ML service for candle predictions
- **POST /api/predict/batch**: Batch prediction for time ranges
- **GET /api/model/info**: Model metadata, metrics, and label configuration
- **GET /api/span-annotations/export**: Export annotations in ML pipeline format
- **Span Source Tracking**: `source` and `model_prediction` fields for feedback loop
## Previous Changes (v2.0.0)
### Label Management System
- **Label Sidebar**: Comprehensive collapsible sidebar showing all Break Up and Break Down annotations
@ -45,18 +69,29 @@ Candle Annotator is a web-based tool for manually annotating candlestick charts
- **Charting**: lightweight-charts v4 (TradingView fork)
- **Icons**: lucide-react
### Backend
### Backend (Next.js)
- **Runtime**: Node.js 18.x
- **API**: Next.js API Routes
- **Database**: SQLite with better-sqlite3
- **ORM**: Drizzle ORM
- **CSV**: papaparse
### ML Service (Python)
- **API Framework**: FastAPI with uvicorn
- **ML Libraries**: scikit-learn (RandomForest), XGBoost
- **Feature Engineering**: TA-Lib (Technical Analysis Library)
- **Data Processing**: pandas, numpy
- **Experiment Tracking**: MLflow (model registry, artifact storage)
- **Data Versioning**: DVC
- **Database**: PostgreSQL 16 (training run metadata)
- **Model Persistence**: joblib
- **Validation**: Pydantic
### DevOps
- **Containerization**: Docker with multi-stage builds
- **Orchestration**: docker-compose
- **Orchestration**: docker-compose (4 services: app, ml-service, mlflow, postgres)
- **Build**: Next.js standalone output mode
- **Monitoring**: Health check via wget
- **Monitoring**: Health checks on all services
## Core Features
@ -80,34 +115,71 @@ Candle Annotator is a web-based tool for manually annotating candlestick charts
## File Structure & Key Files
```
src/
├── app/
│ ├── api/
│ │ ├── annotations/[id]/route.ts # GET label by ID, PATCH update, DELETE remove
│ │ ├── annotations/route.ts # GET all, POST create, DELETE bulk
│ │ ├── candles/route.ts # GET all candles
│ │ ├── export/route.ts # GET CSV export
│ │ ├── health/route.ts # GET health check
│ │ └── upload/route.ts # POST CSV file upload
│ ├── globals.css # Hacker theme CSS variables
│ ├── layout.tsx # Root layout with font loading
│ └── page.tsx # Main app (state management)
├── components/
│ ├── CandleChart.tsx # Chart core with markers
│ ├── SvgOverlay.tsx # Line drawing layer
│ ├── Toolbox.tsx # Sidebar with tools & label list
│ ├── FileUpload.tsx # CSV upload
│ └── ui/ # shadcn/ui components
└── lib/
├── db/
│ ├── index.ts # Drizzle client
│ ├── schema.ts # Table definitions
│ └── migrate.ts # Migration runner
└── utils.ts # Utility functions
Docker files:
├── Dockerfile # Multi-stage build
├── docker-compose.yml # Compose configuration
candle_annotator/
├── src/ # Next.js Frontend & API
│ ├── app/
│ │ ├── api/
│ │ │ ├── annotations/[id]/route.ts # GET label by ID, PATCH update, DELETE remove
│ │ │ ├── annotations/route.ts # GET all, POST create, DELETE bulk
│ │ │ ├── candles/route.ts # GET all candles
│ │ │ ├── export/route.ts # GET CSV export
│ │ │ ├── health/route.ts # GET health check
│ │ │ ├── upload/route.ts # POST CSV file upload
│ │ │ ├── predict/route.ts # POST prediction proxy
│ │ │ ├── predict/batch/route.ts # POST batch prediction proxy
│ │ │ ├── model/info/route.ts # GET model info proxy
│ │ │ └── span-annotations/
│ │ │ ├── route.ts # GET/POST span annotations
│ │ │ └── export/route.ts # GET export for ML pipeline
│ │ ├── globals.css # Hacker theme CSS variables
│ │ ├── layout.tsx # Root layout with font loading
│ │ └── page.tsx # Main app (state + prediction mgmt)
│ ├── components/
│ │ ├── CandleChart.tsx # Chart core with prediction overlay
│ │ ├── PredictionPanel.tsx # Prediction controls & summary
│ │ ├── SvgOverlay.tsx # Line drawing layer
│ │ ├── Toolbox.tsx # Sidebar with tools & label list
│ │ ├── FileUpload.tsx # CSV upload
│ │ └── ui/ # shadcn/ui components
│ ├── types/
│ │ └── predictions.ts # Prediction types
│ └── lib/
│ ├── db/
│ │ ├── index.ts # Drizzle client
│ │ ├── schema.ts # Table definitions (incl. span annotations)
│ │ └── migrate.ts # Migration runner
│ └── utils.ts # Utility functions
├── services/ml/ # Python ML Service
│ ├── app/
│ │ ├── main.py # FastAPI app & inference endpoints
│ │ ├── config.py # Pydantic config models
│ │ ├── db.py # PostgreSQL connection
│ │ └── annotation_ingestion.py # Annotation processing
│ ├── features/
│ │ ├── talib_features.py # TA-Lib indicator computation
│ │ ├── candle_features.py # Candle pattern features
│ │ └── custom_loader.py # Custom feature plugins
│ ├── training/
│ │ ├── train.py # Main training entry point
│ │ ├── evaluation.py # Metrics & visualization
│ │ └── models/
│ │ ├── random_forest.py # RandomForest wrapper
│ │ └── xgboost_model.py # XGBoost wrapper
│ ├── config/
│ │ └── pipeline.yaml # Pipeline configuration
│ ├── data/
│ │ ├── raw/ # Raw OHLCV CSV files
│ │ ├── enriched/ # Feature-engineered data
│ │ ├── labeled/ # Labeled training datasets
│ │ └── annotations/ # Exported annotations
│ ├── pipeline.py # Main pipeline orchestrator
│ ├── Dockerfile # ML service container
│ └── requirements.txt # Python dependencies
├── Docker files:
├── Dockerfile # Next.js multi-stage build
├── docker-compose.yml # Multi-service orchestration
├── .dockerignore # Exclude from image
└── .env.example # Environment template
```