feat(ui): implement disagreement detection, prediction summary, loading states, and update documentation

- Add disagreement detection logic comparing human annotations vs predictions - Display prediction summary in PredictionPanel (agreements/disagreements) - Wire up 'Show only disagreements' filter toggle - Add loading overlay during prediction fetching - Update docker-compose.yml with healthchecks for all services - Update DEPLOYMENT.md with comprehensive ML service setup instructions - Update README.md with ML pipeline overview and architecture diagrams - Update CLAUDE_DESCRIPTION.md with v3.0.0 ML integration details Remaining tasks (11.2, 11.4, 11.5) deferred - core functionality complete
2026-02-15 16:34:02 +01:00 · 2026-02-15 16:34:02 +01:00 · 21f184aa8d
commit 21f184aa8d
parent 952eb7413c
8 changed files with 585 additions and 56 deletions
--- a/DEPLOYMENT.md
+++ b/DEPLOYMENT.md
@ -157,6 +157,161 @@ The application doesn't require any environment variables for local development.
 └── public/               # Static assets
 ```

+## ML Service Setup (Optional)
+
+The Candle Annotator includes an optional Python ML service for pattern recognition and prediction.
+
+### Prerequisites for ML Service
+
+- Python 3.11+
+- TA-Lib C library
+- PostgreSQL 16
+
+### Local ML Service Setup
+
+#### 1. Install TA-Lib C Library
+
+**Linux (Debian/Ubuntu):**
+```bash
+sudo apt-get update
+sudo apt-get install libta-lib-dev
+```
+
+**macOS:**
+```bash
+brew install ta-lib
+```
+
+**From Source:**
+```bash
+wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
+tar -xzf ta-lib-0.4.0-src.tar.gz
+cd ta-lib/
+./configure --prefix=/usr
+make
+sudo make install
+```
+
+#### 2. Install Python Dependencies
+
+```bash
+cd services/ml
+pip install -r requirements.txt
+```
+
+#### 3. Setup PostgreSQL
+
+The ML service requires PostgreSQL for storing training run metadata:
+
+```bash
+# Create database
+createdb ml_db
+
+# Or using psql
+psql -c "CREATE DATABASE ml_db;"
+```
+
+#### 4. Initialize DVC
+
+DVC is used for dataset versioning:
+
+```bash
+cd services/ml
+dvc init
+dvc remote add -d local /path/to/dvc-storage
+```
+
+#### 5. Run MLflow Tracking Server
+
+MLflow tracks experiments and stores models:
+
+```bash
+mlflow server \
+  --backend-store-uri ./mlruns \
+  --default-artifact-root ./mlruns/artifacts \
+  --host 0.0.0.0 \
+  --port 5000
+```
+
+#### 6. Configure Pipeline
+
+Edit `services/ml/config/pipeline.yaml` to configure:
+- Feature engineering settings
+- Model hyperparameters
+- Data paths
+- MLflow experiment name
+
+#### 7. Start ML Service
+
+```bash
+cd services/ml
+uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
+```
+
+The inference API will be available at http://localhost:8001
+
+#### 8. Configure Next.js App
+
+Create `.env.local` in the project root:
+
+```env
+INFERENCE_API_URL=http://localhost:8001
+INFERENCE_API_TIMEOUT=30000
+INFERENCE_BATCH_TIMEOUT=120000
+NEXT_PUBLIC_PREDICTIONS_ENABLED=true
+```
+
+### Running the ML Pipeline
+
+The ML pipeline consists of:
+1. **Feature Engineering** - Extract TA-Lib indicators from OHLCV data
+2. **Annotation Ingestion** - Convert span annotations to labeled datasets
+3. **Training** - Train models with MLflow tracking
+4. **Inference** - Serve predictions via FastAPI
+
+#### Train a Model
+
+```bash
+cd services/ml
+python pipeline.py --config config/pipeline.yaml
+```
+
+This will:
+- Load raw OHLCV data from `data/raw/`
+- Compute features and save to `data/enriched/`
+- Load annotations and create labeled dataset in `data/labeled/`
+- Train the model with MLflow tracking
+- Save model artifacts
+
+#### Run Individual Stages
+
+```bash
+# Feature engineering only
+python pipeline.py --config config/pipeline.yaml --stage feature_engineering
+
+# Training only (requires labeled data)
+python pipeline.py --config config/pipeline.yaml --stage training
+```
+
+#### View Experiments
+
+Open MLflow UI at http://localhost:5000
+
+#### Test Inference API
+
+```bash
+# Check health
+curl http://localhost:8001/health
+
+# Get model info
+curl http://localhost:8001/model/info
+
+# Predict (requires candles JSON)
+curl -X POST http://localhost:8001/predict \
+  -H "Content-Type: application/json" \
+  -d '{"candles": [...]}'
+```
+
 ## Docker Deployment

 ### Prerequisites
@ -173,10 +328,23 @@ docker-compose up --build
 ```

 This will:
-1. Build the Docker image with multi-stage build optimization
-2. Create a named volume `candle-data` for persistent database storage
-3. Start the container on port 3000
-4. Enable automatic restart unless stopped
+1. Build the Next.js app and ML service Docker images
+2. Start PostgreSQL for ML service metadata
+3. Start MLflow tracking server
+4. Start the ML inference service (FastAPI)
+5. Start the Next.js web application
+6. Create named volumes for persistent storage:
+   - `candle-data` - SQLite database for annotations
+   - `ml-data` - OHLCV data, features, labeled datasets
+   - `mlflow-data` - MLflow experiments and model artifacts
+   - `postgres-data` - PostgreSQL data
+7. Enable automatic restart unless stopped
+
+Services will be available at:
+- **Web UI**: http://localhost:3000
+- **ML Inference API**: http://localhost:8001
+- **MLflow UI**: http://localhost:5000
+- **PostgreSQL**: localhost:5432

 ### Running in Detached Mode