feat(ui): implement disagreement detection, prediction summary, loading states, and update documentation

- Add disagreement detection logic comparing human annotations vs predictions
- Display prediction summary in PredictionPanel (agreements/disagreements)
- Wire up 'Show only disagreements' filter toggle
- Add loading overlay during prediction fetching
- Update docker-compose.yml with healthchecks for all services
- Update DEPLOYMENT.md with comprehensive ML service setup instructions
- Update README.md with ML pipeline overview and architecture diagrams
- Update CLAUDE_DESCRIPTION.md with v3.0.0 ML integration details

Remaining tasks (11.2, 11.4, 11.5) deferred - core functionality complete
This commit is contained in:
Marko Djordjevic 2026-02-15 16:34:02 +01:00
parent 952eb7413c
commit 21f184aa8d
8 changed files with 585 additions and 56 deletions

View file

@ -157,6 +157,161 @@ The application doesn't require any environment variables for local development.
└── public/ # Static assets
```
## ML Service Setup (Optional)
The Candle Annotator includes an optional Python ML service for pattern recognition and prediction.
### Prerequisites for ML Service
- Python 3.11+
- TA-Lib C library
- PostgreSQL 16
### Local ML Service Setup
#### 1. Install TA-Lib C Library
**Linux (Debian/Ubuntu):**
```bash
sudo apt-get update
sudo apt-get install libta-lib-dev
```
**macOS:**
```bash
brew install ta-lib
```
**From Source:**
```bash
wget http://prdownloads.sourceforge.net/ta-lib/ta-lib-0.4.0-src.tar.gz
tar -xzf ta-lib-0.4.0-src.tar.gz
cd ta-lib/
./configure --prefix=/usr
make
sudo make install
```
#### 2. Install Python Dependencies
```bash
cd services/ml
pip install -r requirements.txt
```
#### 3. Setup PostgreSQL
The ML service requires PostgreSQL for storing training run metadata:
```bash
# Create database
createdb ml_db
# Or using psql
psql -c "CREATE DATABASE ml_db;"
```
#### 4. Initialize DVC
DVC is used for dataset versioning:
```bash
cd services/ml
dvc init
dvc remote add -d local /path/to/dvc-storage
```
#### 5. Run MLflow Tracking Server
MLflow tracks experiments and stores models:
```bash
mlflow server \
--backend-store-uri ./mlruns \
--default-artifact-root ./mlruns/artifacts \
--host 0.0.0.0 \
--port 5000
```
#### 6. Configure Pipeline
Edit `services/ml/config/pipeline.yaml` to configure:
- Feature engineering settings
- Model hyperparameters
- Data paths
- MLflow experiment name
#### 7. Start ML Service
```bash
cd services/ml
uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
```
The inference API will be available at http://localhost:8001
#### 8. Configure Next.js App
Create `.env.local` in the project root:
```env
INFERENCE_API_URL=http://localhost:8001
INFERENCE_API_TIMEOUT=30000
INFERENCE_BATCH_TIMEOUT=120000
NEXT_PUBLIC_PREDICTIONS_ENABLED=true
```
### Running the ML Pipeline
The ML pipeline consists of:
1. **Feature Engineering** - Extract TA-Lib indicators from OHLCV data
2. **Annotation Ingestion** - Convert span annotations to labeled datasets
3. **Training** - Train models with MLflow tracking
4. **Inference** - Serve predictions via FastAPI
#### Train a Model
```bash
cd services/ml
python pipeline.py --config config/pipeline.yaml
```
This will:
- Load raw OHLCV data from `data/raw/`
- Compute features and save to `data/enriched/`
- Load annotations and create labeled dataset in `data/labeled/`
- Train the model with MLflow tracking
- Save model artifacts
#### Run Individual Stages
```bash
# Feature engineering only
python pipeline.py --config config/pipeline.yaml --stage feature_engineering
# Training only (requires labeled data)
python pipeline.py --config config/pipeline.yaml --stage training
```
#### View Experiments
Open MLflow UI at http://localhost:5000
#### Test Inference API
```bash
# Check health
curl http://localhost:8001/health
# Get model info
curl http://localhost:8001/model/info
# Predict (requires candles JSON)
curl -X POST http://localhost:8001/predict \
-H "Content-Type: application/json" \
-d '{"candles": [...]}'
```
## Docker Deployment
### Prerequisites
@ -173,10 +328,23 @@ docker-compose up --build
```
This will:
1. Build the Docker image with multi-stage build optimization
2. Create a named volume `candle-data` for persistent database storage
3. Start the container on port 3000
4. Enable automatic restart unless stopped
1. Build the Next.js app and ML service Docker images
2. Start PostgreSQL for ML service metadata
3. Start MLflow tracking server
4. Start the ML inference service (FastAPI)
5. Start the Next.js web application
6. Create named volumes for persistent storage:
- `candle-data` - SQLite database for annotations
- `ml-data` - OHLCV data, features, labeled datasets
- `mlflow-data` - MLflow experiments and model artifacts
- `postgres-data` - PostgreSQL data
7. Enable automatic restart unless stopped
Services will be available at:
- **Web UI**: http://localhost:3000
- **ML Inference API**: http://localhost:8001
- **MLflow UI**: http://localhost:5000
- **PostgreSQL**: localhost:5432
### Running in Detached Mode