8.1 KiB
Context
The Candle Annotator is a Next.js app with SQLite storage that lets users annotate candlestick charts with pattern labels. It currently has no ML capabilities — annotations are created manually and exported as CSV/JSON, but there's no way to train models or get predictions back into the UI.
The existing stack is: Next.js 16 (App Router), React 19, lightweight-charts v4, SQLite via Drizzle ORM, Docker deployment. The app runs as a single container on port 3000.
We need to add a Python ML service that sits alongside the Next.js app, connected via HTTP. The Python ecosystem (scikit-learn, XGBoost, TA-Lib, MLflow) is the right tool for this job — there's no viable way to do this in Node.js.
Goals / Non-Goals
Goals:
- Stand up a Python FastAPI service at
services/ml/that handles feature engineering, annotation ingestion, training, and inference - Use TA-Lib for programmatic candlestick pattern detection (CDL* functions)
- Train tree-based models (RandomForest, XGBoost) with MLflow tracking
- Serve predictions via REST API on port 8001
- Proxy inference requests through Next.js API routes to avoid CORS
- Render model predictions on the chart as a distinct visual layer
- Version datasets with DVC
Non-Goals:
- Deep learning models (LSTM, GRU, transformer) — architecture should accommodate them later, but not implemented now
- Multi-user or multi-tenant support
- Real-time streaming predictions (batch/on-demand only)
- Automated retraining pipelines or CI/CD for model deployment
- GPU inference or training optimization
Decisions
1. Separate Python service vs. embedded in Next.js
Decision: Standalone Python FastAPI service in services/ml/, communicating via HTTP.
Alternatives considered:
- Python subprocess spawned by Next.js — fragile process management, no independent scaling
- Python WASM in browser — TA-Lib and scikit-learn don't work in WASM
- Shared SQLite access from Python — SQLite doesn't handle concurrent writers well
Rationale: Clean separation of concerns. The Next.js app owns the UI and annotation data; the Python service owns ML. They communicate through well-defined REST APIs. Each can be developed, tested, and deployed independently.
2. Directory structure: services/ml/ in the monorepo
Decision: Place the Python service at services/ml/ within the existing repo.
Alternatives considered:
- Separate repository — adds overhead for a single-developer project
- Top-level
ml/directory —services/namespace leaves room for future services
Rationale: Monorepo keeps everything together. The services/ prefix signals it's a separate deployable unit, not part of the Next.js app.
3. Pipeline config via YAML
Decision: Single config/pipeline.yaml controls all pipeline stages (feature engineering, annotation ingestion, training, inference). Each stage has an enabled flag.
Rationale: Makes experiments reproducible — the full config is logged as an MLflow artifact with each training run. Stages can be toggled independently (e.g., skip feature engineering, use only programmatic labels).
4. MLflow for experiment tracking, DVC for data versioning
Decision: MLflow tracks experiments, metrics, models. DVC versions datasets.
Alternatives considered:
- Weights & Biases — heavier, cloud-dependent
- Plain file logging — loses queryability and model registry
- Git LFS for data — doesn't handle dataset lineage
Rationale: MLflow runs locally (no cloud dependency), provides a model registry, and has native integrations with scikit-learn and XGBoost. DVC handles data versioning without bloating the git repo.
5. Annotation export format: JSON from existing API
Decision: The Python pipeline reads annotation data by calling the existing Next.js API endpoints (GET /api/annotations, span annotation exports) or from exported JSON/CSV files in data/annotations/.
Alternatives considered:
- Direct SQLite read from Python — concurrent access issues
- Shared PostgreSQL — overkill for single-user tool
Rationale: Using the existing API or file exports keeps the services decoupled. The annotation tool already has export functionality. For training, batch export to data/annotations/ is sufficient.
6. Label encoding: windowed classification first, BIO later
Decision: Start with fixed-window classification (each annotation span → one training sample of N candles). BIO sequence labeling is designed for but not implemented in v1.
Rationale: Window classification works with tree-based models (RandomForest, XGBoost) which are the initial model types. BIO encoding is needed for sequence models (BiLSTM-CRF) which are a non-goal for now.
7. Next.js proxy routes for inference
Decision: Next.js API routes at /api/predict, /api/predict/batch, /api/model/info proxy to the Python service.
Rationale: Avoids CORS configuration. Lets us add auth or rate-limiting on the Next.js side later. The frontend only talks to one origin.
8. Prediction rendering: histogram series overlay
Decision: Use a lightweight-charts histogram series to render predictions as colored bars behind candles. Each bar's color maps to a predicted pattern label.
Alternatives considered:
- Custom canvas plugin — more control but significantly more code
- Series markers only — no area highlighting, just point markers
Rationale: Histogram series is the simplest approach that gives visual area coverage. Can upgrade to a canvas plugin later for hatched/dashed styling. Markers are added for label text with confidence scores.
9. Docker: multi-container with docker-compose
Decision: Add an ml-service container to the existing docker-compose. Add an mlflow container for the tracking server. Shared volume for data/.
services:
candle-annotator: # existing
ml-service: # new - FastAPI on 8001
mlflow: # new - tracking server on 5000
postgres: # new - PostgreSQL for ML service state
Rationale: Each service has its own Dockerfile and dependencies. The shared data/ volume lets both services access OHLCV and annotation files.
Risks / Trade-offs
[TA-Lib C library dependency] → TA-Lib requires installing a system-level C library before the Python wrapper works. Mitigated by pinning it in the Dockerfile (apt-get install libta-lib-dev) and providing clear setup instructions for local development.
[MLflow storage growth] → MLflow artifacts (models, plots, configs) accumulate over time. Mitigated by using a local mlruns/ directory with periodic manual cleanup. Not a concern at single-user scale.
[Preprocessing parity] → Feature engineering during inference must exactly match training. If the pipeline config changes between training and inference, predictions are invalid. Mitigated by logging the full pipeline config as an MLflow artifact and loading it during inference to replicate preprocessing.
[Class imbalance] → Pattern classes will be heavily imbalanced (mostly "no pattern"). Mitigated by using class_weights: balanced and tracking per-class precision/recall, not just accuracy.
[SQLite concurrent access] → If both the Next.js app and Python service try to access the SQLite DB simultaneously, writes can fail. Mitigated by keeping Python read-only on annotation data (via API calls or file exports), never writing to the Next.js SQLite DB directly.
[Temporal data leakage] → Random train/test splits on time series data leak future information. Mitigated by enforcing temporal splits only (configurable but defaulting to temporal).
Resolved Questions
- Python service database: PostgreSQL — the Python service uses its own Postgres instance for storing training run references, pipeline configs, and any service-specific state. Added to docker-compose.
- DVC remote storage: Local backend — datasets versioned on the local filesystem, simplest setup for single-developer workflow.
- Prediction persistence: Ephemeral — predictions are fetched on demand from the inference API, not persisted in any database. The frontend caches them in memory keyed by time range + model version.