candle-annotator

Author	SHA1	Message	Date
Marko Djordjevic	e3184ea86c	save mlruns dir to git	2026-02-15 22:26:01 +01:00
Marko Djordjevic	f7b154f866	fix(ml): extract class labels from model when metadata is missing The model info returned empty labels array because the pkl file has no metadata dict. Now extracts labels from model.classes_ or model.model.classes_ as fallback.	2026-02-15 22:18:16 +01:00
Marko Djordjevic	be09cef098	fix(ml): add missing numpy import in main.py	2026-02-15 22:08:41 +01:00
Marko Djordjevic	40d6d1739e	fix(ml): add windowed feature flattening for inference parity The model was trained on 94-candle sliding windows flattened to 2820 features (94 candles x 30 features). Inference was sending raw per-candle features (27 columns). Changes: - Rewrite preprocessing to return (X, window_times) tuple - Add sliding window creation with correct feature ordering - Fill missing columns (average, barCount) with 0 for feature parity - Fill NaN from indicator warmup with 0 instead of dropping rows - Always compute all indicators (including MFI) for feature parity - Update predict and batch predict endpoints for new signature	2026-02-15 22:07:06 +01:00
Marko Djordjevic	4c7b3f2676	fix(ml): detect all-NaN volume column (not just missing column) The Pydantic model sets volume=None when absent, creating an all-NaN column rather than a missing column. Check isna().all() in addition to column existence.	2026-02-15 21:58:16 +01:00
Marko Djordjevic	317f925c43	fix(ml): handle missing volume data and skip volume-dependent indicators - Fill volume with 0 when column is absent from candle data - Skip MFI/OBV/AD/ADOSC indicators when no real volume data available - Fix pandas FutureWarning for inplace fillna in candle_features - Remove temporary debug NaN logging	2026-02-15 21:56:14 +01:00
Marko Djordjevic	b6b37160a7	fix(ml): cast OHLCV arrays to float64 for TA-Lib compatibility	2026-02-15 21:52:45 +01:00
Marko Djordjevic	f850728d44	fix(api): add GET /api/charts/[id] and fix batch prediction - Add GET handler to /api/charts/[id] route to fetch chart metadata - Fix batch prediction to use regular /predict endpoint with database candles - Remove /predict/batch usage (was designed for file-based predictions) - Make volume field optional in CandleData model (database candles don't have volume) - Convert timestamps to ISO dates for batch requests Known issue: TA-Lib indicators failing with 'input array type is not double' - May need to ensure candle data is float64/double type before processing	2026-02-15 21:49:22 +01:00
Marko Djordjevic	6d0d67e39b	fix(ml): make pair and timeframe optional in PredictRequest - Change pair and timeframe fields from required to optional - Frontend only sends candles array, not pair/timeframe metadata - These fields are only used for logging, not prediction logic - Update logging to handle None values with 'unknown' fallback - Fixes 422 validation error on /predict endpoint	2026-02-15 21:43:14 +01:00
Marko Djordjevic	5a7c901980	fix(frontend): update ModelInfoResponse types to match backend structure - Update TypeScript types to match flat backend response structure - Remove nested model_info and metrics objects - Remove label_config, use labels array and per_class_metrics array - Update all component references to use new structure - Generate default colors for prediction labels in CandleChart - Fix TypeScript type errors for nullable model_version - Remove accuracy/F1 metrics display (not in new response)	2026-02-15 21:39:38 +01:00
Marko Djordjevic	aa81d4f3d0	fix(ml): complete ML pipeline fixes and setup - Fix CCI indicator to use HLC prices instead of close only - Parse datetime column when loading enriched CSV - Strip timezone from annotation timestamps - Fix TA-Lib pattern names (CDL3WHITESOLDIERS, CDL3BLACKCROWS) - Exclude programmatic label columns from training features - Fix classification report to handle missing classes - Update MLflow tracking to use localhost:5000 - Grant PostgreSQL permissions to ml_user Pipeline now runs successfully end-to-end: - Feature engineering: 2543 rows, 31 columns - Annotation ingestion: 286 samples - Training: 89.47% test accuracy with Random Forest	2026-02-15 21:29:54 +01:00
Marko Djordjevic	ceb4103ec4	fix(ml): parse datetime column and fix TA-Lib pattern names - Add parse_dates parameter when loading enriched CSV - Strip timezone from annotation timestamps to match data - Fix pattern names: CDLTHREEWHITESOLDIERS -> CDL3WHITESOLDIERS - Fix pattern names: CDLTHREEBLACKCROWS -> CDL3BLACKCROWS	2026-02-15 21:13:20 +01:00
Marko Djordjevic	2b86524436	fix(ml): correct CCI indicator signature to use HLC prices	2026-02-15 21:09:47 +01:00
Marko Djordjevic	63486bc7b5	fix(ml): add CCI to hlc_indicators list CCI (Commodity Channel Index) requires high, low, and close prices	2026-02-15 21:08:20 +01:00
Marko Djordjevic	57240d4eea	fix(scripts): add created_at timestamps to annotation import Set created_at field for both span_label_types and span_annotations to satisfy NOT NULL constraint	2026-02-15 19:36:55 +01:00
Marko Djordjevic	a68a681c9b	fix(ml): handle date strings in TA-Lib annotation generator - Convert date strings to Unix timestamps in load_ohlcv() - Fix duplicate pattern names (CDL3WHITESOLDIERS/CDL3BLACKCROWS) - Ensure time column is always integer type	2026-02-15 19:30:38 +01:00
Marko Djordjevic	847ff67986	feat(ml): add TA-Lib annotation generation and import workflow Add complete workflow for using TA-Lib to bootstrap training data: - generate_talib_annotations.py: Python script to run TA-Lib CDL* functions and output span annotations in UI-compatible format - import_talib_annotations.ts: TypeScript script to import generated annotations into the UI database with auto-label-type creation - npm script 'import-annotations' for easy execution - TALIB_WORKFLOW.md: Comprehensive guide covering the full cycle: * Generate patterns with TA-Lib * Import into UI * Review and edit in browser * Export and train model * Compare predictions with TA-Lib detections * Iterate for improvement This enables the intended workflow: use TA-Lib for initial annotations, manually refine them, then train a model that learns from corrections.	2026-02-15 19:18:28 +01:00
Marko Djordjevic	3a83fd38e9	feat(ml): implement FastAPI inference service with model loading, preprocessing, and prediction endpoints	2026-02-15 14:29:07 +01:00
Marko Djordjevic	f4c0f9a836	feat(ml): implement training stage with MLflow tracking and model wrappers - Create RandomForestModel and XGBoostModel wrappers with class weight support - Implement temporal and random train/val/test splitting - Add MLflow experiment tracking with full parameter and metric logging - Create evaluation module for confusion matrix, feature importance, and classification reports - Implement model training with sklearn/xgboost flavor logging and optional registry registration - Store training run metadata in PostgreSQL - Wire training stage into pipeline.py orchestrator - Support both RandomForest and XGBoost models with configurable hyperparameters	2026-02-15 14:22:19 +01:00
Marko Djordjevic	16763b967e	feat(ml): implement annotation ingestion with windowed/BIO encoding and TA-Lib patterns	2026-02-15 12:28:58 +01:00
Marko Djordjevic	fd29ab91e0	feat(ml): implement feature engineering pipeline - Create pipeline.py with CLI argument parsing for running stages - Implement TA-Lib indicator computation with multi-output support - Add candle feature extraction (body_size, wicks, ratios, etc.) - Create custom feature loader with dynamic module import - Wire all feature engineering stages with NaN handling - Tasks completed: 2.2, 2.3, 3.1, 3.2, 3.3, 3.4, 3.5	2026-02-15 12:22:59 +01:00
Marko Djordjevic	ea339a54a7	feat(ml): add database schema, config parser, and DVC setup - Initialize DVC with local storage backend (task 1.6) - Create PostgreSQL schema for training_runs table (task 1.7) - Add SQLAlchemy database connection setup (task 1.8) - Create Pydantic config models for pipeline.yaml (task 2.1) - Add migration runner for database setup - Fix pyproject.toml package discovery config	2026-02-15 12:08:53 +01:00
Marko Djordjevic	1a653c5866	feat: add ML service scaffolding with Python FastAPI, Docker, and MLflow setup	2026-02-15 11:58:31 +01:00

1 2

73 commits