candle-annotator/openspec/changes/candle-backend/specs/prediction-ui/spec.md

130 lines
8.5 KiB
Markdown

## ADDED Requirements
### Requirement: Prediction state management
The system SHALL maintain a separate prediction state alongside the existing annotation state. The prediction state SHALL include: spans (array of prediction spans), isLoading, error, modelInfo, visible (toggle), confidenceThreshold (filter), selectedLabels (filter), and autoPredict (toggle). Prediction state SHALL be independent from annotation state.
#### Scenario: Initial prediction state
- **WHEN** the app loads
- **THEN** predictions are empty, visible is true, confidenceThreshold defaults to 0.70, autoPredict is false, and selectedLabels includes all labels
### Requirement: On-demand prediction fetching
The system SHALL fetch predictions on demand when the user clicks "Run on Visible". The system SHALL send the currently visible candles to `/api/predict` and update the prediction state with results. Predictions are ephemeral — not persisted, re-fetched on demand.
#### Scenario: Run on visible candles
- **WHEN** user clicks "Run on Visible" button
- **THEN** the system sends the visible candle range to /api/predict, shows a loading state, and renders returned predictions on the chart
#### Scenario: Batch predict all
- **WHEN** user clicks "Predict All" button
- **THEN** the system sends a batch request to /api/predict/batch for the full dataset and renders all returned predictions
### Requirement: Prediction caching
The system SHALL cache predictions in memory keyed by `${pair}_${timeframe}_${startTime}_${endTime}_${modelVersion}`. When the user scrolls to a range with cached predictions, the system SHALL use the cache instead of re-fetching. Cache SHALL be invalidated when the model version changes.
#### Scenario: Cache hit
- **WHEN** user scrolls back to a previously predicted range with the same model version
- **THEN** the system renders cached predictions without making an API call
#### Scenario: Cache invalidation on model change
- **WHEN** the model version changes (detected via /api/model/info)
- **THEN** all cached predictions are cleared
### Requirement: Prediction rendering on chart
The system SHALL render model predictions as a visual layer on the lightweight-charts instance, visually distinct from human annotations. Predictions SHALL use a histogram series with per-bar colors mapped to predicted pattern labels at reduced opacity (10-20%). Series markers SHALL be added at the start of each prediction span showing `{label} ({confidence}%)` positioned below bars.
#### Scenario: Render prediction spans
- **WHEN** predictions are loaded and visible is true
- **THEN** colored histogram bars appear behind candles for predicted patterns, with markers showing labels and confidence
#### Scenario: Predictions hidden
- **WHEN** the user toggles predictions off (visible = false)
- **THEN** the prediction histogram series and markers are removed from the chart
#### Scenario: Visual distinction from annotations
- **WHEN** both human annotations and model predictions exist for the same range
- **THEN** human annotations render as solid colored rectangles (above bars) and predictions render as low-opacity histogram bars (below bars) — they are visually distinguishable
### Requirement: Confidence threshold filter
The system SHALL filter displayed predictions by confidence. Only predictions with confidence >= `confidenceThreshold` SHALL be rendered. The threshold is adjustable via a slider in the controls panel (range 0.0 to 1.0).
#### Scenario: Filter low confidence
- **WHEN** confidenceThreshold is 0.70 and a prediction has confidence 0.55
- **THEN** that prediction is not rendered on the chart
#### Scenario: Adjust threshold
- **WHEN** user moves the confidence slider from 0.70 to 0.50
- **THEN** previously hidden predictions with confidence between 0.50 and 0.70 become visible
### Requirement: Label type filter
The system SHALL allow users to toggle visibility of individual pattern labels via checkboxes in the controls panel. Only predictions for checked labels are rendered.
#### Scenario: Hide specific label
- **WHEN** user unchecks "double_bottom" in the label filter
- **THEN** all "double_bottom" predictions are hidden from the chart
### Requirement: Prediction controls panel
The system SHALL display a prediction controls panel in the sidebar with: master on/off toggle, model info (name, version, type, training date), action buttons ("Run on Visible", "Predict All"), auto-predict toggle, confidence threshold slider, label checkboxes with per-class precision/recall metrics, prediction count, agreement count, and a "Show only disagreements" filter.
#### Scenario: Display model info
- **WHEN** the prediction panel loads and the inference API is available
- **THEN** the panel fetches /api/model/info and displays model name, version, type, and training date
#### Scenario: Inference API unavailable
- **WHEN** the prediction panel loads and /api/model/info returns an error
- **THEN** the panel shows "Model server offline — predictions unavailable" and all controls are disabled
#### Scenario: Per-class metrics display
- **WHEN** model info includes per-class metrics
- **THEN** each label checkbox shows precision and recall values (e.g., "bull_flag (P:0.89 R:0.76)")
### Requirement: Disagreement detection
The system SHALL compare human annotation spans with model prediction spans to identify disagreements. For each human annotation, check if any prediction span overlaps (>50% time overlap). Disagreement types: "missed_by_model" (human annotated, model predicted "O"), "missed_by_human" (model predicted pattern, no human annotation), "label_mismatch" (both see a pattern but different labels).
#### Scenario: Missed by model
- **WHEN** a human annotation exists at T10-T20 but no prediction span overlaps it
- **THEN** the system identifies this as "missed_by_model"
#### Scenario: Missed by human
- **WHEN** a prediction span exists at T30-T40 with no overlapping human annotation
- **THEN** the system identifies this as "missed_by_human"
#### Scenario: Label mismatch
- **WHEN** a human annotation labels T10-T20 as "bull_flag" and the prediction labels the same range as "wedge_up"
- **THEN** the system identifies this as "label_mismatch"
### Requirement: Disagreement rendering
The system SHALL render disagreements with distinct visual styles: "missed_by_model" shows a red dashed border around the human annotation, "missed_by_human" shows a yellow highlight around the prediction, "label_mismatch" shows an orange border with both labels displayed.
#### Scenario: Render missed_by_human highlight
- **WHEN** a "missed_by_human" disagreement is detected and disagreement rendering is enabled
- **THEN** the prediction span is highlighted with a yellow border/glow to draw attention
#### Scenario: Show only disagreements
- **WHEN** user clicks "Show only disagreements" filter
- **THEN** only prediction spans involved in disagreements are rendered, hiding agreement spans
### Requirement: Prediction-to-annotation feedback
When a user clicks on a "missed_by_human" prediction, the system SHALL open the span annotation dialog pre-filled with the prediction's start_time, end_time, and label. The user can confirm (save as new annotation), correct (change label, then save), or dismiss.
#### Scenario: Confirm prediction as annotation
- **WHEN** user clicks a "missed_by_human" prediction and clicks Save in the pre-filled dialog
- **THEN** the system creates a new span annotation with the model's suggested label and timestamps
#### Scenario: Correct and save
- **WHEN** user clicks a "missed_by_human" prediction, changes the label in the dialog, and clicks Save
- **THEN** the system creates a new span annotation with the corrected label
#### Scenario: Dismiss as not-a-pattern
- **WHEN** user clicks a "missed_by_human" prediction and clicks "Not a pattern"
- **THEN** the system saves a negative annotation with label "O", source "human_correction", and records the model's original prediction and confidence
### Requirement: Inference API connection monitoring
The system SHALL poll `/api/model/info` every 30 seconds when the inference API is unavailable. When the API becomes available, the system SHALL auto-reconnect and enable prediction controls. Human annotation SHALL never be blocked by inference API availability.
#### Scenario: Auto-reconnect
- **WHEN** the inference API was unavailable and becomes reachable
- **THEN** the prediction panel re-enables controls and shows "Model server online"
#### Scenario: Annotation independence
- **WHEN** the inference API is unavailable
- **THEN** all human annotation tools continue to work normally