candle-annotator/openspec/specs/data-ingestion/spec.md
Marko Djordjevic 4121a87875 sync: apply multi-chart-management delta specs to main specs
- Created new chart-management capability spec
- Updated data-ingestion: chart-scoped candles, duplicate filename handling
- Updated backend-api: all endpoints gain chartId parameter, chart CRUD
- Updated chart-canvas: chart switching, scoped data fetching
- Updated label-management: annotations scoped to active chart
- Updated ui-shell: upload creates/selects chart, theme-aware link styling
2026-02-13 09:06:37 +01:00

46 lines
3.3 KiB
Markdown

## ADDED Requirements
### Requirement: CSV file upload
The system SHALL provide a file upload component that accepts CSV files containing OHLC candle data. The CSV format MUST have columns: `time`, `open`, `high`, `low`, `close`. The `time` column SHALL accept both `YYYY-MM-DD` date strings and Unix timestamps (integer seconds). Uploading a CSV SHALL create a new chart (named from the filename without extension) and insert all candle rows associated with that chart, rather than replacing existing data.
#### Scenario: Valid CSV upload
- **WHEN** user uploads a CSV file with valid headers (time, open, high, low, close) and valid data rows
- **THEN** system creates a new chart named from the filename (without .csv extension), parses all rows, and stores them in the `candles` table with the new chart's `chart_id`
#### Scenario: CSV with Unix timestamps
- **WHEN** user uploads a CSV where the `time` column contains Unix timestamps (e.g., 1700000000)
- **THEN** system stores the timestamps as integers in the database with the correct `chart_id` and renders candles correctly on the chart
#### Scenario: CSV with date strings
- **WHEN** user uploads a CSV where the `time` column contains date strings (e.g., "2024-01-15")
- **THEN** system converts dates to Unix timestamps and stores them in the database with the correct `chart_id`
#### Scenario: Invalid CSV format
- **WHEN** user uploads a CSV missing required headers or containing malformed data
- **THEN** system displays an error message describing the issue, does not create a chart, and does not store any partial data
#### Scenario: Duplicate filename upload
- **WHEN** user uploads a CSV whose filename (without extension) matches an existing chart name
- **THEN** system appends a numeric suffix to the chart name (e.g., "btc-daily-2") and creates a new chart with the suffixed name
### Requirement: CSV parsing with papaparse
The system SHALL use the `papaparse` library for CSV parsing. Parsing SHALL handle large files by using streaming mode for files exceeding 10,000 rows. Parsed records SHALL be inserted into SQLite within a single database transaction for atomicity.
#### Scenario: Large file parsing
- **WHEN** user uploads a CSV with more than 10,000 rows
- **THEN** system uses streaming parse and batch inserts within a transaction, completing without memory issues
#### Scenario: Transaction atomicity
- **WHEN** a parse error occurs midway through a CSV file
- **THEN** system rolls back the entire transaction and no partial data is stored
### Requirement: Candles database table
The system SHALL store candle data in a `candles` table with columns: `id` (integer primary key, auto-increment), `chart_id` (integer, foreign key to `charts.id`, NOT NULL), `time` (integer, Unix timestamp), `open` (real), `high` (real), `low` (real), `close` (real). The table MUST have a composite unique constraint on `(chart_id, time)`.
#### Scenario: Schema structure
- **WHEN** the database is initialized
- **THEN** the `candles` table exists with all required columns including `chart_id` and the composite unique constraint on `(chart_id, time)`
#### Scenario: Same timestamp across different charts
- **WHEN** two different charts have candles with the same Unix timestamp
- **THEN** both records are stored successfully because the unique constraint is per-chart