Commit graph

17 commits

Author SHA1 Message Date
aefa35a5ca Merge pull request 'Budget Intelligence & TUI — programmatic spend data, live dashboard, mission cost tracking' (#5) from budget-intel into main 2026-04-08 09:12:08 +00:00
35d8cb5de6
feat: tui.js — live ANSI terminal dashboard (B.A. objective)
Implements token-monitor#3 objective 3:

node tui.js starts a live terminal dashboard:
  - Full layout: Anthropic Teams (severity-colored) + xAI section
  - 60s auto-refresh with 1s countdown
  - Flicker-free via cursor-home (not clear-screen)
  - [r] forces immediate refresh
  - [q] or Ctrl-C exits cleanly, restores terminal + cursor
  - Severity coloring: green=ok, yellow=warning, red=critical/maxed
  - xAI section hidden when XAI_MANAGEMENT_KEY not set (graceful degradation)
  - Budget decision: recommended provider + avoid list prominently displayed
  - Alerts section (up to 3, warning/critical only)
  - No external npm dependencies — pure Node stdlib + project analyze.js

Data source: spawns analyze.js --budget-json internally (reuses cached probe
data if fresh, avoids double-probe on each TUI refresh)

Co-authored-by: Hannibal Smith <hannibal@a-team>
2026-04-08 08:29:40 +00:00
8daa396549
feat: --budget-json and --mission cost tracking (Face + Murdock objectives)
Implements token-monitor#3 objectives 1 and 2:

--budget-json: structured agent-consumable budget decision schema
  - budget_decision with recommended_provider, avoid list, reason
  - providers: all Teams with utilization, USD estimates, severity
  - xai: per-key spend breakdown from management API (if XAI_MANAGEMENT_KEY set)
  - alerts: warnings/critical for maxed providers + low xAI prepaid balance
  - config from ~/.config/token-monitor/config.json (default .50/week/seat)

--mission <ref>: mission cost attribution via Forgejo issue time windows
  - resolves start/end timestamps from Forgejo issue comments
  - requires FORGEJO_TOKEN env var + ~/.config/token-monitor/mission-repos.json
  - repo mapping: { 'bookmarko': 'trentuna/bookmarko', ... }

--mission-window <iso-start> <iso-end>: same without Forgejo dep

Both: utilization delta × weekly seat cost = estimated Anthropic spend
Both: exact xAI spend via management API for time window (if key set)
Existing --json flag unchanged (backward compat preserved)

Co-authored-by: Hannibal Smith <hannibal@a-team>
2026-04-08 08:28:18 +00:00
ab9c60b67c
Handle policy_rejected status (Anthropic April 4 billing change)
- anthropic-teams.js: detect HTTP 400 extra-usage policy blocks, return
  status='policy_rejected' with quota headers still readable
- report.js: display policy_rejected as CRITICAL with 'POLICY BLOCKED' label
- getSeverity: treat policy_rejected as critical

Currently the direct API (used by monitor) returns 200; pi's OAuth path
returns 400. This fix future-proofs against the block extending to direct
API calls, and correctly classifies the status if it does.

Refs: trentuna/commons#17, trentuna/token-monitor#4
2026-04-08 05:38:46 +00:00
e52ba2921c
Fix recommend.js: include allowed_warning in provider selection
allowed_warning providers can serve requests — only the budget is
approaching its limit. Previously they were excluded from both Phase 1
and Phase 2 selection, causing unnecessary escalation to shelley-proxy
emergency fallback when team-vigilio was at 79% 7d (allowed_warning)
and team-ludo was showing invalid_key in the health-pulse cache.

Now:
- Phase 1: first provider under threshold with status allowed or allowed_warning
- Phase 2: lowest-utilization provider with either status; reason notes warning

Effect: next wake picks team-vigilio (79% 7d, warning) instead of
shelley-proxy. Shelley-proxy is now a true last resort again.
2026-04-07 23:17:39 +00:00
ab35cc8346
recommend.js: probe fresh when all cached providers are invalid_key
invalid_key (HTTP 401) can be transient during key rotation or
temporary API issues — unlike rejected/exhausted which are stable
budget states. When cache shows all chain providers as invalid_key,
bypass cache and probe fresh so recovery is immediate instead of
waiting for the 20-minute TTL to expire.
2026-04-07 15:55:32 +00:00
6e6d93f3bf
add recommend.js — budget-aware provider selection
Selects optimal Teams provider from chain based on real 7d utilization.
Uses cached monitor data (no extra API calls if fresh cache exists).

- Phase 1: first provider in chain with 7d util < SWITCH_THRESHOLD (default 75%)
- Phase 2: all over threshold → pick lowest 7d allowed provider
- Phase 3: all rejected → emergency=true, signals shelley-proxy needed
- Always fails safe: returns team-vigilio on any error
2026-04-07 07:50:29 +00:00
350097a46d
add configure-key-limits.js — per-key QPS/QPM rate limit script
- PUT /auth/api-keys/{id} with fieldMask qps,qpm
- Defines limits per role: ba=2/30, vigilio=3/30, analysts=2/20
- --dry-run and --show flags included
- Blocked on UpdateApiKey ACL for management key (needs console.x.ai)
- See token-monitor#2 for Ludo action required
2026-04-06 11:00:09 +00:00
2371e02d57
feat: xAI Management API billing module (token-monitor#2)
- providers/xai-billing.js: fetchXaiUsage, fetchXaiInvoicePreview,
  aggregateByKey, renderXaiSection
- analyze.js: --xai flag for standalone view; xAI section appended
  to full report when XAI_MANAGEMENT_KEY is set
- Verified against live API: per-key spend by unit type, prepaid
  balance, current billing period total
- Usage endpoint: POST /v1/billing/teams/{id}/usage (analyticsRequest)
- Invoice endpoint: GET /v1/billing/teams/{id}/postpaid/invoice/preview
2026-04-06 08:19:27 +00:00
b504977853
docs: add phase3-piggyback.md — piggyback header capture + repo location recommendation 2026-04-06 02:27:23 +00:00
8ced108f74
docs: README overhaul — add analyze.js, wake integration, Quick start, fix provider table and architecture
Six changes:
- Add ## Quick start block (monitor.js, analyze.js, token-status.sh)
- Add ## Analysis section with all 8 analyze.js subcommands and output descriptions
- Add ## Wake integration section — token-status.sh docs, output format, cache guard note
- Provider support table: add google-gemini and xai-* rows
- Architecture block: add analyze.js, gemini.js, xai.js, docs/analyze.md
- Related: add token-status.sh as first item, fix issue link to trentuna/token-monitor#1

164/164 tests pass.
2026-04-06 02:26:51 +00:00
c7e6438398
Fix xai probe: double /v1 URL bug, use /v1/models instead of chat completion
Two bugs caused all xai providers to show 'error' in the monitor:

1. Double /v1 in URL: models.json baseUrl is https://api.x.ai/v1 (OpenAI-
   compatible convention), and the probe was appending /v1/chat/completions,
   producing https://api.x.ai/v1/v1/chat/completions → HTTP 4xx.
   Fix: strip trailing /vN from baseUrl before constructing the probe URL.

2. Wrong model: probe used grok-3-mini, which requires specific x.ai
   console permissions not granted to our keys. Keys have access to
   grok-4-1-fast-reasoning only.
   Fix: use GET /v1/models instead — lightweight, no model guessing,
   returns 200 (valid key) or 401 (invalid). Includes available models
   in result for visibility.

158/158 tests pass (unit tests for parseXaiHeaders unchanged).
2026-04-05 06:31:29 +00:00
34898b1196
Phase 2: analysis layer (analyze.js), cache guard, log hygiene
- analyze.js: burn rate, weekly reconstruction, cycle stagger, rotation
  rank, underspend alerts, log prune with weekly archive
- logger.js: getCachedRun(maxAgeMinutes) — skip probing if recent data exists
- monitor.js: cache guard at wake — 20-min dedup, zero extra API calls
- test.js: fix type assertion for gemini-api/xai-api providers (+5 passing);
  add 14 new tests for cache guard and analyze.js (162 total, all green)
- docs/analyze.md: usage reference

Co-authored-by: Hannibal Smith <hannibal@trentuna.com>
2026-04-05 04:49:05 +00:00
1b4e299461
build: add gemini and xai provider modules
Expands token-monitor with two new provider types:

- providers/gemini.js — Google Gemini API (body-based quota, no headers)
  - Probes generateContent endpoint (1 token), falls back gemini-2.0-flash → gemini-2.5-flash
  - Parses QuotaFailure violations + RetryInfo from 429 JSON body
  - Returns: status, quota_violations[], retry_delay_seconds, severity

- providers/xai.js — x.ai/Grok (OpenAI-compatible header schema)
  - Reads x-ratelimit-{limit,remaining}-{requests,tokens} headers
  - Handles: no_key, ok, rate_limited, invalid_key states
  - Warning threshold: < 10% remaining on requests or tokens

Both providers handle missing API keys gracefully (status: no_key).
Classification via providers/index.js using baseUrl patterns.
140/140 tests passing.

Closes recon findings from trentuna/a-team#91.
2026-04-04 17:52:37 +00:00
988618e165
test: add gemini and xai parser unit tests 2026-04-04 17:51:38 +00:00
07a544c50d
build: token-monitor v0.1.0 — modular LLM API quota visibility
Implements modular provider probing with two distinct header schemas:
- Teams direct (unified schema): 5h/7d utilization floats, status, reset countdown
- Shelley proxy (classic schema): token/request counts + Exedev-Gateway-Cost (USD/call)
- api-ateam: reports no billing data (confirmed non-existent by recon)

Key: uses claude-haiku-4-5-20251001 for minimal probe calls (1 token).
Rate-limit headers present on ALL responses (200 and 429).

113/113 tests passing.

Built from Face recon (trentuna/a-team#91) — live header capture confirmed
unified schema with utilization floats replaces old per-count schema.
2026-04-04 17:01:05 +00:00
760049a25e Initial commit 2026-04-04 16:35:33 +00:00