token-monitor

Author	SHA1	Message	Date
Vigilio Desto	ab9c60b67c	Handle policy_rejected status (Anthropic April 4 billing change) - anthropic-teams.js: detect HTTP 400 extra-usage policy blocks, return status='policy_rejected' with quota headers still readable - report.js: display policy_rejected as CRITICAL with 'POLICY BLOCKED' label - getSeverity: treat policy_rejected as critical Currently the direct API (used by monitor) returns 200; pi's OAuth path returns 400. This fix future-proofs against the block extending to direct API calls, and correctly classifies the status if it does. Refs: trentuna/commons#17, trentuna/token-monitor#4	2026-04-08 05:38:46 +00:00
Vigilio Desto	e52ba2921c	Fix recommend.js: include allowed_warning in provider selection allowed_warning providers can serve requests — only the budget is approaching its limit. Previously they were excluded from both Phase 1 and Phase 2 selection, causing unnecessary escalation to shelley-proxy emergency fallback when team-vigilio was at 79% 7d (allowed_warning) and team-ludo was showing invalid_key in the health-pulse cache. Now: - Phase 1: first provider under threshold with status allowed or allowed_warning - Phase 2: lowest-utilization provider with either status; reason notes warning Effect: next wake picks team-vigilio (79% 7d, warning) instead of shelley-proxy. Shelley-proxy is now a true last resort again.	2026-04-07 23:17:39 +00:00
Vigilio Desto	ab35cc8346	recommend.js: probe fresh when all cached providers are invalid_key invalid_key (HTTP 401) can be transient during key rotation or temporary API issues — unlike rejected/exhausted which are stable budget states. When cache shows all chain providers as invalid_key, bypass cache and probe fresh so recovery is immediate instead of waiting for the 20-minute TTL to expire.	2026-04-07 15:55:32 +00:00
Vigilio Desto	6e6d93f3bf	add recommend.js — budget-aware provider selection Selects optimal Teams provider from chain based on real 7d utilization. Uses cached monitor data (no extra API calls if fresh cache exists). - Phase 1: first provider in chain with 7d util < SWITCH_THRESHOLD (default 75%) - Phase 2: all over threshold → pick lowest 7d allowed provider - Phase 3: all rejected → emergency=true, signals shelley-proxy needed - Always fails safe: returns team-vigilio on any error	2026-04-07 07:50:29 +00:00
Vigilio Desto	350097a46d	add configure-key-limits.js — per-key QPS/QPM rate limit script - PUT /auth/api-keys/{id} with fieldMask qps,qpm - Defines limits per role: ba=2/30, vigilio=3/30, analysts=2/20 - --dry-run and --show flags included - Blocked on UpdateApiKey ACL for management key (needs console.x.ai) - See token-monitor#2 for Ludo action required	2026-04-06 11:00:09 +00:00
Vigilio Desto	2371e02d57	feat: xAI Management API billing module (token-monitor#2) - providers/xai-billing.js: fetchXaiUsage, fetchXaiInvoicePreview, aggregateByKey, renderXaiSection - analyze.js: --xai flag for standalone view; xAI section appended to full report when XAI_MANAGEMENT_KEY is set - Verified against live API: per-key spend by unit type, prepaid balance, current billing period total - Usage endpoint: POST /v1/billing/teams/{id}/usage (analyticsRequest) - Invoice endpoint: GET /v1/billing/teams/{id}/postpaid/invoice/preview	2026-04-06 08:19:27 +00:00
H.M. Murdock	b504977853	docs: add phase3-piggyback.md — piggyback header capture + repo location recommendation	2026-04-06 02:27:23 +00:00
B.A. Baracus	8ced108f74	docs: README overhaul — add analyze.js, wake integration, Quick start, fix provider table and architecture Six changes: - Add ## Quick start block (monitor.js, analyze.js, token-status.sh) - Add ## Analysis section with all 8 analyze.js subcommands and output descriptions - Add ## Wake integration section — token-status.sh docs, output format, cache guard note - Provider support table: add google-gemini and xai-* rows - Architecture block: add analyze.js, gemini.js, xai.js, docs/analyze.md - Related: add token-status.sh as first item, fix issue link to trentuna/token-monitor#1 164/164 tests pass.	2026-04-06 02:26:51 +00:00
Vigilio Desto	c7e6438398	Fix xai probe: double /v1 URL bug, use /v1/models instead of chat completion Two bugs caused all xai providers to show 'error' in the monitor: 1. Double /v1 in URL: models.json baseUrl is https://api.x.ai/v1 (OpenAI- compatible convention), and the probe was appending /v1/chat/completions, producing https://api.x.ai/v1/v1/chat/completions → HTTP 4xx. Fix: strip trailing /vN from baseUrl before constructing the probe URL. 2. Wrong model: probe used grok-3-mini, which requires specific x.ai console permissions not granted to our keys. Keys have access to grok-4-1-fast-reasoning only. Fix: use GET /v1/models instead — lightweight, no model guessing, returns 200 (valid key) or 401 (invalid). Includes available models in result for visibility. 158/158 tests pass (unit tests for parseXaiHeaders unchanged).	2026-04-05 06:31:29 +00:00
Hannibal Smith	34898b1196	Phase 2: analysis layer (analyze.js), cache guard, log hygiene - analyze.js: burn rate, weekly reconstruction, cycle stagger, rotation rank, underspend alerts, log prune with weekly archive - logger.js: getCachedRun(maxAgeMinutes) — skip probing if recent data exists - monitor.js: cache guard at wake — 20-min dedup, zero extra API calls - test.js: fix type assertion for gemini-api/xai-api providers (+5 passing); add 14 new tests for cache guard and analyze.js (162 total, all green) - docs/analyze.md: usage reference Co-authored-by: Hannibal Smith <hannibal@trentuna.com>	2026-04-05 04:49:05 +00:00
Hannibal Smith	1b4e299461	build: add gemini and xai provider modules Expands token-monitor with two new provider types: - providers/gemini.js — Google Gemini API (body-based quota, no headers) - Probes generateContent endpoint (1 token), falls back gemini-2.0-flash → gemini-2.5-flash - Parses QuotaFailure violations + RetryInfo from 429 JSON body - Returns: status, quota_violations[], retry_delay_seconds, severity - providers/xai.js — x.ai/Grok (OpenAI-compatible header schema) - Reads x-ratelimit-{limit,remaining}-{requests,tokens} headers - Handles: no_key, ok, rate_limited, invalid_key states - Warning threshold: < 10% remaining on requests or tokens Both providers handle missing API keys gracefully (status: no_key). Classification via providers/index.js using baseUrl patterns. 140/140 tests passing. Closes recon findings from trentuna/a-team#91.	2026-04-04 17:52:37 +00:00
B.A. Baracus	988618e165	test: add gemini and xai parser unit tests	2026-04-04 17:51:38 +00:00
Hannibal Smith	07a544c50d	build: token-monitor v0.1.0 — modular LLM API quota visibility Implements modular provider probing with two distinct header schemas: - Teams direct (unified schema): 5h/7d utilization floats, status, reset countdown - Shelley proxy (classic schema): token/request counts + Exedev-Gateway-Cost (USD/call) - api-ateam: reports no billing data (confirmed non-existent by recon) Key: uses claude-haiku-4-5-20251001 for minimal probe calls (1 token). Rate-limit headers present on ALL responses (200 and 429). 113/113 tests passing. Built from Face recon (trentuna/a-team#91) — live header capture confirmed unified schema with utilization floats replaces old per-count schema.	2026-04-04 17:01:05 +00:00
Vigilio Desto	760049a25e	Initial commit	2026-04-04 16:35:33 +00:00

14 commits