Usage & Cost Tracking
Benchmarked against: Anthropic โ Usage and Cost API / Claude Code Analytics API Rule source: EGS v1.2 + Company Constitution ยง6 Tools:
list_models,get_stats
Cost awareness is a core operating principle at SuperPortia. The Captain pays 20x Max Plan plus large API deposits โ every token, every API call, every engine invocation has real cost. This page defines how costs are tracked, what the cost structure looks like, and how to optimize spending.
Cost structureโ
SuperPortia uses multiple AI engines, each with different cost profiles:
Claude (primary โ high cost)โ
| Resource | Billing | Shared with |
|---|---|---|
| All Models quota (Opus, Sonnet, Haiku) | Max Plan monthly | Claude.ai Chat + Claude Code CLI |
| Sonnet-only quota | Separate monthly allowance | Claude.ai Chat + Claude Code CLI |
| API credits | Per-token (extra usage / LiteLLM) | Direct API calls only |
Key insight: All Models quota is shared between claude.ai (Chat tab = ๅฐ่ฅฟ) and Claude Code (Code tab = ๅฐๅ ). Heavy CLI usage directly reduces Chat availability.
Low-cost enginesโ
| Engine | Cost | Billing | Best for |
|---|---|---|---|
| Groq (Llama 3.3 70B) | Free | Free tier | Research, analysis, simple tasks |
| Gemini (2.5 Flash) | ~$0.014/search | Per-request | Authoritative search, citations |
| DeepSeek (R1/V3) | Cents | Per-token | Analysis, reasoning |
| Mistral | Cents | Per-token | European alternative |
| Zhipu (GLM-5) | Cents | Per-token | Chinese NLP, tool-calling |
Infrastructureโ
| Service | Cost | Billing |
|---|---|---|
| Cloudflare Workers | Free tier (100K req/day) | Monthly |
| Cloudflare D1 | Free tier (5M reads/day) | Monthly |
| Cloudflare Vectorize | Free tier (30M queries/month) | Monthly |
| Cloudflare R2 | Free tier (10GB storage) | Monthly |
| Supabase | Free tier / Pro plan | Monthly |
Cost hierarchyโ
The engine selection principle (Captain decision, 2026-02-27):
CP value is NOT "cheapest possible" but "minimum cost that gets the job done RIGHT."
| Task importance | Engine | Rationale |
|---|---|---|
| Trivial (random searches, cleanup) | Groq (free) | Not worth spending on |
| Standard (research, analysis, summaries) | Gemini / DeepSeek (cents) | Quality matters, cost is minimal |
| Important (intel analysis, key research) | Gemini with citations | Need authoritative, verified results |
| Critical (code editing, file operations) | Claude | Only engine with full tool access |
| Architecture (decisions, design, delegation) | Opus (direct) | Worth every token |
Anti-pattern: Using free Groq for important tasks led to hallucinated version numbers, poor intel quality, and bad meeting summaries. The cost of bad output far exceeds the savings.
Role-based cost guidanceโ
| Role | Cost level | Should do | Should NOT do |
|---|---|---|---|
| Opus (ๅฐๅ ) | $$$$ | Architecture, decisions, delegation, reviews | Repetitive coding, data searching, format conversion |
| Sonnet (ๅฐA) | $$ | Coding, execution, standard analysis | Architecture decisions |
| Groq/Gemini | Cents | External research, web search, translation | File operations, code editing |
| Cron/Bash | Free | Scheduled checks, automation | โ |
See Cost Awareness governance rule for the full policy.
Tracking toolsโ
Engine usageโ
# List all available engines and their status
list_models()
Returns available providers, models, API key status, and default selections.
UB usageโ
# Get UB statistics
get_stats()
Returns total entries, category breakdown, source distribution โ useful for understanding ingestion volume and storage growth.
WO-based trackingโ
Every Work Order captures actual_hours on completion. This provides a rough measure of agent time spent. Combined with engine selection per WO, it creates a cost-per-task estimate.
Cost optimization strategiesโ
| Strategy | How | Savings |
|---|---|---|
| Delegate search | Use intel_search (Groq) instead of Opus WebSearch | Massive โ Opus tokens vs free |
| Batch research | Run patrol once, not individual searches | Fewer API calls |
| UB-first | search_brain before any external search | Free (already ingested) |
| Engine matching | Match engine to task importance | Avoid over-spending |
| Progressive Disclosure | Load skills on demand, not at startup | Fewer tokens in context |
| Concise prompts | Rules are slim; skills load only when needed | Smaller context window |
Monitoring gaps (future work)โ
| Gap | What's needed |
|---|---|
| Per-session token tracking | Count Opus tokens per session |
| Per-WO cost estimation | Engine cost ร tokens used |
| Monthly cost dashboard | Aggregate across all agents and engines |
| Budget alerts | Warn when approaching quota limits |
| Cost-per-capability | Track cost to build/maintain each system capability |
These are inspection mirror gaps โ we know we need them, building incrementally.
Related pagesโ
| Page | Relationship |
|---|---|
| Cost Awareness | Governance rule for cost discipline |
| Engine Overview | All engines and their capabilities |
| Fleet Management | Fleet-wide operations |
| SRE Status | System health affects cost (retries, failures) |