Usage & Cost Tracking

Benchmarked against: Anthropic — Usage and Cost API / Claude Code Analytics API Rule source: EGS v1.2 + Company Constitution §6 Tools: list_models, get_stats

Cost awareness is a core operating principle at SuperPortia. The Captain pays 20x Max Plan plus large API deposits — every token, every API call, every engine invocation has real cost. This page defines how costs are tracked, what the cost structure looks like, and how to optimize spending.

Cost structure

SuperPortia uses multiple AI engines, each with different cost profiles:

Claude (primary — high cost)

Resource	Billing	Shared with
All Models quota (Opus, Sonnet, Haiku)	Max Plan monthly	Claude.ai Chat + Claude Code CLI
Sonnet-only quota	Separate monthly allowance	Claude.ai Chat + Claude Code CLI
API credits	Per-token (extra usage / LiteLLM)	Direct API calls only

Key insight: All Models quota is shared between claude.ai (Chat tab = 小西) and Claude Code (Code tab = 小克). Heavy CLI usage directly reduces Chat availability.

Low-cost engines

Engine	Cost	Billing	Best for
Groq (Llama 3.3 70B)	Free	Free tier	Research, analysis, simple tasks
Gemini (2.5 Flash)	~$0.014/search	Per-request	Authoritative search, citations
DeepSeek (R1/V3)	Cents	Per-token	Analysis, reasoning
Mistral	Cents	Per-token	European alternative
Zhipu (GLM-5)	Cents	Per-token	Chinese NLP, tool-calling

Infrastructure

Service	Cost	Billing
Cloudflare Workers	Free tier (100K req/day)	Monthly
Cloudflare D1	Free tier (5M reads/day)	Monthly
Cloudflare Vectorize	Free tier (30M queries/month)	Monthly
Cloudflare R2	Free tier (10GB storage)	Monthly
Supabase	Free tier / Pro plan	Monthly

Cost hierarchy

The engine selection principle (Captain decision, 2026-02-27):

CP value is NOT "cheapest possible" but "minimum cost that gets the job done RIGHT."

Task importance	Engine	Rationale
Trivial (random searches, cleanup)	Groq (free)	Not worth spending on
Standard (research, analysis, summaries)	Gemini / DeepSeek (cents)	Quality matters, cost is minimal
Important (intel analysis, key research)	Gemini with citations	Need authoritative, verified results
Critical (code editing, file operations)	Claude	Only engine with full tool access
Architecture (decisions, design, delegation)	Opus (direct)	Worth every token

Anti-pattern: Using free Groq for important tasks led to hallucinated version numbers, poor intel quality, and bad meeting summaries. The cost of bad output far exceeds the savings.

Role-based cost guidance

Role	Cost level	Should do	Should NOT do
Opus (小克)	$$$$	Architecture, decisions, delegation, reviews	Repetitive coding, data searching, format conversion
Sonnet (小A)	$$	Coding, execution, standard analysis	Architecture decisions
Groq/Gemini	Cents	External research, web search, translation	File operations, code editing
Cron/Bash	Free	Scheduled checks, automation	—

See Cost Awareness governance rule for the full policy.

Tracking tools

Engine usage

# List all available engines and their status
list_models()

Returns available providers, models, API key status, and default selections.

UB usage

# Get UB statistics
get_stats()

Returns total entries, category breakdown, source distribution — useful for understanding ingestion volume and storage growth.

WO-based tracking

Every Work Order captures actual_hours on completion. This provides a rough measure of agent time spent. Combined with engine selection per WO, it creates a cost-per-task estimate.

Cost optimization strategies

Strategy	How	Savings
Delegate search	Use `intel_search` (Groq) instead of Opus WebSearch	Massive — Opus tokens vs free
Batch research	Run patrol once, not individual searches	Fewer API calls
UB-first	`search_brain` before any external search	Free (already ingested)
Engine matching	Match engine to task importance	Avoid over-spending
Progressive Disclosure	Load skills on demand, not at startup	Fewer tokens in context
Concise prompts	Rules are slim; skills load only when needed	Smaller context window

Monitoring gaps (future work)

Gap	What's needed
Per-session token tracking	Count Opus tokens per session
Per-WO cost estimation	Engine cost × tokens used
Monthly cost dashboard	Aggregate across all agents and engines
Budget alerts	Warn when approaching quota limits
Cost-per-capability	Track cost to build/maintain each system capability

These are inspection mirror gaps — we know we need them, building incrementally.

Page	Relationship
Cost Awareness	Governance rule for cost discipline
Engine Overview	All engines and their capabilities
Fleet Management	Fleet-wide operations
SRE Status	System health affects cost (retries, failures)

Cost structure​

Claude (primary — high cost)​

Low-cost engines​

Infrastructure​

Cost hierarchy​

Role-based cost guidance​

Tracking tools​

Engine usage​

UB usage​

WO-based tracking​

Cost optimization strategies​

Monitoring gaps (future work)​

Related pages​