Skip to main content

Usage & Cost Tracking

Benchmarked against: Anthropic โ€” Usage and Cost API / Claude Code Analytics API Rule source: EGS v1.2 + Company Constitution ยง6 Tools: list_models, get_stats

Cost awareness is a core operating principle at SuperPortia. The Captain pays 20x Max Plan plus large API deposits โ€” every token, every API call, every engine invocation has real cost. This page defines how costs are tracked, what the cost structure looks like, and how to optimize spending.


Cost structureโ€‹

SuperPortia uses multiple AI engines, each with different cost profiles:

Claude (primary โ€” high cost)โ€‹

ResourceBillingShared with
All Models quota (Opus, Sonnet, Haiku)Max Plan monthlyClaude.ai Chat + Claude Code CLI
Sonnet-only quotaSeparate monthly allowanceClaude.ai Chat + Claude Code CLI
API creditsPer-token (extra usage / LiteLLM)Direct API calls only

Key insight: All Models quota is shared between claude.ai (Chat tab = ๅฐ่ฅฟ) and Claude Code (Code tab = ๅฐๅ…‹). Heavy CLI usage directly reduces Chat availability.

Low-cost enginesโ€‹

EngineCostBillingBest for
Groq (Llama 3.3 70B)FreeFree tierResearch, analysis, simple tasks
Gemini (2.5 Flash)~$0.014/searchPer-requestAuthoritative search, citations
DeepSeek (R1/V3)CentsPer-tokenAnalysis, reasoning
MistralCentsPer-tokenEuropean alternative
Zhipu (GLM-5)CentsPer-tokenChinese NLP, tool-calling

Infrastructureโ€‹

ServiceCostBilling
Cloudflare WorkersFree tier (100K req/day)Monthly
Cloudflare D1Free tier (5M reads/day)Monthly
Cloudflare VectorizeFree tier (30M queries/month)Monthly
Cloudflare R2Free tier (10GB storage)Monthly
SupabaseFree tier / Pro planMonthly

Cost hierarchyโ€‹

The engine selection principle (Captain decision, 2026-02-27):

CP value is NOT "cheapest possible" but "minimum cost that gets the job done RIGHT."

Task importanceEngineRationale
Trivial (random searches, cleanup)Groq (free)Not worth spending on
Standard (research, analysis, summaries)Gemini / DeepSeek (cents)Quality matters, cost is minimal
Important (intel analysis, key research)Gemini with citationsNeed authoritative, verified results
Critical (code editing, file operations)ClaudeOnly engine with full tool access
Architecture (decisions, design, delegation)Opus (direct)Worth every token

Anti-pattern: Using free Groq for important tasks led to hallucinated version numbers, poor intel quality, and bad meeting summaries. The cost of bad output far exceeds the savings.


Role-based cost guidanceโ€‹

RoleCost levelShould doShould NOT do
Opus (ๅฐๅ…‹)$$$$Architecture, decisions, delegation, reviewsRepetitive coding, data searching, format conversion
Sonnet (ๅฐA)$$Coding, execution, standard analysisArchitecture decisions
Groq/GeminiCentsExternal research, web search, translationFile operations, code editing
Cron/BashFreeScheduled checks, automationโ€”

See Cost Awareness governance rule for the full policy.


Tracking toolsโ€‹

Engine usageโ€‹

# List all available engines and their status
list_models()

Returns available providers, models, API key status, and default selections.

UB usageโ€‹

# Get UB statistics
get_stats()

Returns total entries, category breakdown, source distribution โ€” useful for understanding ingestion volume and storage growth.

WO-based trackingโ€‹

Every Work Order captures actual_hours on completion. This provides a rough measure of agent time spent. Combined with engine selection per WO, it creates a cost-per-task estimate.


Cost optimization strategiesโ€‹

StrategyHowSavings
Delegate searchUse intel_search (Groq) instead of Opus WebSearchMassive โ€” Opus tokens vs free
Batch researchRun patrol once, not individual searchesFewer API calls
UB-firstsearch_brain before any external searchFree (already ingested)
Engine matchingMatch engine to task importanceAvoid over-spending
Progressive DisclosureLoad skills on demand, not at startupFewer tokens in context
Concise promptsRules are slim; skills load only when neededSmaller context window

Monitoring gaps (future work)โ€‹

GapWhat's needed
Per-session token trackingCount Opus tokens per session
Per-WO cost estimationEngine cost ร— tokens used
Monthly cost dashboardAggregate across all agents and engines
Budget alertsWarn when approaching quota limits
Cost-per-capabilityTrack cost to build/maintain each system capability

These are inspection mirror gaps โ€” we know we need them, building incrementally.


PageRelationship
Cost AwarenessGovernance rule for cost discipline
Engine OverviewAll engines and their capabilities
Fleet ManagementFleet-wide operations
SRE StatusSystem health affects cost (retries, failures)