Pricing

Benchmarked against: Anthropic — Pricing Rule source: Company Constitution §6, Cost Awareness governance rule Captain insight: Paying 20x Max Plan + large API deposits — every token has real cost

SuperPortia's cost structure spans Claude subscription plans, low-cost API engines, and infrastructure services. Understanding these costs is essential for engine selection and operational efficiency.

Claude pricing

Subscription plans

Resource	Billing	Shared with	Notes
All Models quota (Opus, Sonnet, Haiku)	Max Plan monthly	claude.ai Chat + Claude Code CLI	Chat (小西) and Code (小克) share this quota
Sonnet-only quota	Separate monthly allowance	claude.ai Chat + Claude Code CLI	Independent — does not consume All Models
API credits (extra usage / LiteLLM)	Per-token	Direct API calls only	Kicks in when quota exceeded

Critical insight: All Models quota is shared between claude.ai (Chat tab = 小西) and Claude Code (Code tab = 小克). Heavy CLI usage directly reduces Chat availability, and vice versa.

Claude per-token rates (API / extra usage)

Model	Input	Output
Opus 4.6	Most expensive	Most expensive
Sonnet 4.6	Moderate	Moderate
Haiku 4.5	Cheapest Claude	Cheapest Claude

Exact per-token pricing follows Anthropic's published rates. Check anthropic.com/pricing for current numbers.

Low-cost engine pricing

Engine	Default Model	Cost per request	Monthly estimate (100 req/day)
Groq	Llama 3.3 70B	Free	$0
Groq Search	Compound	Free	$0
Gemini	2.5 Flash	~$0.003	~$9
Gemini Search	+ Google Grounding	~$0.014	~$42
DeepSeek	R1 / V3	Cents	~$5-10
Mistral	Latest	Cents	~$5-10
Zhipu	GLM-5	Cents	~$5-10
Ingest	MTAAA Pipeline	Free (internal)	$0

Infrastructure pricing

All infrastructure currently operates within free tiers:

Service	Free tier	Current usage	Overage cost
Cloudflare Workers	100K requests/day	Well within	$0.50/million
Cloudflare D1	5M reads/day, 100K writes/day	Well within	$0.001/million reads
Cloudflare Vectorize	30M queries/month	Well within	Usage-based
Cloudflare R2	10GB storage, 1M reads/month	Well within	$0.015/GB/month
Supabase	Free tier / Pro plan	Varies by project	Plan-based

Cost comparison: engine selection impact

The same task at different engine levels:

Task	Free (Groq)	Cheap (Gemini)	Standard (Sonnet)	Premium (Opus)
Web search	$0	$0.014	$0.10+	$0.50+
Text summary	$0	$0.003	$0.05+	$0.30+
Code generation	N/A	N/A	$0.10-0.50	$0.50-2.00
File operations	N/A	N/A	$0.10-0.50	$0.50-2.00
WO dispatch	$0	$0.01	$0.50-1.00	$1.00-2.00

Key takeaway: Delegating research to Groq/Gemini instead of using Opus directly saves 10-100x per query.

Cost optimization strategies

Strategy	Savings	How
UB-first search	Massive	`search_brain()` before any external search — answer may already be ingested
Delegate research	10-100x	Use `intel_search` (Groq free) or `search_web` instead of Opus WebSearch
Batch patrol	Linear	Run patrol once for a domain, not individual searches
Engine matching	Variable	Match engine to task importance per selection guide
Progressive Disclosure	Token savings	Load skills on demand, not at startup
Concise prompts	Token savings	Slim rules, skills load only when invoked

Monitoring and tracking

# Check engine availability and API key status
list_models()

# Check UB volume (indicates ingestion costs)
get_stats()

# Check WO history (each WO has engine + actual_hours)
list_work_orders(include_completed=True)

Monitoring gaps (planned)

Feature	Status
Per-session token counter	Planned
Per-WO cost estimation	Planned
Monthly cost dashboard	Planned
Budget alerts	Planned
Cost-per-capability tracking	Planned

Page	Relationship
Choosing an Engine	Selection framework
Usage & Cost	Admin tracking tools
Cost Awareness	Governance rule
Engine Overview	Full engine catalog

Claude pricing​

Subscription plans​

Claude per-token rates (API / extra usage)​

Low-cost engine pricing​

Infrastructure pricing​

Cost comparison: engine selection impact​

Cost optimization strategies​

Monitoring and tracking​

Monitoring gaps (planned)​

Related pages​