Skip to main content

Cost Awareness

Benchmarked against: Anthropic — Streaming refusals (cost/quota management) Version: 2.0 | Scope: All agents

Every token costs real money. SuperPortia operates on Claude Max Plan where All Models quota is shared between Claude Chat and Claude Code.


The cost principle

CP value is NOT "cheapest possible" but "minimum cost that gets the job done RIGHT."

  • Free engines (Groq) = ONLY for trivial tasks (random searches, unimportant cleanup)
  • Cheap engines (Gemini, DeepSeek) = Important tasks (intel analysis, research, meeting notes)
  • Claude = Code operations, architecture, decisions only

Role table

RoleAgentCostBest for
Chief EngineerClaude Code (Opus)$$$$Architecture, decisions, delegation
ExecutorAntigravity / WorkersFreeCoding, executing WOs
Intel OfficerGroq / Gemini / DeepSeekcentsExternal research
Couriercron + bashFreeScheduled checks
StrategistClaude AI Chat (Sonnet)$$Strategy analysis, reviews

Search flow (mandatory)

1. Search UB first (search_brain) — FREE
2. UB empty → delegate to Groq/Gemini — cents
3. Results → ingest to UB (ingest_fragment) — becomes reusable asset
4. NEVER use Opus WebSearch/WebFetch directly

Engine selection guide

Task typeEngineCostNotes
Trivial search, cleanupGroqFreeFast, but prone to hallucination on important tasks
Research, intel analysisGemini~$0.014/searchAuthoritative, has citations
Reasoning, analysisDeepSeekVery cheapGood for complex analysis
Chinese NLP, tool-callingZhipu (GLM-5)CheapBest for Chinese tasks
Code editing, file opsClaude~$1-2/runONLY engine that can modify files

What Opus should do

  • Architecture design and key decisions
  • Reviewing others' work
  • Establishing standards and patterns
  • Delegating tasks to cheaper engines
  • Complex multi-step reasoning

What Opus should NOT do

  • Repetitive coding (delegate)
  • Data searching (delegate to search_brain or intel_search)
  • Format conversion
  • Long explanations of simple concepts

Billing structure

QuotaScopeNotes
All Modelsclaude.ai + Claude Code CLI (shared)This is what 小克 uses
Sonnet OnlySeparate quotaDoes not consume All Models
Extra UsageOverflow or direct API (LiteLLM)Pay-per-token