Cost Awareness
Benchmarked against: Anthropic — Streaming refusals (cost/quota management) Version: 2.0 | Scope: All agents
Every token costs real money. SuperPortia operates on Claude Max Plan where All Models quota is shared between Claude Chat and Claude Code.
The cost principle
CP value is NOT "cheapest possible" but "minimum cost that gets the job done RIGHT."
- Free engines (Groq) = ONLY for trivial tasks (random searches, unimportant cleanup)
- Cheap engines (Gemini, DeepSeek) = Important tasks (intel analysis, research, meeting notes)
- Claude = Code operations, architecture, decisions only
Role table
| Role | Agent | Cost | Best for |
|---|---|---|---|
| Chief Engineer | Claude Code (Opus) | $$$$ | Architecture, decisions, delegation |
| Executor | Antigravity / Workers | Free | Coding, executing WOs |
| Intel Officer | Groq / Gemini / DeepSeek | cents | External research |
| Courier | cron + bash | Free | Scheduled checks |
| Strategist | Claude AI Chat (Sonnet) | $$ | Strategy analysis, reviews |
Search flow (mandatory)
1. Search UB first (search_brain) — FREE
2. UB empty → delegate to Groq/Gemini — cents
3. Results → ingest to UB (ingest_fragment) — becomes reusable asset
4. NEVER use Opus WebSearch/WebFetch directly
Engine selection guide
| Task type | Engine | Cost | Notes |
|---|---|---|---|
| Trivial search, cleanup | Groq | Free | Fast, but prone to hallucination on important tasks |
| Research, intel analysis | Gemini | ~$0.014/search | Authoritative, has citations |
| Reasoning, analysis | DeepSeek | Very cheap | Good for complex analysis |
| Chinese NLP, tool-calling | Zhipu (GLM-5) | Cheap | Best for Chinese tasks |
| Code editing, file ops | Claude | ~$1-2/run | ONLY engine that can modify files |
What Opus should do
- Architecture design and key decisions
- Reviewing others' work
- Establishing standards and patterns
- Delegating tasks to cheaper engines
- Complex multi-step reasoning
What Opus should NOT do
- Repetitive coding (delegate)
- Data searching (delegate to
search_brainorintel_search) - Format conversion
- Long explanations of simple concepts
Billing structure
| Quota | Scope | Notes |
|---|---|---|
| All Models | claude.ai + Claude Code CLI (shared) | This is what 小克 uses |
| Sonnet Only | Separate quota | Does not consume All Models |
| Extra Usage | Overflow or direct API (LiteLLM) | Pay-per-token |