Engine-based Tool Routing
Benchmarked against: Anthropic — Programmatic Tool Calling Architecture: supervisor.py flat/nested dual-mode routing Key insight: Different engines support different tool-calling protocols
Not all engines handle tool calling the same way. SuperPortia's supervisor automatically routes tool calls through the correct mode based on the engine's capabilities.
The problem
LLM engines differ in how they handle tool calls:
| Engine | Tool call format | Multi-agent support | Issue |
|---|---|---|---|
| Claude | Native JSON | Full (agent-as-tool) | None |
| Gemini | Native JSON | Full (agent-as-tool) | None |
| Groq (Llama 3.3) | XML format | Limited | Uses XML instead of JSON |
| DeepSeek | JSON (partial) | Limited | Unreliable with nested agents |
| Mistral | JSON (partial) | Limited | Unreliable with nested agents |
| Zhipu | JSON (partial) | Limited | Unreliable with nested agents |
If you send a nested agent-as-tool call to Groq, it fails because Groq uses XML tool calling instead of JSON.
Dual-mode routing
The supervisor automatically detects the engine and selects the appropriate mode:
Nested mode
- Engines: Claude, Gemini
- How: Supervisor delegates to worker agents, each with their own tool set
- Advantage: Cleaner separation of concerns, workers can have specialized context
- Use when: Complex multi-step tasks requiring different expertise
Flat mode
- Engines: Groq, DeepSeek, Mistral, Zhipu
- How: Supervisor directly calls tools without intermediate worker agents
- Advantage: Compatible with engines that have limited tool-calling support
- Use when: Simple tasks, cost-sensitive operations
Supervisor implementation
The supervisor (supervisor.py) implements automatic mode switching:
# Simplified logic
def select_mode(engine: str) -> str:
nested_capable = {"claude", "gemini"}
if engine in nested_capable:
return "nested"
return "flat"
Flat mode tool restrictions
Some tools are excluded from flat mode due to known issues:
| Tool | Flat mode | Reason |
|---|---|---|
ingest_text | Excluded | Groq produces garbage token explosion (Unicode escape sequences) |
| Complex nested calls | Excluded | XML format cannot express nested structures |
When a tool is excluded from flat mode, the orchestrator layer handles it instead.
Engine routing table
| Task type | Recommended engine | Mode | Cost |
|---|---|---|---|
| Research / analysis | Groq | Flat | Free |
| Intel with citations | Gemini | Nested | ~$0.014 |
| Code editing | Claude | Nested | $$$$ |
| Classification | Groq / DeepSeek | Flat | Free / cents |
| Chinese NLP | Zhipu | Flat | Cents |
| Complex orchestration | Claude / Gemini | Nested | $$-$$$$ |
Dispatch engine routing
The dispatch_work_order tool provides a simplified routing interface:
| Engine | Type | Cost | Best for |
|---|---|---|---|
groq | LLM | Free | Research, analysis, simple tasks |
groq-search | LLM + Web | Free | Intel gathering with web search |
gemini | LLM | Cheap | General tasks |
gemini-search | LLM + Web | Cheap | Authoritative research with citations |
deepseek | LLM | Very cheap | Reasoning, analysis |
mistral | LLM | Cheap | European model, alternative |
zhipu | LLM | Cheap | Chinese NLP, agent tool-calling |
claude | LLM | Expensive | Code editing, file operations only |
ingest | Pipeline | Free | Batch file ingestion |
Known limitations
| Limitation | Impact | Workaround |
|---|---|---|
| Groq XML tool calls | Cannot use nested mode | Auto-switches to flat |
| Groq token explosion on ingest | Garbage output | Excluded from flat mode tools |
| DeepSeek nesting | Unreliable worker agents | Auto-switches to flat |
| Free engines + important tasks | Poor quality results | Use Gemini/Claude for important work |
Captain's principle: "Minimum cost that gets the job done RIGHT." Free engines are only for trivial tasks.
Related pages
| Page | Relationship |
|---|---|
| Tool Discovery | Finding available tools |
| Dispatch Modes | Engine dispatch reference |
| Choosing an Engine | Engine selection guide |