Ingest Fragment API

Benchmarked against: Anthropic — Files API Architecture: MTAAA 5-node classification pipeline Spec: docs/MTAAA-Spec-v1.4-DRAFT.md

The Ingest Fragment API is SuperPortia's primary method for adding knowledge to the Universal Brain. Every piece of content — files, text, URLs, screenshots — passes through the MTAAA pipeline for automatic classification.

Pipeline overview

Node	Role	Output
file_detector	Identifies input type and format	`{type, mime, encoding}`
content_extractor	Extracts readable content	`{text, metadata}`
feature_learner	Identifies entities, topics, patterns	`{entities, tags, features}`
schema_matcher	Maps to MTAAA 3D taxonomy (Topic x Type x Lifecycle)	`{topic, type, lifecycle}`
archivist	Writes to UB with full metadata	`{entry_id, status}`

Input types

Type	Parameter	Example
File	`input_type: "file"`	`/path/to/document.pdf`
Text	`input_type: "text"`	Raw text content
URL	`input_type: "url"`	`https://example.com/article`
Screenshot	`input_type: "screenshot"`	Image file path

API reference

`ingest_fragment`

# Basic file ingestion
result = ingest_fragment(
    path="/path/to/file.py",
    input_type="file",
    source="manual"
)

# Text ingestion
result = ingest_fragment(
    path="The content to ingest as text...",
    input_type="text",
    source="api"
)

# URL ingestion
result = ingest_fragment(
    path="https://example.com/article",
    input_type="url",
    source="manual"
)

Parameters

Parameter	Type	Required	Description
`path`	string	Yes	File path, text content, or URL
`input_type`	string	No	`file` (default), `text`, `url`, `screenshot`
`source`	string	No	`manual` (default), `nq_alpha`, `ss_vault`, `downloads`, `api`

Response

{
  "entry_id": "ub-a1b2c3d4e5f6",
  "category": "source_code",
  "title": "Auto-generated title from content",
  "write_status": "success",
  "vectorized": true
}

Auto-tagging

The pipeline automatically adds governance metadata:

Tag	Source	Example
`source_ship`	`SP_SHIP_ID` env var	`SS1`
`ss_agent_id`	`SP_AGENT_ID` env var	`mac-cli`
Timestamp	System clock	`2026-03-05T10:45:00Z`

5-handler routing

MTAAA routes content through specialized handlers based on detected type:

Handler	Content types	Key features
text_subgraph	Articles, notes, decisions, research	Full NLP classification
code_handler	Source code, scripts, configs	Language detection, function extraction
image_handler	Screenshots, photos, diagrams	Multimodal description
structured_handler	JSON, CSV, YAML	Schema inference
mixed_handler	PDFs, notebooks, rich documents	Multi-section processing

MTAAA 3D classification

Every entry is classified along three dimensions using the Controlled Vocabulary:

Dimension	What it answers	Example values
Topic	What is this about?	`AI Agents > Architecture`, `Trading > Strategy`
Type	What kind of content?	`Specification`, `Decision Record`, `Research`
Lifecycle	How current?	`versioned`, `persistent`, `ephemeral`

The LLM classifier selects from the CV only — no freeform values allowed.

Ingestion quality checklist

Before calling ingest_fragment():

Title: Will be auto-generated, but you can set it explicitly for important entries
Language: All UB entries must be in English (Captain decision, 2026-02-28)
Duplicates: Run search_brain() first to check for existing entries
Tags: Auto-assigned by pipeline; add manual tags for mandatory categories (see Controlled Vocabulary)
Self-contained: Content should be understandable without external context

Batch ingestion

For multiple files, create a work order with file paths in the description:

# WO description for batch ingest
/path/to/file1.md
/path/to/file2.py
/path/to/file3.json

Then dispatch with engine: "ingest" — free, no LLM cost.

Where content lands

Stage	Table	Status
UB Dock	`entries`	Unclassified, searchable by keyword
UB Main	`classified_entries`	Fully classified (3D), vector-indexed

Page	Relationship
UB Entry CRUD	Reading and updating entries
Controlled Vocabulary	Classification taxonomy
Search Brain	Finding ingested content

Pipeline overview​

Input types​

API reference​

ingest_fragment​

Parameters​

Response​

Auto-tagging​

5-handler routing​

MTAAA 3D classification​

Ingestion quality checklist​

Batch ingestion​

Where content lands​

Related pages​