Skip to main content

ingest_fragment

Benchmarked against: Anthropic โ€” Memory tool MCP Tool: Available on both Cloud UB and Local UBI servers Cost: Free (internal operation)

ingest_fragment is the primary content ingestion tool โ€” the entry point for all knowledge entering Universal Brain. It runs the full MTAAA pipeline to classify and store content.


Usageโ€‹

# Ingest a file
ingest_fragment(path="/path/to/spec.md", input_type="file")

# Ingest text directly
ingest_fragment(path="LangGraph 1.0.9 released with new checkpoint API...", input_type="text")

# Ingest from URL
ingest_fragment(path="https://example.com/article", input_type="url")

Parametersโ€‹

ParameterTypeDefaultDescription
pathstringrequiredFile path, text content, or URL
input_typestring"file"file / text / url / screenshot
sourcestring"manual"manual / api / nq_alpha / ss_vault / downloads

Responseโ€‹

{
"entry_id": "ub-abc123def456",
"category": "knowledge",
"title": "LangGraph 1.0.9 Release Notes",
"vectorized": true
}

The MTAAA pipelineโ€‹

Every ingestion runs through the 5-node classification pipeline:

  1. Caller โ†’ Agent calls ingest_fragment()
  2. UBI Router โ†’ Detects content type, routes to handler
  3. Handler (e.g., ๆ–‡ๅญ—้‹็ˆบ) โ†’ Classifies content
  4. 3D Classification โ†’ Topic ร— Type ร— Lifecycle from Controlled Vocabulary
  5. Result โ†’ Entry stored in D1 + vectorized for search

See File Ingestion (MTAAA) for full pipeline details.

Ingestion rulesโ€‹

From Company Constitution ยง1-ยง2 and UB Governance:

RuleDetail
LanguageAll entries must be in English
TitleDescriptive, searchable
ContentSelf-contained (reader needs no other context)
TagsLowercase, hyphenated, max 8 per entry
No duplicatessearch_brain() first to check
TimestampInclude Taipei time

Quality checklistโ€‹

Before calling ingest_fragment():

  1. Is the title descriptive and searchable?
  2. Is the content self-contained?
  3. Are tags following controlled vocabulary?
  4. Did you search UB first to avoid duplicates?
  5. Is the content in English?