UB Source Tracking
Benchmarked against: Anthropic — Citations Tools:
ingest_fragment,search_brain,get_entryRule: UB Governance (EGS Chapter 9)
Every piece of knowledge in the Universal Brain has provenance — who created it, where it came from, when it was ingested, and which ship produced it. This tracking enables trust, audit, and quality control.
Provenance fields
Every UB entry automatically captures:
| Field | Source | Purpose |
|---|---|---|
entry_id | Auto-generated | Unique identifier (e.g., ub-396f44b70763) |
source_ship | SP_SHIP_ID env var | Which ship ingested this (SS1, SS2, SS3) |
agent_id | SP_AGENT_ID env var | Which agent created this |
source | Ingestion parameter | Origin: manual, api, nq_alpha, ss_vault, downloads |
created_at | Auto-generated | UTC timestamp |
updated_at | Auto-generated | Last modification time |
tags | Agent or pipeline | Categorization tags |
entities | MTAAA pipeline | Extracted named entities |
Tag system
Tags follow a controlled vocabulary — lowercase, hyphenated, max 8 per entry:
Mandatory tags by content type
| Content type | Required tags |
|---|---|
| Research/Intel | research, [domain], [YYYY-MM] |
| Decision Record | decision, [project], captain-approved |
| Incident/RCA | incident, rca, P0-P3 |
| Spec/Design | spec, [project], [version] |
| Session Record | session, [ship] |
| Session Handoff | session-handoff, [ship] |
| Correction | correction, [topic] |
Tag format rules
| Rule | Example | Anti-example |
|---|---|---|
| Lowercase | cloud-ub | Cloud-UB |
| Hyphenated | engine-selection | engine_selection |
| No spaces | work-order | work order |
| Date format | 2026-03 | March 2026 |
MTAAA 3D classification
Beyond tags, the MTAAA pipeline classifies entries along three dimensions:
| Dimension | What it captures | Example values |
|---|---|---|
| Topic | Subject matter | "AI Agents > Architecture", "Infrastructure > Cloud" |
| Type | Content type | "Specification", "Decision Record", "Research Note" |
| Lifecycle | Currency | "versioned", "persistent", "ephemeral" |
Classification uses a Controlled Vocabulary (CV) — the LLM selects from predefined categories only, no freeform.
Searching with provenance
# Search returns entries with full metadata
results = search_brain("Cloud UB architecture")
# Each result includes:
# - entry_id, title, content (preview)
# - tags, source_ship, agent_id
# - created_at, relevance score
Filtering by provenance
# Browse by category
search_by_category(category="knowledge", subcategory="architecture")
# Get full entry with all metadata
get_entry(entry_id="ub-396f44b70763")
Quality checklist
Before calling ingest_fragment():
- Title — Descriptive, searchable, English
- Content — Self-contained (reader needs no other context)
- Tags — Follow controlled vocabulary above
- No duplicates —
search_brain()first to check - Source — Set correctly (manual, api, etc.)
Freshness tracking
| Tag pattern | Meaning |
|---|---|
verified-2026-03 | Perishable knowledge verified this month |
stale | Known to be outdated, needs re-verification |
timeless | Framework/method knowledge, doesn't expire |
ephemeral | Temporary, can be cleaned up |
Related pages
| Page | Relationship |
|---|---|
| UB Governance | Full governance rules |
| File Ingestion | MTAAA pipeline |
| Controlled Vocabulary | CV reference |
| Search Brain | Search details |