Backup & Recovery

Benchmarked against: Anthropic — Zero Data Retention / Data residency Architecture: Cloud UB (D1) + R2 backup + local mirrors Status: Partially implemented — backup strategy defined, automation in progress

Backup and recovery defines how SuperPortia protects its data and recovers from failures. The fleet's knowledge, work orders, and agent communications are critical assets that must survive hardware failures, service outages, and operational errors.

What needs backup

Data	Criticality	Loss impact
UB entries	Critical	All institutional knowledge lost
Work orders	Critical	Task history and audit trail lost
WO transitions	High	Compliance audit trail lost
Agent messages	Medium	Communication history lost
Agent registry	Low	Ephemeral — rebuilt on next heartbeat
Source code	Critical	Git provides backup (GitHub)
CLAUDE.md + rules	Critical	Git provides backup (GitHub)
Vector embeddings	Low	Regenerable from entry content

Backup strategy

Tier 1: Cloud UB (D1 → R2)

The primary backup path for all Cloud UB data:

Aspect	Detail
Source	Cloud UB D1 (all tables)
Destination	Cloudflare R2 bucket
Format	JSON export per table
Frequency	Daily (planned — currently manual)
Retention	All backups kept (R2 storage is cheap)
Automation	Cron trigger → Worker endpoint → R2 write

Tier 2: Local mirror

Each ship maintains a local SQLite copy of UB data:

Aspect	Detail
Source	Cloud UB D1
Destination	Local SQLite on ship filesystem
Sync	Currently manual; future: automated periodic sync
Purpose	Fast local access + offline resilience

Tier 3: Git repositories

Source code and configuration are backed up through standard Git workflow:

Aspect	Detail
Source	Local filesystem
Destination	GitHub (private repositories)
Trigger	Per-commit push
Includes	Code, CLAUDE.md, rules, skills, docs, scripts
Excludes	UB data, secrets (.env), large binaries

Recovery procedures

Scenario 1: Cloud UB D1 data loss

Severity: Critical Recovery time: Minutes to hours depending on backup freshness

Identify scope of loss (full DB vs specific tables)
Locate latest R2 backup
Download backup JSON files
Import into new D1 database (or restore to existing)
Rebuild Vectorize index from entry content
Verify with sre_status()
Notify all agents to reconnect

Scenario 2: Cloud UB Worker failure

Severity: High Recovery time: Minutes

Check Cloudflare dashboard for Worker status
If code issue: redeploy from Git
   - npx wrangler deploy
If Cloudflare outage: wait for resolution
Local UBI tools remain available for ship-local work
Monitor with cloud_ub_health()

Scenario 3: Ship hardware failure

Severity: High (for affected ship) Recovery time: Hours to days

Other ships continue operating via Cloud UB
Replace/repair hardware
Clone Git repository to new machine
Install dependencies (Node.js, Python, etc.)
Configure MCP servers with ship identity
Run agent_heartbeat() to register
Local UBI rebuilds from Cloud UB sync

Scenario 4: Vectorize index corruption

Severity: Low Recovery time: Minutes

Semantic search degrades to keyword-only
Rebuild index from D1 entry content
Re-embed all entries (batch process)
Verify with search_brain() test queries

Scenario 5: Accidental data deletion

Severity: Varies Recovery time: Minutes (if caught quickly)

UB entries: check R2 backup for deleted entries
Work orders: check wo_transitions history
Messages: archived messages are soft-deleted, recoverable
Source code: git reflog / git checkout
If Captain approval was given for deletion: document as intentional

Disaster recovery matrix

Scenario	RTO (Recovery Time)	RPO (Data Loss)	Automated?
D1 data loss	1–2 hours	Up to 24 hours (last backup)	No — manual
Worker failure	5–15 minutes	Zero (stateless)	Partial — auto-restart
Ship hardware failure	Hours–days	Zero (Cloud UB has data)	No
Vectorize corruption	30 minutes	Zero (regenerable)	No
Network outage	Wait for resolution	Zero	N/A
Accidental deletion	Minutes	Varies by backup frequency	No

Current gaps

Gap	Impact	Planned fix
No automated D1 → R2 backup	Up to 24h data loss risk	Cron-triggered Worker endpoint
No automated local sync	Local mirrors may be stale	Periodic Cloud UB → local sync
No backup verification	Backups might be corrupt	Post-backup integrity check
No point-in-time recovery	Can only restore to last backup	WAL-based incremental backup
No cross-region replication	Single Cloudflare region	Multi-region D1 (when available)

These gaps are documented as inspection mirror items — known capabilities to build.

Mutual rescue architecture

The dual-ship design provides inherent resilience:

Captain quote (2026-02-25): "一個掛一個還可以救" — if one goes down, the other can rescue.

Page	Relationship
Data Residency	Where data lives
Fleet Management	Fleet architecture
SRE Status	Health monitoring
Cloud UB MCP	Cloud UB details
Company Constitution	§1 Knowledge goes to UB only

What needs backup​

Backup strategy​

Tier 1: Cloud UB (D1 → R2)​

Tier 2: Local mirror​

Tier 3: Git repositories​

Recovery procedures​

Scenario 1: Cloud UB D1 data loss​

Scenario 2: Cloud UB Worker failure​

Scenario 3: Ship hardware failure​

Scenario 4: Vectorize index corruption​

Scenario 5: Accidental data deletion​

Disaster recovery matrix​

Current gaps​

Mutual rescue architecture​

Related pages​