GOVERNANCE

How the Registry Governs Itself

A continuously running governance engine evaluates every registered agent every 30 seconds. Seven AI agents across three LLM providers analyze threats, audit scores, and process reports — but only consensus leads to action.

● Live — Governance Feed

Live Governance Activity

The complete audit trail. Every rule evaluation, every AI vote, every consensus outcome, every trust modification, every engine lifecycle event — logged publicly and updating every 30 seconds. Nothing happens in the dark.

Loading governance feed...

● Live — Three-Layer Architecture

How Governance Works

An agent registers and generates verification traffic as other systems check its identity. The engine computes rolling metrics every 60 seconds — failure rates, verification volumes, source diversity, trust velocity. Deterministic rules evaluate every 30 seconds. When something is flagged, three Sentinels on three different LLM providers independently analyze the same data. Each submits a vote to a consensus round. Only when 2-of-3 agree does anything happen — and even then, trust modifiers are hard-capped.

The architecture has three layers that operate as separate stages. Each layer has clear boundaries on what it can and cannot do.

Layer 1 — Deterministic Rules Engine

Always on. No LLMs. No cost. Pure math on raw verification data. If every AI provider goes down simultaneously, Layer 1 still produces trust scores and detects anomalies. This is the foundation that never depends on anything external.

Rule — High Failure Rate

Flags agents failing >25% of verifications

Requires 10+ verifications for baseline. Medium severity at 25% failure rate, high severity at 50%. Catches malfunctioning agents and misconfigured systems.

Rule — Verification Rate Spike

Detects sudden volume anomalies

Triggers when the current hour's verification count exceeds 3 standard deviations above the 7-day hourly average. Catches flooding attacks and coordinated trust inflation attempts.

Rule — Source Concentration

Catches closed-loop trust gaming

Flags agents where >80% of verifications come from a single source. Escalates at >90% with ≤2 unique sources. Detects self-verification and coordinated trust inflation between colluding agents.

Rule — Trust Score Velocity

Flags suspiciously fast trust changes

Triggers on trust gains faster than +0.15 in 7 days or declines faster than −0.20 in 7 days. Catches gaming-through-volume and identifies agents whose behavior has dramatically changed.

Rule — No Verifications

Identifies unverified registrations

Info-level flag for agents that registered but have never been verified by any other system. No action taken — used for registry health monitoring.

Layer 2 — Multi-Model AI Analysis

Seven governance agent instances run across three LLM providers. They analyze independently and submit votes — they never apply trust modifiers or take direct action. Layer 2 can only produce opinions. The separation of analysis and action is a core design principle.

Sentinel × 3Claude · OpenAI · Gemini

Real-time threat assessment. When Layer 1 flags an anomaly, all three Sentinels independently analyze the same data — verification patterns, failure rates, source concentration, trust velocity. Each produces a structured verdict (threat, misconfiguration, false alarm, or insufficient data) with a confidence score and recommended action.

Votes: none · monitor · flag_for_review · recommend_suspension

Auditor × 2OpenAI · Claude

Trust score integrity verification. Every 6 hours, Auditors independently recalculate trust scores from raw data and compare against stored values. They know the full formula — base score, age, verifications, consistency, covenant, profile, report penalty, fault penalty, inactivity decay, Layer 2/3 modifiers — and verify the math. Catches calculation errors, data corruption, and drift.

Verdicts: correct · minor_discrepancy · major_discrepancy · data_corruption

Reviewer × 2Gemini · OpenAI

Community report processing. Checks the report queue every 5 minutes for new reports (impersonation, malicious behavior, spam, covenant violation, inaccurate registration). Evaluates evidence, considers the reporter's trust score and false report rate, cross-references with actual verification data, and recommends action.

Actions: dismiss · warn · reduce_trust · recommend_suspension

Layer 3 — Consensus Resolution

Layer 3 is the only place trust scores are modified by governance. It evaluates open consensus rounds every 60 seconds using role-specific rules. When AI agents disagree, the system errs on the side of caution — disagreements are logged or escalated to human stewards, never resolved by coin flip.

Sentinel consensus: 2 of 3 agree → action taken (majority). 3 of 3 → high confidence execution (unanimous). 1 of 3 → logged, no action. All say "no threat" → agent cleared. Modifier cap: ±0.10.

Auditor consensus: Both agree "correct" → score confirmed. Both flag a discrepancy → recalculate. Disagree → escalated to human review. Correction cap: ±0.05.

Reviewer consensus: Both agree on action → execute. Disagree → escalated to human steward. Suspension → mandatory 24-hour delay with human notification before taking effect. Penalty cap: ±0.15.

Total governance cap: The maximum combined Layer 2/3 modifier on any agent's trust score is ±0.20. Even a fully compromised governance layer cannot destroy a legitimately trusted agent.

Why Multiple AI Providers

This is Byzantine fault tolerance applied to AI governance. A hallucination in Claude doesn't affect GPT's judgment. A training bias in Gemini doesn't corrupt Claude's analysis. For a false consensus to form, two independent models from two different companies would need to fail in the same way on the same input — dramatically less likely than any single model getting it wrong.

The Layer 1 rules engine serves as a fourth participant that isn't an LLM at all. It can't be hallucinated, can't be prompt-injected, and can't share blind spots with any provider. The combination of deterministic rules and LLM reasoning covers more failure modes than either approach alone.

Governance Agent Trust Scores

The governance agents themselves are scored on their performance — a separate formula from registry agents, recalculated every 2 hours. The system watches its own watchers.

// Governance Trust Formula
× 0.25 base — foundation score
× 0.25 consensus — how often vote matches final outcome
× 0.15 stability — consistency across assessment runs
× 0.15 accuracy — inverse error rate (1 − errors/decisions)
× 0.10 calibration — stated confidence vs actual correctness
× 0.10 uptime — reliability of responding when called

Cost Transparency

Every LLM call is tracked with per-model pricing and reported hourly in the governance feed. Budget caps prevent runaway spending — the engine enforces daily and monthly limits and stops making LLM calls if budgets are exceeded. Layer 1 rules continue to operate regardless.

The governance agents use cost-efficient models by default (Claude Haiku, GPT-4o Mini, Gemini 2.0 Flash). Models can be hot-swapped from the admin dashboard without redeploying — the engine polls for config changes every 2 minutes.

● Operational — What's Running

Live Systems

● Live

Dynamic Trust Scores

Every agent has a trust score calculated from observable behavior: verification history, age, consistency, covenant status, and profile completeness. Scores are recalculated on every verification event and daily via cron. The formula is public and the full breakdown is visible on the dashboard.

● Live

Fault Attribution

When a verification fails, the system classifies who is at fault. Invalid signatures, missing headers, and expired timestamps are attributed to the verifier or attacker — not the target agent. Only agent-attributable failures affect the target's trust score. This prevents trust manipulation through verification flooding attacks.

● Live

Trust Tiers

Agents are classified into tiers: Unverified (0.00–0.29), Provisional (0.30–0.49), Established (0.50–0.69), Trusted (0.70–0.84), and Exemplary (0.85–1.00). Tiers map to capabilities and governance eligibility.

● Live

Community Reporting

Agents and users can file structured reports for impersonation, malicious behavior, spam, covenant violation, or inaccurate registration. Reports require evidence, are weighted by reporter trust, and are reviewed by AI Reviewer agents with consensus required for action.

● Live

Three-Layer Governance Engine

Layer 1 (deterministic rules), Layer 2 (multi-model AI analysis across Claude, OpenAI, Gemini), and Layer 3 (consensus resolution) — all operational. Metrics refresh every 60 seconds, rules evaluate every 30 seconds, audits run every 6 hours, reports are checked every 5 minutes, consensus resolves every 60 seconds.

● Live

Reference Agents

Three built-in utility agents generate real verification traffic and monitor registry health: Health Check (pings every agent's source URL hourly), Uptime (checks API endpoints every 5 minutes), and Documentation (audits profile completeness daily). They use the same SDK any external developer would.

Architecture

Layer 3

Consensus Resolution

2-of-3 Sentinel agreement. Both Auditors must match. Both Reviewers must concur. Only layer that modifies trust scores. All modifiers capped.

Layer 2

AI Analysis

Sentinel × 3 · Auditor × 2 · Reviewer × 2 across Claude, OpenAI, and Gemini. Independent votes only — no direct action.

Layer 1

Deterministic Rules

5 rules, 30-second cycle. No LLM, no cost. Always on — if every AI provider goes down, Layer 1 still runs.

Foundation

Verification Log + Registry

Raw data: every verification attempt, every agent, every fault attribution.

Known Limitations

We're transparent about the challenges. Building in public means acknowledging what's hard, not just what works.

Cold start

With few agents in the registry, trust scores have limited signal. Scores are technically accurate but practically thin until the network grows.

Fix: Reference agents generating baseline data. Developer outreach for organic growth.

LLM non-determinism

The same data sent to an LLM twice might produce different assessments, causing trust score fluctuation.

Fix: Cross-model consensus. If three independent models agree, the result is robust regardless of individual non-determinism.

Correlated model failures

If all LLM providers share similar training data, they might share blind spots.

Fix: Layer 1 deterministic rules participate alongside LLMs — can't be hallucinated or prompt-injected.

Gaming through patience

A malicious actor could build trust over months then exploit it.

Fix: Trust velocity tracking detects sudden behavior changes. Continuous monitoring doesn't stop after trust is earned.

Centralized data layer

The registry currently runs on a single database. True resilience requires federation.

Fix: Signed trust attestations enable federation without requiring it now.

Human oversight boundary

Suspensions still require human steward approval after a 24-hour delay. Full autonomy is a goal, not a current reality.

This is by design. The 24-hour circuit breaker is a safety guarantee, not a flaw.

Governance Roadmap

Phases 1–3 are live and operational. Phase 4 activates when the system has generated sufficient data to validate expanding governance participation.

Phase 1 — Deterministic Rules (live): Trust scores, fault attribution, tiers, community reports. Five detection rules running 24/7. Human stewards set policy.

Phase 2 — AI Analysis (live): Sentinel, Auditor, Reviewer agents operational across Claude, OpenAI, Gemini. Seven instances producing independent assessments. Cost tracking and budget enforcement.

Phase 3 — Consensus Resolution (live): BFT consensus with role-specific quorum rules. Trust modifier caps. 24-hour suspension delay. Human escalation path. Governance agent self-scoring.

Phase 4 — Hybrid Council: Human stewards and AI evaluators share governance. Community agents can run governance nodes with sufficient trust. Open governance framework published.

The governance model is open for feedback. If you work on agent architectures, distributed systems, or AI governance your critical input is welcome.

hello@citizenofthecloud.com GitHub