Sales embeds brand voice in a system prompt. Support copies compliance rules from a Notion doc. Marketing uses its own tone guidelines. When legal updates the data handling policy, nothing propagates.


TL;DR

  • Governance fragmentation is the second structural challenge in multi-agent enterprises: every team maintains its own version of the rules, in its own tools, with no synchronization mechanism.
  • The symptoms are subtle — contradictory pricing, inconsistent tone, stale compliance language — and they compound across every agent interaction.
  • Governance routing with 92% precision delivers the right organizational context to the right agent for the right task, using a fast path (~850ms) that requires no LLM calls.
  • Progressive context delivery tracks what's already been delivered across multi-step workflows, cutting token usage by 50%.
  • Well-authored governance variables are 20–50 percentage points more discoverable than poorly-authored ones — authoring quality is an operational concern, not a cosmetic one.

I wrote previously about memory silos — the problem where agents learn things but don't share them across workflows. That's the first structural challenge when organizations scale from one agent to many.

This post is about the second challenge, and in some ways it's more dangerous: governance fragmentation. Not what agents know, but what rules they follow.

The Default State

14 agent configs across 3 teams — each with its own rule source, each stale, legal updates that don't propagate, no synchronization mechanism

Here's what governance looks like in most organizations running AI agents today:

The sales team builds an outbound agent. The system prompt includes a paragraph about brand voice, copied from the marketing team's style guide six months ago. It includes pricing guidelines — "never discount more than 15% without VP approval" — that the VP revised last quarter, but the update never reached the system prompt.

The support team runs a customer-facing bot. Compliance rules were pasted from a Notion doc. The doc has since been updated twice. The bot's version is two revisions behind. It still references a data retention policy that legal deprecated in January.

Marketing has a content generation workflow. It uses its own tone guidelines, different from the ones embedded in the sales agent. Not contradictory exactly — just inconsistent. The brand sounds slightly different depending on which team's agent the customer encounters.

When legal updates the data handling policy, there's no mechanism to propagate it to the 14 agent configurations across three teams. Someone has to manually find every system prompt, every Notion doc, every workflow configuration, and update each one. In practice, this doesn't happen. The update reaches the agents legal knows about and misses the ones they don't.

There is no versioning, no single source of truth, and no way to ensure all agents operate under the same organizational rules.

Why This Is Harder Than It Looks

The instinct is to centralize: put all policies in one place, have every agent read from it. Solved.

But "one place" doesn't solve the routing problem. A pricing policy, a brand voice guide, a compliance requirement, and a product FAQ all live in the same store. When the outbound agent needs to write a cold email, which ones should it get?

We tried RAG for this. Semantic similarity returned email server configuration docs when the query was "write a cold outbound email." A pricing policy competed with a product FAQ that happened to mention "pricing." RAG finds what's similar. Governance needs what's authoritative and applicable.

The other problem with naive centralization: context volume. If you dump every organizational policy into every agent's context window, you hit two walls. First, token costs scale linearly with the number of policies. Second, and worse, the model's attention dilutes across the volume. Liu et al. showed that models under-use information in long-context middles. More governance context can actually decrease compliance because the critical policy gets buried in the noise.

How Governance Routing Works

Governance routing architecture — variables enriched with HyPE, scope inference, and content embedding, then routed via fast mode (~850ms) or full mode (~2-5s), with progressive session-aware delivery

The architecture we built separates governance into three concerns: how policies are stored, how they're routed to agents, and how they're delivered across multi-step workflows.

Storage: Governance Variables

Organizational context is stored as governance variables — structured objects with name, description, tags, content, heading hierarchy, and content-aware embeddings. These aren't documents in a vector store. They're enriched, metadata-rich objects with three things computed at creation time:

Hypothetical Prompt Enrichment (HyPE) generates synthetic queries representing how an agent might need this variable. A "Brand Voice Guidelines" variable gets queries like "how should I write a cold email?" and "what tone should support responses use?" These are embedded alongside the variable, so routing can match on intent, not surface-level keyword overlap.

Governance scope inference determines whether the variable is always-on (like a compliance policy that applies to every interaction) or conditionally relevant (like a product-specific pricing sheet). Always-on variables get a routing boost so they're never missed.

Content-aware embedding from metadata and a content preview, providing a fallback similarity signal.

Routing: Two Tiers

Fast mode (~850ms average) makes zero LLM calls. Each candidate variable is scored using a weighted composite of embedding similarity and keyword overlap against variable metadata and HyPE queries, plus a governance scope boost for always-on variables. Results are partitioned into critical and supplementary sets.

Full mode (~2–5s) adds a two-stage LLM pipeline for high-stakes decisions: embedding pre-filter reduces candidates, then structured LLM analysis classifies context as critical or supplementary with section-level extraction capability. This means the system can pull the relevant section from a long policy document rather than injecting the entire thing.

Against 25 governance variables across 5 categories and 20 task types, routing achieves 92% precision and 88% recall. Nearly all selected variables are relevant, and most relevant variables are selected.

Delivery: Progressive Context

This is the piece that matters most for multi-step agents — and it's the one most implementations miss.

An agent operating in an autonomous loop (plan → act → observe → re-plan → act) may invoke governance routing at every step. Without session awareness, the same compliance policy is re-injected into every step. Five steps, same policy five times. The agent's context window fills with redundant governance context that should have been delivered once.

Progressive delivery maintains a session state tracking which variables (and which sections) have already been delivered. On each routing call, already-delivered variables are excluded. Only new or newly relevant content is resolved and injected.

Supplementary items are intentionally not recorded in the session state. This allows promotion: a guideline initially deemed peripheral can surface as critical when the agent's task evolves. If the agent shifts from "write an email" to "propose pricing," governance context about pricing should appear even if it was previously classified as supplementary.

The result: 50% token reduction across multi-step workflows with no quality degradation.

The Authoring Problem No One Talks About

Discovery rate by category — well-authored variables (82–92%) vs. poorly-authored (0% in 3 of 5 categories), 20–50 percentage point gap

Here's the result that surprised us most. We tested governance routing against both well-authored and poorly-authored versions of the same variables.

Well-authored governance variables are 20–50 percentage points more discoverable than poorly-authored equivalents. In 3 of 5 categories (brand, product, support), poorly-authored variables scored 0% discovery rate.

Zero. The routing system literally couldn't find them.

This means governance quality isn't just a routing problem — it's an authoring problem. A brilliantly designed routing engine is useless if the governance variables are titled "Q3 Update" with no description, no tags, and content that assumes context the system doesn't have.

This is why we invested in AI-assisted authoring. Users describe what they need in natural language. The system generates structured governance variables with clear names, descriptions, tags, and content organization. Interactive enhancement lets operators refine variables based on observed issues. The authoring tools aren't nice-to-have features — they're operationally significant infrastructure.

What Governance Fragmentation Actually Costs

The individual symptoms look small:

  • A cold email offers a discount the company no longer honors
  • A support response uses legal language from a deprecated policy
  • A marketing agent and a sales agent describe the same product feature differently
  • A pricing conversation goes wrong because the agent doesn't know about the enterprise tier changes from last month

Each one is survivable. But they compound. And unlike memory silos — where the cost is missed opportunity — governance fragmentation creates active contradictions. The customer gets told two different things by two different agents from the same company. That's not a missed opportunity. That's a trust violation.

Singapore's Model AI Governance Framework for Agentic AI, the first dedicated framework for agentic systems, says it directly: organizations must implement "technical controls and processes" that bound agent behavior proactively. The policy must find the agent, not the other way around.

100% adversarial governance compliance across 50 scenarios designed to bypass constraints validates that the routing layer is robust. But the compliance number only matters if the governance content is accurate, current, and discoverable. That's an organizational discipline backed by tooling, not a feature you ship once.

The Test for Your Organization

Three questions to diagnose governance fragmentation:

1. Where do your agent rules live? If the answer is "system prompts" or "scattered across Notion docs and workflow configs," you have fragmentation. Governance variables need a single, versioned, team-owned source of truth.

2. When legal updates a policy, how many places need to change? If the answer is more than one, you don't have governance routing — you have a manual propagation process that will fail under pressure.

3. Do your agents get the right rules, or all the rules? If every agent gets every policy, you're paying in tokens and losing in attention. If agents get no policies unless they ask, you're relying on agents to know what they don't know. Governance routing gives agents the right rules for the right task, proactively.

The memory silo problem is about what agents know. The governance fragmentation problem is about what agents are allowed to do. Both are invisible until they start costing you. The difference is that governance failures are customer-facing — contradictory, non-compliant, off-brand — in ways that erode trust faster than stale context ever could.


Frequently Asked Questions

Can't I just use a long system prompt with all my rules? You can, but it doesn't scale. As rules multiply, the model's attention on any individual rule dilutes. You also can't version, update, or route them independently. A system prompt is a monolith. Governance variables are modular, routable, and independently maintainable.

How is this different from RAG over my policy documents? RAG optimizes for semantic similarity — what's most like the query. Governance routing optimizes for authority and applicability — what governs this task. When you query "write a cold email," RAG might return email server docs. Governance routing returns brand voice and compliance policies. The mechanisms are different because the concerns are different.

Who owns governance variables? Domain teams own their governance variables: marketing owns brand voice, legal owns compliance, finance owns pricing. The routing layer ensures every agent consumes the current version regardless of which team maintains it. Governance variable visibility and access levels (organization-wide, private, admins-only) are enforced at the API layer.

What happens when governance variables conflict with each other? The routing layer classifies results as critical or supplementary, giving agents a priority signal. When two governance variables genuinely conflict — e.g., a sales discount policy versus a margin protection policy — the system surfaces both as critical, and the conflict becomes explicit rather than hidden in separate system prompts.