Zero Cross-Entity Leakage Across 3,800 Results

100 entities, overlapping names, same industry, similar roles — and zero actual memory bleed. Here's how entity isolation works when embeddings can't save you.

TL;DR

Under adversarial conditions — 100 entities with same industry, similar roles, overlapping names, and similar deal sizes — entity-scoped retrieval produces zero true cross-entity leakage across 3,800 results.
The 2.74% observed flag rate (104 flags across 500 queries) were all false positives from shared name tokens, not actual memory bleed.
Isolation is enforced by CRM key pre-filtering at the storage layer, not by embedding distinctiveness. This is an architectural choice, not an optimization.
If you're relying on embedding distance to keep entities separate, you will fail at scale.

When you build a shared memory layer — one that any agent across an organization can write to and read from — you introduce a risk that per-agent memory systems never face: cross-entity contamination.

The enrichment agent processes content about Company A. The outbound agent queries memory about Company B. If Company A and Company B are both Series B SaaS companies in the same vertical, with CTOs who have similar titles and overlapping pain points, the question becomes: can the system reliably keep them apart?

This isn't theoretical. It's the first question enterprise security teams ask. And the answer depends entirely on how isolation is enforced.

The Embedding Problem

The intuitive approach to entity isolation is embedding distance. Each entity's memories have their own embeddings. When you query for Entity B, the embeddings for Entity A should be far enough away in vector space that they don't surface.

This fails. Here's why.

Two CTOs at Series B SaaS companies in the same vertical, both evaluating cloud migration, both concerned about API reliability, both with $500K budget authority — their memories will have nearly identical embeddings. The semantic content is the same. The entities are different. Embedding distance can't tell the difference.

It gets worse with scale. At 100 entities in the same industry segment, the embedding space for their memories is dense with near-neighbors. At 10,000, it's a minefield. A query about "cloud migration concerns" for Entity #4,217 will surface results from dozens of similar entities unless something other than embedding similarity is keeping them apart.

How We Tested This

We designed an adversarial evaluation specifically to stress entity isolation. The setup:

100 entities in the same industry
Similar roles across entities (CTOs, VPs of Engineering, Directors of Platform)
Overlapping names — entities sharing first or last name tokens
Similar deal sizes — removing another potential distinguishing signal
Unique markers embedded in each entity's memories — specific facts that exist only for that entity

Then we ran 500 queries across these entities (100 entities × 5 query types) and examined every one of the 3,800 returned results for cross-entity contamination.

The result: zero true cross-entity leakage.

We flagged 104 results (2.74%) for manual review. Every single one was a false positive — results that contained name tokens shared between entities (e.g., two contacts both named "Sarah"), not actual memory from a different entity bleeding into the results. The unique markers for each entity never appeared in another entity's query results.

Adversarial isolation test — 100 entities, 3,800 results, zero true leakage

CRM Key Pre-Filtering

The isolation mechanism is simple and deliberate: CRM key pre-filtering at the storage layer.

Every memory entry carries an entity scope — a recordId tied to CRM keys (email, website URL, phone number, custom identifiers). Every retrieval operation is partitioned first by organization ID, then filtered by entity scope, before any embedding similarity search runs.

Query flow:
1. Organization partition (hard tenant isolation)
2. Entity scope filter (CRM key match)
3. Vector similarity search (within scoped results only)
4. Post-filtering by metadata
5. Optional reflection loop

Three-layer entity isolation model — org partition, entity scope, content redaction

The vector search at step 3 never sees memories from other entities. It's not that those memories are ranked lower — they're excluded from the candidate set entirely. The embedding search operates within a pre-filtered subset, not the full store.

This is an architectural decision, not an optimization. Isolation must be enforced by storage-level scoping, not by embedding distinctiveness. Embeddings are probabilistic. Scoping is deterministic. For multi-tenant security, probabilistic isn't good enough.

Why This Matters Beyond Security

Entity isolation isn't just a security concern. It's a data quality concern.

When memories leak between entities, the contamination is subtle. The agent doesn't produce obviously wrong outputs — it produces outputs that are slightly off. A pain point from Company A colors the analysis of Company B. A budget figure from one deal influences the approach to another. The agent sounds confident. The information is plausible. But it's wrong in ways that are hard to detect.

This is the worst kind of bug: one that looks like correct behavior. The sales rep reads the agent's analysis, doesn't notice the contamination, and acts on subtly wrong intelligence. No error message. No crash. Just a slow erosion of trust when predictions don't match reality and no one can figure out why.

At scale — thousands of entities across hundreds of agent workflows — even a 1% leakage rate means dozens of contaminated interactions per day. Each one is a small trust violation. They compound.

The Multi-Layer Isolation Model

Entity isolation in Governed Memory operates at three levels:

Organization partition. All operations are partitioned by orgId at the storage layer. This is hard tenant isolation — Organization A cannot access Organization B's memories under any query condition. This is the outermost boundary and is non-negotiable.

Entity scope. Within an organization, memories are scoped to entities using CRM keys. A query for contact jane@acme.com only searches memories linked to that contact's record. Entity types are open-ended — contacts, companies, deals, vendors, partners — and all share the same isolation mechanism.

Content redaction. A two-phase redaction pipeline scrubs PII and secrets before and after LLM extraction. Phase 1 replaces sensitive patterns with typed placeholders before the LLM sees the content. Phase 2 scans extracted values, catching cases where the LLM reconstructs PII-like patterns from contextual cues. Four tiers of entity detection: secrets (API keys, private keys), financial PII (credit cards, IBAN), identity PII (SSNs), and contact PII (emails, phones, IPs).

The layers are independent. Organization partition prevents cross-tenant access. Entity scoping prevents cross-entity contamination within a tenant. Redaction prevents sensitive data from persisting in extracted memories. A failure in one layer doesn't compromise the others.

What to Ask Your Memory Provider

If you're evaluating a shared memory system for multi-tenant or multi-entity use, these are the questions that matter:

How is entity isolation enforced? If the answer involves embedding distance or similarity thresholds, that's a red flag. Isolation should be enforced by storage-level scoping — deterministic, not probabilistic.

What happens with adversarial similarity? Test with entities that share industry, role, name tokens, and deal characteristics. If the system relies on content distinctiveness for isolation, it will fail exactly when it matters most — when entities look similar.

Is isolation testable? Can you run a query for Entity A and verify that zero results from Entity B appear? Not "results from Entity B are ranked low" — zero results. If you can't test for this, you can't guarantee it.

Does redaction run before or after extraction? Post-extraction-only redaction means the LLM saw the raw PII during processing. A two-phase approach (pre and post extraction) ensures the model never processes original sensitive values.

The answer to "is our memory system secure enough for production?" isn't about encryption at rest or SOC 2 compliance — those are table stakes. It's about whether the system maintains hard boundaries between entities under adversarial conditions where the content itself provides no natural separation.

Zero leakage across 3,800 adversarial results isn't a benchmark score. It's a design constraint.

Frequently Asked Questions

What are CRM keys exactly? CRM keys are identity fields that uniquely identify an entity: recordId, email, websiteUrl, phoneNumber, or custom identifiers. They're the same identifiers your CRM uses. Memory entries are linked to entities through these keys, and retrieval filters on them before any semantic search runs.

Does entity isolation affect retrieval quality? No. The LoCoMo benchmark result (74.8% overall accuracy) is achieved with full entity isolation active. Isolation scopes the search space — it doesn't degrade it. In fact, by removing irrelevant results from other entities, it can improve precision.

What about cross-entity queries — like "show me all contacts evaluating cloud migration"? Cross-entity aggregation is a deliberate operation at the application layer, not a retrieval leak. The system supports it through structured property queries across the schema-enforced memory tier — filtering by property values across entities within an organization. This is a controlled, authorized operation, not a side effect of loose isolation.

Can entities share memories intentionally? Entity types include companies, deals, and other aggregate objects. A contact's memories and their company's memories are separate entity scopes. Cross-entity context is assembled at the application layer — for example, the multi-entity memory pattern compiles context from a contact, their company, and their deal into a single context block. But the underlying memories maintain separate entity scopes.