Hamed Taheri
Co-founder & Product Lead, Personize.ai. Building governed memory infrastructure for enterprise AI agents. Vancouver.
The Case for Instruction Decomposition in Agentic AI Systems
A single prompt gives the LLM too much freedom. Decomposing instructions into sequential steps creates guided agency: multi-agent-like behavior from a single context window, with controllable complexity and shared state.
Reasoning as a Service: Stateless Orchestration, Client-Side Execution, Addressable Governance
What happens when you strip an AI API down to reasoning only? The tools run on the developer's servers. The state lives in signed payloads, not databases. Governance becomes infrastructure. And evaluation becomes a built-in second opinion.
Code-Orchestrated Agents vs. Tool-Calling: The Architecture Decision That Matters Most
Stripe, Shopify, and Salesforce all converged on the same pattern: LLM decides, code executes. Here's the architectural reasoning, the trade-offs, and when tool-calling actually makes sense.
The Multi-Entity Memory Pattern
Most AI systems memorize contacts. The ones that work memorize contacts, their companies, their deals, and the relationships between all of them — then recall across entity boundaries at inference time.
Encoding Solution Architecture Into an AI Skill
The early stages of AI implementation are mostly discovery — assembling scattered context into a coherent system design. We spent two years encoding that process. Here's what we found.
LLM Function Calling in Production: What the Benchmarks Actually Say
The best models fail 30% of the time on complex tool-calling scenarios. Seven documented error patterns, infinite loop failures, and silent cascading errors. Here's what the data says before you ship function calling to production.
Adversarial Governance Compliance — Our Methodology and What Near-Perfect Accuracy Tells Us
Delivering the right context to agents is one problem. Ensuring they respect what they must never do is another. Here's how we designed our adversarial governance experiment, what our results show, and why this work is never finished.
Dual Memory: Why You Need Both Free-Text Facts and Typed Properties
38% of valuable information no schema anticipated. 12% only usable with type enforcement. Neither modality alone captures the full picture, and both come from one extraction pass.
The Four-Layer Architecture Behind Governed Memory
Dual memory, governance routing, reflection-bounded retrieval, and schema lifecycle — the architecture we built when RAG wasn't enough.
14 Agent Configs, 3 Teams, Zero Source of Truth
Sales embeds brand voice in a system prompt. Support copies compliance rules from a Notion doc. Marketing uses its own tone guidelines. When legal updates the data handling policy, nothing propagates.
Progressive Context Delivery: How We Cut Token Usage 50% in Multi-Step Agents
When agents re-plan and act in loops, re-injecting the same governance context at every step is expensive and makes outputs worse. Here's the fix.
Reflection-Bounded Retrieval: +25.7pp Completeness on Hard Queries
When the information an agent needs is scattered across 3–5 sources, a single retrieval pass misses most of it. Here's what actually works — and the surprising finding about what drives the gain.
Why Schema-Enforced Memory Is the CRM Integration Layer AI Has Been Missing
Free-text memories can go into a prompt. They can't sync to Salesforce, filter by deal stage, or aggregate across 10,000 entities. That's the downstream dead end.
Schemas Are Living Documents: The Closed-Loop Refinement Pipeline
Schemas age. Models get updated. Content types shift. New agent workflows produce data the schema wasn't designed for. Here's how to build a schema that keeps up.
Seven Memories Per Entity Is All You Need
Output quality saturates at roughly seven governed memories per entity. More context isn't better context — it's expensive noise.
The $450K Email Your AI Sent Wrong
Your enrichment agent knows the CTO is evaluating three vendors. Your outbound agent sends a generic cold email anyway. This is how memory silos cost you deals.
Two-Phase Redaction: Scrubbing PII Before and After LLM Extraction
Most redaction pipelines scrub PII from the output. We scrub it before the LLM sees the content and again after extraction. Here's why both phases are necessary.
99.6% Fact Recall, 74.8% on LoCoMo — What the Numbers Actually Mean
Transparent breakdown of our experimental results: what we tested, what the numbers prove, what they don't, and why we benchmark against ourselves honestly.
Zero Cross-Entity Leakage Across 3,800 Results
100 entities, overlapping names, same industry, similar roles — and zero actual memory bleed. Here's how entity isolation works when embeddings can't save you.
Your Agents Know Things. They Just Don't Tell Each Other.
Every workflow learns something. No workflow shares it. This is where organizational intelligence goes to die.
Dogfooding governed memory: building smart notifications for our own product
I installed our own SDK as a customer with a standard API key. No internal shortcuts. This is what I built and what happened.
What's Relevant? What Do We Know? What Are the Rules?
Three questions that reveal whether your AI agents have what they need — or whether you're building on gaps.
7 Patterns for Building Governed AI Knowledge Bases
A response to The New Stack's excellent taxonomy. They got six right. Here's the pattern nobody's building yet, and a practical blueprint for how to build it.
Amazon, LinkedIn, and the Race to Build Agentic Knowledge Bases (Part 2)
Google, Microsoft, and Salesforce are each solving a piece of the agent governance puzzle. Here's what the pattern reveals — and the gap nobody has closed.
Amazon, LinkedIn, and the Race to Build Agentic Knowledge Bases (Part 1)
The biggest companies in tech are converging on the same conclusion: AI agents without organizational knowledge are a liability.
Who's Actually in Charge of Your AI Agents?
Same company, same task, three different AI agents, three completely different answers. Your customers notice. Do you?
3 Shortcomings of RAG as a Memory
The gap between 'stored' and 'remembered' is where agent quality lives.
Why Agents Fail Without Memory
If your AI agents forget everything between conversations, they're not agents — they're expensive autocomplete.