Why the raw schema serves a query planner and not an agent, what every data platform shipped in 2026 to fix it, and why a semantic layer alone isn't enough.


Point a capable agent at your data warehouse and ask it a real business question. Not "rows where region = 'EMEA'," but "which accounts are most at risk of churning next quarter." At a handful of tables it looks like magic. Across a real schema it does something worse than fail: it answers, confidently, and it's wrong. It joined two tables that should never be joined. It used a revenue column that includes refunds because nobody told it the company reports net. It picked the staging copy of the table, not the certified one. The SQL runs. The number is plausible. The number is wrong.

That failure mode is the entire problem, and it is not a model problem. The team at dbt Labs put it more precisely than I could: text-to-SQL can return a plausible but incorrect answer silently, while a modeled semantic layer either returns the right result or explicitly fails. Silent confidence on a wrong answer is the most expensive output a data system can produce. A human analyst who isn't sure says so. An agent reading a raw schema has no way to be unsure, because the schema never told it what anything means.

This is why, across 2026, every major data platform quietly arrived at the same conclusion and shipped the same fix. Look at what they actually built and you find the argument I have been making about CRMs, generalized to the entire analytics stack.

The raw schema has the wrong reader

A relational schema is an extraordinary piece of engineering built for one consumer: a query planner returning exact rows for exact filters. Reports. Dashboards. Human analysts who already know that rev_net_v2 is the column finance trusts and rev is the one from 2019 that nobody deleted. The schema encodes structure. It does not encode meaning. The meaning lives in the heads of the five people who have worked there long enough to know which table is real.

An agent has none of that tribal knowledge. It reads the schema literally, and the schema is silent on every question that actually matters: which metric is authoritative, how two entities relate, what a word means in this company versus the dictionary. The insight that "active user" means something specific and contested here does not live in a column. It lives in a Slack thread and a tenured analyst's memory, and the database filed neither.

I argued this same point about CRMs: the system was built for a human clicking a dashboard, and it breaks when the primary reader becomes an agent. Watching the data-platform vendors converge on the fix in real time has been the strongest external confirmation I could have asked for. They are not adding intelligence. They are adding the layer the schema was always missing.

What everyone shipped in 2026

Three competitors, three products, one conclusion.

Databricks announced Genie Ontology at its Data and AI Summit: a self-improving context layer that extracts metric definitions, business terms, calculations, and relationships from tables, queries, dashboards, and pipelines, then organizes them into a graph of what the data actually means. It even borrows from PageRank. A mechanism they call OntoRank weighs a definition by where it came from, how authoritative the author is, how often people rely on it, how closely it ties to certified assets, and how fresh it is, so the agent answers from the source that carries the most weight. On their own 28-question internal benchmark, Genie hit 84.5 percent first-attempt accuracy against 52.4 percent for the strongest general-purpose coding agent.

Snowflake shipped ontology-grounded Cortex agents and made the sharpest version of the case: enterprise data is full of meaning that never appears in a schema, the inherent domain knowledge you must capture to power reliable agents. On a deliberately hard biomedical benchmark, a plain semantic view scored 50 percent; layering in ontology grounding (a knowledge graph, retrieval over enriched concept profiles, and curated term mappings) took it to 78.2 percent.

dbt Labs published a 2026 benchmark on the question directly. With modeled data, Claude Sonnet 4.6 went from 90.0 percent on raw text-to-SQL to 98.2 percent through the semantic layer; GPT-5.3 Codex went from 84.1 to 100. Their conclusion: use the semantic layer when the answer has to be right, and raw text-to-SQL only for throwaway exploration.

Read those numbers with discipline, because here is where most write-ups get careless. Every one of these is a vendor benchmark, run by the company selling the layer, on a small and self-chosen question set. Databricks' is 28 questions on an internal suite, not the public BIRD benchmark some secondhand coverage claimed. Snowflake's is 22 questions in one domain. dbt's is dbt's. No single percentage here is load-bearing. What is load-bearing is the direction, because three companies that compete hard for the same customers, measuring independently, all point the same way: ground the agent in meaning and accuracy jumps; don't, and it confabulates.

And they didn't stop at competing products. In September 2025, Snowflake, Salesforce, BlackRock, dbt Labs, and RelationalAI launched the Open Semantic Interchange, a vendor-neutral specification for how semantic metadata is defined and shared, with the outputs donated to the Apache Software Foundation. Databricks later joined the working group. When direct competitors start co-authoring a shared standard, they are not chasing a feature. They are conceding that this layer is now foundational infrastructure, and that fighting over its file format helps no one.

Semantic layer and ontology are not the same thing

Here is the distinction almost every summary blurs, and it is the part worth getting right.

A semantic layer governs metrics and dimensions. You define revenue once, as net of refunds, in the currency finance uses, and every agent and dashboard queries that one definition instead of reinventing it in raw SQL. It kills the "which revenue column" problem. That alone is most of the accuracy gain in the dbt numbers.

An ontology sits above that and encodes the things a metrics catalog can't: how entities relate, what contains what, which terms are synonyms, what a concept means in a regulated domain. It is the difference between knowing the definition of a word and knowing the language. Snowflake's own results make the gap concrete: the semantic view got them to 50 percent, and it was the ontology on top, the hierarchy and relationships and mappings, that carried them to 78. The metrics layer answers "what is revenue." The ontology answers "how does this account relate to that contract, that subsidiary, and that renewal, and which of those even counts as the same customer."

Most teams rushing to bolt a semantic layer onto their warehouse will stop at the metrics and declare victory. They will get the first jump and miss the second. The relationships and meaning are harder to capture and worth more.

This is curation, not more data

One reflex makes all of this worse: the belief that the fix for a confused agent is more context. Dump the whole schema, every column, every relationship, every historical table into the prompt and let the model sort it out. It is exactly backwards. A schema dumped wholesale is noise; the agent now has more ways to be wrong, not fewer. The reason OntoRank exists, the reason Snowflake curates term mappings by hand, is that the value is in selection. The right few authoritative definitions beat the entire catalog. An ontology is a curation artifact. The constraint was never how much your data platform holds. It is how little of it the agent needs to be right.

Where this leaves you

The honest read is not "buy the ontology product and you're done." Every benchmark above is self-reported, the meaning-capture is genuinely hard, and an ontology that is wrong or stale will mislead an agent more authoritatively than no ontology at all. Authority cuts both ways.

But the strategic picture is unambiguous, and it is the same one I drew for the CRM. The systems whose primary user is becoming an agent were all built for a different reader, and they all need a layer between the raw store and the agent that carries meaning, relationships, and authority. The data platforms saw it first because text-to-SQL made the failure impossible to ignore: the wrong answer was right there in the dashboard. Your CRM, your ticketing system, your product analytics, every system an agent is about to start operating, has the same gap and has not yet had its reckoning.

The vendors built the semantic door. The room behind it, governed and authoritative meaning that an agent can trust without a tenured analyst in the loop, is still mostly empty. That room is where the next few years of this work actually happen.


If this resonated, the same argument applied to CRMs, where the agent stops operating the system and becomes its primary user, is in Who's Running Your CRM. Subscribe if you want the next pieces on memory, governance, and the operating layer enterprise AI actually runs on.