Goldman Just Wrote Our Memo.

On the Goldman Sachs Global Institute paper, what it means for AI infrastructure, and why the data layer beneath the model is where the next decade of value will live.

Last week the Goldman Sachs Global Institute published a paper, “When AI Learns How the World Works,” arguing that the next frontier of AI is no longer larger language models. It is world models: systems that understand how physical reality behaves, simulate consequences before acting, and reason across the structures and dynamics of the real world. The paper is, in effect, an institutional articulation of the category we have been building inside of since 2020.

This post explains why the Goldman framing matters, the thesis we have been quietly operating on for five years, and the one architectural decision that we believe will separate the platforms that survive this transition from the ones that get rebuilt around them.

What Goldman Says

Strip the prose away and Goldman makes five claims.

World models are the next paradigm: where LLMs predict tokens, world models predict consequences.
There are two flavors, physical (gravity, friction, motion, geography) and social (agents, incentives, behavior), and both require ground truth about the structured reality they sit inside.
Compute is no longer the bottleneck; what matters now is, in Goldman’s words, “the scope and quality of the simulation itself, and how faithfully it captures dynamics that matter.”
Aggregate AI infrastructure spend may be materially understated relative to current consensus.
And digital twins, once an R&D curiosity, are becoming a procurement category.

“The frontier of AI no longer lies only in larger models and more data. It lies in better representations of reality.”

— Goldman Sachs Global Institute, April 2026

The Real Thesis: 20% Models, 80% Data

Here is the argument we want readers to hold above everything else in this post, because it is the one most of the field is still missing. The model is not where the durable return on AI investment lives. AI is roughly 20% models and 80% data, and we believe that ratio is going to widen, not narrow.

The model layer will evolve the way LLMs already have. Yann LeCun’s JEPA. Fei-Fei Li’s spatial intelligence work at World Labs. NVIDIA’s physical simulators. Google. Meta. Whoever is next. Each new generation will be more capable, more efficient, and within twelve to eighteen months, broadly available. Every major player will have a competitive world model. The same compression cycle the field just lived through with LLMs will repeat. Margins on models compress. Differentiation collapses into table stakes.

Data does not work that way. Data does not evolve in generations; it accumulates. The company that owns access to the most physical-world data, plus the authoritative knowledge graph that holds the context, relationships, and provenance of that data, owns the layer that does not get commoditized. Models will compete. The data and context underneath them will compound.

Geodesic, our platform, is not a model. It is the federated access, knowledge graph, and reasoning layer that every world model will eventually need to do useful work outside a lab. Whoever wins at the model layer is, by construction, our customer.

What a Digital Twin Actually Is

If you take one technical idea from this post, take this one. The way most people define a digital twin is wrong, and that wrongness is the reason most digital twin programs fail to deliver anything an AI system can actually use.

Ask ten executives what a digital twin is and nine will describe a 3D rendering. A photorealistic model of a city, a factory, a building. Beautiful visualization, real-time graphics. NVIDIA’s own consulting partners now openly warn customers about this trap: “starting with visualization instead of data integration — a beautiful twin with no live data has no operational value.”

A real digital twin is not a picture of the world. It is a contextual matrix of the data that describes the world. It is the answer to four questions, asked simultaneously, about every meaningful object, event, and signal inside the system you are modeling: what is this data, where does it live, when was it true, and how does it connect to everything else? Get those four right and the rendering becomes a downstream choice. Get the rendering first and you have a beautiful surface with nothing real underneath it.

“The digital twin is the data and the connections of that data, not the picture of it. The rendering is the last mile of delivery; the data is the first mile, and by far the harder one.”

All Data Is Spatiotemporal

There is one architectural decision that separates Geodesic from every other data platform on the market, and it is the decision that makes the digital-twin definition above actually work in production.

Every fact about the physical world has a where and a when. Every sensor reading. Every transaction. Every asset. Every event. The where and the when are not metadata. They are not optional fields on a record. They are the primary keys of physical reality. If you do not know where something happened and when it happened, you do not know what happened.

Yet every other major data platform treats space and time as second-class citizens. Snowflake and Databricks were built around tables and rows; geography and time are columns you bolt on. Palantir’s ontology can be configured to handle them but is not architecturally native. Esri is spatially native but treats time as an afterthought, and remains structurally siloed from the broader enterprise stack. The result, across all of them, is the same integration tax the industry keeps paying: spatial joins that take hours, temporal alignments that have to be hand-built for every dataset, cross-source queries that fail at any meaningful scale.

Geodesic was designed from the first line of code with a different premise. The where and the when are equal citizens with the what. Every datum, regardless of source or format, is indexed natively in space and time. We believe being natively spatiotemporal is the only way to build a real digital twin of the world. A world model that understands physical reality has to reason across where and when, fluidly, at any scale, across any source. Any platform that treats space and time as add-ons will hit a wall the moment it tries to scale.

“If your digital twin cannot natively reason across space and time, it is not a twin. It is a diagram with sensors attached.”

The Category Beneath the Category

Step back from the technical argument. What Goldman’s paper signals, and what the institutional capital flowing into world models is starting to price, is a new category of AI infrastructure: the layer that turns physical-world data into something a model, an agent, or an enterprise can reason on. Most discussion of this layer treats it as plumbing. We think that is exactly wrong. The layer is the product.

We have a way of describing this we have started using more publicly: SeerAI answers questions you didn’t know you could ask. The reason it resonates with the people we say it to is that it is not a feature claim. It is a category claim. What Geodesic delivers is not software in the traditional sense. It is access to knowledge that was previously inaccessible, structured in a way that lets the user keep digging.

The last twenty years of enterprise software were Software as a Service. A tool sat there, a box of features, until someone used it again. The value was capped by the imagination of the user. The platform we are building is something architecturally different: every question answered exposes three more worth asking, every dataset connected reveals adjacencies that were invisible before. The platform compounds the user’s curiosity into action. We think that dynamic earns its own name. Knowledge as a Service. KaaS.

Goldman’s paper does not use that vocabulary. It does not need to. The institutional point is the same: the next decade of AI value will not live in the largest model. It will live in the most faithful, queryable, semantically rich representation of the world that model has to reason against. The companies that own that representation own the compounding asset of the agentic era.

Reference: Goldman Sachs Global Institute, “When AI Learns How the World Works,” April 2026, by George Lee and Dan Keyserling.

What Goldman Says

The Real Thesis: 20% Models, 80% Data

What a Digital Twin Actually Is

All Data Is Spatiotemporal

The Category Beneath the Category

Ready to transform how you work with data?

A monthly newsletter for people building at the edge