Event Sourcing meets MCP: The whole story for LLMs

Event Sourcing provides LLMs with the richest context. The Model Context Protocol makes this context accessible via a standard interface.

listen Print view
Robot with its hand over its mouth, surrounded by many digital inscriptions with the word "Data"

(Image: Jirsak / Shutterstock.com)

14 min. read
By
  • Golo Roden
Contents

Last August, I argued here that Event Sourcing is the perfect foundation for AI. The core thesis was: Without complete, context-rich data, even the most powerful model remains blind. Event Sourcing provides exactly this data because it stores not only the current state but the entire business history. Nothing has changed about this thesis. But one question remained open: How do you make this data accessible to a Large Language Model?

the next big thing – Golo Roden
the next big thing – Golo Roden

Golo Roden is the founder and CTO of the native web GmbH. He works on the design and development of web and cloud applications and APIs, with a focus on event-driven and service-based distributed architectures. His guiding principle is that software development is not an end in itself, but must always follow an underlying technical expertise.

For about a year and a half, the Model Context Protocol (MCP) has existed as an open standard that connects LLMs with any external data sources. This elegantly bridges the gap between the data an event store holds and what a language model can see of it. This interplay deserves a closer look.

Anyone working with large language models quickly learns an important lesson: the quality of the answers depends less on the model than on the context you give it. An average model with excellent context delivers better results than a top model lacking context. This applies to simple prompts as well as complex analyses of company data.

Here, context means not only the question you ask but, above all, the data the model can access. And therein lies the problem: Most databases store only the current state. A relational database tells you that a customer has the status “Premium.” But it doesn't tell you since when, why, or which interactions led to it. It shows that a product costs €9.99, but not whether this price was lowered yesterday or has remained unchanged for ten years. It shows that an order has the status “open,” but not whether it was changed three times, canceled once, and then placed again.

For an LLM, this is like handing it a book containing only the last chapter. It can describe the current state but cannot explain connections, recognize patterns, or understand developments. It simply lacks the story told in the preceding chapters.

Event Sourcing fundamentally solves this problem. Instead of storing the state and overwriting it with each change, individual state changes are recorded as business events. Each event describes what happened, when it happened, and in what business context. The sum of all events results not only in the current state but in the complete, chronological, and immutable history. The last chapter becomes the entire book.

However, the best data foundation is of little use if an LLM cannot access it. Until now, connecting external data sources to a language model typically required individual integrations: custom APIs, custom glue code, custom maintenance. A unique solution had to be built for each combination of data source and LLM. This scales poorly and creates dependencies that develop into serious technical debt over time.

The Model Context Protocol changes that MCP is an open standard introduced by Anthropic in November 2024, which now enjoys broad support from numerous providers. OpenAI, Google, and other major providers have also joined the standard. MCP defines a uniform interface between LLMs and external data sources. An MCP server provides tools and data that an LLM can use via an MCP client. The interaction takes place in natural language: the model formulates requests, the server delivers the appropriate data. This all happens via a standardized protocol based on JSON-RPC, so proprietary technology is not required on either the client or server side.

What's crucial about MCP is its data source agnosticism. The standard defines how communication works, but not what kind of data lies behind it. Whether an MCP server connects to a relational database, a file system, an API, or an event store makes no difference at the protocol level. MCP servers now exist for dozens of systems, from GitHub and Slack to databases of various kinds.

The question MCP doesn't answer is therefore all the more important: What quality does the data flowing through this channel have? If MCP is the bridge between LLM and data source, then the data source determines how sturdy this bridge can be. A CRUD database connected via MCP still only provides snapshots. An event store, on the other hand, provides the whole story.

To make this tangible, a concrete example that I regularly use, modeling a public library as a domain, is helpful. The associated events are manageable yet rich in business terms: A BookAcquired event describes the acquisition of a new book with title, author, and ISBN. BookBorrowed documents the borrowing; BookReturned the return. On the reader side, there is ReaderApplied for registration and ReaderAccepted for the activation of the library card. For the concrete implementation, I rely on the CloudEvents standard, which provides a business-named type field and a reverse domain notation like io.eventsourcingdb.library.book-borrowed. Each book and each reader forms its subject, i.e., its event stream, which completely maps the respective history.

Now imagine an LLM having access to this city library's event store via an MCP server. It could search the existing subjects, i.e., identify individual books and readers. It could list the event types and thus understand the structure of the domain without anyone having to explain it. And it could answer specific questions that would be difficult or impossible to answer with a CRUD database.

“Which books were acquired in the past twelve months but have never been borrowed?”

This question requires knowing that a book exists (BookAcquired) and the absence of a specific subsequent event (BookBorrowed). In a CRUD database, a specially maintained field would have to exist for this, which is correctly updated with every change. In the event store, the answer directly results from the event sequence.

“Show me the complete borrowing history of reader 23.”

In a CRUD database, you see at best which book the reader currently has borrowed. In the event store, you see every single borrowing, every return, every late return, and every fee charged. The LLM can derive patterns from this, such as whether certain genres are preferred or whether returns are regularly late.

“Which books are frequently borrowed but rarely returned on time?”

This is also a question that can only be answered by combining multiple events over time. An LLM that sees this data can not only provide the answer but also formulate hypotheses: Is it due to the length of the book? Its popularity? Specific readers who generally return late?

Crucially, none of these questions require someone to have built a special reading model or programmed an analysis in advance. The LLM formulates the question, the MCP server provides the data, and the model interprets it. The flexibility arises because the raw data is complete in terms of business logic and history. What works with a city library in this example works in reality with any domain: insurance claims, logistics chains, e-commerce order processes, healthcare treatment histories, ...: the list is endless. And the richer the events are semantically, the more an LLM can do with them.

There is another reason why Event Sourcing and LLMs fit together so well, and it lies in the form of the data itself. Events are inherently formulated in business language. An event type like io.eventsourcingdb.library.book-borrowed already says what happened. The event data contains the details: which book, which reader, at what time. With such data, an LLM doesn't have to guess what is meant. It doesn't read cryptic status codes, technical foreign keys, or a type column with the value 3. It reads exactly what happened: a book was borrowed, a book was returned, a fee was charged, a reader was blocked. The semantics are in the data itself.

Furthermore, there is the chronological order. Events are naturally sorted by time. An LLM can therefore read the events like a narrative: first the book was acquired, then borrowed, then returned, then borrowed again, then returned late, then a fee was charged. This narrative structure corresponds exactly to what language models handle best. They are trained to recognize connections in sequential data, identify patterns, and draw conclusions. A sequence of events is structurally closer to a text than to a relational table.

Videos by heise

Finally, each event is self-contained and context-rich. It not only describes what has changed but already carries the business context within it. A BookBorrowed event contains not only a book ID and a reader ID but is in the context of a subject that represents the entire history of that book. This built-in context reduces the need to provide the LLM with additional explanations. The events speak for themselves.

Anyone following the current developments in context engineering will recognize the parallel: it's about providing a language model with the richest, most structured, and most relevant context possible. Events fulfill all three criteria without requiring additional preparation. They are, in a sense, already written in the language that LLMs understand best. In other words, if your software already provides good data, analysis via artificial intelligence becomes significantly easier.

The connection between Event Sourcing and MCP is not mere theory. For example, in mid-March 2026, an MCP server was released as a free extension for the EventSourcingDB database. It allows interacting with the event store in natural language via any LLM: reading and writing events, searching subjects and event types, registering event schemas, executing EventQL queries, and even querying the built-in EventQL documentation. The MCP server runs as a standalone process alongside the database, supports TLS encryption and token-based authentication, and is available as a Docker image.

In practice, this means: A developer can ask her LLM which subjects exist in the event store and receive a structured answer. She can ask which events exist for a specific book, and the LLM will read them chronologically from the store. She can formulate an analytical question, and the LLM will translate it into an EventQL query, execute it, and interpret the result. All this happens via a standardized interface, without a single line of integration code. What previously required experienced developers with knowledge of the query language and data structure becomes accessible to a broader audience through natural language.

However, it would be wrong to narrow this idea down to a single product. The principle applies to any event store that offers or will offer an MCP server. The value lies not in the specific implementation but in the combination: an event store that holds the complete business history and a standardized protocol that makes this history accessible to language models. EventSourcingDB is merely a concrete example of what this combination can look like in practice today.

MCP thus solves a real problem: it standardizes LLM access to external data sources and makes individual integrations unnecessary. This is a significant step. But MCP alone is not enough. Standardized access to insufficient data yields standardized insufficient results.

The real question remains: What data lies behind the MCP server? Anyone connecting a CRUD database gives an LLM access to the current state. That's better than no access at all, but it remains at snapshots without history. The LLM sees what is. It doesn't see what was, what has changed, or why.

Anyone connecting an event store, on the other hand, gives an LLM access to the entire business history. It sees not only the state but the path to it. It can recognize patterns, establish connections, and understand developments. It can explain, not just describe.

The combination of Event Sourcing and MCP is so powerful because it addresses two complementary problems simultaneously. Event Sourcing solves the data problem: it provides complete, context-rich, and business-language-formulated data. MCP solves the access problem: it makes this data accessible to language models via a standardized interface. Neither solves the entire challenge on its own. Together, however, they close a gap that previously ensured that LLMs either had no access to the right data or had access to the wrong data.

It is remarkable how stable this combination is against changes. Models will continue to evolve, new LLMs will emerge, existing ones will become more powerful. The way we formulate prompts will change. MCP will also continue to evolve as a protocol. What will not change are the stored events. They remain the immutable foundation upon which everything else is built. Anyone who starts storing events today is not investing in a short-lived technology but in data that gains value with every new model and every future access method.

Anyone who read my article on Event Sourcing as a foundation for AI last August will find the next logical step here. The data is there. Now it is also accessible.

(rme)

Don't miss any news – follow us on Facebook, LinkedIn or Mastodon.

This article was originally published in German. It was translated with technical assistance and editorially reviewed before publication.