Lean instead of bloated: What Aggregates and Read Models really are
DDD, CQRS, and Event Sourcing promise flexibility, but many teams end up with CRUD with extra steps. The reason? A misunderstanding.
(Image: sabthai/Shutterstock.com)
- Golo Roden
When developers first encounter Domain-Driven Design (DDD), CQRS, and Event Sourcing, they already bring mental models with them. Years of working with objects and tables have shaped how they think about data.
And so, they immediately have a familiar image in mind when they hear the word “aggregate”: an aggregate must be like an object, and objects are mapped to tables. This intuition feels right. But it isn't, and it leads to a system that suspiciously resembles CRUD with additional steps.
I've seen this pattern countless times. Teams build something they call an event-driven system and end up with a single books table that contains every field their Book aggregate has. They've basically recreated a relational database, just with events as the transport mechanism. The strengths of DDD, CQRS, and Event Sourcing, the flexibility these concepts promise – all of it remains unused.
The Aggregate Misconception
The problem is what developers typically consider an aggregate: a container for all data about a thing. They imagine a Book aggregate and start listing properties:
BookAggregate {
id: string
title: string
author: string
isbn: string
currentBorrower: string | null
dueDate: Date | null
location: string
condition: string
purchasePrice: number
acquisitionDate: Date
lastInspectionDate: Date
popularityScore: number
}
This looks like an object. It has all the fields. It maps cleanly to a database table. And therein lies the mistake: treating the aggregate as a data container.
If you think this way, your aggregate becomes a bloated representation of everything you might ever want to know about a book. It mirrors the structure of your Read Model because you haven't yet realized that they are fundamentally different concepts serving fundamentally different purposes.
What an Aggregate Actually Is
According to DDD, an aggregate is a boundary of consistency for decision-making. That's all. Its purpose is to ensure that business rules are adhered to when commands are processed. It only needs the information required to decide whether a command is valid.
Consider the BorrowBook command. To decide if a book can be borrowed, you only need to know one thing: is the book currently available? You don't need the title, the author, the ISBN, the purchase price, the location, or the last inspection date. None of this information helps you decide whether this specific command should succeed or not. This means the aggregate can be very lean, as it only contains the state relevant to the decision.
For our library example, a correctly designed Book aggregate might look like this:
BookAggregate {
isAvailable: boolean
currentBorrower: string | null
}
This is enough to decide:
- Can this book be borrowed? (
isAvailable === true) - Can this person return it? (
currentBorrower === personId)
Everything else, any other information about the book, belongs elsewhere – in Read Models, not in the aggregate.
The Read Model Misconception
Once developers accept that an aggregate has certain fields, the next mistake follows: “If my aggregate has these fields, my Read Model table should have these fields too.”
The result is predictable. You create a books table with columns for id, title, author, isbn, borrower, dueDate, location, condition, purchasePrice, and every other field you can think of. Queries become complex joins across this monolithic structure. Performance suffers. Flexibility disappears.
This is CRUD thinking applied to Event Sourcing. The events exist, but they are just a transport layer. The system still revolves around a single canonical representation of the data, just like a traditional relational database.
What Read Models Actually Are
Read Models are projections optimized for specific queries. They serve use cases, not data structures. And here's the crucial insight: Read Models are derived from events, not from aggregates:
- Your aggregate decides what happens.
- Events record what happened.
- Read Models are built from these events to answer specific questions efficiently.
There is no requirement, no rule, no architectural principle that states Read Models must mirror the structure of aggregates.
In fact, the opposite is true. From an event stream, you can build many different Read Models. This is the strength of CQRS that gets lost when you think in tables.
The Library Example: One Write Model, Many Read Models
Let's make this concrete with our library. We have a Book aggregate that handles decisions:
BookAggregate {
isAvailable: boolean
currentBorrower: string | null
}
Events flow through the system: BookAcquired, BookBorrowed, BookReturned, BookRemoved, and so on. These events contain rich information about what happened.
Now consider the different questions people want answered:
- The catalog search needs to show available books with their titles, authors, and ISBNs. It's not interested in borrowing history or physical location.
- The member dashboard (the “My Books” page) needs to show which books the member has borrowed, when they are due, and if any are overdue. It doesn't need ISBNs or physical locations.
- The statistics panel for librarians needs to know which books are most popular, average borrowing times, and trends over time. It doesn't need current availability.
- The overdue report needs borrower names, contact information, book titles, and how many days overdue. It doesn't need purchase prices or condition ratings.
- The inventory management system needs physical locations, condition ratings, and last inspection dates. It doesn't need borrower information.
Each of these is a separate Read Model, built from the same events, optimized for its specific use case.
Many Small Read Models Instead of One Big Table
Here's how these Read Models might look:
Catalog Search Read Model:
{
bookId: string
title: string
author: string
isbn: string
isAvailable: boolean
}
Borrower Dashboard Read Model:
{
memberId: string
books: [
{
bookId: string
title: string
dueDate: Date
daysOverdue: number
}
]
}
Library Statistics Read Model:
{
bookId: string
title: string
totalBorrows: number
averageDuration: number
popularityRank: number
}
Overdue Books Read Model:
{
bookId: string
title: string
borrowerId: string
borrowerName: string
contactEmail: string
daysOverdue: number
}
Inventory Read Model:
{
bookId: string
location: string
condition: string
lastInspectionDate: Date
}
Each Read Model
- has only the fields needed for its use case,
- can be stored in a different database if necessary (PostgreSQL for transactions, Elasticsearch for search, Redis for fast lookups),
- can be rebuilt from events at any time, and
- evolves independently of other Read Models.
The Multiplication Effect
This is where Event Sourcing shows its true strength. From a stream of events, you derive many specialized Read Models. Each is small, focused, and fast. Adding a new Read Model requires no changes to the Write Model or existing Read Models. You simply build another projection from the same events.
Need a new report? Create a new Read Model. Need to optimize a slow query? Restructure that specific Read Model without touching anything else. Need to support a new use case? Add another projection.
This flexibility is the promise of CQRS. But it only materializes when you stop viewing Read Models as mirrors of your aggregates.
Why This Matters
The practical benefits are significant:
- Performance improves because each Read Model is small and specialized. Queries hit exactly the data they need, no more. Indexes can be optimized for specific access patterns.
- Flexibility increases because you can add, modify, or remove Read Models without affecting the Write Model or other Read Models. Teams can own their Read Models independently.
- Clarity emerges because each Read Model has a clear purpose. There is no ambiguity about which data is for which use case. The structure of each Read Model reflects the questions it answers.
- Independence follows because different teams can work on different Read Models without coordinating schema changes. The events are the contract, not the database tables.
Unlearning the Table
The hardest part of Event Sourcing is unlearning the mental models that have served you well in CRUD systems. Objects and tables are useful concepts, but they are not the right lens through which to understand aggregates and Read Models.
Stop asking, “What fields does my aggregate have?” Start asking, “What do I need to know to make this decision?”
Stop asking, “What table do I need for this aggregate?” Start asking, “What questions do my users need answered?”
The aggregate is your decision boundary, lean and focused. Events are your historical record of what happened. Read Models are your optimized views for specific queries.
These are three different concepts. They don't need to have the same structure. In fact, they probably shouldn't.
(rme)