Relational Core, Event-Driven Soul: Evolving Zitadel for Scale

  1. The "Hard Truths" of Pure Event Sourcing
    1. 1. The Performance Wall
    2. 2. The Complexity Tax
  2. The Evolution: Relational Core, Event-Driven Soul
    1. How It Works Now
  3. Technical Deep Dive: The Repository Pattern
  4. What This Means for You
  5. Reliability at Scale
    1. Ready to see the difference?

When we started building Zitadel, we were purists. We chose an architecture based on Event Sourcing and CQRS because it felt like the most honest way to build a verifiable, auditable identity system. Every state change was an immutable event, creating a perfect audit trail.

It was elegant. It was conceptually powerful. And for a long time, it served us well.

But at Zitadel, our 2026 vision is built on a single, non-negotiable principle: Uncompromising Developer Experience. We believe that your identity infrastructure should be invisible—it should just work, scale indefinitely, and be easy to operate.

As Zitadel has been deployed in increasingly demanding B2B SaaS environments—handling millions of requests and complex multi-tenant hierarchies—we’ve had to be honest with ourselves. The "pure" Event Sourcing model, while great for auditing, imposes a "complexity tax" on performance and operation.

That’s why we are evolving. We are shifting our storage layer to a model that stores objects in their natural, relational form while preserving the power of events. Here is the why, the how, and what it means for you.

The "Hard Truths" of Pure Event Sourcing

An Identity and Access Management (IAM) system is, at its core, an OLTP (Online Transaction Processing) system. It needs to handle a high volume of reads and writes with millisecond latency. Authenticating a user shouldn't require replaying history.

As we scaled, we identified two critical friction points with our original architecture:

1. The Performance Wall

To get the current state of a user in a pure Event Sourced model, the system often has to fetch and compute events. While we used "projections" (read models) to mitigate this, they introduced their own latency.

Using PostgreSQL as a traditional event store meant the query optimizer struggled. Creating effective indexes for dynamic event payloads was difficult, and answering simple questions like "Which users does this admin manage?" turned into complex, distributed data problems. We found ourselves re-implementing database joins in application logic—a recipe for latency.

2. The Complexity Tax

Complexity is the enemy of security. We noticed that pure CQRS had a steep learning curve that slowed down our Open Source community. Simple features required too much boilerplate code.

Take the simple act of checking if a username is unique. In a pure event-sourced system, you can't just ask the database. You have to query a projection, and if that projection is even slightly behind (eventual consistency), you might allow a duplicate. We found ourselves maintaining dedicated "lookup tables" just to work around this, adding unnecessary layers to what should be a solved problem.

More importantly, it made operations harder for you. "Replaying" events to fix a projection bug is a powerful concept, but in production, you just want your database to be consistent now—and you want a UNIQUE constraint to actually mean unique.

The Evolution: Relational Core, Event-Driven Soul

We aren't abandoning the principles that make Zitadel special. Auditability, Time-Travel Forensics, and Event-Driven Integration are here to stay.

Instead, we are moving to a hybrid approach: Write to relational tables first, then generate events.

How It Works Now

  1. Relational Source of Truth: The current state of a User, Organization, or Project is stored as a standard row in a PostgreSQL table. This leverages the full power of the relational engine for fast, indexed queries.
  2. Event Generation: Within the same transaction, we generate the corresponding event and write it to our immutable event log. Since we are already writing to the database, appending the event in the same transaction has close to no performance impact compared to the overhead of our previous distributed coordination.
  3. No More "Eventually Consistent" Headaches: Because we write the state and the event atomically, the API response is immediately consistent. You don't have to wait for a projection to catch up.

This gives you the speed of a traditional database with the audit trail of an event store.

Technical Deep Dive: The Repository Pattern

To execute this, we are refactoring our storage layer using the Repository Pattern. This decouples our business logic from the underlying storage, allowing us to implement optimized SQL queries without cluttering the codebase.

  • Simplified Contributions: For our open-source contributors, this is a massive win. Adding a field to a user profile no longer requires understanding complex event reducers. You just add a column and update the repository.
  • Gradual Migration: We are rolling this out safely. Starting with Zitadel v5, we are migrating critical paths like the Session API and Settings to this new model.
  • Feature Flags: We have implemented strictly scoped feature flags (see Issue #10332) that allows us to switch between the old and new storage logic. This ensures that existing deployments can migrate gradually with zero downtime.

"I love the pattern; even the storage is easy to implement and test." — Zitadel Engineering Team

What This Means for You

This isn't just an architectural cleanup; it’s a direct upgrade to your production environment.

  • 🚀 Lower Latency: Login flows and permission checks become standard SQL lookups. No replaying, no computing.
  • 📉 Operational Simplicity: Managing Zitadel becomes as simple as managing any standard Go + PostgreSQL application. Vacuuming, indexing, and replicating data work exactly as your DBAs expect.
  • 🔮 Future-Ready Analytics: By decoupling state from events, we can optimize the events table purely for OLAP (Online Analytical Processing). This opens the door for powerful security analytics and AI-driven threat detection in the future without impacting login performance.

Reliability at Scale

We are building Zitadel to be the default choice for Cloud-Native Identity. Whether you are running a multi tenant B2B SaaS on Kubernetes or a high-traffic consumer app, you need a system that adapts to your scale.

This evolution doubles down on our promise: Simplicity for the developer, control for the enterprise.

A Note on Compliance & GDPR:

Decoupling state from events also simplifies data governance. With the new model, we can introduce retention policies per event type in the audit table. This makes handling GDPR deletion requests or defining specific data retention periods significantly easier, as you can now manage audit logs separately from the active user state.

Ready to see the difference?

We are rolling out these changes in the upcoming releases. We’d love to hear your thoughts on this architectural shift.

Liked it? Share it!