← Back to context

Comment by dagss

1 year ago

I feel event sourcing is a real world pragmatic approach to declarative programming that this paper advocates.

For state changes you add events to the database to describe something that happened. Any question you may need an answer for / business decision you want to make can be answered by querying the events.

The problem at the moment is that while event sourcing is excellent at reducing accidental complexity surrounding implementing business rules, there is little standard / commonly used tooling around it and you end up with lots of accidental complexity in that end.

An example would be a database not designed to be a CRUD store but to store events and manage read models, and manage computation of projections etc -- while being suitable for OLTP workloads. At a minimum, very strong support for using any kind of construct in materialized views (since in a sense the entire business logic is written as a "materialized view" when doing event sourcing)

Event Sourcing is a nice card to have in your hand, but it should not be a goal in itself. A yard is 3 feet. The atomic mass of Hydrogen is 1.008 and its symbol is "H". These will never change. If someone came to me and said "your 'Chemical' table is not event-sourced. We're doing event-sourcing here. You need to change it." I would tell them to get lost. Why the hell would you have an event for "An Element's Mass was Modified" event stored in your database? Unless you're developing for CERN or NREL or something, just don't.

On the other hand, having a bank account table with a single field for someone's money is clearly not enough. You absolutely should be tracking every transaction that has changed that account's value. Do you need to track every other possible change to that account? Like, whether they want paper or electronic mail? No, probably not.

"Event sourcing" is a way to refactor a domain model -- take a statement and break it into a sequence of statements that "add up" to the original statement.

"Add up" is key here. When you break "AccountBalance" into "Transaction", it's clear how to "add up" transactions to recreate the original account balance. But that's not your goal, necessarily! The reason why this tends to make better domain models is exactly because you have to think about "adding up" your domain models, and what that means. "Adding up" is an associative, probably-commutative, binary operation with identity. Note that that means your domain MUST have a "zero transaction". ALL of your events that you event source need a "zero event". If you cannot come up with the "zero" of an event, then you should not be breaking into events! The whole point is to be able to define the monoid over it, which requires identity.

So instead of taking event sourcing as an end goal, make your goal this: think about the operations that make sense on your domain, like accumulating accounts. What else can you accumulate? Can you add Employees together? Not really -- you can group them, into departments and events and meetings. Is that grouping associative and commutative? Sure -- it's just Set Union. Is there anything about Employees you can add together? Well, their salaries. In fact for data analysis, employee salaries are an important cost that you probably want to throw in a data cube. Define a Monoid over Employee salaries.

What other operations make sense on your data? Close, open, start, end, group, buy, sell, move, rotate, add, multiply, concatenate, join, reverse, inverse, combine, merge, undo, redo, fill, empty, saturate, fix, break, import, export, report, validate. Are they associative, commutative, distributive, invertible? Do they have identity? Event sourcing is such a tiny part of exploring this world. And it's worth exploring.

  • I'm not arguing that event sourcing should be done for its own sake, so I don't really want to disagree with you; but that said your post doesn't perfectly resonate with me either.

    When you write a typical backend system, the desired function of the system is to interact with the external world. Without I/O the system may as well not exist.

    Input is a desire from someone that something be done or recording that something happened. Such input changes the data recorded, or appends what the system should know / has seen. This is an "event".

    All input can be framed as being an event. But "an element's mass was modified" is not an event ... it doesn't describe someone or something giving input to our system.

    The algebraic view on things you take seems to be treating the system at a different level than what I think about as event sourcing.

    Neither "an element's mass was modified" or "sell" or "transaction" that you mention are realistic events. An event is "User U clicked a button in the web console to Sell share S at time T". Implementing the effects of that event -- computing a specific read model resulting from appending the event to the stored set of events -- may well be best done by some algebra like you suggest, but that seems like another topic.

    You seem to talk about models for computing and transforming state. I talk about I/O vs data storage.

    • > All input can be framed as being an event.

      Sure, it could be, but is it useful to do that? If I stand up and shout "The price of a Banana is $4 per bushel!", you could record my voice and upload it as a raw wave file. That's the rawest "input event" you can come up with. Or you could write down "some random dude said that bananas cost $4 around 4:30 pm and I'm not sure whether I believe him or not". That's not the "raw input", it's been transcribed and modified and annotated. Yet it's almost certainly more useful to your system, and it's kinda like event sourcing. Kinda.

      The problem with worrying about whether something is "input" or "output" or "internal" is that you can just move the dotted line anywhere around your system to change those. If you break a monolith into independent reusable building blocks, those building blocks are going to have a completely different idea of what counts as input and output. But who cares? You're not changing any fundamental truth about how the domain works. Your domain model should really be independent of worrying about what's "input" and "output". Those lines move all the time. Instead think about what operations make sense to do with your data, and then think about the mathematical properties of those operations.

      > But "an element's mass was modified" is not an event ... it doesn't describe someone or something giving input to our system.

      Sure it does. Someone gave you the input that a particular element has a particular mass. How is that not input? How else did you get that data?

      > The algebraic view on things you take seems to be treating the system at a different level than what I think about as event sourcing.

      This is my exact point. You should think about event sourcing this way, because that's the only reason it's useful: it's accidentally a source of important "domain algebra" that you otherwise might miss. But there's lots of other important "domain algebra" that you are still missing, and they don't necessarily look like event sourcing.

      > An event is "User U clicked a button in the web console to Sell share S at time T".

      But surely that's not what you're storing in your system! That would be an extreme coupling between the concept of "selling shares" and "clicking a button". Those are completely unrelated ideas! Why would you want to tightly couple them!? If that's what you think event sourcing is, sorry to be blunt, but you have very badly misunderstood it.

      3 replies →