Interesting! I like his point about the progress of time as another form of input. That way, your application becomes deterministic and very debugging-friendly.
I’ve seen something similar mentioned in the context of either Event Sourcing (ES) or Domain-driven design (DDD): The clock is an external actor to your system. Can’t find a link right now though.
I think “Command sourcing” is a great name, it brings “Event sourcing” to mind as both analogy and contrast. Perhaps it’s a better name than “Memory Image Pattern”.
The kind of problems MIP is a good solution for include:
Complex business logic
Tight deadlines
Low latency or realtime requirements
Frequently changing requirements that traditionally would result in laborious database schema changes in production
Frequently changing requirements that makes it hard to know what kind of data is useful
A need to handle historical data and metadata
Any combination of the above
But it’s probably not a good solution if you have any of the following:
A compute-intensive or very data-intensive application
Using a WAL log to store commands to the system is really clever. Also, doing serial command processing feels like an elegant way to sidestep a bunch of concurrency issues. Combine the two and you’ve got full ACID compliance!
However, one thing I haven’t figured out is how code deploys would work? Some runtimes support hot code upgrades out of the box (Erlang, a bunch of lisps), but what would the process look like for a runtime that needs to be restarted to reload code? Would you create a snapshot (maybe asynchronously?), restart and read the snapshot back?
In a traditional setup you’d typically have your application servers take turns to get upgraded so you always have some machine handling incoming traffic, but in this pattern you only have one machine, so while it’s upgrading the system is down I guess?
(For a runtime that can boot fast enough it might not cause significant downtime though?)
Great question! You can upgrade the command (write) interface without downtime, using two machines A and B:
A is serving the command interface with the old application logic.
You start up a new machine B with the new application logic. The application data can have a wildly different structure, but the schema for the command log is the same.
B reprocesses the log from the beginning while A continues to serve command requests and write to the log.
You decide on some future command index N where control is to switch from A to B and communicate that to both machines.
Starting at command index N, machine A stops processing the commands and instead proxies them to B.
Approximately immediately thereafter, B will have processed command N-1 from the log and will begin processing incoming commands.
For the query (read) interface it’s even easier since you can actually have several machines processing queries and do a normal rolling upgrade.
This seems really naive. It is not trivial to implement low-latency and high-throughput write-ahead logging and checkpointing from scratch. The systems which have managed to do so efficiently are commonly known as “databases”.
Databases also try to do so while processing several write requests concurrently and remaining scalable. Indeed, this is far from trivial, and traditional RDBMSs have been battle tested and refined over several decades to achieve this.
Two things that struck me about this pattern after thinking about it for a while:
How would you compare this to Redux? It seems there are a lot of touch points, except state in React/Redux being ephemeral
Wouldn’t blocking I/O temporarily stop the processing of commands? This is something I’d worry about if I was making eg a bunch of calls to REST APIs. I guess async I/O is a way to handle this, but then you wouldn’t be able to maintain linearity across I/O boundaries. (I guess the part about reactors being able to emit commands sort of hinted at this?)
The architecture is very similar to Redux.
The reasons for using Redux for a UI vs using MIP for a server-side application might be partly different since the constraints and trade-offs are different, but a lot of the advantages are also similar.
I’ve actually found it advantageous to combine server-side MIP with a similar pattern client-side.
Share the business logic code (the “reducers”) between client and server.
When the UI initializes, get the latest state from the server.
Send UI actions as commands to the server-side MIP application, and stream accepted commands to the client to update the UI state.
Suddenly you have solved both persistence and realtime collaboration with one move.
If at any point you are dissatisfied with the server round-trip delay for UI updates, you already have the perfect architecture for solving that:
Let the client calculate the latest state under the assumption that the commands sent to the server will be accepted, but also save the state implied by the latest command the server has accepted.
Backtrack as needed on timeouts or when the server contradicts the assumption.
If you make a bunch of calls to REST APIs, you never wait for the replies before processing the next command.
Commands need to be deterministic.
Emitting commands from reactors is a way to make something indeterministic look deterministic:
Operations involving calling out to external services become less trivial. Since they cannot be considered deterministic, they can’t be implemented as a processor. Rather, they will have to become reactors, that may or may not produce more commands that are fed back to the application.
Example:
A command is sent to the application to perform a ping against some server.
The result of the ping is not deterministic and therefore cannot be used to calculate the next application state, so doing the ping inside a projector is useless.
You do it in a reactor, and when you have the result, you emit a command back into the application that basically says “the result of the ping was X”.
This result-declaration-command can be used by processors.
When the application is restarted, the actual ping is not performed since reactors are ignored by the reprocessor, but the perceived result of the ping, saved in a command, is the same.
Doing this asynchronously is perfectly fine; if the external service call takes longer, it just means that the result-declaration-command might arrive later in the command sequence, but determinism is still preserved.
I believe calling external services was the only reason you’d worry about blocking I/O.
Still, let’s address that part:
Decide what kind of durability guarantees you want, and make a fair comparison between MIP and another option.
Processing commands in sequence does not mean that you need to do everything on a single thread.
For example, each of the following tasks can run on its own thread:
Assign an index number to incoming commands
Serialize incoming commands into a log buffer
Write the log buffer to file
Run projectors/reducers to create the new state
Run reactors, possibly on several threads
Send replies to clients, possibly on several threads, optionally (depending on what durability guarantees you want) after making sure the corresponding command is persisted.
Note that the only operation that is I/O-dependent here is writing to disk sequentially.
Decades of database performance tuning has had “minimize disk seek operations” as a mantra, and here we have approximately no seeks at all.
It’ll run fast.
In practice you’re unlikely to need more than one thread.
The above multi-thread model is just a way out if you ever find yourself needing more compute-heavy operations that take an amount of time within the same order of magnitude as disk writes.
Commands can have side effects, but the logic that causes side effects needs to be separated from the logic that updates the application state. I call these pieces of logic “Reactors” and “Projectors”, respectively. (I should probably have clarified that Projectors must not have side effects.) With this separation it works, because the Reprocessor can rebuild the application state without causing side effects a second time.
This is a different name from what it used to have in the old Gang of Four book (I know, I know) but I can’t recall it because there were better things to expend neurons on (like the lyrics to Barbie girl for some reason)
It’s mostly useful to make undo/redo much easier.
As a state persistence mechanism though it’s really not the great because it results in excessive load and launch times, as well as unbound memory growth.
The solution to these problems is coalescing commands in memory and flattening state on storage. Both of which get you to what this article is trying to avoid.
Commands don’t need to be kept in memory after processing, so you get unbounded growth of disk data but not memory use. If disk data growth is unacceptable you can coalesce the command log but indeed then you lose many of the benefits of MIP, as detailed under Command log storage and system prevalence.
The load/launch times can become an issue but this is more easily addressed without losing the benefits of MIP. Saving snapshots of application state to disk without deleting commands will speed up load/launch, see the heading Startup time.
There’s an old Carmack
.plan
about this sort of pattern, too: https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_day/johnc_plan_19981014.txtInteresting! I like his point about the progress of time as another form of input. That way, your application becomes deterministic and very debugging-friendly.
I’ve seen something similar mentioned in the context of either Event Sourcing (ES) or Domain-driven design (DDD): The clock is an external actor to your system. Can’t find a link right now though.
Would “Command sourcing” be a fitting name for it? It looks interesting. What kind of problem would this be a good solution for?
I think “Command sourcing” is a great name, it brings “Event sourcing” to mind as both analogy and contrast. Perhaps it’s a better name than “Memory Image Pattern”.
The kind of problems MIP is a good solution for include:
But it’s probably not a good solution if you have any of the following:
Very interesting article!
Using a WAL log to store commands to the system is really clever. Also, doing serial command processing feels like an elegant way to sidestep a bunch of concurrency issues. Combine the two and you’ve got full ACID compliance!
However, one thing I haven’t figured out is how code deploys would work? Some runtimes support hot code upgrades out of the box (Erlang, a bunch of lisps), but what would the process look like for a runtime that needs to be restarted to reload code? Would you create a snapshot (maybe asynchronously?), restart and read the snapshot back?
In a traditional setup you’d typically have your application servers take turns to get upgraded so you always have some machine handling incoming traffic, but in this pattern you only have one machine, so while it’s upgrading the system is down I guess?
(For a runtime that can boot fast enough it might not cause significant downtime though?)
Great question! You can upgrade the command (write) interface without downtime, using two machines A and B:
For the query (read) interface it’s even easier since you can actually have several machines processing queries and do a normal rolling upgrade.
This seems really naive. It is not trivial to implement low-latency and high-throughput write-ahead logging and checkpointing from scratch. The systems which have managed to do so efficiently are commonly known as “databases”.
Databases also try to do so while processing several write requests concurrently and remaining scalable. Indeed, this is far from trivial, and traditional RDBMSs have been battle tested and refined over several decades to achieve this.
If your application’s input rate is low enough that you can accept limited horizontal scalability and lack of concurrency - then write-ahead logging and checkpointing really do become trivial.
Two things that struck me about this pattern after thinking about it for a while:
Question 1:
The architecture is very similar to Redux. The reasons for using Redux for a UI vs using MIP for a server-side application might be partly different since the constraints and trade-offs are different, but a lot of the advantages are also similar.
I’ve actually found it advantageous to combine server-side MIP with a similar pattern client-side. Share the business logic code (the “reducers”) between client and server. When the UI initializes, get the latest state from the server. Send UI actions as commands to the server-side MIP application, and stream accepted commands to the client to update the UI state. Suddenly you have solved both persistence and realtime collaboration with one move.
If at any point you are dissatisfied with the server round-trip delay for UI updates, you already have the perfect architecture for solving that:
Question 2:
If you make a bunch of calls to REST APIs, you never wait for the replies before processing the next command. Commands need to be deterministic. Emitting commands from reactors is a way to make something indeterministic look deterministic:
Example: A command is sent to the application to perform a ping against some server. The result of the ping is not deterministic and therefore cannot be used to calculate the next application state, so doing the ping inside a projector is useless. You do it in a reactor, and when you have the result, you emit a command back into the application that basically says “the result of the ping was X”. This result-declaration-command can be used by processors. When the application is restarted, the actual ping is not performed since reactors are ignored by the reprocessor, but the perceived result of the ping, saved in a command, is the same. Doing this asynchronously is perfectly fine; if the external service call takes longer, it just means that the result-declaration-command might arrive later in the command sequence, but determinism is still preserved.
I believe calling external services was the only reason you’d worry about blocking I/O. Still, let’s address that part:
Decide what kind of durability guarantees you want, and make a fair comparison between MIP and another option. Processing commands in sequence does not mean that you need to do everything on a single thread. For example, each of the following tasks can run on its own thread:
Note that the only operation that is I/O-dependent here is writing to disk sequentially. Decades of database performance tuning has had “minimize disk seek operations” as a mantra, and here we have approximately no seeks at all. It’ll run fast. In practice you’re unlikely to need more than one thread. The above multi-thread model is just a way out if you ever find yourself needing more compute-heavy operations that take an amount of time within the same order of magnitude as disk writes.
Thanks for elaborating! Good point about single-threading not being an explicit requirement, as long as commands are synchronous
This only works if none of your commands have side effects
Commands can have side effects, but the logic that causes side effects needs to be separated from the logic that updates the application state. I call these pieces of logic “Reactors” and “Projectors”, respectively. (I should probably have clarified that Projectors must not have side effects.) With this separation it works, because the Reprocessor can rebuild the application state without causing side effects a second time.
Redis uses this log storage method.
As does pretty much every database which actually guarantees the durability of committed operations.
This is a different name from what it used to have in the old Gang of Four book (I know, I know) but I can’t recall it because there were better things to expend neurons on (like the lyrics to Barbie girl for some reason)
It’s mostly useful to make undo/redo much easier.
As a state persistence mechanism though it’s really not the great because it results in excessive load and launch times, as well as unbound memory growth.
The solution to these problems is coalescing commands in memory and flattening state on storage. Both of which get you to what this article is trying to avoid.
Commands don’t need to be kept in memory after processing, so you get unbounded growth of disk data but not memory use. If disk data growth is unacceptable you can coalesce the command log but indeed then you lose many of the benefits of MIP, as detailed under Command log storage and system prevalence.
The load/launch times can become an issue but this is more easily addressed without losing the benefits of MIP. Saving snapshots of application state to disk without deleting commands will speed up load/launch, see the heading Startup time.
[Comment removed by author]