I found it kind of weird how a lot of this was like “hey, people were talking about modules 70 years ago!”. They were also talking about process oriented (or service oriented, essentially) architecture in the 50s, 60s, and 70s - ex the actor model, simula67, Erlang, etc. Microservices are more recent but microservice architecture is basically just SOA with some patterns and antipatterns defined.
There’s a great section in Armstrong’s thesis, “Philosophy”, that has a bit to say on the topic of modules vs processes (which map to services, logically)[0].
The essential problem that must be solved in making a fault-tolerant software system is therefore that of fault-isolation. Different programmers will write different modules, some modules will be correct, others will have
errors. We do not want the errors in one module to adversely affect the behaviour of a module which does not have any errors.
To provide fault-isolation we use the traditional operating system notion of a process. Processes provide protection domains, so that an error in one process cannot affect the operation of other processes. Different programmers write different applications which are run in different processes; errors in one application should not have a negative influence on the other applications running in the system.
His thesis is rife with excellent citations about the process abstraction and its power.
Much of this is to say that, yes, modules are great, but they do not go far enough. Modules do not allow for isolation - processes do. By all means, use modules, but understand their limitations.
Modules do not isolate your address space. You may share references to data across modules that is mutated by those modules independently - this creates complex unsynchronized state.
Modules do not isolate failures. If I pass a value to a function and that function blows up, how far do I have to roll back before that value can be considered sane? Is the value sane? Was it modified? Is there an exception? A fault? All sorts of issues there.
Joe also talks a lot about the physics of computing, how violating causality is problematic, state explosions, etc. It all fits into the philosophy.
Anyway, yes, splitting a service into two process incurs overhead. This is factually the case. That’s also why Erlang services were in-memory, but whatever that’s obviously not microservice architecture, which very clearly is built around communication protocols that traverse a network. No question, you will lose performance. That’s why you don’t create a microservice for, say, parsing JSON - just do that in process. Hey, go nuts, even use a module! You… can do that. Or honestly, yeah, parse json in another process - might make sense if you’re dealing with untrusted input.
I found it ironic that microservices are being shown as somehow mistaken with regards to the “fallacies of distributed computing” when the actual backgroun is in fact to deal with unreliable systems. See ‘Why do computers stop and what can be done about it?’ - 1985 [1]. Notably, that entire paper boils down to “if you want to lower your mtbf you need a distributed system with these specific primitives” (cough processes cough cough) .
You can do that by embracing standalone processes hosted in Docker containers, or you can do that by embracing standalone modules in an application server that obey a standardized API convention, or a variety of other options
Weird emphasis on or because all of these work well together. Please, by all means, use well defined and isolated modules in your microservices, everyone will be happier for it. Also don’t just split out random shit into services, especially if the overhead of the RPC is going to matter - again, there’s nothing “microservicy” that would make you do such a thing.
Microservice architecture, as with all methodologies, has its benefits and downfalls.
To be honest I didn’t get much out of this article. It basically is “modules good, microservices… something?”. The only criticism seems to be that distributed systems are complicated, which is not really interesting since “distributed systems” is a whole massive domain and “networks have latency” is about as tiny of a drop of water in that ocean as one can get.
This article is useful when your boss’s boss starts asking about microservices and you need to send them something that says “microservices have tradeoffs and here is an alternative.”
If you already understand that, you’re well past needing to read this article yourself.
It’s long and easy to digest, which is great when you’re sending it to someone removed from the day to day “it depends” of software engineering.
I love the timing of this post because 2023 is the year we buy out of our years long love affair with AWS Lambda and start moving that functionality into our monolithic services.
I definitely like the whole idea of building your app as if it was a monolith and then deploying it to a bunch of lambdas in production, just wish that it was a little easier to predict certain things about the production environment within development. The serverless model, where a lot of that stuff is handled for you, is truly a breeze to deploy and maintain over time. In the case of Next.js, I find that building and deploying is the only real way to find out how something is going to behave in the real world. Sometimes that’s because of something I missed, but I’ve also had some issues that happened when I deployed to Vercel and were specifically due to how they chose to deploy my API endpoints. I know this because I could only reproduce these issues when deploying to Vercel, it wouldn’t break when I ran next dev or next build && next start on my local machine, since it always has access to the full app codebase. On Vercel, it turns out that you don’t get all of that with your deployed API endpoints, so unfortunately I had a bit of a time getting that all working properly. I chose a different (much simpler) path to do what I needed to do, but it was still a kick in the pants in terms of how I thought this stuff was working and how it actually was in the real world.
As a former Rails dev, I totally see the benefit of “The Majestic Monolith”, but I also remember dealing with some really interesting deployment scenarios due to how big those monoliths tended to get. Being able to have the same, or at least incredibly similar, developer experience without having to deal with the deployment mess has been a huge win of Next.js, Redwood.js, and other full-stack React frameworks. But while all of the deployment stuff has been moving forward at a rapid pace, it seems like when you run the app on development, you’re not really getting the “full picture” of how that app is going to behave in production. With Rails, I remember there being a lot more parity (at the framework level) between production and development, with a few configuration settings enabling prod-level behavior and that’s about it. Definitely feels like Next.js would benefit from some “guard-rails” around the pitfalls of deploying, such as warning you if the amount of code imported within an API route exceeds the limit for serverless functions (not sure if it does this now but it didn’t last time I checked), and potentially simulating stuff like filesystem access on the lambdas that your routes run on, so you can know what to look out for when developing. It’s not easy problems to solve, and I get that…but I feel like that would make this whole “build a monolith, deploy to serverless functions” thing a lot easier for developers to wrap their heads around. Right now, since all of it is truly in its infancy, some of those problems kinda surprise you when you take these apps to production for the first time. It’s not just me, either…my teammates (experienced Next/React devs) have struggled with it even when trying out Next.js v13.
While your scenario is an interesting one - its not quite what we’re dealing with at all here. We had a team in the past (no longer with the company) build an entire HTTP RPC API with API Gateway and Lambdas. Deploying each Node.js lambda was a headache and you ended up bundling each node_modules build with individual lambda (this was before Layers). Each lambda reacted solely to an API Gateway endpoint. It just feels like the wrong tool for the job considering this was the API for a web application that did many different things to a single data source. Not being able to run the entire service API locally was a boon to development because it turned into “write code, deploy to QA, test, fix bugs, repeat.”
We have also recently moved to using a single Express.js service on ECS to move data from customer facing applications into our production store. This used to be a .NET Core Lambda with API Gateway. I am very impressed with the throughput Express can give us, as well as the EventManager class built into Node.js, that I want to start moving all of our SQS + Lambda events into this service as event handler modules.
Local development is way easier (from my experience and perspective) when you don’t have to deal with a bunch of remote resources in AWS and that is a goal for my team - to run the entire stack locally. Also a little lift and shift is nice in case we ever need to leave the AWS ecosystem.
Definitely agreed. And Express works wonders, our API is still running Express at its core and it can take a good amount of request load before we need to start scaling it out. That said, it seems to me that the most common problems people had with microservices and serverless functions was not the actual development part, but rather the deployment and maintenance aspects of software development…Something that feels totally forgotten when people think about how to make a microservices-driven architecture. I’m also lumping “serverless functions” in with microservices because, to me, they’re relatively the same thing. I don’t think people were necessarily wrong to go the route of developing each part of their application in a modular fashion, separating each module in the deployment process so if one of them causes a problem it doesn’t require the entire application to scale up. This is a great idea, it’s just not that easy to implement if you don’t have hundreds of people working in your operations department. In fact (this is something I learned recently so I’m just gonna share it lol), those companies hire entire teams of developers whose sole job it is make it so other developers have a better time at work with their system. If your team doesn’t have that, I feel like “doing microservices from scratch” like this is just going to be really hard to maintain over time.
What I think is really neat is how there are frameworks and platforms that have sprung up which cater to this clear gap in the hosting market. Developers want to be able to stand up the application locally and in testing so they can build fast, but they don’t want to risk breaking things or accidentally causing performance issues in other parts of the application. The latter is definitely doable with a serverless approach, but we’ve only begun to make it so the developer experience matches the quality of the deployment experience. Hoping we can go a little further to make it truly equal, but for now it works in most cases.
I found it kind of weird how a lot of this was like “hey, people were talking about modules 70 years ago!”. They were also talking about process oriented (or service oriented, essentially) architecture in the 50s, 60s, and 70s - ex the actor model, simula67, Erlang, etc. Microservices are more recent but microservice architecture is basically just SOA with some patterns and antipatterns defined.
There’s a great section in Armstrong’s thesis, “Philosophy”, that has a bit to say on the topic of modules vs processes (which map to services, logically)[0].
His thesis is rife with excellent citations about the process abstraction and its power.
Much of this is to say that, yes, modules are great, but they do not go far enough. Modules do not allow for isolation - processes do. By all means, use modules, but understand their limitations.
Modules do not isolate your address space. You may share references to data across modules that is mutated by those modules independently - this creates complex unsynchronized state.
Modules do not isolate failures. If I pass a value to a function and that function blows up, how far do I have to roll back before that value can be considered sane? Is the value sane? Was it modified? Is there an exception? A fault? All sorts of issues there.
Joe also talks a lot about the physics of computing, how violating causality is problematic, state explosions, etc. It all fits into the philosophy.
Anyway, yes, splitting a service into two process incurs overhead. This is factually the case. That’s also why Erlang services were in-memory, but whatever that’s obviously not microservice architecture, which very clearly is built around communication protocols that traverse a network. No question, you will lose performance. That’s why you don’t create a microservice for, say, parsing JSON - just do that in process. Hey, go nuts, even use a module! You… can do that. Or honestly, yeah, parse json in another process - might make sense if you’re dealing with untrusted input.
I found it ironic that microservices are being shown as somehow mistaken with regards to the “fallacies of distributed computing” when the actual backgroun is in fact to deal with unreliable systems. See ‘Why do computers stop and what can be done about it?’ - 1985 [1]. Notably, that entire paper boils down to “if you want to lower your mtbf you need a distributed system with these specific primitives” (cough processes cough cough) .
Weird emphasis on or because all of these work well together. Please, by all means, use well defined and isolated modules in your microservices, everyone will be happier for it. Also don’t just split out random shit into services, especially if the overhead of the RPC is going to matter - again, there’s nothing “microservicy” that would make you do such a thing.
Microservice architecture, as with all methodologies, has its benefits and downfalls.
To be honest I didn’t get much out of this article. It basically is “modules good, microservices… something?”. The only criticism seems to be that distributed systems are complicated, which is not really interesting since “distributed systems” is a whole massive domain and “networks have latency” is about as tiny of a drop of water in that ocean as one can get.
[0] https://erlang.org/download/armstrong_thesis_2003.pdf
[1] https://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf
This article is useful when your boss’s boss starts asking about microservices and you need to send them something that says “microservices have tradeoffs and here is an alternative.”
If you already understand that, you’re well past needing to read this article yourself.
It’s long and easy to digest, which is great when you’re sending it to someone removed from the day to day “it depends” of software engineering.
I love the timing of this post because 2023 is the year we buy out of our years long love affair with AWS Lambda and start moving that functionality into our monolithic services.
I definitely like the whole idea of building your app as if it was a monolith and then deploying it to a bunch of lambdas in production, just wish that it was a little easier to predict certain things about the production environment within development. The serverless model, where a lot of that stuff is handled for you, is truly a breeze to deploy and maintain over time. In the case of Next.js, I find that building and deploying is the only real way to find out how something is going to behave in the real world. Sometimes that’s because of something I missed, but I’ve also had some issues that happened when I deployed to Vercel and were specifically due to how they chose to deploy my API endpoints. I know this because I could only reproduce these issues when deploying to Vercel, it wouldn’t break when I ran
next dev
ornext build && next start
on my local machine, since it always has access to the full app codebase. On Vercel, it turns out that you don’t get all of that with your deployed API endpoints, so unfortunately I had a bit of a time getting that all working properly. I chose a different (much simpler) path to do what I needed to do, but it was still a kick in the pants in terms of how I thought this stuff was working and how it actually was in the real world.As a former Rails dev, I totally see the benefit of “The Majestic Monolith”, but I also remember dealing with some really interesting deployment scenarios due to how big those monoliths tended to get. Being able to have the same, or at least incredibly similar, developer experience without having to deal with the deployment mess has been a huge win of Next.js, Redwood.js, and other full-stack React frameworks. But while all of the deployment stuff has been moving forward at a rapid pace, it seems like when you run the app on development, you’re not really getting the “full picture” of how that app is going to behave in production. With Rails, I remember there being a lot more parity (at the framework level) between production and development, with a few configuration settings enabling prod-level behavior and that’s about it. Definitely feels like Next.js would benefit from some “guard-rails” around the pitfalls of deploying, such as warning you if the amount of code imported within an API route exceeds the limit for serverless functions (not sure if it does this now but it didn’t last time I checked), and potentially simulating stuff like filesystem access on the lambdas that your routes run on, so you can know what to look out for when developing. It’s not easy problems to solve, and I get that…but I feel like that would make this whole “build a monolith, deploy to serverless functions” thing a lot easier for developers to wrap their heads around. Right now, since all of it is truly in its infancy, some of those problems kinda surprise you when you take these apps to production for the first time. It’s not just me, either…my teammates (experienced Next/React devs) have struggled with it even when trying out Next.js v13.
While your scenario is an interesting one - its not quite what we’re dealing with at all here. We had a team in the past (no longer with the company) build an entire HTTP RPC API with API Gateway and Lambdas. Deploying each Node.js lambda was a headache and you ended up bundling each
node_modules
build with individual lambda (this was before Layers). Each lambda reacted solely to an API Gateway endpoint. It just feels like the wrong tool for the job considering this was the API for a web application that did many different things to a single data source. Not being able to run the entire service API locally was a boon to development because it turned into “write code, deploy to QA, test, fix bugs, repeat.”We have also recently moved to using a single Express.js service on ECS to move data from customer facing applications into our production store. This used to be a .NET Core Lambda with API Gateway. I am very impressed with the throughput Express can give us, as well as the
EventManager
class built into Node.js, that I want to start moving all of our SQS + Lambda events into this service as event handler modules.Local development is way easier (from my experience and perspective) when you don’t have to deal with a bunch of remote resources in AWS and that is a goal for my team - to run the entire stack locally. Also a little lift and shift is nice in case we ever need to leave the AWS ecosystem.
Definitely agreed. And Express works wonders, our API is still running Express at its core and it can take a good amount of request load before we need to start scaling it out. That said, it seems to me that the most common problems people had with microservices and serverless functions was not the actual development part, but rather the deployment and maintenance aspects of software development…Something that feels totally forgotten when people think about how to make a microservices-driven architecture. I’m also lumping “serverless functions” in with microservices because, to me, they’re relatively the same thing. I don’t think people were necessarily wrong to go the route of developing each part of their application in a modular fashion, separating each module in the deployment process so if one of them causes a problem it doesn’t require the entire application to scale up. This is a great idea, it’s just not that easy to implement if you don’t have hundreds of people working in your operations department. In fact (this is something I learned recently so I’m just gonna share it lol), those companies hire entire teams of developers whose sole job it is make it so other developers have a better time at work with their system. If your team doesn’t have that, I feel like “doing microservices from scratch” like this is just going to be really hard to maintain over time.
What I think is really neat is how there are frameworks and platforms that have sprung up which cater to this clear gap in the hosting market. Developers want to be able to stand up the application locally and in testing so they can build fast, but they don’t want to risk breaking things or accidentally causing performance issues in other parts of the application. The latter is definitely doable with a serverless approach, but we’ve only begun to make it so the developer experience matches the quality of the deployment experience. Hoping we can go a little further to make it truly equal, but for now it works in most cases.
Some of this reminds me of a blog post I co-authored awhile back on “hard and soft modularity”.