1. 59
  1.  

  2. 23

    The guideline I heard was “don’t think about microservices until you have more devops people than the average startup has employees.”

    Monoliths scale pretty dang far!

    1. 9

      At my last job I often said my primary mission was to defer for the longest time possible the introduction of microservices and to defer for the longest time possible the introduction of deep learning.

      1. 6

        This lines up with my experience as well: microservice scaling is entirely organizational, not technological.

        Eventually consistent writes, stale OLAP, and designing for the inevitable failure of half your application becomes paradise when the alternative is that even small feature delivery grinds to a halt because coordinating and testing changes eats up most of the cycles of your development teams.

        1. 1

          I agree that having devops figured out is critical, right. But even in startups, when done right, its nice to split up different things into different services.

        2. 13

          In my experience the highest velocity architecture is one monolith per team.

          1. 5

            Microservices should implement bounded contexts, which map to teams, so, yes: one microservice (monolith, service, whatever) per team is exactly right.

            1. 9

              … until that team gets reorged.

              1. 6

                This is a mood beyond words

                1. 4

                  A microservice architecture is indeed a poor fit for an organization that’s still figuring out how to hew the borders of its bounded contexts.

                2. 3

                  Something that seems to have gotten lost is that there was a reason to tack the “micro” onto the long-established term “service oriented”. Micro was supposed to be very different. A lot of talk about microservices nowadays seems to just mean “not-monolith”. I’m not generally very keen on microservices, but a lot of confusion seems to stem from mixing ideas from different approaches without understand their strengths and weaknesses. My current advice is to 1) achieve strong modularization within your monolith (so it becomes which modules could be turned into stand-alone services if need be, 2) extract a few services when there is a clear benefit, 3) forget about microservices until you have no other options.

                3. 2

                  Not only highest velocity, but also lowest fragility. Since you essentially allocate responsibility onto a team, which is unlikely to quit all at once. It also lowers communication overhead to come extent, since you reliably have team leads + PMs + managers communicating without having to get everyone involved in the communication chain.

                4. 11

                  This is basically right, but I think misses a bit of the forest for the trees. They are frequently misunderstood and over-applied, but they do have real and massive value in the right setting, and it would be a shame to write them off altogether.

                  Microservices are an architectural design pattern that converts Conway’s Law from a liability to an asset. They solve organizational problems and create technical ones, and so should be few, mapping to autonomous teams in an organization, not functional components in a system.

                  1. 4

                    Conway’s Law

                    organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.

                    https://en.wikipedia.org/wiki/Conway%27s_law

                  2. 18

                    I get where he’s coming from, but I think the really valuable thing about microservices is the people-level organizational scaling that they allow. The benefits from having small teams own well-defined (in terms of their API) pieces of functionality are pretty high. It reduces the communication required between teams, allowing each team to move at something more closely approaching “full speed”. You could, of course, do this with a monolith as well, but that requires much more of the discipline to which the author refers.

                    1. 15

                      Why do you need network boundaries to introduce this separation? Why not have different teams work on different libraries, and then have a single application that ties those libraries together?

                      1. 4

                        Network boundaries easily allow teams to release independently, which is key to scaling the number of people working on the application.

                        1. 1

                          Network boundaries easily allow teams to release independently

                          …Are you sure?

                          Independent systems are surely easier to release independently, but that’s only because they’re independent.

                          I think the whole point of a “microservice architecture” is that it’s one system with its multiple components spread across multiple smaller interdependent systems.

                          which is key to scaling the number of people working on the application

                          What kind of scale are we talking?

                          1. 1

                            Scaling by adding more people and teams to create and maintain the application.

                            1. 2

                              Sorry, I worded that question ambiguously (although, the passage I quoted already had “number of people” in it). Let me try again.

                              At what number of people writing code should an organisation switch to a microservices architecture?

                              1. 1

                                That’s a great question. There are anecdotes of teams with 100s of people making a monolith work (Etsy for a long time IIRC), so probably more than you’d think.

                                I’ve experienced a few painful symptoms of when monoliths were getting too big: individuals or teams locking large areas of code for some time because they were afraid to make changes in parallel, “big” releases taking days and requiring code freezes on the whole code base, difficulty testing and debugging problems on release.

                            2. 1

                              I think the whole point of a “microservice architecture” is that it’s one system with its multiple components spread across multiple smaller interdependent systems.

                              While this is often the reality, it misses the aspirational goal of microservices.

                              The ideal of software design is “small pieces, loosely joined”. This ideal is hard to attain. The value of microservices is that they provide guardrails to help keep sub-systems loosely joined. They aren’t sufficient by themselves, but they seem to nudge us in the right, more independent direction a lot of the time.

                          2. 2

                            Thinking about this more was interesting.

                            A [micro]service is really just an interface with a mutually agreed upon protocol. The advantage is the code is siloed off, which is significant in a political context: All changes have to occur on the table, so to speak. To me, this is the most compelling explanation for their popularity: they support the larger political context that they operate in.

                            There may be technical advantages, but I regard those as secondary. Never disregard the context under which most industrial programming is done. It is also weird that orgs have to erect barriers to prevent different teams from messing with each other’s code.

                          3. 12

                            The most common pattern of failure I’ve seen occurs in two steps:

                            a) A team owns two interacting microservices, and the interface is poor. Neither side works well with anyone but the other.

                            b) A reorg happens, and the two interacting microservices are distributed to different teams.

                            Repeat this enough times, and all microservices will eventually have crappy interfaces designed purely for the needs of their customers at one point in time.

                            To avoid this it seems to me microservices have to start out designed for multiple clients. But multiple clients usually won’t exist at the start. How do y’all avoid this? Is it by eliminating one of the pieces above, or some sort of compensatory maneuver?

                            1. 4

                              crappy interfaces designed purely for the needs of their customers…

                              I’m not sure by which calculus you would consider these “crappy”. If they are hard to extend towards new customer needs, then I would agree with you. This problem is inherent in all API design, though microservice architecture forces you to do more of that thus you could get it wrong more often.

                              Our clients tend to be auto-generated from API definition files. We can generate various language-specific clients based on these definitions and these are published out to consumers for them to pick up as part of regular updates. This makes API changes somewhat less problematic than at other organizations, though they are by no means all entirely automated.

                              …at one point in time

                              This indicates to me that management has not recognized the need to keep these things up to date as time goes by. Monolith vs. microservices doesn’t really matter if management will not invest in keeping the lights on with respect to operational excellence and speed of delivery. tl;dr if you’re seeing this, you’ve got bigger problems.

                              1. 1

                                Thanks! I’d just like clarification on one point:

                                …management has not recognized the need to keep these things up to date as time goes by…

                                By “keeping up to date” you mean changing APIs in incompatible ways when necessary, and requiring clients to upgrade?


                                crappy interfaces designed purely for the needs of their customers…

                                I’m not sure by which calculus you would consider these “crappy”. If they are hard to extend towards new customer needs, then I would agree with you.

                                Yeah. Most often this is because the first version of an interface is tightly coupled to an implementation. It accidentally leaks implementation details and so on.

                                1. 1

                                  By “keeping up to date” you mean changing APIs in incompatible ways when necessary, and requiring clients to upgrade?

                                  Both, though the latter is much more frequent in my experience. More on the former below.

                                  Yeah. Most often this is because the first version of an interface is tightly coupled to an implementation. It accidentally leaks implementation details and so on.

                                  I would tend to agree with this, but I’ve found that this problem mostly solves itself if the APIs get sufficient traction with customers. As you scale the system, you find these kinds of incongruities and you either fix the underlying implementation in an API-compatible way or you introduce new APIs and sunset the old ones. All I was trying to say earlier is, if that’s not happening, then either a) the API hasn’t received sufficient customer interest and probably doesn’t require further investment, or b) management isn’t prioritizing this kind of work. The latter may be reasonable for periods of time, e.g. prioritizing delivery of major new functionality that will generate new customer interest, but can’t be sustained forever if your service is really experiencing customer-driven growth.

                                  1. 1

                                    Isn’t now the problem moved to the ops team, which had to grow in size in order to support the deployment of all these services, as they need to ensure that compatible versions talk to compatible versions, if that is even possible? What I found the most problematic with any microservices deployment is that ops teams suffer more, new roles are needed just to coordinate all these “independent”, small teams of developers, for the sake of reducing the burden on the programmers. One can implement pretty neat monoliths.

                                    1. 2

                                      We don’t have dedicated “ops” teams. The developers that write the code also run the service. Thus, the incentives for keeping this stuff working in the field are aligned.

                            2. 6

                              It reduces the communication required between teams, allowing each team to move at something more closely approaching “full speed”.

                              Unfortunately, this has not been my experience. Instead, I’ve experienced random parts of the system failing because someone changed something and didn’t tell our team. CI would usually give a false sense of “success” because everyone’s microservice would pass their own CI pipeline.

                              I don’t have a ton of experience with monoliths, but in a past project, I do remember it was nice just being able to call a function and not have to worry about network instability. Deploying just 1 thing, instead of N things and having to worry about service discovery was also nicer. Granted, I’m not sure how this works at super massive scale, but at small to medium scale it seems nice.

                              1. 2

                                Can you give an example where this really worked out this way for you. These are all the benefits one is supposed to have, but the reality often looks different in my experience

                                1. 10

                                  I work at AWS, where this has worked quite well.

                                  1. 5

                                    Y’all’s entire business model is effectively shipping microservices, though, right? So that kinda makes sense.

                                    1. 20

                                      We ship services, not microservices. Microservices are the individual components that make up a full service. The service is the thing that satisfies a customer need, whereas microservices do not do so on their own. Comprising a single service, there can be anywhere from a handful of microservices up to several hundred, but they all serve to power a coherent unit of customer value.

                                      1. 4

                                        Thank you for your explanation!

                                    2. 4

                                      AWS might be the only place I’ve heard of which has really, truly nailed this approach. I have always wondered - do most teams bill each other for use of their services?

                                      1. 5

                                        They do, yes. I would highly recommend that approach to others, as well. Without that financial pressure, its way too easy to build in some profligate waste into your systems.

                                        1. 1

                                          I think that might be the biggest difference.

                                          I’d almost say ‘we have many products, often small, and we often use our own products’ rather than ‘we use microservices’. The latter, to me, implies that the separation stops at code - but from what I’ve read it runs through the entire business at AMZ.

                                        2. 1

                                          It’s worked well at Google too for many years but we also have a monorepo that makes it possible to update all client code when we make API changes.

                                  2. 5

                                    This reminds me of a GOTO talk by Simon Brown. He aptly explained what microservices are supposed to be, but what they eventually become. https://youtu.be/5OjqD-ow8GE?t=2403

                                    1. 3

                                      Monoliths are to microservices what fat binaries are to containers. At the end of the day the incentive structures insides companies are complicated and rarely align with best engineering practices, so I don’t see the microservice focused world of corporate-programming changing anytime soon.

                                      1. 3

                                        I think ‘services’ in any form are the wrong product of programming. The real purpose of any system is larger computational processes (the ‘business logic’) that span multiple services and systems - frontends, middleware, databases. Currently, each of these contains duplicated and overlapping parts of the system wide model definition. We have to keep each of these definitions in sync, mentally keep track of the correspondence, manually hand hold any changes across the system while ensuring compatibility, etc.

                                        I would like to see a topology independent programming model. What I mean is a model where we can define system wide processes without tying it to the implementation topology. This definition would then be mapped to the implementation topology via additional descriptions.

                                        Consider how compilers freed our code from being tied to the CPU topology (we don’t reference register names, load and store operations in our code, but make up variable names closer to the business logic, that get mapped to the underlying topology). Similarly we want to be free of writing business logic tied to the system topology. If I’m writing a messaging app, my highest level code should look like room.append(message). Then separately I’d map the room object and the message object to various projections across the services and databases.

                                        Does something like this exist?

                                        1. 2

                                          A topology independent model was part of earlier generations of RPC, like NFS, DCOM, CORBA. Just make all your OO methods accessible over the network and go! Unfortunately this ignores the latency of RPCs, which require API redesign. It also prevents us from making RPC interfaces reliable by hiding state in the server that is deleted on restart or upgrade.

                                          Does it help to realise that services can still share client libraries built on interface descriptions like protobuf?

                                          1. 1

                                            If you have the choice between an object being fully local or fully remote on another machine, with method calls becoming network messages, I don’t think that is topology independent. Your topology will still factor into how you design these object APIs and where you put them. I’m aware of CORBA, DCOM and also GemStone, but I’m talking about something different.

                                            Consider the conceptual room object in the messaging app. Where does it live? A part of it is in the client, part of it is in some servers and another part of it is persisted in the database. It’s topologically fragmented but conceptually a single object. The parts of this object are not even identical replicated copies - each one overlaps with another but has more to it. Conceptually the state of the room is the union of the state of the distributed fragments.

                                            AFAIK none of the mentioned technologies can let you declare and work with a single object that is implemented in this fragmented fashion. The idea is that you’d treat this as a single object and define various processes and attributes for it. Separately you’d define how the attributes and processes can be decomposed and mapped onto the topology.

                                            1. 1

                                              Let’s say each of your fragments is one service in a microservices architecture. The advantage is each service can be developed, deployed, debugged independently in parallel by different teams.

                                              If you couple all the fragments again in a single codebase (albeit one with projections as you describe) isn’t that no longer possible? Or at best you need to be sure not to edit code outside your team’s fragment because you could break another team’s service. Separate repositories for separate teams is an obvious way to divide responsibilities.

                                              1. 1

                                                Good questions - these are open questions that would need to be resolved as part of refining the programming and deployment model. I’ll phrase the questions as “how does ownership work?” and “what is a deployable unit?” and speculate on them below.

                                                First I want to point out that even in a microservice world, the owner of a shared library may update it at one point in time but it will end up going live at different times in different services based on when they rebuild and push. The library owner may have no control over the deployment here.

                                                Secondly I think it’s a fallacy that services are ‘independent’ - a downstream or upstream service push may break a service due to interdependencies, assumptions and sleeping bugs that got activated due to a new usage pattern. Still, in principle the API and SLA encapsulate what a service provides and ownership corresponds to a ‘service’ that satisfies these.

                                                In a topo-free-model world, I imagine services will be fluid enough that they would not be a unit of ownership. For instance, there are multiple alternative service topologies that might implement the same topo-free-model. Consider two equivalent topologies, A-B-C-D and A-E-D, where the E service implements the functionality of both services B and C. Now transitioning from ABCD to AED may be done because it has better performance characteristics. Importantly, there would be no change to the topo-free-model definition of the code during this transition. Only the mapping to the topology would change. I imagine this could be rolled out by a single team that owns this subset (B-C) of the topology. Ownership of the topo-free-model itself would be distributed along the business model. I imagine a large object such as User might itself be sliced up and owned by multiple teams - depending on business function. In the end, Conway’s law will still be in effect so the org structure would mirror the topo-free-model to some degree.

                                        2. 2

                                          Monoliths can potentially be shared by multiple teams and a single development standard can be enforced. Microservices can range in varying degrees of quality and operational health. Engineering discipline is tremendously important.

                                          However, it depends a lot on organizational discipline as well.

                                          How do you handle ownership during re-orgs? How do you handle employees leaving or changing teams? Who is on the hook for future updates and maintenance? A team? A person? What happens if the service needs to be changed and the owning team doesn’t have the capacity?

                                          I worry because at a previous job we seemed to have this unwritten rule that a person “owned” a service even if they were on a team that had totally different goals. This meant that when people left nobody would be retrained and code that was vital to certain projects would be left to rot. “Dying on the vine” as they called it.

                                          1. 2

                                            What we really need are better databases.

                                            1. 3

                                              I feel that databases already stepped-up their game, but somehow people are not up to date with all the improvements. A lot of developers I meet have no clue how to optimize database and generally treat it as a black box. A lot of companies would rather hire someone with ReactJS experience, than DBA experience :)

                                              1. 2

                                                I obviously have no idea what you have in mind but I agree and am intrigued. (Even so, it’s interesting and instructive to see how the whole noSQL cycle went down.)

                                                1. 1

                                                  What I mean is that I have spent some time with PostgreSQL’s views, triggers and row-level-security to glimpse a future where a lot of business logic gets encoded in a non-imperative way very close to the data. We are not there yet, though.

                                                  It would be nice to be able to store the schema in a git repository and be able to statically check that all your views and procedures are compatible with each other. It would also be nice to have a tool to construct the best migration path from the current schema to the new one where you only instruct it on the missing bits (that is, how did the shape of data changed).

                                                  I think that a tight type system and some good tooling might be able to combat the complexity much better than service oriented architecture that still needs a lot of attention on coordination and API stability. If a team changed their public views, they should immediately get a type error or a QuickCheck test suite should notify them that they broke something. They could share ownership and modify dependent code themselves more easily.

                                                  1. 2

                                                    This is indeed the technical platform I introduced at my last job and am using for my current project!

                                                    It would be nice to be able to store the schema in a git repository

                                                    I’m using the excellent setup pioneered (?) by the PostgREST/Subzero project:

                                                    https://github.com/subzerocloud/subzero-cli

                                                    It’s very simple actually: build up your schema with idempotent SQL scripts split up into a regular text files according to your taste (you can place them in a hierarchical file structure that fits your software model). Just use \ir path/to/script.sql to run all the scripts in order from a top init.sql file. For example, from init.sql, call one script to set up db users, another to create schemas and set basic permissions on them, then call one script for each schema which in turn calls sub-scripts to set up tables, views, functions… All of this happens in regular text files, under version control. Reloading the entire db strucure + seed data takes about a second, so you can iterate quickly.

                                                    Now, the great thing that subzero-cli gives you is a way to turn the resulting schema into a migration (using the Sqitch stand-alone migration tool) by automatically diffing your current schema against the last checked in schema. (This involves a little dance of starting up ephemeral Docker containers and running a diffing tool, but you don’t really notice.) So you get a standard way of deploying this to your production system using simple migrations. (Sqitch is a pretty great tool in itself.)

                                                    be able to statically check that all your views and procedures are compatible with each other

                                                    Here you’ll have to rely on automated tests, like pgTAP or anything really that you prefer. Python is very well supported as an “in-database” language by Postgres and I’m working on writing tests using the wonderful Hypothesis library and run them directly inside Postgres to thoroughly test functions, views etc.

                                                    It would also be nice to have a tool to construct the best migration path from the current schema to the new one

                                                    Again, handled very well by subzero-cli, relying on apgdiff (apgdiff.com, yes it’s “old” but subzero maintain their own fork which gets small tweaks from what I’ve seen).

                                                    I obviously agree with the rest of what you wrote :) If you put PostgREST, PostGraphile, or Hasura on top of your “smart” postgres system, you can give teams quite a bit of flexibility and autonomy in structuring client-server communication for their use cases, while keeping the core logic locked down in a base schema.

                                                    1. 1

                                                      It would be nice to be able to store the schema in a git repository and be able to statically check that all your views and procedures are compatible with each other. It would also be nice to have a tool to construct the best migration path from the current schema to the new one where you only instruct it on the missing bits (that is, how did the shape of data changed).

                                                      Unless I misunderstand you, these tools already exist, at least for MySQL, PostgreSQL, and MSSQL. The compatibility checking does need to happen by deploying the schema, but the rest is there now.

                                                      1. 1

                                                        The compatibility checking does need to happen by deploying the schema, but the rest is there now.

                                                        I am pretty sure that checking of procedure bodies only happens when you run them.

                                                        Can you share links for the tools? I am not aware of them.

                                                        1. 2

                                                          Yeah, stored procs are not statically analyzed in any of the tools I know. SQL Server: https://docs.microsoft.com/en-us/sql/relational-databases/data-tier-applications/deploy-a-database-by-using-a-dac?view=sql-server-ver15 MySQL: https://www.skeema.io/ For Postgres I know I’ve seen one or two tools that functioned in this way but I don’t seem to have saved the link.

                                                2. 0

                                                  He nails it pretty well about running stateful applications on k8s. Most stateful applications require intimate hardware tuning to run at optimum performance. Containers these days are much better and offer better virtualization guarantees than a few years ago but yet you’d be better off running it in a VM or even a baremetal host if possible. Microservices are highly tailored towards running stateless web applications and a lot of the current best practices are tailored towards that and not everything needs to be a web application.