I got myself taken seriously at work the other day, so I’m writing a draft of… something (in normal times it would have been a handful of lectures or workshop sessions; in these pandemic times we’ll have to see) that I want to use to introduce developers at work to the practice of designing and running distributed systems.
In short, work has decided that they want to take the monolith they’ve run for years and cut it into services. My opinion is that we’ve never asked the majority of developers working here to do this, and they’ll have a bad time if they have to figure out how to design and run distributed systems from first principles. I’m putting together some kind of primer they can use to (a) find out what they don’t know, (b) find out where to learn more, and (c) contains some of the lessons we’ve learned the hard way. I want to focus more on the practice of doing these things than the theory; understanding Paxos is very nice, but usually doesn’t help us not get paged at night.
I’d like to ask what resources people here know of and can recommend for this? I have a pile of bookmarks, but it’s probable I don’t know of some helpful texts.
I’d also like to know if anyone has tried a hand-on approach to these kinds of lessons, and how it went? I had an idea of having people build a small test system that I’d break in various ways (overload it with many queries, or heavy queries, or introduce network splits, or …) but I’m not sure if they’d just get lost in the system setup and we’d never get to more interesting lessons.