One thing I would love to read more about is how to determine the cutoff between scaling horizontally and investing time/money in optimizing your software.
Oooo, that’s a good one. I have a couple hand-wavey heuristics. I’ll think more about trying to turn that into something more real.
I have the next 2-3 posts vaguely planned out. But, I’ll definitely be thinking about your idea @hwayne.
Doesn’t it boil down to, basically, “it’s a knack, spend 10 years doing it and you’ll get good at judgment”?
I’d definitely take a more “business” driven approach to that problem. Optimizing your software for cost should only be done if the money it saves is superior to what it costs to optimize it.
You also have to take into account indirect costs like when using weird optimization tricks can make code less readable sometimes and also has a cost for future development.
On the other hand, scaling horizontally adds costs of coordinating multiple servers, which includes load balancing, orchestrating deployments, distributed system problems, etc.
The business-driven approach is not always the best for society as a whole. It doesn’t take into account negative externalities like the environmental cost of running inefficient programs and runtimes.
I had a few that helped a lot:
Use the fastest components. Ex: Youtube used lighttpd over Apache.
If caching can help, use it. Try different caching strategies.
If it’s managed, use a system language and/or alternative GC.
If fast path is small, write the data-heavy part in an optimizable language using best algorithms for that. Make sure it’s cache and HD-layout friendly. Recent D submission is good example.
If it’s parallelizable, rewrite the fast path in a parallel, programming language or using such a library. Previously, there was PVM, MPI, Cilk, and Chapel. The last one is designed for scale-up, scale-out, and easy expression simultaneously. Also, always use a simpler, lighter solution like that instead of something like Hadoop or SPARK if possible.
Whole-program, optimizing compilers (esp profile-guided) if possible. I used SGI’s for this at one point. I’m not sure if LLVM beats all of them or even has a profile-guided mode itself. Haven’t looked into that stuff in a while.
Notice that most of this doesn’t take much brains. A few take little to no time either. They usually work, too, giving anything from a small to vast improvement. So, they’re some of my generic options. Via metaprogramming or just good compiler, I can also envision them all integrated into one language with compiler switches toggling the behavior. Well, except choosing fastest component or algorithm. Just the incidental stuff.
“When pursuing a vertical scaling strategy, you will eventually run up against limits. You won’t be able to add more memory, add more disk, add more “something.” When that day comes, you’ll need to find a way to scale your application horizontally to address the problem.”
I should note the limit last I checked was SGI UV’s for data-intensive apps (eg SAP Hana) having 64 sockets with multicore Xeons, 64TB RAM, and 500ns max latency all-to-all communication. I swore one had 256 sockets but maybe misremembering. Most NUMA’s also have high-availability features (eg “RAS”). So, if it’s one application per server (i.e. just DB), many businesses might never run into a limit on these things. The main limit I saw studying NUMA machines was price: scaling up cost a fortune compared to scaling out. One can get stuff in low-to-mid, five digits now that previously cost six to seven. Future-proofing scale up by starting with SGI-style servers has to be more expensive, though, than scale-out start even if scale-out starts on a beefy machine.
You really should modify the article to bring pricing up. The high price of NUMA machines was literally the reason for inventing Beowful clusters which pushed a lot of the “spread it out on many machines” philosophy towards mainstream. The early companies selling them always showed the price of eg a 64-256 core machine by Sun/SGI/Cray vs cluster of 2-4 core boxes. First was price of mini-mansion (or castle if clustered NUMA’s) with second ranging from new car to middle-class house. HA clustering goes back further with VMS, NonStop, and mainframe stuff. I’m not sure if cost pushed horizontal scaling for fault-tolerance to get away from them or if folks were just building on popular ecosystems. Probably a mix but I got no data.
“The number of replicas, also known as “the replication factor,” allows us to survive the loss of some members of the system (usually referred to as a “cluster”). “
I’ll add that each could experience the same failure, esp if we’re talking attacks. Happened to me in a triple, modular redundancy setup with a single component faulty. On top of replication, I push hardware/software diversity much as one’s resources allow. CPU’s built on different tools/nodes. Different mobo’s and UPS’s. Maybe optical connections if worried about electrical stuff. Different OS’s. Different libraries if they perform identical function. Different compilers. And so on. The thing that’s the same on each node is one app you’re wanting to work. Even it might be several written by different people with cluster having a mix of them. The one thing that has to be shared is the protocol for starting it all up, syncing the state, and recovering from problems. Critical layers like that should get the strongest verification the team can afford with SQLite and FoundationDB being the exemplars in that area.
Then, it’s really replicated in a fault-isolating way. It’s also got a lot of extra failure modes one has to test for. Good news is several companies and/or volunteers can chip in each working on one of the 3+ hardware/software systems. Split the cost up. :)
Those are planned subjects for future posts. I wanted to keep this one fairly simplistic for folks who aren’t experts in the area.
That makes sense. Thanks for clarifying.