1. 21
  1.  

  2. 11

    Great critique! Reminds me a bit of

    Back to the ’70s with Serverless (2020)

    which I linked from

    Summer Blog Backlog: Distributed Systems (Kubernetes is our generation’s Multics)

    I’d say the key issue is that the cloud “abstractions” are leaky, and they don’t compose. This leads to combinatorial explosions of code and frameworks.

    1. 7

      My requirements mean that eventually I will need this project reliably in production on a server somewhere, particularly if I’m looking to do training and inference with GPUs, and even more so if I want to make Viberary a web app that people can use interactively. So it makes sense to start working with it in the cloud.

      I don’t see how that follows. This is for a side project, so there’s no externally imposed requirements. The author could just as well have decided to take out one or more VPSes with a database and server processes. You can make that scale if and when it becomes necessary.

      1. 5

        Instead of whittling buildings from living trees as real builders once did, we are reduced to merely assembling purchased wood and bricks.

        1. 7

          But you can build arbitrary things out of wood and bricks.

          What I see with the cloud and modern frameworks is that you can start with a canned architecture, and get something that’s roughly similar to what you want kinda quickly.

          But then you spend 90% of your time patching over the last 10%, and you never really get what you want. Instead you pass it on to the next team, which rewrites it on some newer non-composable abstraction, from the same vendor, or a different one. The abstractions eat up any hardware improvements, so that the same program has higher latency than it did 5 years ago.

          1. 2

            But the wood you are purchasing isn’t 2x4s. It’s just tree branches that you need to combine together. They don’t fit quite right so you need to stick some mud in between them so the wind doesn’t cut through the building.

            1. 8

              I wouldn’t even call it raw tree branches or 2x4’s. Those can be eventually fashioned into the shape you want, with enough work (and, in software, work can be automated!).

              I would say the analogy is closer to trying to build a house out of Ikea furniture parts. Those parts are great, if what you want to build is what the designers intended! But if it’s not (and it’s often not), then you’re stuck with hacks. After the hacks, the system continues to works poorly.


              Steve Yegge has a couple good analogies that get at the composition problem:

              Java is like a variant of the game of Tetris in which none of the pieces can fill gaps created by the other pieces, so all you can do is pile them up endlessly.

              http://steve-yegge.blogspot.com/2007/12/codes-worst-enemy.html

              In the cloud, a common pattern I see is “adding caches” for things that shouldn’t be slow in the first place. The caches patch over some problems, and add more.

              They leave big holes in correctness and performance, which are obvious problems we should be aware of. There are also highly non-obvious problems like metastable states: https://www.usenix.org/publications/loginonline/metastable-failures-wild

              (i.e. the presence of a cache now means that restarting a stressed system does NOT cause it to recover. All cloud companies have complex software and processes to patch over this problem, implicitly or explicitly. On the thread about Twitter, I pointed out cloud systems have people turning cranks all day long, and SREs were (probably still are) the most numerous type of employee at Google)


              And Legos:

              With the right set (and number) of generically-shaped Lego pieces, you can build essentially any scene you want. At the “Downtown Disney” theme park at Disney World in Orlando, there’s a Legoland on the edge of the lagoon, and not only does it feature highly non-pathetic Lego houses and spaceships, there’s a gigantic Sea Serpent in the actual lake, head towering over you, made of something like 80 thousand generic lego blocks. It’s wonderful.

              Dumb people buy Lego sets for Camelot or Battlestars or whatever, because the sets have beautiful pictures on the front that scream: “Look what you can build!” These sets are sort of the Ikea equivalent of toys: you buy this rickety toy, and you have to put it together before you can play with it. They’ve substituted glossy fast results for real power, and they let you get away with having no imagination, or at least no innovative discipline.

              https://sites.google.com/site/steveyegge2/ancient-languages-perl?pli=1


              A major point of my posts on software architecture last year, which it took me many words to get around to, is that Unix is a language-oriented operating system, and that means it composes like a language.

              The cloud is not language-oriented, and doesn’t compose.

              And I think the “everything is text” problem is sort of a red herring (it’s a tradeoff/downside of the Unix style).

              I believe that we’re so focused on solving the I need types for fine-grained autocomplete problem that we’ve lost sight of the we’re writing way too much code that works poorly problem. One problem is local and immediate, while the other is global and systemic.

            2. 1

              We’ve been building from wood and bricks for the last few decades but these days we’re building prefabbed kitchens and bathrooms and jamming them together until the joists buckle and shipping that

            3. 4

              I think every generation of programmers feels nostalgic for the “good old days” when they felt like they understood their whole stack and were working directly at a super low level. But I’m not at all convinced that it’s actually true that we once did.

              For example, I got into web dev in the early 2000s with PHP and Perl on the backend and HTML/CSS/JS (“DHTML”) on the frontend. And at the time I felt like I understood the stack, but really I understood only down to about the point of the HTTP daemon, with a little bit of knowledge of what was going on over the wire. The internals of, say, Apache and then the underlying TCP/IP stack, operating system, etc., were all just opaque to me. They were commodity pieces taken off the shelf to do what they did and I didn’t have to worry much about them.

              Today, we have a lot more commodity pieces to take off the shelf – that part, at least, is true. But I’m not convinced that it’s necessarily a bad thing, or necessarily separates us from the problems we’re working on. First of all, we have all those pieces because people ran into problems, figured out solutions that worked well enough for them, and then made those solutions available, often for free and with the ability to modify/improve. That’s a great thing and has absolutely advanced the state of our art by leaps and bounds.

              More importantly: I feel like I am closer, today, to the actual problems I’m trying to solve than ever before.

              Once upon a time, there were a lot more intermediate problems that I had to solve and that got in the way of whatever I actually wanted to do. I can think and reason about web applications and their constituent parts in ways that early-2000s me could not have conceived of. Back then I had to invent my own half-assed MVC-ish architecture because the big wave of modern backend frameworks was still just over the horizon. Back then I had to manually set up servers and install all the packages – HTTP daemon, database, cache, etc. – and wire them all together. Once upon a time I even was on a team that had to literally wire up and rack-mount our own hardware! Back then I had to do a ton of extra work that put distance between me and the actual problems I was hoping to solve, which tended to be things like features or bugfixes in a web app.

              Today, I feel like there are fewer layers interposing themselves between me and the problems I’m trying to solve, precisely because so much of that stuff has been replaced by commodified solutions/components. Which does mean I’m working at a higher level of abstraction, much of the time. And does mean I’ve had to learn new patterns and ways of talking about them and how to work with the interfaces of the commodified pieces. But it doesn’t mean that I feel I’ve lost something or that it’s no longer fun or challenging to me.

              (though I also am a long-standing believer in the idea that most people in most teams at most companies shouldn’t be trying to invent new tech from scratch in the first place – it’s usually a sign that things have gone badly wrong)

              1. 4

                I liked this comment because it describes my own background, all the way down to the timing and anecdotal illustrations. (I, too, came up with Perl, PHP, and “DHTML”; I, too, racked my own servers.) And though I think you are right that every generation of programmers is delusionally nostalgic for the prior era’s “simplicity”, I also think you might be missing some of the traps in today’s programmer culture. Specifically, I came up in a culture of open: open web & open protocols & open source & open architecture.

                It’s not a question of whether I knew my whole stack. That depended on my willingness and free time. It was a question of whether I could know the whole stack. And also whether I could contribute at any layer. And that’s what the web and LAMP and then Python & f/oss databases and so forth gave me. Since we’re of the same era, you know exactly what “closed” used to mean in that late 90s & early 00s era: Microsoft’s closed source development ecosystem. ASP and C# developers deploying to IIS and SQL Server certainly got things done for businesses, they didn’t struggle with much accidental complexity. But their stack was so damn proprietary, so damn unknowable. From the web server to the framework/language to the OS and so on.

                Today, we have open source everywhere in the development environment, but there has been a precipitous and worrying rise of proprietary in the production environment, due especially to “serverless” style public cloud APIs. This should be seen for what it is: much closer to the Microsoft model of the 90s/00s than the alternative. And all the same concerns for developers apply.

                (This is one of the reasons why, even though I am expert in AWS/GCP and a fan of each, my main use case for them is commodity compute, memory, and storage. I don’t want to rack physical servers anymore, but I still want it to feel as though I have a rack of servers sitting in a data center, running open source software I can comprehend. That I can destroy the rack and recreate it programmatically is all to the good, but I don’t want it hidden from me entirely for when my code has to run on it. For the compute & memory side, give me SSH access & full-blown Linux userland, or give me death!)

                1. 3

                  It was a question of whether I could know the whole stack.

                  Most people couldn’t, and definitely wouldn’t. At least if we’re using a full definition of “the stack”. From the web down to silicon. Any other definition of “the stack” is just stopping a little bit before we get uncomfortable. Why not include file system implementation? Or layer 1 & 2 network protocols? Or processor microarchitecture?

                  The beef with serverless, though, I agree. Back when I first started dealing with cloud, it felt like we could (and maybe would, in time) arrive at some sort of abstraction, even if just conceptual, of cloud, that would make which cloud you’re using more of an implementation detail. We never quite got there, and now it looks like we’re as further from it as we’ve ever been. And this forces you to bet not on technology, but on companies, which kinda rubs me the wrong way, sometimes.

              2. 4

                I’d give this article more credit if it didn’t revolve around using cloud and bigquery to load a ~2GB dataset.

                The VendorOps bit is a great insight, though.

                1. 2

                  I mean, if it’s a problem with 2GB, that makes cloud look even worse, though?

                  1. 2

                    Not exactly. Looks like he’s trying to see the whole dataset as a dataframe on Jupyter. That’s opening a 2GB webpage. It’s gonna crap most browsers, even if you have loads of RAM.

                    Honestly, I kinda drift off at the second part of the article because I genuinely didn’t understand what the author was trying to do and what was exactly was their complaint. The 2GB dataset is pretty early on that second half, though.

                2. 3

                  I’m confused about the graphic showing that you would want a Hadoop cluster for 1TB of data. That’s something you do on one machine, and not even that beefy a machine today.