1. 85
  1.  

  2. 17

    I’ve been telling my family for several years about the impending collapse of the gigantic house of cards of modern software based on what I’ve seen. They don’t believe me.

    The #1 thing we need to do is to do is apprentice and case-study based training of developers. Start training developers using case studies with real programs instead of just abstract ideas like “encapsulation” and having them write from scratch without having seen what production software looks like. Literate programming book such as Pharr and Humphreys’ amazing “Physically based rendering” are good steps in this direction.

    1. 9

      http://www.pbr-book.org for those interested in following the referral.

      1. 7

        Could you elaborate on what you mean by “collapse”? How would you expect the experience of writing or using software to be different afterwards?

        1. 4

          I really wished my college days were spent more on building and less on learning about polymorphism and encapsulation.

          That said, the author misses the point that we are in the middle of a huge boom where entire industries are just now looking around and saying, “How can we make our business more efficient with software?” Of course software is ideally supposed to ship bug free, of course Catalina shouldn’t have bugs, but at the end of the day we live in a world where user growth and utility drives value, not code quality.

          Century old industries are just now coming online and trying to integrate software into their business models, so they likely are either going to make low quality software that solves immediate needs. Not all companies view software as a revenue center.

          For most companies in the world software is just a way to lower costs. Not a money maker. Many of us are biased by the fact we work in tech and work on software products/projects that are money makers. Some of the companies in the post are giant tech companies.

          Imagine the total lines of code count at any of the companies mentioned. It’s easily in the hundreds of millions or even billions. Of course there are examples of bad code!

        2. 29

          I worked at large companies with user-facing products similar to what the author referenced - not Apple, but Skype, Microsoft and Uber. I worked or observed the team closely on OSes like XBox One and Windows 8, similar to the MacOs example. I think the example is overly dramatic.

          In software - especially with BigTechCo - the goal is not to ship bug-free software. It’s to ship software that supports the business goal. Typically this means gaining new customers and reducing user churn. Ultimately, the end goal of publicly traded companies is to provide shareholders value. And Apple is dam good at this, generating $50B in profit on $240B in revenues per year.

          All the examples in this blog post are ones that won’t result in user churn. The Catalina music app having a buggy section? Big deal, it will be fixed in the next update. The Amazon checkbox issue? Same thing: it will be prioritised and fixed sometime. They are all side-features with low impact. The team might have known about it already. Or - more likely - this team did not spend budget on thorough testing, as what they were building isn’t as critical as some other features.

          The Skype app was full of smaller bugs like this: yet it dominated the market for a long time. When it failed, it was not for this. Catalina likely spent resources on making sure booting was under a given treshold and updates worked flawless. Things that - if they go wrong - could lead to loss of customers, gaining fewer new ones. So things that would directly impact revenue.

          Finally, a (very incorrect) fact:

          Lack of resources? This is Apple, a company that could’ve hired anyone in the world. There are probably more people working on Music player than on the entire Spotify business. Didn’t help.

          This is plain false and very naive thinking. Teams are small at Apple and the music player team for Catalina is likely 10-15 people or less, based on my experience. While Apple could hire an army for an app like this, then they would not be the $1T company they are today. They have that valuation because they are very good at consistently generating high profits: for every $10 of revenue, they generate $2 of profit. They hire the number of people needed to make a good enough product and don’t spend money just because they have it.

          What did change is Apple used to have a huge budget for manual testers: it was insane, compared to other companies. Due to rationalising - publicly traded company et al - the testing department is likely smaller for non-critical apps. Which puts them in-line or slightly above with the rest of their competitors.

          I am not saying that bugs in software are great. But consumer-facing software development is more about iteration, speed and launching something good enough, than it is about perfection. It’s what makes economic sense. For other industries, like air travel or space, correctness is far more important, and it comes at the expense of speed and iteration.

          It’s all trade-offs.

          1. 12

            It’s to ship software that supports the business goal.

            This is really the fundamental miss of the author. Author doesn’t understand that (1) crap happens and (2) the level of quality required for a satisficed user is lower than he thinks.

            Teams are small at Apple and the music player team for Catalina is likely 10-15 people or less, based on my experience.

            Also, I can’t lay my head on a citation, but I think that it’s been studied that smaller teams produce better quality software (up to a point, ofc).

            1. 3

              All the examples in this blog post are ones that won’t result in user churn. The Catalina music app having a buggy section? Big deal, it will be fixed in the next update. The Amazon checkbox issue? Same thing: it will be prioritised and fixed sometime. They are all side-features with low impact. The team might have known about it already. Or - more likely - this team did not spend budget on thorough testing, as what they were building isn’t as critical as some other features.

              The inability to open the iTunes Store might be bad for sales, so they’ll probably want to fix that one. But yes, as long as the basic features are working, these bugs are fine, on some level. This is how it is.

              I think he is trying to highlight something on a more fundamental level: it should not be so easy to write these kinds of bugs. The developers should have to go out of their way to write them. But with the tools they have been given, it seems they have to work very hard to avoid writing bugs. It is like they have been given hammers that by their nature have a tendency towards hitting thumbs and sometimes manage to hit both your thumbs at the same time.

              Let’s turn it around. Suppose software had a fundamentally good and auspicious nature. Suppose also that your product owner was a tricky fellow who wanted to add some bugs in your program. He comes up with a user story: as a user, sometimes I want to have an item be selected, but not highlighted, so as to further my confusion. I think the result of this would be a commit with some kind of conditional statement, table-driven code or perhaps an extra attribute on the list items that activates the bug path. The point being that you would need to add something to make it buggy. With the tools the Catalina music app team had, they very likely did not have to add anything at all to get those bugs.

              The instances of bugs he brings up suggests to me that the tools involved were not used for their intended purpose. They were used to create simulacrum software. The Amazon checkboxes probably get their state from a distributed system where they “forgot” to handle multiple pending state changes. They could instead have used a design where this would never be an issue at all. If it had been designed properly, they would indeed have needed to add code to get it that buggy. And the buggy list items are probably not even in lists, but merely happen to sometimes visually resemble lists. And so on.

              It is not good that this sort of thing happens regularly. One example from my own experience: the Swedish Civil Contingencies Agency (MSB) has an app that alerts you to important events. I cannot count how many times it has lost its settings and has defaulted to alerting about everything that happens everywhere. I have uninstalled that app. When the war arrives, I’ll be the last one to know.

              1. 4

                Teams are small at Apple and the music player team for Catalina is likely 10-15 people or less, based on my experience.

                Yes, this accords with my experience. I would be surprised if it were that many people; the number of people who were working on the iTunes client was shockingly small, and they’d never grow the team just for Music.

                1. 3

                  Based on my experience at Apple, I’d be surprised if the Music app was an actual team. Much more likely it was a few folks from another team that was tasked with creating it as a part-time project and wasn’t their primary project. Or, it could’ve been 2 folks who were fairly junior and tasked with writing it with occasional assistance.

                  In my experience teams and projects were sorely understaffed and underfunded (unless it was wasteful projects like the doomed self-driving car, in which case they were showered with far too much money and people). It was just shocking to work at a company that had so much excess cash and couldn’t “afford” to add people to projects that could really use them.

                2. 2

                  All the examples in this blog post are ones that won’t result in user churn. The Catalina music app having a buggy section? Big deal, it will be fixed in the next update. The Amazon checkbox issue? Same thing: it will be prioritised and fixed sometime. They are all side-features with low impact. The team might have known about it already. Or - more likely - this team did not spend budget on thorough testing, as what they were building isn’t as critical as some other features.

                  One bug like this would not make me “churn”, but two or three would. I no longer use Chrome, nor iTunes, nor iOS, because of exactly this type of drop in quality. I no longer even bother with new Google products, because I know that they’re more likely than not to be discontinued and dropped without support. I no longer consider Windows because of all the dark patterns.

                  I am on the bleeding edge relative to less techical users, but I am also a lazy software dev, meaning I hate tinkering just to make something work. I’ll probably go with GNU for my next computer. And a year or two later, I bet so will my neighbor who just uses email and my friend who just needs to edit photos and my other friend who just writes papers.

                  1. 7

                    I no longer use Chrome, nor iTunes, nor iOS, because of exactly this type of drop in quality . . . I hate tinkering just to make something work.

                    I’ll probably go with GNU for my next computer.

                    🤨

                    1. 1

                      Not sure if you intended for that to show up as “missing Unicode glyph”, but it works.

                      You’ve got a point there.

                      Until now, I have been using macOS for the hardware support, a few niche apps for stuff like playing music, and a GNU VM (Fedora LXDE) for dev which has proven to be a low-maintenance setup all around.

                      1. 2

                        The “missing Unicode glyph” is the Face with one eyebrow raised emoji and shows up for me using Chrome on iOS and Windows.

                        1. 2

                          And FF on Android

                3. 14

                  The problem with evolutionary software development is that it, like biological evolution, results in a pile of hacks-upon-hacks that nobody really understands and that could be made grossly more efficient. Were I a bit cheekier I’d make a crack about it being a clear case again intelligent design, but that’d belabor the point. :P

                  To me, this is the probably the key observation from the essay:

                  The software crisis is systemic and generational. Say, the first generation works on thing X. After X is done and becomes popular, time passes and the next generation of programmers comes and works on Y, based on X. They do not need to know, exactly, how X is built, why it was built that way, or how to write an alternative X from scratch. They are not lesser people or lazier, they just have no real need to write X2 since X already exists and allows them to solve more pressing tasks.

                  I’ve many times shot myself in the foot by claiming to myself (or others) that I knew a lower level of abstraction because I used its api, when in fact I didn’t really. Examples of this:

                  • Wrapping memory operations and believing I understood how memory pages worked
                  • Wrapping 3D rendering routines believing I understood how triangles get drawn
                  • Writing matrix inversion routines believing I understand how matrices work
                  • Using embedded code templating engines believing I understood how templating works
                  • Using a calendar date picker plugin believing I understood how dates work and correspond to days of the week

                  The striking thing to me about most of those examples is how low-level/niche they seem, and how quickly I found out that I actually had no clue what the hell was happening when I tried to do them.

                  The thing that really spooks me is that the folks that have solved these issues are leaving our industry. Hell, a lot of the work that I’ve done I can barely remember the intricacies of years later. Worse, it seems to me that we have industry pressures to:

                  • Skip documentation of any sort (just read the code!)
                  • Prefer buying (AWS) or incrementally mudballing (babel/webpack) instead of solving problems ourselves
                  • Celebrate clever developers (conference talks, books, full stop) instead of productive, reliable ones (when was the last time your name appeared on your product’s about page?)
                  • Decry old techniques as being wrongthink and needing to be cancelled, because a better approach comes along (frontend JS, C vs. Rust, etc.)
                  • Lose perspective on how “hard” business/computational problems are (when was the last time you estimated the resources/number of systems a business action should take? can you trace that through your system? if not, why not?)

                  Each one of those pressures has a real and reasonable fountainhead, but I fear that very few people are spending the time to figure out how they’re warping the fabric of our practice in such a way as to prevent long-term engineering of any meaningful sort.

                  1. 12

                    weak men … already know you won’t find time to watch it

                    My stopwatch estimates I spent a minute and thirty-eight seconds reading this blog post. one sixtieth of the time, roughly, of Blow’s talk. My time and my attention are far more valuable than Yet Another Frigging Talk/Podcast. Fortunately, OP implicitly recognized this and wrote text, the appropriate medium for serious thinking and concepts.

                    That said.

                    Software is a house of cards because our economic system does not reward or prize proper reinvention, and American consumers reject, with passion, (remember Windows 8’s rollout?) substantial change in how their world works, demonstrating a lived conservativism for how their tools work.

                    The entire stack is built on these notions of backwards compatibility teetering on adhoc processors and systems from the 70s. Then the Web is rolled in, and now we’re building on a language designed in 2 weeks, just to make the monkey dance.

                    Yet, it makes money, reliability is tolerable, and profits continue flowing.

                    To really fix the situation, you’re looking at scrapping (in the end), everything from the x86 interface on up, and butchering so many sacred cows it’d be a revolution in the religion of software dogma. Costs for the end consumer & businesses would, in the medium term, probably rise 10-50x, since no more commodity would exist for the system.

                    1. 8

                      our economic system does not reward or prize proper reinvention … Costs for the end consumer & businesses would, in the medium term, probably rise 10-50x, since no more commodity would exist for the system.

                      Not such a rousing case for “proper”. As always, when people look at legacy, they see the ugly surface and ignore the deep value below the surface.

                      Certainly the pile of hacks is nasty and I don’t like it, but to wag one’s finger about “proper” ways and claim that a working system is a failure of economics(?), makes no sense.

                      butchering so many sacred cows

                      Legacy is the opposite of a sacred cow, it actually provides value, and is detested rather than worshipped. The Proper Way is the sacred cow, a false ephemeral idol that yields vaporware and lofty claims.

                      By the way, proper engineering considers real-world constraints including time/financial budget, effort vs payoff (leverage), and, yes, existing investments.

                      Meanwhile https://urbit.org has actually done what you suggest: reimplement the entire stack. Have you tried it? Or too busy on the legacy stack? :)

                      1. 0

                        I’m familiar with the dark enlightenment’s creation, thanks.

                        The Proper Way is the sacred cow, a false ephemeral idol that yields vaporware and lofty claims.

                        oh, go away.

                        ed: That is demonstrably false; a genuine false centrism that propagates terrible ideas and prioritizes legacy in the name of value. I simply refuse to engage with that kind of ahistoricity and bad philosophy of science.

                        1. 3

                          prioritizes legacy in the name of value.

                          If value isn’t a worthwhile goal, you may excuse the passers-by on your street corner for being confused about your meaning.

                    2. 7

                      I feel like the original post (and some of the subsequent discussion here) is full of armchair quarterbacking and general smugness, even though I do sympathize with the fact that computers are ridiculously faster than 10 years ago but somehow seem slower (Dan Luu dives into that a bit if you’re interested). That said, I don’t think the problem is frameworks-upon-frameworks or leaky abstractions or that modern developers don’t have a Mel-like total understanding of every transistor on their system.

                      My view of the problem is that developers are reluctant to acknowledge that, as a complex system, software will always run in a degraded mode. If you acknowledge that things can’t be perfect, you can shift the discussion to how much you want to spend to approach perfection. Conversely, you can then think about where you want to spend your time with that budget.

                      1. 10

                        Jonathan Blow’s “Preventing the collapse of civilization” talk is a must watch.

                        Twitter newly rebuilt UI takes 7× longer to load first tweet

                        There’s no way all those websites that invested tons in A/B testing not intentionally make things load so slow. I’m guessing it’s because the wait makes it more addictive or something like it. It’s either that or incompetence, and I don’t know which is worse.

                        1. 21

                          It becomes clearer when you have 20 departments, each with their own tracking pixel. Death by 1000 cuts.

                          1. 12

                            I’ve some personal experience on what happens there: You just forget that not everyone is using a:

                            • Pretty expensive and modern phone
                            • With tons of GHz and RAM
                            • With 5G connectivity or high speed WiFi
                            • Connecting thru LAN network or geographically near your datacenter

                            And those are the kind of details you just forget, and nobody actually cares because time-to-interactive isn’t as measured as click-thru ratios.

                            The best I could do when was working on mobile app development, was using the crappiest possible Android or iPhone around for testing. Good enough to have a fast workflow of stop application, install open, test, and repeat. Bad enough so collapsing the memory or CPU wasn’t difficult.

                          2. 12

                            I haven’t worked with a quality focused team since ~2009, so it has nothing to do with weakness, and turning this into a moral choice that someone is making seems misplaced to me. I think it’s a capitalist choice, and yet again capitalism optimizing for nothing useful.

                            The worse is better theory winning is not some victory lap for C, but I believe just a part of the fact that consumers / clients have no other choices, and if they do the cost and effort of switching is almost an impossible hurdle. The idea of me switching to an iPhone or my wife switching to Android is almost an insurmountable set of unknown complexity.

                            1. 2

                              I don’t think the article really states it as a moral choice, but rather as an emergent property of software development as it is practiced.

                              1. 1

                                I’m sure there’s a philosophical name for this. It’s a practice that results in morally problematic results, despite that practice not being a deliberate moral choice. Sort of like how capitalism as currently practiced fills the ocean with microplastic garbage despite nobody making a choice to do that.

                                1. 5

                                  Hot take: most “morality” is just a matter of aesthetics. Billions of people would presumably rather be alive than not existing because a non-capitalist system is grossly inefficient at developing the supporting tech and markets for mass agriculture. Other people would prefer that those folks not exist if it meant prettier beachfront property, or that their favorite fish was still alive.

                                  Anyways, that’s well off-topic though I’m happy to continue the conversation in PMs. :)

                                  1. 8

                                    Just as “software development” is a pretty broad term, “capitalism” is a pretty broad term. I wouldn’t advocate eliminating capitalism any more than I would advocate eliminating software development. The “as currently practiced” is where the interesting discussion lies.

                                  2. 3

                                    There’s an economic name for it - externality - though economics is emphatically not philosophy.

                                    1. 1

                                      Sort of like how capitalism as currently practiced fills the ocean with microplastic garbage despite nobody making a choice to do that.

                                      This is a classic False Cause logical fallacy.

                                      Capitalism is not the cause of microplastic pollution. The production of microplastics and subsequent failure to safely dispose of microplastics is the cause of microplastic pollution.

                                      Microplastics produced in some centrally-planned wealth-redistribution economy would be just as harmful to the environment as microplastics produced in a Capitalist economy (although the slaves in the gulags producing those microplastics would be having less of a fun time).

                                      Further example:

                                      • Chlorofluorocarbons were produced in Capitalist economies.
                                      • Scientists discovered that chlorofluorocarbons are poking a hole in the ozone layer and giving a bunch of Australians skin cancer.
                                      • People in Capitalist economies then decided that we should not allow further use of chlorofluorocarbons.
                                      1. 3

                                        Again, the key phrase here is not “capitalism”, but “as currently practiced”. Capitalism doesn’t cause microplastics, but it doesn’t stop them either. In other words microplastics are “an emergent property of capitalism as it is practiced”. You could practice it differently and not produce microplastics, but apparently the feedback mechanism between the bad result (microplastics/bloated software) and the choices (using huge amounts of disposable plastics/using huge amounts of software abstractions) is not sufficient to produce a better result. (Of course assuming one thinks the result is bad to begin with.)

                                        1. 0

                                          Of course assuming one thinks the result is bad to begin with.

                                          That is really the heart of the matter, as far as I see it. In contemporary discourse, capitalism as a values system (versus capitalism as a set of observations about markets) does not have a peer, does not have a countervailing force.

                                          I’m sure there’s a philosophical name for this

                                          @leeg brought this up as well, but “negative externality” is in the ballpark of what you are looking for . An externality is simply some effect on a third party, and whose value is not accounted for within the system. Environmental pollution is a great example of a negative externality. Many current market structures do not penalize pollution at a level commensurate with the damage caused to other parties. Education is an example of a positive externality: the teachers and administrators in schools rarely achieve a monetary reward commensurate with the long-term societal and economic impact of the education they have provided.

                                          Societies attempt to counteract these externalities by some degree of magnitude (regulations and fines for pollution, tax exemptions for education), and much ink is spilled in policy debates as to whether or not the magnitudes are appropriate.

                                          Bring back in my first statement, that capitalism (née economic impact) is not only values system, but is the only system that is assumed to be shared in contemporary discourse. This results in a lot of roundabout arguments, in pursuit of other values, being made in economic terms.

                                          What people really wish to convey, what really motivates people, may be something else. However, they cannot rely on those values being shared, and resort to squishy, centrist, technocratic studies and statistics that hide their actual values, in hopes other people will at least share in the appeal to this-or-that economic indicator (GDP, CPI, measures of inequality, home ownership rates, savings rates, debt levels, trade imbalances, unemployment, et cetera). This technocratic discussion fails to resolve the actual difference in values, and causes conflict-averse people to tune it out entirely, thus accepting the status quo (“capitalism”). I lament this, despite being very centrist and technocratically-inclined myself.

                                          Rambling further would eclipse the scope of what is appropriate for a post on Lobsters, so I will chuck it your way in a DM.

                                          1. [Comment removed by author]

                                            1. 3

                                              I apparently chose an explosive analogy here, and now I’m fascinated by all the stuff that’s coming back.

                                              But let me just try again with something less loaded…how about transportation?

                                              The bad effects in the essay (wasted resources, bugs, slowness, inelegance) are a result of how we do software development. Assume for argument that most people don’t choose waste, bugs, slowness, and inelegance deliberately. Nevertheless, that’s what we get. It’s an “emergent property” of all the little choices of how we do it.

                                              Similarly, most people—I hope certainly the engineers involved—don’t choose to have the NOx pollution, two-hour commutes, suburban sprawl, unwalkable communities, and visual blight that result from how we do transportation. It just happens because of how we do it.

                                              So we’re all actively participating in making choices that cause an outcome that a lot of participants don’t like.

                                              My point was just that there are lots of things like this, not just software development. So I figure this sort of problem must have a name.

                                              (And yes, this means writing an essay about how awful the result is doesn’t do anything to fix it, because the feedback from result to cause is very weak.)

                                              1. 2

                                                So I figure this sort of problem must have a name.

                                                Engineering. Engineering is trading off short commutes for private land. Engineering is a system of cars that get every individual acting alone where they need to go, even though getting all people at the same destinations from the same origin really calls for mass transit. Engineering is families with kids making different living and thus commuting arrangements than single people. These are all tradeoffs.

                                                The ideal keyboard takes no space and has a key for everything you want to type from letters to paragraphs. Everything else is engineering. The ideal city has zero school, work, leisure, and shopping commutes for everybody. What we have instead is engineering.

                                                The ideal bus line goes to every possible destination and stops there. It also takes no time to complete a full circuit. We compromise, and instead have buses that work for some cities and really don’t for others.

                                  3. 5

                                    if we weren’t building a particular plane uninterruptedly, then after just 50 years it is already easier to develop a new one from scratch rather than trying to revive old processes and documentation. Knowledge does not automatically transfer to the next generation.

                                    For software this can be just months. It explains why we reinvent the wheel all the time. It isn’t even bad since it is economically more efficient to start from scratch.

                                    Is there any research into this? I guess for critical infrastructure some analysis is done. For example, I heard they always build the flight control centers twice and also store lots of replacement parts somewhere. Similarly for luxury cars which are built specifically to customer wishes.

                                    1. 6

                                      Surely, I agree software is too complex and built on abstractions that programmers don’t understand. However, I don’t think these types of essays provide great evidence of that. Yes there are bugs in Apple’s flagship app, and whatever that Amazon app was. How exactly does this provide evidence of anything? Without knowing the architecture, design, etc, it definitely doesn’t support this thesis.

                                      And I will admit I didn’t watch the video, I’m on mobile and not able to watch it right now. If the video contains some evidence that I missed, please let me know.