1. 6

    I would have liked to tag this article with datastructures but there is no such tag.

    1. 6

      compsci tends to be the correct goto for that stuff. :)

      1. 1

        On top of it, using CompSci neans that clicking the tag will teach them data structures, algorithms, optimizations, and more!

    1. 6

      One potential limitation is that compute shaders are a relatively new feature, and are only available in GPUs made in the last five-or-so years.

      Something to watch out for with GPU-based 2D rendering in general is that it might unnecessarily make the application unusable on older hardware. I’m reminded of a sad case involving WPF about a decade ago: a Bible study application where the choice of video card actually mattered. I know about this particular case because it affected my brother when he tried to run that program on an older laptop. I find it sad that, as shown by the linked forum thread, people had to take time away from their real work to discuss video cards. And bear in mind that the application in question was basically a specialized text viewer and editor.

      1. 2

        Interesting to see that even in a forum of people that probably care a lot about ethics, you’ll see this comment:

        Intel graphics are the worst!

        I thought at least these people would care more about freedom than about performance. But I guess not.

        1. 1

          Well, it’s a Christian forum rather than one on ethics in general. Many Christians aren’t concerned much about those things due to the sovereignty if God. As in, how He exerts His will on all things to accomplish His purposes regardless of the schemes of men. From the past (eg Ninevah in Nahum or reforming Israel) to future (eg Revelation), He’s not stopped by much worse oppression and corruption than software licensing. Ex: The number of believers growing massively in China despite the Chinese regularly jailing them. Similary, but smaller numbers, in North Korea. His Word and Will are unstoppable.

          We’re also to be judged for how much we loved/served God and other people. The love others is things like feeding them, equipping them, being kind, listening, etc. Obviously, bringing them to Christ as well for redemption, peace, and purpose. I’ve personally found I can’t do those things while simultaneously spending lots of time tinkering with FOSS. I’ve cut way, way back on tech to invest in both God’s Word and other people with Him moving consistently.

          Finally, there’s the fact that believers in Christ get the Holy Spirit to transform us but we’re still human. Our human nature remains till death. Centering on Christ supernaturally transforms in gradual way us to give more of His traits if we stay on it. Jesus says His people will bear His fruits (i.e. character). 1 John 3 has stronger words. If they don’t or are casual about it, human traits dominate with a group of them acting like any other. For instance, not caring about software freedom and griping about electronics is extremely normal. We are to call out evil, though. Grey area on graphics cards. ;)

      1. 25

        How has AGPL failed? Quoting the introduction to Google’s own policies, which are the top hit for “agpl google” on DuckDuckGo:

        WARNING: Code licensed under the GNU Affero General Public License (AGPL) MUST NOT be used at Google. The license places restrictions on software used over a network which are extremely difficult for Google to comply with.

        This seems like a resounding success of AGPL.

        Proprietary distributed systems frequently incorporate AGPL software to provide services.

        Who, and which software packages? This isn’t just about naming and shaming, but ensuring that those software authors are informed and get the chance to exercise their legal rights. Similarly, please don’t talk about “the legal world” without specific references to legal opinions or cases.

        I feel like this goes hand-in-hand with the fact that you use “open source” fourteen times and “Free Software” zero times. (The submitted title doesn’t line up with the headline of the page as currently written.) This shows an interest in the continued commercialization of software and exploitation of the commons, rather than in protecting Free Software from that exploitation.

        1. 8

          How has AGPL failed?

          I said how in the article:

          The AGPL was intended, in part, to guarantee this freedom to users, but it has failed. Proprietary distributed systems frequently incorporate AGPL software to provide services. The organizations implementing such systems believe that as long as the individual process that provides the service complies with the AGPL, the rest of the distributed system does not need to comply; and it appears that the legal world agrees.

          The purpose of the AGPL is not to stop commercial users of the software, it’s to preserve the four freedoms. It doesn’t preserve those freedoms in practice when it’s been used, so it’s a failure.

          But really AGPL doesn’t have anything to do with this. No-one claims AGPL is a license like the one I describe in the article.

          Proprietary distributed systems frequently incorporate AGPL software to provide services.

          Who, and which software packages?

          mongodb is a high-profile example, used by Amazon.

          1. 10

            I’m fairly certain that the version of mongodb that Amazon based DocumentDB off of was Apache licensed, so I don’t think that applies here. From what I’m seeing, they also explicitly don’t offer hosted instances of the AGPL licensed versions of Mongo.

            1. 5

              But really AGPL doesn’t have anything to do with this. No-one claims AGPL is a license like the one I describe in the article.

              Then your article headline is misleading, is it not?

              1. 2

                This is true. MongoDB changed their license, in response to Amazon using forks of their own software to compete with them.

                1. 1

                  That’s fair, you’re probably right. There are lots of other hosted instances of MongoDB that used the AGPL version though, so MongoDB is still the highest-profile example of this. That was the motivation for MongoDB’s move to SSPL.

                2. 2

                  I don’t see a laundry list here. I appreciate that you checked your examples beforehand and removed those which were wrong, but now there’s only one example left. The reason that I push back on this so heavily is not just because I have evidence that companies shun AGPL, but because I personally have been instructed by every employer I’ve had in the industry that AGPL code is unacceptable in their corporate environment, sometimes including AGPL developer tools! It would have been grounds for termination at three different employers, including my oldest and my most recent employments.

                  Regarding MongoDB, I have no evidence that AWS violated the terms of the AGPL, and they appear to have put effort into respecting it somewhat. It seems that MongoDB’s owners were unhappy that their own in-house SaaS offering was not competing enough with others, and they chose licensing as their way to fight. Neither of these companies are good, but none of them appear to be disrespecting the AGPL.

                3. 4

                  the agpl says

                  Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software.

                  in other words, you cannot run your own proprietary fork of an agpl program; you have to offer users the modified sources. it says nothing about the sources of the other programs that that program communicates with, or the network infrastructure and configuration that comprises your distributed system.

                  1. 3

                    Yes. This is how we define distributed systems, in some security contexts: A distributed system consists of a patchwork network with multiple administrators, and many machines which are under the control of many different mutually-untrusting people. Running AGPL-licensed daemons on one machine under one’s control does not entitle one to any control over any other machines in the network, including control over what they execute or transmit.

                    Copyright cannot help here, so copyleft cannot help here. Moreover, this problematic layout seems to be required by typical asynchronous constructions; it’s good engineering practice to only assume partial control over a distributed system.

                    1. 1

                      So, that seems fine then? What’s the problem with that?

                      1. 1

                        the problem, as the article says, is that we have no copyleft license that applies to entire distributed systems, and it would be nice to. not so much a problem with the agpl as an explanation of why it is not the license the OP was wishing for.

                        1. 4

                          I explain more upthread, but such licenses aren’t possible in typical distributed systems. Therefore we should not be so quick to shun AGPL in the hopes that some hypothetical better license is around the corner. (Without analyzing the author too much, I note that their popular work on GitHub is either MIT-licensed or using the GH default license, instead of licenses like AGPL which are known to repel corporate interests and preserve Free Software from exploitation.)

                          1. 2

                            We have Parity as a start. Its author says:

                            “Parity notably strengthens copyleft for development tools: you can’t use a Parity-licensed tool to build closed software. “

                            Its terms are copyleft for other software you “develop, operate, or analyze” with licensed software. That’s reads broad enough that anything an operator of distributed systems both owns and integrates should be open sourced.

                    1. 3

                      My cognitive dissonance is off the charts these days with Microsoft. Are they the goodies?

                      1. 24

                        They are a big company doing many different things. It saves a lot of heartache to ditch the goodie/baddie scale and consider each activity on its merits. And in this case I think they’re onto something good. I for one would sacrifice 10-50% JS speed to rule out an entire class of common security bugs.

                        1. 2

                          ditch the goodie/baddie scale and consider each activity on its merits.

                          I gradually changed to this thinking approach after I crossed 33-34, however I struggle to communicate this thinking approach to people. Is there a name for this mental model?

                          1. 3

                            People are complex, organizations change over time. Initial impressions matter a large amount, but people who are interested update their opinions on the basis of new information, while retaining history.

                            One way to do this is bayesian estimation. The two primary problems with bayesian estimation are that people are no good at keeping track of their priors and are no good at estimating new probabilities.

                            It is reasonable to assume that Microsoft, at any time, is doing something clever, something stupid, and a lot of things that they think will make them money in a medium-term future.

                            1. 1

                              Right tool for the job. It almost always redirects the conversation to the technical merits than moral policing. That does not mean moral and ethical considerations are useless, it just helps better calibrate the discussion.

                        2. 17

                          They are:

                          • Giving schools free copies of their office package in order to maintain their OS and productivity software dominance on the desktop.
                          • Lobbying public sector heavily to get locked into Azure.
                          • Using their dominant position to get the whole public sector as well as kids and students onto Teams and O365 with a recurring subscription.
                          • Cross selling One Drive heavily on their non-enterprise Windows editions.
                          • Showing ads and install bloatware in their non-enterprise Windows editions.

                          They want nothing else than total platform dominance and they don’t care about the little people, obviously. The question is, do you consider total dependence of our administrations and of the less-well-off on them a goodie?

                          1. 4
                            • Microsoft fights to get big Pentagon contracts. GitHub can’t even drop a small ICE contract, because Microsoft doesn’t want to offend the military-industrial complex.
                            1. 3

                              Thank you.

                              1. 6

                                No problem.

                                They also care about developers and fund some important research, so it’s not all black & white. Just to be fair.

                              2. 1

                                Google, Amazon, Apple, etc do things too, it’s not like MS is alone in their, MY PLATFORM ONLY perspective.

                                1. 1

                                  True, but I have yet to see GAA employees to write local laws.

                                  I am pretty sure their lobbyist are real busy in Brussels and Washington, though.

                              3. 8

                                I think, very approximately, corporations which sell platforms tend to act like arseholes when they have the upper hand (in terms of network effects and lock-in) and saintly when they don’t.

                                e.g. 10-14 years ago, with win32 being the juggernaut of “we already sunk cost into desktop apps on this platform” and the iPhone only just being released, Microsoft were barely even bothering to disguise their efforts to extinguish open protocols. Meanwhile Apple were pushing html5 hard as the big thing that would allow software to work beautifully on all platforms.

                                Whereas now we have Microsoft very much less dominant and Apple have a near monopoly on phones purchased by the subset of users who spend money on their phones. So Microsoft are promoting open everything, and Apple are doing things like strategically dragging their feet on html5 features, adding horrendous bugs to iOS Safari and not fixing them for months to years.

                                1. 5

                                  Nope. I think they’re doing a great job of “embrace” though.

                                  http://www.catb.org/~esr/halloween/

                                  1. 5

                                    No such thing–these companies look out for themselves, period. But you are better off with more than one company duking it out, because to compete they will (sometimes) do things that are pro-consumer or pro-developer or whatever.

                                    Microsoft on top of the world tried to keep the Web stagnant once they’d killed Netscape, and undermined the growth of Linux and GPL software every way they could. Microsoft as a bit more of an underdog likes Web standards, runs GitHub, and builds the favorite open-source dev environment of a bunch of new coders.

                                    Same with, for example, AMD and Intel. AMD got competitive and prices of high-core-count chips plummeted–great! Now with a chip shortage and a solid follow-up product, AMD is in a position to start raising prices, and they are. Intel getting its manufacturing improvements on track would probably give us more of a price war in certain segments, not less!

                                    I’m old enough to remember when AWS was exciting to smaller devs because you didn’t have to shell out upfront for servers or provision far in advance for spiky loads. 🤣 Now it’s a huge profit center for Amazon and there are plenty of justifiable complaints about pricing (notably for egress) and undermining open-source-centric companies by using their market position to compete with the product developer’s own managed offerings.

                                    Tactically, I guess I want underdogs to reach competitiveness, and for companies to see some benefit to doing good things (e.g. open sourcing things) and cost to bad ones (e.g. pushing out little companies) even when the immediate economic incentives point the wrong way. But in the long run all these companies respond to their economic situation and none of them are your friend.

                                    1. 4

                                      It doesn’t really make much sense to blame the moral character of individual actors when there’s a system of incentives and punishments driving things. If a big company were all saints and angels, they’d be out competed by another company that was willing to play the cutthroat game to maximum advantage. (And sometimes it is worth noting playing the cutthroat game to maximum advantage means doing good things, and sometimes good things come out of bad actions too, but if the market+political winds shift, they’ll (eventually) shift with it or go out of business.)

                                      They’re doing what they think will make them money without getting another nasty visit from the government regulators. So is Google and Apple and Mozilla. None are good or bad per se, they’re all just navigating the same sea.

                                      1. 3

                                        I’ll add to mordae that they were using legal muscle to extract billions in patent royalties from Android. I saw one article claim that stopped. I haven’t followed it in a while. You could check to see if they still patent troll others in general. I mean, that’s clearly evil.

                                        On showing ads, the traditional logic for desktop software, games, etc is that people who pay don’t see ads. Microsoft started putting ads in paid products, including the Xbox. After gamers protested, they expanded how many were on the screen instead of reduced it.

                                        One might also count the integration between single-player games with Xbox Live. I shouldn’t have to login to an online service to play an offline game. If I am offline, I shouldn’t lose access to my progress, items, etc. I recall having to deal with stuff like that on Xbox. The nature of online service is they eventually discontinue it for some products and services. Then, they’ll stop working at all or not work as well. All of this is done to maximize sales of new products using Microsoft’s and their suppliers’ platforms.

                                      1. 5

                                        I always think of security of software massive companies are using as more a labor problem. It’s in their nature to keep billions in profit. They’ll spare only a slice. Can they spare enough of a slice to hire the developers to fix Linux? And, since it’s Google, Android as well?

                                        Three figures helped me understand this a while back:

                                        1. The cost of Solaris 10. It was a rewrite project to modernize a commercial UNIX. Fans of the open-source projects derived from it say it paid off. This gives an idea what their own high-quality, more-secure, clean-slate option might cost.

                                        2. The cost, in size and/or funding, of groups such as PaX Team, OK Labs, and GrapheneOS. Add in academic teams who did things like add memory safety to FreeBSD kernel or Linux apps. Maybe ExpressOS. These give an idea how much it might cost to make one or more dramatic improvements to Linux, Android, etc.

                                        3. Google makes billions in revenue on Android. Others making massive fortunes on Linux or Linux-based tech include IBM, Amazon, and defense contractors.

                                        Yeah, they could do it by themselves if they wanted. That confirms the damage is an externality to them. They’ll put in some kind of effort, leave most of the job undone, and do that even when the cost is almost nothing to them.

                                        1. 3

                                          You need to think about opportunity cost. $1 invested in security is $1 not invested in something else. The security investment is competing with things that customers are demanding. How many people will buy an Android phone instead of an iOS phone if Android ships with security feature X? How does this compare with the number of people that will buy an Android phone instead of an iOS phone if Android supports 200 new emoji? Even a big company like Google has a finite number of engineers and needs to prioritise their work. New features sell products, security improvements don’t (outside of a few markets).

                                          This would probably change a lot if regulation increased liability for vulnerabilities, but it’s not clear how that would work with open source projects. Some random person accidentally introduces a vulnerability into Linux, Google packages it up in AOSP, Samsung ships a phone with Samsung’s Android version built on top of AOSP, who should be liable if an end-user’s phone is compromised?

                                          1. 1

                                            The opportunity cost-based accounting only works if there is no liability for shipping shitty/insecure software. If only we could put an end to that…

                                            1. 1

                                              You need to think about opportunity cost. $1 invested in security is $1 not invested in something else.

                                              It’s true that businesses consider this. It might even be what Google was doing. You’ll know that if Google is, like Amazon, steadily building things that might become products, keeping an eye out for what customers love, and turning those into long-term, supported products with revenue. Instead, we see a trend at Google where they consistently create excellent tech with lots of users, sometimes taking over a niche, before canceling that tech. It’s like they don’t even try to make their investment pay off in any way. Looking at their cloud vs Amazons, it’s clear Amazon’s customer focus is whats causing them to pound away at Google. What we see can’t possibly be Google being clever about what customers want.

                                              On security side, we see a demand for security. Google pushed Chrome on both performance and security via Native Client. They promote projects like BeyondTrust and Titan. Back on demand side, we see damaging stories in the media comparing malware rates on iOS vs Android. It’s a differentiator for iOS you’d think they’d want to eliminate. Further, there’s always been a demand for a private, at least enterprise and government, variant of Android. There’s a hard-to-estimate niche for people that pay to turn off ads. That means it’s a wise move for these companies to make a more secure version with all tracking and ads turned off priced at whatever tracking/ads bring plus some profit. Privacy-conscious customers often pay a premium for it. So, they could increase the profit margin on that. It’s telling that they haven’t even tried despite both FOSS projects and small businesses showing how low cost it would be for Google to set that up.

                                              Saying they’re handling priorities really well is too charitable after all their missed opportunities, canceled products, and wasteful projects. Best explanation is a combination of (a) they don’t care about security for many reasons (including harder-to-sell ROI) and (b) marketing ineptitude.

                                              “This would probably change a lot if regulation increased liability for vulnerabilities”

                                              I agree. I go further wanting both regulation and liability. I’ve promoted regulation since the TCSEC (pdf) and DO-178B proved corporations will respond to it. Whole ecosystems, from reusable components to tools for building them, emerged from both sets of regulations. That makes regulations the only thing that’s proven to work. From there, I just looked to how to do it better: goal-based vs prescriptive, lower costs, more automation, etc.

                                              Liability is a different issue. We’ve seen the class actions in other industries achieve risk reduction. From there, it becomes some kind of game of what profitable evil they might get away with, what the lawsuits will cost, and if one is higher than the other. We need to make sure whatever they’re getting hit for is actionable. We also need to set the fines or damages at many times higher than whatever preventing them cost. Then, if that’s consistently enforced, those doing it will clearly have just acted unreasonable under reasonable, professional standard.

                                              So, a mix of regulation and liability.

                                            2. 2

                                              They probably would rather put their efforts into Fuscia. The Linux kernel is a bit of a commodity these days, with every tech company maintaining some level of contribution and aptitude for it. The result is less control for any individual company, which we love, but they can’t stand (see browser wars)

                                              1. 2

                                                That’s a good theory. Lots of them like control and GPL elimination. It seems like it’s why router vendors go proprietary when better stuff is available under BSD. Let’s not forget long-term, lock-in opportunities. Google seems to be in that camp looking how Google Apps/Play agreements with Android manufacturers.

                                            1. 1

                                              Far as runtime, author could test it again mature CL’s or Schemes. That might give an idea.

                                              1. 6

                                                My favorite part of it was its security technology (pdf) that Secure64 built a DNS product on. That PDF is an advertisement but describes its mechanisms. With all the papers I’ve submitted, I imagine we could’ve seen a lot of interesting stuff if some of those designs built on Itanium’s hardware support. Esp to hardware-accelerate or further isolate specific parts of them. To get readers thinking, an example would be how Code-Pointer Integrity cleverly used segments to do an important part of its design at high speed and isolation.

                                                Another benefit of RISCy architectures is that they’re not inherently tied to how C’s abstract machine works. In theory, you have enough registers to keep a large pile of internal state for alternative designs. Most x86 alternatives weren’t as fast and cheap as x86, though. Whereas, Itanium was at least really fast with more memory addressing. There was a lot of potential for alternative paradigms to take off by targeting it so long as enterprises could buy them (esp as appliances). That Secure64 did a good job trying to do that is why I mention them in Itanium discussions.

                                                One immediate use it had was in NUMA machines. SGI’s Altix used them. Its Prism variant was the system gamers dreamed about having (Onyx2 on MIPS before that). You can scale up critical workloads without the programming headaches of using a cluster. It was just multithreading with some consideration applied to data locality.

                                                One more benefit was reliability technology. Quite a few failures in mission-critical systems come from hardware issues developers can’t see. Even in clusters, making the individual chips more reliable reduces odds of failures in them taking down the whole thing. Especially when they use the same CPU. The Itanium had improvements in this area, I think NUMA/mainframe vendors (eg Clearpath) took advantage of that, and eventually Xeons got most or all of those features. For a while, Itanium-based designs might have crashed less.

                                                Unfortunately, most potential adopters just asked, “Can I compile my existing code on my existing system to get it to go much faster with 64-bit support?” Answer was no. Then, AMD won by focusing on and accelerating legacy code on an ISA compiler writers already optimized their stuff for, too.

                                                This was Intel’s third attempt, after i432 and i960 (awesome), to get people off legacy. They lost billions of dollars in ambitious attempts. Today, people gripe that Intel architecture has all kinds of security issues. They should’ve rewritten or patched the architecture. If anything, Intel took more risks than everyone with losses so high I’d be surprised if they take any more.

                                                1. 2

                                                  Unfortunately, most potential adopters just asked, “Can I compile my existing code on my existing system to get it to go much faster with 64-bit support?” Answer was no. Then, AMD won by focusing on and accelerating legacy code on an ISA compiler writers already optimized their stuff for, too.

                                                  Similar story with HTM.

                                                  1. 1

                                                    Azul was great, too! Seemed like they were beating the averages. Then, they ditched IIRC the hardware product. I might watch that.

                                                1. 17

                                                  This article has everything: databases, rad, different visions of computing as a human field of endeavor, criticism of capitalism. I commend it to everyone.

                                                  1. 13

                                                    Criticism of capitalism is, in theory, my main thesis, but I find it difficult to convey in a way that doesn’t get me a lot of angry emails with valid complaints, because the issue is complex and I can’t fully articulate it in a few paragraphs. But it is perhaps my core theory that capitalism has fundamentally diminished the potential of computing, and I hope to express that more in the future.

                                                    1. 3

                                                      But it is perhaps my core theory that capitalism has fundamentally diminished the potential of computing, and I hope to express that more in the future

                                                      I am on a team that is making a documentary about the history of personal computing. One of the themes that has come out is how the kind of computing that went out to consumers in the early 80s on was fundamentally influenced by wider socioeconomic shifts that took place beginning in the 70s (what some call a “neoliberal turn”). These shifts included, but were not limited to, the elevation of shareholder primacy and therefore increased concentration on quarterly reports and short-termism.

                                                      These properties were antithetical to those that led to what we would say were the disproportionate advances in computing (and proto-personal computing) from the 50s, 60s, and 70s. Up until the 80s, the most influential developments in computing research relied on long-term, low-interference funding – starting with ARPA and ultimately ending with orgs like PARC and Bell Labs. The structures of government and business today, and for the past few decades, are the opposite of this and therefore constitutionally incapable of leading to huge new paradigms.

                                                      One more note about my interviews. The other related theme that has come out is that what today we call “end user programming” systems were actually the goal of a good chunk of that research community. Alan Kay in particular has said that his group wanted to make sure that personal computing “didn’t become like television” (ie, passive consumption). There were hints of the other route personal computing could have gone throughout the 80s and 90s, some of which are discussed in the article. I’d add things like Hypercard and Applescript into the mix. Both were allowed to more or less die on the vine and the reasons why seem obvious to me.

                                                      1. 1

                                                        These properties were antithetical to those that led to what we would say were the disproportionate advances in computing (and proto-personal computing) from the 50s, 60s, and 70s. Up until the 80s, the most influential developments in computing research relied on long-term, low-interference funding – starting with ARPA and ultimately ending with orgs like PARC and Bell Labs. The structures of government and business today, and for the past few decades, are the opposite of this and therefore constitutionally incapable of leading to huge new paradigms.

                                                        This is something I’ve been thinking about a while - most companies are incapable of R&D nowadays; venture capital funded startups have taken a lot of that role. But they can only R&D what they can launch rapidly and likely turn into a success story quickly (where success is a monopoly or liquidity event).

                                                      2. 3

                                                        As with so many things. But I think mass computing + networking and our profession have been instrumental in perfecting capitalism.

                                                        Given the values that were already dominating society, I think this was inevitable. This follows from my view that the way out is a society that lives by different values. I think that this links up with our regularly scheduled fights over open source licenses and commercial exploitation, because at least for some people these fights are at core about how to live and practice our craft in the world as it is, while living according to values and systems that are different from our surrounding capitalist system. In other words, how do we live without being exploited employees or exploiting others, and make the world a marginally better place?

                                                        1. 2

                                                          Complaints about the use of the word, and maybe calling you a socialist, or something?

                                                          I wouldn’t do that to you, but I do have to “mental-autocorrect” capitalism into “We may agree software developers need salary and some SaaS stuff is useful, but social-media-attention-rent-seekers gain power, which sucks, so he means that kind of ‘capitalism’”.

                                                          There should be a word, like cronyism is the right word for what some call capitalism, or a modifier like surveillance capitalism.

                                                          1. 3

                                                            But I am a socialist. The problem is this: the sincere conviction that capitalist organization of global economies has diminished human potential requires that I make particularly strong and persuasive arguments to that end. It’s not at all easy, and the inherent complexity of economics (and thus of any detailed proposal to change the economic system) is such that it’s very difficult to express these ideas in a way that won’t lead to criticism around points not addressed, or of historic socialist regimes. So is it possible to make these arguments about the history of technology without first presenting a thorough summary of the works of Varoufakis and Wolff or something? I don’t know! That’s why I write a blog about computers and not campaign speeches or something. Maybe I’m just too burned out on the comments I get on the orange website.

                                                            1. 1

                                                              Sure, I appreciate that, though it would maybe attract bad actors less if there was some thread of synopsis that you could pull on instead of “capitalism”.

                                                              I think the problem is broad terms, because they present a large attack surface, though I do realize people will also attack outside the given area.

                                                              I’m also saddened by a lot of what’s going on in ICT, but I wouldn’t attribute it blindly to “capitalism”, but I don’t have all the vocabulary and summaries, if you will, to defend that position.

                                                              One’s capitalism is anyway different from another’s, so the definitions must be laid out. Maybe all of Varoufakis isn’t needed every time?

                                                              Nor am I convinced we’ll reach anything better with socialist or other governmental interventions. An occasional good law may be passed, or money handouts that lead to goodness, but each of those will lose in the balance to detrimental handouts/malfeasance, corruption, unintended consequences, and bad laws.

                                                              Maybe some kind of libertarian-leaning world where people have a healthy dose of socialist values but are enlightened enough to practice them voluntarily?

                                                          2. 1

                                                            I would love to see the hoops you jump through to express that. Honestly. It seems so alien to my worldview that anyone making that claim (beyond the silly mindless chants of children which I’m assuming is not the case here) would be worth reading.

                                                            1. 8

                                                              I’ve made a related argument before, which I think is still reasonably strong, and I’ll repeat the core of here.

                                                              In my experience, software tends to become lower quality the more things you add to it. With extremely careful design, you can add a new thing without making it worse (‘orthogonal features’), but it’s rare that it pans out that way.

                                                              The profit motive drives substantial design flaws via two key mechanisms.

                                                              “Preventing someone from benefiting without paying for it” (usually means DRM or keeping the interesting bits behind a network RPC), and “Preventing someone from churning to another provider” (usually means keeping your data in an undocumented or even obfuscated format, in the event it’s accessible at all).

                                                              DRM is an example of “adding a new thing and lowering quality”. It usually introduces at least one bug (sony rootkit fiasco?).

                                                              Network-RPC means that when your network connection is unreliable, your software is also unreliable. Although I currently have a reliable connection, I use software that doesn’t rely on it wherever feasible.

                                                              Anti-churn (deliberately restricting how users use their own data) is why e.g. you can’t back up your data from google photos. There used to be an API, but they got rid of it after people started using it.

                                                              I’m not attempting to make the argument that a particular other system would be better. However, every human system has tradeoffs - the idea that capitalism has no negative ones seems ridiculous on the face of it.

                                                              1. 1

                                                                Those are shitty aspects to a lot of things, and those aspects are usually driven by profits, although not always the way people think. I’ll bet dollars to donuts that all the export features Google removes are done simply because they don’t want to have to support them. Google wants nothing less than they want to talk to customers.

                                                                But without the profit motive in the first place, none of these things would exist at all. The alternatives we’ve thought up and tried so far don’t lead to a world without DRM, they lead to a world where media is split into that the state approves and nobody wants to copy, and that where possession of it gets you a firing squad, whether you paid or not.

                                                                1. 6

                                                                  But without the profit motive in the first place, none of these things would exist at all.

                                                                  It’s nonsensical to imagine a world lacking the profit motive without having any alternative system of allocation and governance. Nothing stable could exist in such a world. Some of the alternative systems clearly can’t produce software, but we’ve hardly been building software globally for long enough to have a good idea of which ones can, what kinds they can, or how well they can do it (which is a strong argument for the status quo).

                                                                  As far as “made without the profit motive” go, sci-hub and the internet archive are both pretty neat and useful (occupying different points on the legal-in-most-jurisdictions spectrum). I quite like firefox, too.

                                                              2. 3

                                                                “Capitalism” is a big thing which makes it difficult to talk about sensibly, and it’s not clear what the alternatives are. That said, many of the important aspects of the internet were developed outside of commercial considerations:

                                                                • DARPA was the US military

                                                                • Unix was a budget sink because AT&T wasn’t allowed to go into computing, so they just shunted extra money there and let the nerds play while they made real money from the phone lines

                                                                • WWW was created at CERN by a guy with grant money

                                                                • Linux is OSS etc.

                                                                A lot of people got rich from the internet, but the internet per se wasn’t really a capitalist success story. At best, it’s about the success of the mixed economy with the government sponsoring R&D.

                                                                On the hardware side, capitalism does much better (although the transistor was another AT&T thing and NASA probably helped jumpstart integrated circuits). I think the first major breakthrough in software that you can really peg to capitalism is the post-AlphaGo AI boom, which was waiting for the GPU revolution, so it’s kind of a hardware thing at a certain level.

                                                                1. 2

                                                                  I still disagree, but man it’s nice to just discuss this sort of thing without the name-calling and/or brigading (or worse) you see on most of the toobs. This sort of thing is pretty rare.

                                                                2. 2

                                                                  Obviously not op, but observe the difference between the growth in the scale and distribution of computing power, and what has been enabled, over the last 40 years.

                                                                  Business processes have been computerized and streamlined, entertainment has been computerized, and computerized communications especially group communications like lobsters or other social media have arrived. That’s not nothing, but it’s also nothing that wasn’t imaginable at the start of that 40 year period. We haven’t expanded the computer as a bicycle of the mind - consider simply the lack of widespread use of the power of the computer in your hand to create art. I put that lack of ambition down to the need to intermediate, monetize, and control everything.

                                                                  1. 1

                                                                    And additionally the drive to drive down costs means we have much less blue sky research and ambition; but also means that things are done to the level that they’re barely acceptable. We are that right now with the security situation: everything is about playing whackamole quicker than hackers, rather than investing in either comprehensive ahead of time security practices or in software that is secure by construction (whatever that would look like).

                                                                3. 1

                                                                  What I used to tell them is it’s basically a theory that says each person should be as selfish as possible always trying to squeeze more out of others (almost always money/power), give less to others (minimize costs), and put as many problems on them as possible (externalities).

                                                                  The first directly leads to all kinds of evil, damaging behavior. There’s any number of schemes like rip-offs, overcharging, lockin, cartels, disposable over repairable, etc. These are normal, rather than the exception.

                                                                  The second does every time cost-cutting pressure forces the company to damage others. I cite examples with food, medicine, safety, false career promises, etc. They also give less to stakeholders where fewer people get rewarded and get rewarded less for work put in. Also, can contrast to utilitarian companies like Publix that gives employees benefits and private stock but owners still got rich. Or companies that didn’t immediately lay off workers during recessions. An easy one is most can relate to is bosses, esp executives, paid a fortune to do almost nothing for the company vs workers.

                                                                  Externalities affect us daily. They’re often a side effect of the other two. Toxic byproducts of industrial processes is a well known one. Pervasive insecurity of computers, from data loss to crippling DDOS’s to infrastructure at risk, is almost always an externality since the damage is someone else’ problem but preventing it would be supplier’s. You see how apathy is built-in when the solution is permissible, open-source, well-maintained software and they still don’t replace vulnerable software with it.

                                                                  Note: Another angle, using game of Monopoly, was how first movers or just lucky folks got an irreversible, predatory advantage over others. Arguing to break that up is a little harder, though.

                                                                  So, I used to focus on those points, illustrate alternative corporate/government models that do better, and suggest using/tweaking everything that already worked. Meanwhile, counter the abuse at consumer level by voting with wallet, sensible regulations anywhere capitalist incentives keep causing damage, and hit them in court with bigger damages that preventing it would cost. Also, if going to court, I recommend showing where it was true how easy or inexpensive prevention was asking the court basically orders it. Ask them to define reasonable, professional standard as not harming stakeholders in as many cases as possible.

                                                                  Note: Before anyone asks, I don’t have the lists of examples anymore or just inaccessible. Side effect of damaged memory is I gotta stay using it or I lose it.

                                                              1. 3

                                                                I wonder if the team considered writing the data store backend as a library in Rust, with Ruby bindings. That way it could run in-process, without the overhead of IPC or network round-trips. Rust provides strong static typing like Haskell, so the team would still have gotten the benefits of that.

                                                                1. 15

                                                                  We didn’t consider Rust, but we did, as documented, build a sidecar app at first which brought latency down low enough. I’d be excited to adopt some Rust at NoRedInk though!

                                                                  Our experience with Haskell has been great, and the learning curve is particularly shallow when all of our developers already write Elm.

                                                                  The reason to pull out to a new service (as well as the reason to avoid different technologies) often has to do with decision fatigue rather than optimal architecture: We’ve built a nice template for an HTTP service in Haskell with, for instance, nice monitoring, deployments, etc.

                                                                  Moving to a new language takes time to make all of the right decisions. The same is true for service shape. How do we monitor, set up permissions, deploy secrets, etc to our sidecar app? Do we want to build & maintain that deployment tooling? In our case: we decided it was worth the latency hit to be able to share tooling.

                                                                  1. 8

                                                                    Our fearless leader is extensively using Rust (and Zig) in the programming language he’s developing, Roc.

                                                                    1. 4

                                                                      Oh, that language looks very interesting. I think a “friendly” functional programming language that compiles to binaries would be a great addition.

                                                                      If they haven’t seen it, this paper might be relevant to their work:

                                                                      Perceus: Garbage Free Reference Counting with Reuse

                                                                    2. 4

                                                                      Although I use neither, Haskell and Rust look like a good fit since they both focus on safety with support for FP and easier concurrency. If you use Rust, your main obstacle will be borrow checker. Data-driven design helps there. If needed, you can cheat by using refcounting in Rust on what you can’t borrow check. You still get benefits of both its compiler and any 3rd-party code that does borrow check.

                                                                      So, Rust can be gradually learned and adopted as an accelerator for Haskell projects. If not accelerator, just a way to leverage the good things in its ecosystem (esp networking and data stores).

                                                                      1. 17

                                                                        Haskell, when written with care, can often be roughly as fast as rust. It’s a natively compiled language that uses llvm for optimization. With stream fusion and other high level optimizations, you can end up with tight, vectorized asm loops over primitive, unboxed types.

                                                                        It’s likely more productive to add appropriate strictness hints, awkward structure, and ugly code to make the optimizer happy than to glue two languages together via an ffi, especially since the ffi is likely to act as an optimization barrier, suppressing inlining and making alias analysis hard.

                                                                        So, while idiomatic rust is likely to be faster than idiomatic haskell, warty optimized haskell in hotspots likely to be more productive than gluing things together by an ffi.

                                                                  1. 4

                                                                    Let me chime in on this. I enjoyed your article as I always do. Minor comments first:

                                                                    re formal methods making it cheaper. I’d scratch this claim since it might give a wrong impression of the field. Maybe you’ve seen a ton of papers claiming this in your research. It could be popular these days. In most of mine (ranging four decades min.), they usually just said formal methods either reduced or (false claim here) totally eliminated defects in software. They focus on how the software will have stunning quality, maybe even perfection. If they mention cost, it’s all over the place: built barely anything with massive labor; cost a lot more to make something better; imply savings in long-term since it needed no maintenance. The lighterweight methods, like Cleanroom, sometimes said it’s cheaper due to (other claim) prevention being cheaper. They didn’t always, though.

                                                                    re common knowledge is wrong. I got bit by a lot of that. Thanks for digging into it, cutting through the nonsense, and helping us think straight!

                                                                    Now for the big stuff.

                                                                    re stuff we can agree on. That’s what I’m going to focus on. The examples you’re using all support your point that nobody can figure anything out. You’re not using examples we’ve discussed in the past that have more weight with high, practical value. I’ll also note that some of these are logical in nature: eliminating the root causes means we don’t need a study to prove their benefit except that they correctly eliminate the root causes. Unless I skimmed past it, it looks like you don’t bring up those at all. Maybe you found problems in them, too. Maybe just forgot. So, I’ll list them in case any of you find them relevant.

                                                                    Memory safety. Most studies a massive number of bugs are memory safety. Languages that block them by default and tools that catch most of them will, if they work (prevent/catch), block a massive number of bugs. Lesser claims: many exploits (a) also use those kinds of bugs and (b) more easily use them. That lets us extrapolate the first claim to reduction of exploits and/or their severity. An easy test would be to compare (minus web apps) both the number of memory-safety errors and code injections found in C/C++ apps vs Java, C#, and Rust. If Java or C#, in the safe parts rather than unsafe parts. I predict a serious drop since their root cause is logically impossible most of the time. I believe field evidence already supports this.

                                                                    Garbage collection (temporal safety, single-threaded). The industry largely switched to languages using GC, esp Java and .NET, from C/C++. That’s because temporal errors are more difficult to prevent, were harder to find at one point (maybe today), and might cost more to fix (action at a distance). These kinds of errors turn up even in high-quality codebases like OpenBSD. I predict a huge drop by garbage collection making them impossible or less-probable. If it’s not performance-prohibitive, a GC-by-default approach reduces development time, totally eliminates cost of fixing these, and reduces crashes they cause. What I don’t have proof of is that I used to see more app crashes before managed languages due to a combo of spatial and temporal memory errors. Might warrant a study.

                                                                    Concurrency (temporal safety, multi-threaded). Like what GC’s address, concurrency errors are more difficult to prevent and can be extremely hard to debug. One Lobster was even hired for a long period of time specifically to fix one. There’s both safe methods for concurrency (eg SCOOP, Rust) and tools that detect the errors (esp in Java). If the methods work, what we’d actually be studying is if not introducing concurrency errors led to less QA issues than introducing and fixing concurrency errors. No brainer. :)

                                                                    Static analysis. They push the button, the bugs appear, and they fix the ones that are real. Every good tool had this effect on real-world, OSS codebases. That proves using them will help you get rid of more bugs than not using them. If no false positives (eg RV-Match, Astree), your argument is basically done. If false positives (eg PVS-Check, Coverity), then one has to study if the cost to go through the false positives is worth it vs methods without them. I’d initially compare to no-false-positive methods. Then, maybe to methods that are alternatives to static analysis.

                                                                    Test generators. Path-based, combinatorial, fuzzing… many methods. Like static analysis, they each find errors. The argument one is good is easy. If many are available (esp free), then other empirical data says each will spot things others miss. That has a quality benefit without a study necessary if the administrative and hardware cost of using them is low enough.

                                                                    My persistent claim on Lobsters for code-level quality: use a combination of FOSS languages and tools with memory-safety, GC-by-default, safe concurrency, static analysis, and test generation. To take the claim down, you just need empirical evidence against one or more of the above since my claim just straight-forward builds on them. Alternatively, show that how they interact introduces new problems that are costlier to fix than what they fixed. That is a possibility.

                                                                    Drivers. Microsoft Windows used to blue screen all the time for all kinds of users. The reports were through the roof. It was mostly drivers. They built a formal methods tool to make drivers more reliable. Now, Windows rarely blue screens from driver failures. There you go.

                                                                    Design flaws. You’ve personally shown how TLA+ mockups catch in minutes problems that alternative methods either don’t catch or take roughly forever to. Any research like that makes an argument for using those tools to catch those kinds of problems. I’ll limit the claim to that. It becomes a tiny, tiny claim compared to silver bullets most are looking for. Yet, it already allows one to save time on UI’s, distributed systems, modeling legacy issues, etc.

                                                                    Functional and immutable over imperative and mutable. In a nutshell, what happens as we transition from one to another is that the number of potential interactions increase. Managing them is more complex. Piles of research across fields show that both complexity and interactions drive all kinds of problems. Interestingly, this is true for both human brains and machine analysis. You’ve also linked to a formal methods article where author illustrated that mathematically for verifying functional vs imperative software. Any technique reducing both complexity and potential interactions for developers reduces their labor so long as the technique itself is understandable. That gives us root cause analysis, cross-disciplinary evidence, and mathematical proofs that it will be easier in long-term to verify programs that are more functional and use less, mutable state.

                                                                    Things we’ll be stuck with, like Internet protocols. Certain systems are set up to be hard to change or never change. We’re stuck with them. OSS, industry, and government keeps deploying bandaids or extra layers to fix problems in such systems without fixing the root. When they do fix a root cause, it’s enormously expensive (see Y2K bug). I’d argue for getting them right up-front to reduce long-term costs. I’m not sure what empirical studies have gone into detail about these systems. I just think it’s the best thing to apply that kind of thinking to.

                                                                    I’d say those are the only things I’d mention in an article like yours. They’re things that are easy to argue for using visible, real-world results.

                                                                    1. 1

                                                                      I get a 404?

                                                                      1. 1

                                                                        It was up earlier. Now a 404. The cached copy being a 404 raises some questions.

                                                                        The Wayback link works, though.

                                                                      1. 1

                                                                        I upvoted this because the lessons still haven’t been learned for many. If you read on it, cosmic rays flipping single bits are pretty much the only thing that comes up in most evaluating this risk.

                                                                        “The most aggressive policies (retiring a page immediately after justone error, or retiring whole rows and columns) are able to avoidup to 96% of all errors. “

                                                                        Obviously, anything that avoids 96% of all errors without a serious penalty is a rare find. Let’s use or build on it!

                                                                        I’ll chime in to add that approaches like NonStop still address issues like this well. They make the internal components like black boxes that should produce same output on a given input. They do voting. If a component defects, there’s automated fixes that can get it back in shape. They also have redundancy and snapshots. Such designs knock out lots of problems people would never see coming.

                                                                        I still think we need more modern designs following route of older ones like NonStop to architecturally block many issues with tactics like in this paper boosting reliability of individual components. We might also get them cheap if we use RISC boards for the individual components. On top of that, there’s lots of diversity in ecosystems such as ARM where the boards might have different implementations, process nodes, etc. The differences might reduce odds two components fail at the same time.

                                                                        1. 11

                                                                          If I’m reading this right, it talks the protocol that OTP uses for distributed Erlang (which is one of the ways you can do FFI in Erlang: write a C / whatever program that appears as one or more processes to Erlang code) in Go and provides some nice Go interfaces that look like Erlang for the Go programmer to use? Within the Go part of the world, you can have mutable (and shared mutable) state that would violate the invariants of the Erlang abstract machine if exposed there, but it’s conveniently hidden in some Go actors and so the Erlang world only sees immutable objects.

                                                                          It looks quite neat, but you give up a lot of safety by going from Erlang to Go. If I needed single-threaded performance that badly, I think I’d probably go all of the way to C++. There seem to be a bunch of dead attempts to do this in C++, this seems to be the only actively maintained one.

                                                                          1. 7

                                                                            Rustler lets you code the performance-sensitive parts in Rust. Stay in Erlang for its safety. Stay safe, fast, and integrated when you leave it.

                                                                          1. 8

                                                                            Wow it’s crazy that Chez Scheme is from 1984 and Racket is from 1995!

                                                                            I can definitely see that maintaining a big language implementation is a huge thing to ask, especially for 5 CS professors who are doing many other things like teaching, reviewing papers, serving on committees, etc.

                                                                            1. 7

                                                                              Even better, Chez Scheme goes back to Z80 processors. Many folks said you must stick entirely with C on weak hardware. They schemed up something better. Their historical paper (pdf) was full of interesting stuff.

                                                                            1. 10

                                                                              In other words, given a specification to copy a list, our synthesis tool was able to produce a complete self-contained C program with a separate independent proof certificate, formally guaranteeing, without a shadow of a doubt, that it fully satisfies the specification, thereby removing any need to trust a fallible human developer.

                                                                              “Beware of bugs in the above code; I have only proved it correct, not tried it.” - Donald Knuth

                                                                              Someone please set me straight if I’m wrong, but it looks to me like that generated code has a recursive call in a non-tail position. Pass it a large linked list and its stack will overflow and it will crash, despite its certified correctness and its guarantee that there’s no need for it to ever be double-checked by a fallible human developer.

                                                                              1. 20

                                                                                The F* project has been producing C code as the output from a verification toolchain for a while and this includes use in EverCrypt, where the formal proofs include security proofs of the protocols being implemented. The examples from this post look simpler than the demos of F* from 5-10 years ago.

                                                                                There are a few problems with any of the things in this space:

                                                                                • C is a really bad target for anything like this. C compilers all have a subtly different interpretation of the C semantics and your verified code is only correct with respect to one concrete instantiation of these semantics. Our paper from a few years ago shows the space of C semantics. Even CompCERT, which is a formally verified C compiler, only guarantees correct lowering if your input C is free from undefined behaviour.
                                                                                • Getting the properties that you prove correct is really hard. For example, an earlier version of some of the EverCrypt code was proven to be free of temporal safety errors. The theorem was that no object may be accessed after it is freed. The proofs succeeded trivially: nothing was ever freed. Changing this to specify lifetimes and guarantee that everything was deallocated in a timely fashion was a lot of work.
                                                                                • To the best of my knowledge, none of the proof tools work on non-trivial examples of concurrent code. CppMem is probably the best of these and it can’t handle more than a few dozen lines of code. When you can formally verify something like this multi-producer, single-consumer queue I’ll be really excited. And if your name is Mark Batty then I might even believe you can do it.
                                                                                • Anything that the surrounding code does is an axiom. Shared-everything languages are incredibly bad for partial proofs: a memory-safety bug or a concurrency bug anywhere in the rest of your program can invalidate one of the axioms that your proof code depends on.
                                                                                • None of the tools are good enough for large programs. This means that you invariably have to mix verified and non-verified code in the same program. Most of the synthesis tools that I’ve seen generate interfaces that are really hard to use correctly. Part of this comes from targeting C. There are a load of properties that are trivial to enforce with C++ smart pointers or type-safe enums, that the C programmer has to just remember to get right. This feeds back into the compositional correctness problems: If your interfaces are hard to use, you’ll get bugs at the boundary of the proof code. In real-world uses I’ve seen, you get more of these than you fixed in the verified part.

                                                                                TL;DR: Verification and program synthesis are super interesting but they’re nowhere near ready for prime time.

                                                                                1. 3

                                                                                  Verification and program synthesis are super interesting but they’re nowhere near ready for prime time.

                                                                                  The NATS iFACTS traffic control system (released in 2011) has over half a million lines of verified “crash-proof” SPARK code.

                                                                                  1. 2

                                                                                    Is there any good info on how well it has stood up since then, how much work it was to produce and verify, etc?

                                                                                    1. 2

                                                                                      I haven’t found any cost amounts from poking around online. The Eurofighter Typhoon also uses SPARK in some of its systems. Yes, verification is hard, but there are real systems out there which do it. I think people assume it’s some unreachable goal, but with piecemeal approaches like Ada/SPARK, I think it’s a lot closer than people realize and the future is very bright.

                                                                                      I don’t think we’ll see a big bang of “Our thing is verified!” it’s going to slow roll out with “this tiny lib we’re using is verified” and “we have some components which are verified”, and “we use verification tools” moving forward. Something along the lines of “functional core, imperative shell” but “verified core, un-verified wrappers.”

                                                                                    2. 2

                                                                                      I’ll add the Infer tool that Facebook acquired to that. It also uses separation logic for temporal analysis of C/C++ programs. It handles massive codebases. They apply it to Java, too.

                                                                                      The difference is that these tools narrow the problem down, aim for high automation, make tradeoffs that facilitate real-world use, and have people doing non-glamorous work supporting that (eg standard library, build integration). Academics often don’t do all this given they’re just proving a concept, doing more ambitious things, or sometime just what they enjoy.

                                                                                  2. 6

                                                                                    “Beware of bugs in the above code; I have only proved it correct, not tried it.” - Donald Knuth

                                                                                    He wrote this before we had theorem provers. He proved it by hand, which is a lot more error prone.

                                                                                    Doesn’t mean I agree with you, just that I don’t think that quote is really as damning as people say.

                                                                                    1. 1

                                                                                      That’s an interesting comment. I never took that quote to mean that he wasn’t confident in the correctness of his handcrafted proof. I always interpreted it as a warning that proofs of code can have non-obvious limitations: they can be completely correct and internally consistent, but if the person doing the proving didn’t think to include some aspect of the desired behavior in the specification, the code can still do unexpected things.

                                                                                      The code sample in the article is a good example. “Uses a constant amount of stack space regardless of input size” and “terminates in a well-defined way if memory allocation fails” weren’t part of the specification, so the code can bomb out even though it can be proven to perform the correct sequence of steps to copy a linked list as described by its spec.

                                                                                      Of course, I know a fraction of a fraction of a fraction as much about formal proofs of code as you do, so take this as the mad ramblings of a neophyte; it’s just what I took away from that quote when I first read it.

                                                                                  1. 19

                                                                                    Thanks for posting! The technical side of things is quite interesting, but the writing style of the author is rather pompous.

                                                                                    “What if there were a way to automatically generate C-code, complete with formal proofs of correctness, removing any need to trust fallible human developers?” .. and what about the fallible humans writing the specs?

                                                                                    Here’s a link directly to their paper (he also provides other references at the bottom): https://gopiandcode.uk/pdfs/CySuSLik-icfp21.pdf

                                                                                    1. 3

                                                                                      I’ll add that there’s been many specification errors in verified programs. Mutation testing was finding issues in them, too. We’ll need mamy eyeballs on them. Checking specs/proofs in general math is a social process but mathematicians don’t like computer-focused math. So, in a way, we’re moving the problems from code nobody was reviewing or analyzing to an esoteric language fewer people will check.

                                                                                      It’s still great work due to the ubiquity of C, good tools for seperation logic, removing attack surface in spec-to-code gap, and pushing synthesis forward. Might as well keep putting good stuff together to see what happens.

                                                                                      Far as general problem, I’m a bigger fan of doing stuff like this with notations like ASM’s that wider audience could use and review. Or languages like ML or SPARK with C generated from them. If algorithmic, a side benefit is one can retarget static analyzers and automated provers to them to confirm some properties without really getting into formal logic.

                                                                                      1. 2

                                                                                        There are so many memory unsafety bugs. Most of these are caused by selecting memory-unsafe languages, and C is the worst offender by far.

                                                                                      1. 4

                                                                                        I like the improvements. Replayable logs are probably a good idea in systems with redundancy. Hackers might hit the redundancy part, too. But…

                                                                                        The premise feels like a strawman. Systems like Google Docs are built with a focus on features, not high reliability. Few systems in commercial or FOSS deployment use proven methods of high-reliability design. They instead use methods proven to accomplish their other goals while reducing customer-losing outages below some threshold. So, we’ll keep seeing such results.

                                                                                        Anyone really testing their reliability concepts this should do it against reliability-focused deployments such as OpenVMS, NonStop, and Stratus clusters. Maybe throw in Minix 3 (drivers), SPARK Ada (deterministic components), and FoundationDB (distributed components). After architecture, there’s safe languages, static analysis, and test generators. Unlike seL4 or CompCert, these are all either commercially-developed systems or methods widely deployed in them. That proves they’re feasible for new projects.

                                                                                        Systems aiming for high-reliability today should use combinations of such techniques. Maybe even the OP technique, too, since it reduces risk. They’ll get better results. In this case, I see an opportunity right where it says “our failure paths.” Quick note: always specify, analyze, and test them as much as any other code. They’re code, too.

                                                                                        1. 24

                                                                                          This was originally submitted yesterday. I jumped at the old title and wrongly removed the story. We occasionally get off-topic posts on how to take notes and I thought this was one more. We talked about it in the chat room a bit, I re-evaluated, and encouraged the author to resubmit.

                                                                                          1. 4

                                                                                            Was wondering where it went! Saw in the moderation log it was removed.

                                                                                            I assume a lot of technology is about personal or organizational productivity, I assumed the story got the balance wrong though.

                                                                                            1. 2

                                                                                              As usual, good job on transparency. :)

                                                                                            1. 4

                                                                                              It would probably be easier to start with classic Pascal, and a hand-built p-code interpreter.

                                                                                              Although it’d be funnier to start with a c-subset compiler written in BASIC :)

                                                                                              1. 1

                                                                                                I think this is misunderstanding the problem - the goal is to get from the simplest possible thing that can be written in assembly and verified up to something substantial. There’s no way to solve this by starting with Pascal or BASIC - those can be intermediate steps, but only by first having a tiny handwritten implementation of those. It is true that BASIC used to be handwritten assembly, so getting a tiny C compiler in 6502 BASIC might be a valid route.

                                                                                                The article touched on the idea that assembly is just an interface edge to hardware and the hardware cannot be intrinsically trusted. I think the next step which it missed is that bootstrapping can start on any platform, and if it’s possible to get a Z80 (or other simple CPU) to cross compile a minimal toolchain, that can be a valid path. As soon as that toolchain runs on a “modern” CPU its output cannot be trusted, although if the goal is to emit code that runs on a “modern” CPU the final result can’t really be trusted either. Finding the bottom turtle still doesn’t solve the ME problem.

                                                                                                1. 2

                                                                                                  “I think this is misunderstanding the problem - the goal is to get from the simplest possible thing that can be written in assembly and verified up to something substantial.”

                                                                                                  That’s close to what they did with P-code. The Pascal/P system used a combo of a Pascal compiler, a standard library, and P-code they both targeted. Each port just had to port the P-code interpreter. It was easy enough for non-compiler experts that Pascal was ported to 70+ architectures in two years per Wirth.

                                                                                                  They could’ve hand-coded the compiler and standard library in P-code for traceability if they wanted. Pascal itself was designed partly to make the compiler easy to build. Pascal-S was a variant focusing on compiler simplicity. There’s signifigant overlap here with goals of bootstrapping. Let’s test that by looking at what I proposed for Pascal-inspired bootstrapping.

                                                                                                  Like Lisp taught, we work top down with HLL code while bringing bottom layer upward toward expressing HLL concepts. Start with a C interpreter with minimal features of it: scalar data, expressions, structs, functions, and basic I/O. Implement it in minimalist C followed by assembly hand-matched against it. Use this to express first C compiler (eg TCC). You can either write a compiler from C to this language or hand-convert code to it.

                                                                                                  Like Pascal/S, the compiler can be ultra-simple in design but easier since target is HLL. You don’t even write a tool for generating assembly or binary: you just use existing, simified designs made for bootstrapping targeted to the C-like VM and/or get it from the compiler after porting the whole thing. Had I stayed in it, my plan was to cheat by making an untrusted, automated converter from TCC’s C to that VM language that was visually inspectable by non-experts. If it passed inspection, trust was reduced to an interpreter that could be someone’s weekend project. All much simpler, more democratic, and more consistent with C skills than Scheme or hex.

                                                                                                  Far as hardware, use a verified CPU like VAMP (DLX-style) synthesized with a method like Baranov’s using FSM’s. The algorithms could be implemented on same VM approach. My method also allows diverse hardware with outputs having to match on each. Standardized computation on diverse hardware, esp pre-1999, negates most concerns about hardware backdoors. Even more if computations run on embedded MCU’s that had no transistors to spare for anything in supplier’s commercial interest, much less compiler subverters. :)

                                                                                                  Note: My VM concept itself can be implemented in hardware. Burroughs B5000 implemented a high-level ISA in hardware partly for security back in the 1960’s.

                                                                                                  I’ll tag @cadey, @sjamaan, @Sophistifunk to include them in this in case they find any of it useful.

                                                                                                  1. 2

                                                                                                    Burroughs B5000 implemented a high-level ISA in hardware partly for security back in the 1960’s.

                                                                                                    For years I’ve felt that the loss of this branch of hw is a crying shame.

                                                                                                    1. 3

                                                                                                      It’s not lost, it just slept for a while. CHERI is heavily inspired by the B5500 architecture and my goal with it was to produce a RISC version of the Burroughs architecture: you don’t need the hardware to differentiate between floating point and integer types, because it doesn’t affect security and you can do it in software, but you do need it to enforce pointer provenance and memory safety.

                                                                                                      Arm is building us a test chip that just taped out, so we’ll get to experiment with this in a high-end pipeline soon.

                                                                                                      1. 1

                                                                                                        This is excellent news.

                                                                                                      2. 1

                                                                                                        Indeed. Until maybe return-oriented programming, it would’ve automatically prevented most code injections that hackers used against sloppy code. Even pro code that occasionally overlooks something would have same benefit.

                                                                                                        1. 2

                                                                                                          ROP is also blocked in many forms. In a CHERI system, the return address that you spill on the stack is a tagged value. If an attacker injects data, they can’t inject pointers, they need to find other tagged pointers that they can copy over a return address (spolier). If the tagged values also provide permissions (as CHERI does but the B5500 doesn’t) then it’s even harder: they need to find an existing executable pointer (and, with Morello, one that is permitted to be used as a return address) and copy it over a spilled return address.

                                                                                                1. 1

                                                                                                  Hmm, I haven’t read this closely yet, but one big part which seems to be missing is commercial hobby projects or amateur pursuits? If I wanna use something under the BTPL in a hobby project, but I anticipate earning a little bit of money from that hobby project, am I not permitted to use the BTPL-licensed software unless I start an LLC?

                                                                                                  Like, I’m currently working on a small game. I don’t have a “small business” company, I’m not a “charitable organization”, I’m just a hobbyist. But I anticipate a commercial application; if I get the game to a place I’m comfortable with, I’ll try to put it on itch.io or steam or something for a few bucks. From my reading of the license text, I don’t think I would be allowed to use any BTPL-licensed software in that project.

                                                                                                  1. 2

                                                                                                    You’d automatically be a sole proprietor, or small-business owner, at least in the U.S.. Check with the laws wherever you’re at to see if conducting business over there makes you legally a business. Stuff like permits might be considered separately from or together with that. It varies.

                                                                                                    It gets more interesting if you sell across borders or through a service hosted in another country. I’m not sure how much their laws apply to you in that case.

                                                                                                    1. 2

                                                                                                      I’m in Norway, I think our equivalent of the “sole proprietorship” is what we call an “enkeltpersonforetak”, and you have to register those, they’re not automatically “created” when you conduct business. I wonder how the BTPL would work in that case.

                                                                                                      Seems best to just have it written out explicitly in the license, right?

                                                                                                      1. 1

                                                                                                        Maybe define business as any entity using the software for commercial puposes. Then, specify size from there.

                                                                                                    2. 2

                                                                                                      Law is interpreted, it’s not logical like a program, the spirit of the law is also important. The license is pretty clear in spirit, you are totally fine until you hit “big company”(which is clearly defined). Anyone dumb enough to take you to court over this license is almost certain to lose, unless you happen to be defined as a big company at the time of the lawsuit, which in your scenario you clearly are not.

                                                                                                      Law sometimes tries to be all logical, but it usually fails. Law will likely never ever pass a compiler type check, let alone a full compile.

                                                                                                      1. 1

                                                                                                        That’s a good point. I would think that your case would fall under the “small business” criteria since an individual selling a commercial license would meet the criteria, but I’m really not sure.

                                                                                                        1. 1

                                                                                                          Maybe? But I’m having trouble seeing how “You may use the software for the benefit of your company” might apply when there is no company. IANAL though.