1. 12

    to me it is very telling when the central argument against Go is the Rob Pike quote that a design value of Go is that it should be easy for programmers with not much professional experience to use Go to get professional experience. This article, like many articles about why Go is bad hinges on a central assumption: that the reader agree that programming for programming’s sake is the most pure form of programming, and that people who code for the purposes of making a living are inherently worse at programming (or less smart), not people who have different, legitimate goals.

    and besides, the author’s first example is atrocious code, both in functionality and in style, as it fails to recognize when it is throwing out arguments and needlessly buffers the entire contents into memory instead of handling the input as a stream. A Go programmer would be more liable to write that program as so: https://gist.github.com/jordanorelli/36beba899fd0d837d0b38227e14e473d

    and as others have said, the things the author cites are either dated (a six year old post about dependency management) or misinformed (“idiomatic generics in Go”).

    1. 1

      misinformed (“idiomatic generics in Go”).

      What’s the currently idiomatic way to… imitate generics these days?

      1. 6

        I don’t think the question is framed particularly well. The more relevant question would be “I would naturally solve [some problem] using generics, but since Go lacks generics, what would be the idiomatic way of approaching [that problem]”. Without specific problems, it’s tough to answer. There is no idiomatic way of implementing map and reduce, because map and reduce are not problem statements, they’re solution strategies, and those solution strategies are rarely idiomatic to Go.

        The one place I’ve found the lack of generics to be problematic is in defining container types, which is especially annoying when the natural way of solving a problem would be to use sets. I do find the handling of container types to be frustrating in Go, but it has posed problems for me less frequently than might be expected; slice and map types get you what you need most of the time, and where they’re lacking, making a struct that contains both is often far simpler than making a sophisticated container type. In the cases where I’ve had to make more sophisticated container types, I’ve often found it to be the case that it’s highly unlikely that the contained value type should be interface{}, and not either some struct (or struct pointer) type or some meaningful interface type that I have defined. In any case, I would recommend not searching for “an idiomatic way of imitating generics in Go”, but rather to search for idiomatic ways to solve the specific problems at hand.

        1. 4

          What’s the current idiomatic way to emulate infix notation in forth?

          1. 2

            Yeah, you’ve got a point there.

            What’s really frustrating here is that, while infix notation really just isn’t Forth, generics could have integrated seamlessly with Go. Despite what Go’s authors said for a very long time, and their excuses for not adding them are incredibly weak. Generics may be hard to design and hard to implement, but they’re fairly easy to use. My guess is that they were lazy or pressed for time, and then rationalized their choices.

            Also, I’m not sure they could even have said something like “Generics would have been nice, but our bosses said we need the language this month” publicly.

            1. 3

              There have been a whole bunch of quite detailed generics proposals from the Go maintainers that they themselves eventually rejected for one reason or the other before the current type parameters proposal was accepted. The current effort is the culmination of about ten years of active work on the issue.

              You may disagree with the reasons why those proposals were rejected, but I think any claims of laziness or post-hoc rationalisations are very hard to defend.

              1. 3

                Remember that it took years for the maintainers to even admit that Go might need generics after all. They stated reasons for not adding generics, but it would take some serious hubris for them to actually believe them.

                As for why it is taking time: a blunder of this magnitude is not easy to correct. you have backwards compatibility to deal with, existing constructs that were there only because we didn’t have true generics, kludges in the standard library that we’ll need to address if we find a better way… Now I have nothing to complain about from the moment they stopped being stupid and started admitting Go would be better with generics. (Which is an implicit admission of a monumental mistake, though I don’t expect them to publicly admit it.)

                As for why failing to add generics in the first place was incredibly stupid. Especially for a team of their calibre:

                First, Benjamin Pierce’s Types and Programming Languages was first published in 2002. It’s a seminal book that’s known to pretty much every programming language enthusiast. When you design a language over 6 years after its publication, failing to learn its material is overwhelming evidence of second order incompetence (that is, not even realising that you don’t know everything, and you might want to get familiar with the state of the art before you go and influence the whole world with your thing). I don’t really fault the language designers for not being familiar with generics right away. I fault them from not getting the hint from Java, C# C++, ML, Haskell… and then not learning the relevant material. Which was conveniently concentrated in a famous, very well written book.

                Second, when backward compatibility is not a problem, generics are easy. I know because I’ve done it myself, with my first serious programming language. The end users wanted a scripting language with a REPL to pilot and configure a test environment for programmable logic controllers. I gave them a statically typed one, with local type inference, a smidge of OOP syntax, and static dispatched based on the type of the first argument (similar to C++ overloaded functions). I quickly realised that I would need generics to write parts of the standard library, so I added them. The whole thing, compiler + bytecode interpreter, took me 3 months. The customers started using my thing right away, and I was still around 6 months later. I only had to correct one bug.

                Considering how easy generics actually are, I can only conclude that if Go’s initial designers had learned a bit of type theory, they would have designed a language with generics in mind. They didn’t, so I can only conclude that they were either lazy, or ignorant and unwilling to learn (another form of laziness, I guess).

                1. 3

                  Remember that it took years for the maintainers to even admit that Go might need generics after all.

                  It took a little bit less time, from 2009 (less than a month after the first public release): https://research.swtch.com/generic

                  generics are easy

                  No one ever claimed that implementing generics is too hard; just that doing it in such a way to aligns with Go’s other design goals is not obvious. Designing any programming language is easy; JavaScript was famously done in a week and loads of people are using it every day: doesn’t mean it got everything right.

                  a blunder of this magnitude is not easy to correct. you have backwards compatibility to deal with

                  Implementing generics but getting it wrong would be significantly harder to correct.

                  1. 2

                    It took a little bit less time, from 2009 (less than a month after the first public release):

                    That fast? That hints as a rushed release, then. Pressured for time for some reason. They tried to get away with it, but apparently the backlash quickly told them that nope, they’re not going to get off the hook that easy.

                    No one ever claimed that implementing generics is too hard; just that doing it in such a way to aligns with Go’s other design goals is not obvious.

                    Pierce explains how to combine parametric polymorphism with subtyping. The solutions were known. The rest of Go’s type system is not so unique that it would require anything special.

                    Implementing generics but getting it wrong would be significantly harder to correct.

                    Look at ML/Haskell (the basic stuff only, don’t necessarily bother with modules or type classes), limit yourself to local type inference to make your life easier, it’s pretty hard to get it wrong.

                    https://research.swtch.com/generic

                    That’s a false dilemma. The “box vs bloat” choice should be an implementation detail. If you expose it to the users like C++ did, you’ve already painted the language into a corner.

                    Now a proper generic system will likely appear to users as if we box everything. But the compiler has ways to bypass boxes under the hood (especially if it inlines stuff). You could also take the bullet, instantiate your types, and swallow the code bloat. You could do a mix of boxing and template-like instantiation. Granted, the best compilation strategies won’t be easy to find, but at least users have a nice interface with no nasty surprises.

                    1. 1

                      Implementing generics but getting it wrong would be significantly harder to correct.

                      But did they do their due diligence? What solutions did they investigate and what problems did they find that required a decade of further research?

                      JavaScript was famously done in a week and loads of people are using it every day: doesn’t mean it got everything right.

                      This reflects worse on Go than on JS; no one would have expected a toy made in ten days to be used (with few major changes) for major tasks twenty years later, whilst Go was made from the onset as a serious systems language.

                2. 2

                  Since Go was a project created to keep a famous person interested enough to stay at Google so they could say said famous person work’s there, and did not see any meaningful adoption at Google for years (and even still is hardly the most popular internal language) I don’t think they were pressed for time ;)

              2. 1

                interface{}

                1. 1

                  OP did make make an interface{} based attempt, so I’m not sure he was as misinformed as @scraps suggested.

            1. 6

              Note how the letter says Barinsta “may” violate such and such law. Since it’s not an outright accusation, the lawyer who wrote this doesn’t risk anything. They can always say they wrote it “just in case” or something.

              About the violation of terms of service… are they even bound to those terms of service? Maybe Barinsta users are, but the developers of the software itself?

              1. 2

                It looks like the developer was an user too, and according to that letter they got banned for life from all facebook services

              1. 9

                The author seems highly adversarial towards Meow hash at the start of the essay, though the amount of research work that must have gotten into it is worthy of admiration.

                For people that might not know what the initial use Casey wanted fulfilled when he wrote v1. of MeowHash, it was basically to use as a hash-function for building maps of assets in a game engine. I think a long time he has been very vocal about how it will not be suitable to do cryptography, because it was designed for speed, not safety. I guess the subsequent progress in the next versions made them be comfortable with assuming Level 3. I see that currently they have taken it down to Level 1 following this article.

                1. 18

                  The author seems highly adversarial towards Meow hash at the start of the essay

                  How so?

                  The author noted, multiple times, that Meow hash was not advertised as a strong cryptographic hash. I took the author’s tone as polite and collegiate.

                  I ask because I often write documents with similar tone and caveats. I don’t want to seem adversarial!

                  1. 11

                    The article is still held in a professional manner, and backs up the author’s claims with outstanding cryptographic analysis. I symapthise with the author here: The meow hash function claims security where none is to be found. The developers warn that their level 3 classification might be unsound and nobody has proven that it is secure, but they still upheld that claim, which is questionable at best and malicious at worst. It was reclassified after these exploits were made public, but IMO it should never have claimed to be securer without proof.

                    I don’t mean to detract from the value of the meow hash function: 50GB/s is an absolutely outstanding hash rate and in situations where cryptographic security isn’t required (such as game asset hasing), and it is certainly one of the fastest out there. Just don’t claim what you haven’t verified yourself.

                    1. 10

                      I think that, in general, humans should not be boastful about hash functions. We have yet to show that one-way functions even exist; there is a lot of hubris involved in claiming that a hash function is one-way or even slightly hard to invert.

                      1. 1

                        Have not shown that one way functions exists? This seems… odd. Speaking of which, how is isOdd not a one-way function?

                        1. 5

                          The Wikipedia article @Corbin linked explains this. It is not, because a one-way function is one where for a given output, it is hard to find any input, it doesn’t have to be the input originally provided.

                          1. 1

                            Interesting, cheers. I will let this stew in the ol’ noggin.

                      2. 2

                        [Meow hash] was designed for speed, not safety

                        My understanding is that speed and safety are basically the same thing:

                        • Either you want maximum security for given speed constraints,
                        • or you want maximum speed for a given security margin.

                        Thus, the quality of a symmetric primitive is basically measured by how many rounds it needs to thwart all known attacks (assuming the community made a concerted effort to break it, mind you), and how much time each round takes.

                        Now if we can increase the rounds of MeowHash down to the speed of some modern hash like BLAKE2, BLAKE3, or SHA-3, and we notice that it has a similar security margin, then it’s probably pretty good. If however the margin is smaller, or it stays broken even with that many rounds, then it’s just crap…

                        …which may be a worthy trade-off anyway if it was never meant to be attacked in the first place of course, especially if the first rounds take little computation.

                      1. 27

                        there is a fair amount of evidence that people who have a pressing need can use OpenPGP successfully.

                        The bar is continuously set lower and lower by PGP enthusiasts.

                        1. 2

                          This is not to say that OpenPGP, its various implementations, and the applications that integrate it don’t have problems. They do. The OpenPGP working group needs to finish the cryptographic refresh and finally standardize AEAD. We need email clients that are built with security-oriented workflows for people who have powerful adversaries. And, we need popular mail clients to adopt opportunistic encryption similar to Signal for ordinary users.

                          Also, we could discuss the possibility of ditching PGP and use something else (like Age)… still over email.

                          1. 4

                            S/MIME works reasonably well, if you can get a certificate. It works very well for signing emails, which is the thing I care the most about: my bank or solicitor can easily afford to pay for a cert and set up the infrastructure to roll out signing certificates for all employees’ email so that any communication with customers is signed. It’s less good for encryption, because there’s no good way of discovering a key (and key management for decryption is exciting, especially if you want to be able to revoke keys from certain devices if you lose them).

                        1. 4

                          Apart from the name/manufacturer of the received vaccine, there is no superfluous data inside, so the QR code is not a privacy nightmare, as some have feared.

                          It’s still not ideal. For normal people it doesn’t matter, but some are going too need more than 2 shots, some only need 1, and some will be exempt. And that, people, can easily give insight into one’s health:

                          • Exempt? Maybe you have some blood disease, or other condition that prevents you from being vaccinated?
                          • 1 shot? You’ve probably been infected.
                          • 3 shots? Now I can suspect your immune system is deficient in some way.

                          It would be better, I think, to have an expiration date instead, or some range of validity.

                          1. 4

                            It would be better, I think, to have an expiration date instead, or some range of validity.

                            The problem with that idea is that 1) the codes are valid in multiple jurisdictions that consider a person immunized for a different amount of time¹ and 2) those expiration dates could change with new covid developments². So such codes would have to be generated for each jurisdiction and have a short livetime. This would in practice mean that codes would be generated on the spot.

                            This can give insights into the social behaviour of people (e.g. “Alice generated a new QR code at 21:39:02 from the WiFi of the Foo-bar. Bob has downloaded a new code from the same IP at 21:38:54”), which is IMO much more privacy invaiding. Also, those 1-shot/3-shots special cases aren’t implemented (yet) in Austria anyways; those people can only use their paper documents for now.


                            ¹: Not just different EU member states, but regional lockdowns might have different requirements as well.

                            ²: It’s even worse: validity length already varies for tests depending on the circumstances: e.g. a PCR test in Austria is valid for 72h when visiting a restaurant, 48h when crossing the border to Italy, or 7 days when clocking in to work.

                            1. 1

                              Oops, good point…

                          1. 5

                            Well, the government are not joking. What happened to medical confidentiality?

                            1. 16

                              Having to prove you have a vaccination has been a requirement in all manner of situations before this - like international travel.

                              1. 8

                                I live in France, and a number of vaccines are already mandatory (for obvious public health reasons).

                                I’ve never had to present a proof of vaccination when I go to the theatre. Or Theme park. Or anywhere within my country for that matter. Even for international travel, didn’t need to give the USA such proof when I came to see the total solar eclipse in 2019. I’ve also never had to disclose the date of my vaccines, or any information about my health.

                                What you call “all manner of situation” is actually very narrow. This certificate is something new. A precedent.

                                1. 9

                                  and a number of vaccines are already mandatory (for obvious public health reasons).

                                  This is why you’ve not been asked for proof for international travel, since it’s assumed that you’ll have received these immunisations or be unexposed through herd immunity as someone who resides in France.

                                  We’re currently in a migration period where some people are immunised and others aren’t. We’ve had this happen before– the WHO is responsible for coordinating the Carte Jaune standard (first enforced on 1 August 1935) to aid with information sharing, but they haven’t extended it to include COVID-19 immunisation yet.

                                  In a 1972 article, the NYTimes headlines “Travel Notes: Immunization Cards No Longer Needed for European Trips” regarding Smallpox immunisations.

                                  Still, even today, immigrants applying to the United States for permanent residency remain required to present evidence of vaccinations recommended by the CDC: https://www.cdc.gov/immigrantrefugeehealth/laws-regs/vaccination-immigration/revised-vaccination-immigration-faq.html#whatvaccines

                                  1. 3

                                    (Note: international travel is one use case where I believe it’s perfectly legitimate to ask for a evidence of vaccination. It’s the only way a country can make sure it won’t get some public health problems on its hand, which makes it a matter of sovereignty.)

                              2. 1

                                It’s not the government that’s sharing this information. It’s you when you present that QR code. This is equivalent to your doctor printing out a piece of your medical records and handing it to you. You can do whatever the hell you want with that piece. It’s your medical history. If you want to show it to someone, you can. If you don’t want to show it to someone, you can. The government only issues the pass. Nothing more.

                                1. 2

                                  The QR code has a very important difference with a piece of paper one would look at: its contents are trivially recorded. A piece of paper on the other hand is quickly be forgotten.

                                  This is equivalent to your doctor printing out a piece of your medical records and handing it to you.

                                  No, this is equivalent to me printing out a piece of my medical record and handing it to the guard at the entrance of the theatre. And I’m giving them way more than what they need to know. They only need a cryptographic certificate with an expiration date, and I’m giving them when I got my shot or whether I’ve been naturally infected. I can already see insurance companies buying data from security companies.

                                  You can do whatever the hell you want with that piece. It’s your medical history.

                                  There’s a significant difference between the US and the EU here, that is worth emphasising. In the US, your personal information, (such as your medical history) is kind of your property. You can give it or sell it and all sorts of things. In the EU however your personal information is a part of you, and as such is less alienable than your property. I personally align with the EU more that the US on this one, because things that describes you can be used to influence, manipulate, and in some case persecute you.

                                  If you want to show it to someone, you can. If you don’t want to show it to someone, you can.

                                  Do I really have that choice? Can I really chose not to show my medical history if it means not showing up at the theatre or any form of crowded entertainment ever? Here’s another one: could you actually chose not to carry a tracking device with you nearly at all times? Can you live with the consequences of no longer owning a cell phone?

                                  1. 0

                                    If you carry a tracking device with you at all times, why do you care about sharing your vaccination status? And why should someone medically unable to be vaccinated care about your privacy when their life is at risk?

                                    As someone who’s father is immunocompromised, and with a dear friend who could not receive the vaccine due to a blood disease, fuck off. People have died.

                                    1. 3

                                      fuck off. People have died.

                                      Since you’re forcing my hand, know that I received my first injection not long ago, and have my appointment for the second one. Since I have good health, I don’t mind sharing too much.

                                      What I do mind is that your father and dear friend have to share their information. Your father will likely need more than 2 injections. If it’s written, we can suspect immunocompromission. Your friend will be exempt. If it’s written, we can suspect some illness. That makes them vulnerable, and I don’t want that. They may not want that.

                                      Now let’s say we do need that certificate. Because yes, I am willing to give up a sliver of liberty for the health of us all. The certificate only needs 3 things:

                                      • Information that can be linked to your ID (some number, your name…)
                                      • An expiration date.
                                      • A cryptographic certificate from the government.

                                      That’s it. People reading the QR-code can automatically know whether you’re clear or not, and they don’t need to know why.

                                      If you carry a tracking device with you at all times, why do you care about sharing your vaccination status?

                                      I do not carry that device by choice. The social expectation that people can call me at any time is too strong. I’m as hooked as any junkie now.

                                      1. 2

                                        I am willing to give up a sliver of liberty for the health of us all.

                                        I appreciate your willingness, your previous comments made me think you weren’t. I apologize for my hostility. I think we can agree we should strive to uphold privacy to the utmost, but not at the expense of lives.

                                        That’s it. People reading the QR-code can automatically know whether you’re clear or not, and they don’t need to know why.

                                        That’s true, and that system would be more secure. But the additional detail could provide utility that outweighs that concern.

                                        I can already see insurance companies buying data from security companies.

                                        Insurance companies already have access to your medical history in the US. Equitable health care is an ongoing struggle here. ¯\_(ツ)_/¯

                                        Edit: I removed parts about US law that could be incorrect, as IANAL.

                                        1. 5

                                          Deep breath, C-f HIP … sigh

                                          HIPAA states PHI (personal health information) cannot be viewed by anyone without a need to know that information, and information systems should never even allow unauthorized persons to view that information in the first place. Device or software that displayed PHI to a movie theatre clerk would never go to market because it would never pass HIPAA compliance.

                                          Damn it, no, this is incredibly wrong.

                                          HIPAA applies to covered entities and business associates only. Covered entities are health care providers, insurance plans, and clearinghouses/HIEs. Business associates are companies that provide services to covered entities – so if you are an independent medical coder that reads doctor notes and assigns ICD10 codes, you’re covered because you provide services to a covered entity. How do you know if you’re a business associate? You’ve signed a BAA.

                                          Movie theaters are not covered entities, and are not business associates. HIPAA has zero bearing on what they do. Your movie theater clerk could absolutely mandate you share your vaccination status – just like your doughnut vendor can ask in exchange for a free doughnut.

                                          1. 1

                                            Your movie theater clerk could absolutely mandate you share your vaccination status

                                            Yeah. As the movie theater is private property, and “unvaccinated” isn’t a protected group, they are allowed to discriminate all they want.

                                            But I admit I am surprised they’d legally be able to store and sell your medical records. It seems you’re correct, and I had incorrectly generalized my experience and knowledge dealing with other covered entities all day to non-covered entities. A classic blunder of a programmer speaking about law, whoops. I’ve cut those statements from my prior comment.

                                            I still don’t think that vaccination information would be any news to insurance companies, but I’m yet again disappointed by US privacy law.

                                            1. 2

                                              Yeah. As the movie theater is private property, and “unvaccinated” isn’t a protected group, they are allowed to discriminate all they want.

                                              It is conceivable you could make an ADA argument here – “I can’t get a COVID vaccination due to a medical condition; therefore, you need to provide a reasonable accommodation to me”. But that’s maybe a stretch, I’m not sure.

                                              But I admit I am surprised they’d legally be able to store and sell your medical records

                                              I think a lot of this comes down to training about HIPAA. If you’re in-scope for HIPAA, many places (rightfully) treat PHI as radioactive and communicate that to employees. And there’s very little risk in overstating the risk around mishandling PHI - it’s far safer to overmessage the dangers to people who work with it.

                                              Indeed, until I needed to get involved on the compliance side – after all, somebody has to quote HITRUST controls for RFPs – I overfit HIPAA as well.

                                              I’m yet again disappointed by US privacy law.

                                              If you want to feel marginally better, go read up on 42 CFR Part 2. It still only applies to covered entities but it offers real, meaningful protections to an especially vulnerable population: people seeking treatment for substance use disorder. It also makes restrictions around HIPAA data handling look trivial.

                                          2. 2

                                            But the additional detail could provide utility that outweighs that concern.

                                            Possibly. That would need to be studied and justified, I believe.

                                            Furthermore any reader of these QR codes should only return a pass/fail result, […]

                                            Actually that’s what I expect from official programs, including in France. The problem is the QR code itself: any program can read it, and it’s too easy (and therefore tempting) to write or use a program that displays (or record!) everything.

                                            HIPAA laws are some of the few here that have teeth

                                            Hmm, that less horrible than I thought then. Glad to hear it.

                                            1. 1

                                              Hmm, that less horrible than I thought then. Glad to hear it.

                                              As @owen points out, IANAL and these laws don’t apply in this circumstance. I still don’t think that vaccination information would be any news to insurance companies, but I’m yet again disappointed by US privacy law.

                                1. 6

                                  What if we prevented people from accessing content based on their vaccination status? What if you couldn’t read Reddit, watch Youtube, or post on Instagram without first proving that you’ve been vaccinated? I think we’d up our numbers pretty quick.

                                  This isn’t something that @bertrandom can arbitrarily decide to do, thankfully. The major corporations who own these platforms would have to make this decision, and then implement it in a way that is difficult to bypass, despite the amount of developer attention that bypass methods for these major websites would attract. Imagine a software project like youtube-dl that was constantly being updated to bypass the latest iteration of the vaccination paywall.

                                  Of course, there’s an accelerationist argument that if Youtube, Reddit, etc. block more and more people from using their platforms on ideological or quasi-ideological grounds (and willingness to provide vaccination status is definitely ideologically-linked), this would weaken the network effect of these platforms and create a larger potential userbase for alternative platforms, which would create laudable competition in the space and offer an opportunity free-software platforms (e.g. Peertube) to gain mindshare.

                                  1. 41

                                    This is a joke.

                                    1. 7

                                      It is, in fact, not a joke. (Relax! This is a joke.)

                                      1. 7

                                        I don’t think the article is a joke, and I’m not joking. I think the author dresses up the article as a joke, but he’s really pretty serious about it.

                                        1. 7

                                          The author states twice that “this is a joke”, so I’m sure it is a joke.

                                          It’s also a massive, successful troll, as can be evidenced by this comment thread.

                                          1. 2

                                            I don’t think the author is trying with this blog post to make a serious public health policy proposal, or to seriously get the ear of people who work at Reddit or Youtube.

                                            On the other hand, according to the author’s about page, he works at Slack, which owns another proprietary communications platform much like the other ones mentioned, that many people use (often because it’s a requirement of their jobs - that’s why I use Slack myself, for instance).

                                            Slack is not a huge company, and it’s easy to imagine that a Slack employee who writes a sentence like “What if we prevented people from accessing content based on their vaccination status?” might be involved in an internal product decision about whether or not to implement something like this vaccine paywall. If not specifically for the covid vaccine, than for some potential future public-health measure where people have intense political disagreement over the appropriate response. “What if you couldn’t chat on Slack without first proving that you’ve been vaccinated? I think we’d up our numbers pretty quick.” is as good a hypothetical as the original.

                                            Anyway, even if the author is writing this entirely as a joke, I assume that people exist who would actually like to influence the public-health-related behavior of the public at large by building personal medical information checks into the software they use frequently, and think about ways to bypass these checks before they actually get built. And one great way to avoid even the possibility of these kinds of checks is to avoid using platforms run by someone other than you to begin with.

                                            1. 8

                                              Slack is not a huge company

                                              No, it’s not. Slack is a chat application. Slack is owned by Salesforce, which is an absolute behemoth of a company.

                                              The author was using the idea of vaccination-status-based-access-control as a conceit to make a discussion of how to parse the data encoded in that QR code less bland. It’s neither a policy proposal nor any kind of political statement. It’s a joke to make more people read a blog post that would otherwise be extremely tedious. Like the author says. Repeatedly.

                                              Your assumption that “people exist who would actually like to influence…” is really only interesting in the sense of rule 34. If you can imagine something, people exist who are interested and they are probably sharing details about it on the internet.

                                        2. 5

                                          This “joke” feels more like a warning to me.

                                          1. -2

                                            It’s racism and bigotry plain and simple

                                            1. 6

                                              I think racism is a specific kind of bigotry, so I’m going to just use that as shorthand instead of “racism and bigotry” for the remainder of this comment. Please read “racism” as “racism and bigotry” and “racist” as “racist and bigoted” to the extent that you draw a distinction between racist bigotry and other bigotry for the purposes of this comment.

                                              What is racist about telling people how to parse these QR codes using a joke about access control?

                                              Please be very specific, because I’m really trying to see how it’s racist, and I just can’t. Are vaccines being denied to people on the basis of race or ethnicity? Where? How? I know there were allegations of that during the early US rollout, but even if they were true then, they are not currently true.

                                              I keep trying to come up with a way to phrase my question to make it sound less combative. But I can’t, so I’ll just explicitly say that it is a good faith question and not some rhetorical point. I have not heard of vaccines being tied to race anywhere, and I would really like to understand the association you are making.

                                              1. 3

                                                This is a joke

                                                1. 2

                                                  If you can be vaccinated and a vaccine is available to you and you’re not getting one, you should feel bad.

                                                  Again I’m not laughing

                                                  So go ahead build your wall. Exclude the 3rd world, it’s what the goal is.

                                                  1. 7

                                                    Read what you quoted.

                                                    If you can be vaccinated

                                                    and a vaccine is available to you

                                          1. 3

                                            Even though this is ment as a joke, it also serves as a nice PoC for people who will actually unironically want to deploy something like this.

                                            I’m not entirely sure how I feel about this. Honestly, some jokes might better remain untold.

                                            1. 6

                                              The way I understand it, it’s not a joke. It’s a warning.

                                            1. 8

                                              Today I wrote a little table printer, you know, stuff like this:

                                                   a     │           b           │
                                              ───────────┼───────────────────────┤
                                              asd        │ 4,434,341,321,312,321 │
                                              asdasdsad  │               443,434 │
                                              

                                              Before I knew what was going on I had added options to use +---+---+ instead of the unicode box drawings characters, options to configure the box drawing characters per-cell, a CSV output mode, and a bunch of other options I don’t need, and all sorts of methods and constants to set all of this. All I wanted was a simple table printer for a little program I wrote that’s a bit more advanced/nicer than Go’s stdlib tabwriter.

                                              I ended up removing most of it; it more than doubled the code size, and I don’t need any of these features. I suspect most people don’t. And if you do: well, use something else then.

                                              And I’m actually fairly conscious about limiting the scope of libraries and such. It’s so easy to get carried away.

                                              I wouldn’t really phrase it as “opinionated” though, but rather more along the lines of “this only solves use case X; other use cases are entirely reasonable, but not solved by this library”, although that doesn’t have such a nice ring to it. It’s fine to solve only 20% of use cases, and if you do it well then those 20% will be solved a lot better than with a much larger generic library/tool!

                                              1. 3

                                                I ended up removing most of it; it more than doubled the code size, and I don’t need any of these features. I suspect most people don’t. And if you do: well, use something else then.

                                                How about forking your code instead? Thanks to your simple and focused design, I can more easily modify it to suit my own needs.

                                                While your more flexible version had a better chance of solving my problem, I would have a harder time adapting it if it didn’t.

                                                1. 1

                                                  This kind of goes back to the MVP philosophy. It even works with Open Source in my opinion.

                                                  In fact, it works even better. If you publish a utility or a library that works for you, and people start using it, 2 things will happen.

                                                  1. People will open Feature Requests against your code, saying “Can we have it do x?” You can then decide to take another look at that idea.
                                                  2. Less often, some hero will come along and say: “I used your code and found myself needing to do x, so I wrote the following patch to have it do x. Can you merge it into your code”?

                                                  Every time #2 happens to me, it makes my day.

                                                  PS: Linux was the original example of a small opinionated software that grew this way. I would argue Ruby on Rails is a later example

                                                2. 1

                                                  Clay Shirky called it Situated Software.

                                                  1. 1

                                                    It is an interesting issue that comes up often in code.

                                                    I was making an interface for some middleware (testing out writing something to handle the incoming http request in node). The code I ended up writing for some flexibility was rather long, but as soon as I wrote it closer to the interface that most end up writing it as the code ended up being something like five lines long.

                                                    I have been looking at the CanActivate interface in angular, which is a guard on a route:

                                                    canActivate(route: ActivatedRouteSnapshot, state: RouterStateSnapshot): Observable<boolean | UrlTree> | Promise<boolean | UrlTree> | boolean | UrlTree
                                                    

                                                    Just look at those return types! This is a perfect example of something that went down a rabbit hole, but one that I think is pretty justified. You definitely want pretty much all those cases at some point.

                                                    But where it becomes a real issue is when there end up being edge cases that the documentation isn’t specific about what happens. Although I am failing to come up with an example just now.

                                                  1. 12

                                                    Kind of a strawman. Here’s a modern C++ raw loop:

                                                    for (const auto &element : container) {
                                                        // do stuff with element
                                                    }
                                                    

                                                    This is quite a bit lighter than the old raw loops the article is complaining about, and very widely applicable (every time the container implements iterators, which is almost always). Stuff like std::foreach and std::find_if can be implemented simply enough that we’re not even tempted to reach for the <algorithm>.

                                                    Also, <algorithm> is not blameless either: the input collection is almost systematically described with two iterators, instead of the entire collection. Guess what, we rarely operate over a portion of the collection. Almost every time we say collection.begin() and collection.end() (or the more “modern”, more verbose std::begin(collection) and std::end(collection)). Compare std::find_if() to what it could have been:

                                                    // current API
                                                    auto it_found = std::find_if(std::cbegin(my_vect), std::cend(my_vect), predicate);
                                                    
                                                    // better API
                                                    auto it_found = std::find_if(my_vect, predicate);
                                                    

                                                    People would use <algorithm> more if they didn’t require such unjustified overhead.

                                                    1. 3

                                                      I think that’s the point of ranges and range versions of algorithms in C++20. Took long enough to think of pairing begin/end into a single structure! :)

                                                      1. 1

                                                        I’d say range-based loops don’t count as “raw loops”. I know the article defines raw loops as simply a for loop, but the real problem it tries to avoid is related to the for(;;) form, which is prone to off-by-ones, and allows complex loop conditions and fiddling with the iteration variable. A simple for-each loop has none of these issues.

                                                        1. 1

                                                          I understand, but by omitting range for, the author posited a false dichotomy. It felt a bit like we were stuck in pre-C++11 era (and back then <algorithm> were unusable because there was no way to put the body of the loop where it belonged).

                                                      1. 2

                                                        When do all their mirrors support https? Downloading something over http or even ftp does not feel like 2021.

                                                        1. 12

                                                          If they do this right (signed packages and so on), then https will only help with privacy. Which is important for sure, but leaking which packages you download is less horrible than leaking the contents of your conversations, or even just who you’ve been in contact with lately.

                                                          1. -1

                                                            HTTPS is more than just privacy. It also prevents JavaScript injection via ISPs, or any other MITM.

                                                            1. 21

                                                              It does that for web pages, not for packages. Packages are signed by the distro’s keys, so if anyone were to mess with your packages as you download them, your package manager would notice and prevent you from installing the package. The only real advantage to HTTPS for package distribution is that it helps conceal which packages you download (though even then, I get an attacker could get a pretty good idea just by seeing which server you’re downloading from and how many bytes you’re downloading).

                                                              1. 1

                                                                It does that for web pages, not for packages

                                                                Indeed, however ISOs, USB installers, etc. can still downloaded from the web site.

                                                                Packages are signed by the distro’s keys, so if anyone were to mess with your packages as you download them, your package manager would notice and prevent you from installing the package.

                                                                Yes, I’m familiar with cryptographic signatures.

                                                                1. 9

                                                                  Indeed, however ISOs, USB installers, etc. can still downloaded from the web site.

                                                                  Yes. The Debian website uses HTTPS, and it looks like the images are distributed using HTTPS too. I thought we were talking bout distributing packages using HTTP vs HTTPS. If your only point is that the ISOs should be distributed over HTTPS then of course I agree, and the Debian project seems to as well.

                                                                  1. 0

                                                                    No, the point is that there is no need for HTTP when HTTPS is available. Regardless of traffic, all HTTP should redirect to HTTPS IMNSHO.

                                                                    1. 16

                                                                      But… why? Your argument for why HTTPS is better is that it prevents JavaScript injection and other forms of MITM. But MITM clearly isn’t a problem for package distribution. Is your argument that “HTTPS protects websites against MITM so packages should use HTTPS (even thought HTTPS doesn’t do anything to protect packages from MITM)”?

                                                                      I truly don’t understand what your reasoning is. Would you be happier if apt used a custom TCP-based transport protocol instead of HTTP?

                                                                      1. 6

                                                                        I suspect that a big reason is cost.

                                                                        Debian mirrors will be serving an absurd amount of traffic, and will probably want to serve data as close to wire speed as possible (likely 10G). Adding a layer of TLS on top means you need to spend money on a powerful CPU or accelerator kit, instead of (mostly) shipping bytes from the disk to the network card.

                                                                        Debian won’t be made of money, and sponsors won’t want to spend more than they absolutely have to.

                                                                        1. 4

                                                                          But MITM clearly isn’t a problem for package distribution.

                                                                          It is though! Package managers still accept untrusted input data and usually do some parsing on it. apt has had vulnerabilities and pacman as well.

                                                                          https://justi.cz/security/2019/01/22/apt-rce.html

                                                                          https://xn--1xa.duncano.de/pacman-CVE-2019-18182-CVE-2019-18183.html

                                                                          TLS would not prevent malicious mirrors in either of these cases, but it would prevent MITM attacks exploiting these issues.

                                                                          1. 7

                                                                            Adding TLS implementations also bring bugs, including RCE. And Debian is using GnuTLS for apt.

                                                                            1. 1

                                                                              Idd. It was one of the reasons for OpenBSD to create signify so I’m delighted to see Debians new approach might be based on it.

                                                                              From https://www.openbsd.org/papers/bsdcan-signify.html:

                                                                              … And if not CAs, then why use TLS? It takes more code for a TLS client just to negotiate hello than in all of signify.

                                                                              The first most likely option we might consider is PGP or GPG. I hear other operating systems do so. The concerns I had using an existing tool were complexity, quality, and complexity.

                                                                    2. 7

                                                                      @sandro originally said: “When do all their mirrors support https?” Emphasis on “mirrors”. To the best of my knowledge, “mirror” in this context does not refer to a web site, or a copy thereof, but to a packages repository.

                                                                      I responded specifically in this context. I was not talking about web sites, which rely on the transport mechanism for all security. In the context I was responding to, each package is signed. Your talk of JavaScript injection and other MITM attacks is simply off topic.

                                                              2. 9

                                                                ftp.XX.debian.org are CNAMEs to servers accepting to host a mirror. These servers are handled by unrelated organisations, so it is not possible to provide a proper cert for them. This match the level of trust: mirrors are not trusted with the content nor the privacy. This is not the case of deb.debian.org which is available over HTTPS if you want (ftp.debian.org is an alias for it).

                                                                1. 2

                                                                  Off line mirrors, people without direct internet access, decades later offline archives, people in the future, local DVD sets.

                                                                  Why “trust” silent media?

                                                                1. 7

                                                                  Is this a proposal, or have they adopted this already? Signify/minisign is so much better than multi-purpose tools PGP/GPG. I hope they get this done ASAP.

                                                                  1. 4

                                                                    I think that’s an early specification, it even mentions “Rough Draft” at the end.

                                                                    That being said, it also covers cohabitation between the 2 which would help with a smoother transition and hopefully faster implementation and adoption.

                                                                  1. 8

                                                                    Wrong title. This post is fairly interesting and well written, but it doesn’t really explain why we need build systems. Instead, it tells us what build systems do. And while I do see the author trying to push us towards widely used build systems such as CMake, he offers little justification. He mentions that most developers seem to think CMake make them suffer, but then utterly fails to address the problem. Are we supposed to just deal with it?

                                                                    For simple build system like GNU Make the developer must specify and maintain these dependencies manually.

                                                                    Not quite true, there are tricks that allows GNU Make to keep track of dependencies automatically, thanks to the -M option from GCC and Clang. Kind of a pain in the butt, but it can be done.

                                                                    A wildcard approach to filenames (e.g. src/*.cpp) superficially seems more straightforward as it doesn’t require the developer to list each file allowing new files to be easily added. The downside is that the build system does not have a definitive list of the source code files for a given artefact, making it harder to track dependencies and understand precisely what components are required. Wildcards also allow spurious files to be included in the build – maybe an older module that has been superseded but not removed from the source folder.

                                                                    First, tracking dependencies should be the build system’s job. It can and has been done. Second, if you have spurious files in your source tree, you should remove them. Third, if you forget to remove an obsolete module, I bet my hat you also forgot to remove it from the list of source files.

                                                                    Best practice says to list all source modules individually despite the, hopefully minor, extra workload involved when first configuring the project or adding additional modules as the project evolves.

                                                                    In my opinion, best practice is wrong. I’ll accept that current tools are limited, but we shouldn’t have to redundantly type out dependencies that are right there in the source tree.


                                                                    That’s it for the hate. Let’s talk solutions. I personally recommend taking a look at SHAKE, as well as the paper that explains the theory behind it (and other build systems as well). I’ve read the paper, and it has given me faith in the possibility of better, simpler build systems.

                                                                    1. 3

                                                                      We need to distinguish between build execution (ninja) and build configuration (autotools). The paper is about the execution. Most of complexity is in the configuration. (The paper is great though 👍)

                                                                      1. 2

                                                                        I have looked at SHAKE and its paper before, but I am curious: what would you like to see in a build system?

                                                                        I ask because I am building one. 1

                                                                        1. 4

                                                                          I’m a peculiar user. What I want (and build) is simple, opinionated software. This is the Way.

                                                                          I don’t need, nor want, my build system to cater to God knows how many environments, like CMake does. I don’t care that my dependencies are using CMake or the autotools. I don’t seek compatibility with those monstrosities. If it means I have to rewrite some big build script from scratch, so be it. Though in all honesty, I’m okay with just calling the original build script and using the artefacts directly.

                                                                          I don’t need, nor want, my build system to treat stuff like unit testing and continuous integration specially. I want it to be flexible enough that I can generate a text file with the test results, or install & launch the application on the production server.

                                                                          I want my build system to be equally useful for C, C++, Haskell, Rust, LaTeX, and pretty much anything. Just a thing that uses commands to generate missing dependencies. And even then most commands can be as simple as calling some program. They don’t have to support Bash syntax or whatever. I want multiple targets and dynamic dependencies. And most of all, I want a strong mathematical foundation behind the build system. I don’t want to have to rebuild the world “just in case”.


                                                                          Or, I want a magical build system where I just tell it where’s the entry point of my program, and it just fetches and builds the transitive extension of the dependencies. Which seems possible on some closed ecosystems like Rust or Go. And I want that build system to give me an easy way to run unit tests as part of the build, as well as installing my program, or at least giving me installation scripts. (This is somewhat contrary to the generic build system above.)

                                                                          That said, if the generic build system can stay simple and is easy enough to use, I probably won’t need the “walled garden” version.

                                                                          1. 2

                                                                            Goodness; you know exactly what you want.

                                                                            Your comment revealed some blind spots in my current design. I am going to have to go back to the drawing board and try again.

                                                                            I think a big challenge would be to generate missing dependencies for C and C++, since files can be laid out haphazardly with no rhyme or reason. However, for most other languages, which have true module systems, that may be more possible.

                                                                            Thank you.

                                                                        2. 2

                                                                          The real reason why globbing source files is unsound, at least in the context of CMake:

                                                                          Note: We do not recommend using GLOB to collect a list of source files from your source tree: If no CMakeLists.txt file changes when a source is added or removed, then the generated build system cannot know when to ask CMake to regenerate.

                                                                          I heard the same reason is why Meson doesn’t support it.

                                                                          1. 2

                                                                            Oh, so it’s a limitation of the tool, not something we actually desire… Here’s what I think: such glob patterns would typically be useful at link time, where you want to have the executable (or library) to aggregate all object files. Now the list of object files depend on the list of source files, which itself depends on the result of the glob pattern.

                                                                            So to generate the program, the system would fetch the list of object files. That list depends on the list of source files, and should be generated whenever the list of source file changes. As for the list of source files, well, it changes whenever we actually add or remove a source file. As for how we should detect it, well… this would mean generating the list anew every time, and see if it changed.

                                                                            Okay, so there is one fundamental limitation here: if we have many many files in the project, using glob patterns can make the build system slower. It might be a good idea in this case to fix the list of source files. Now, I still want a script that lists all available source files so I don’t have to manually add it every time I add a new file. But I understand the rationale better now.

                                                                            1. 1

                                                                              Oh, so it’s a limitation of the tool, not something we actually desire… Here’s what I think: such glob patterns would typically be useful at link time, where you want to have the executable (or library) to aggregate all object files. Now the list of object files depend on the list of source files, which itself depends on the result of the glob pattern.

                                                                              So to generate the program, the system would fetch the list of object files. That list depends on the list of source files, and should be generated whenever the list of source file changes. As for the list of source files, well, it changes whenever we actually add or remove a source file. As for how we should detect it, well… this would mean generating the list anew every time, and see if it changed.

                                                                              Okay, so there is one fundamental limitation here: if we have many many files in the project, using glob patterns can make the build system slower. It might be a good idea in this case to fix the list of source files. Now, I still want a script that lists all available source files so I don’t have to manually add it every time I add a new file. But I understand the rationale better now.

                                                                            2. 1

                                                                              First, tracking dependencies should be the build system’s job. It can and has been done.

                                                                              see: tup

                                                                              Second, if you have spurious files in your source tree, you should remove them.

                                                                              Conditionally compiling code on the file level is one of the best ways to do it, especially if you have some kind of plugin system (or class system). It’s cleaner that ifdefing out big chunks of code IMO.

                                                                              Traditionally, the reason has been because if you want make to rebuild your code correctly when you remove a file you have to do something like

                                                                              OBJS := $(wildcard *.c)
                                                                              
                                                                              .%.var: FORCE
                                                                              	@echo $($*) | cmp - $@ || echo $($*) > $@
                                                                              
                                                                              my_executable: $(OBJS) .OBJS.var
                                                                              	$(CC) $(LDLIBS) -o $@ $(OBJS) $(LDFLAGS)
                                                                              

                                                                              which is a bit annoying, and definitely error-prone.

                                                                              Third, if you forget to remove an obsolete module, I bet my hat you also forgot to remove it from the list of source files.

                                                                              One additional reason is that it can be nice when working on something which hasn’t been checked in yet. Imagine that you are working on adding the new Foo feature, which lives in foo.c. If you then need to switch branches, git stash and git checkout will leave foo.c lying around. By specifying the sources you want explicitly, you don’t have to worry about accidentally including it.

                                                                              1. 1

                                                                                Conditionally compiling code on the file level is one of the best ways to do it, especially if you have some kind of plugin system (or class system). It’s cleaner that ifdefing out big chunks of code IMO.

                                                                                Okay, that’s a bloody good argument. Add to that the performance implication of listing every source file every time you build, and you have a fairly solid reason to maintain a static list of source files.

                                                                                Damn… I guess I stand corrected.

                                                                            1. 4

                                                                              What I got out of the article:

                                                                              • Nowadays, the ISA is quite cleanly separated from the implementation: it’s not just decoded then executed, the stream of instructions is chopped (sometimes fused!) into micro-operations, whose characteristics are quite different from any instruction set: they’re wider (most likely to make decoding easier or even trivial), and generally do less work. Then there’s a second stage where the micro operations are actually executed (out of order). It’s almost as if the processor had some internal ISA focused on implementation instead of external constraints like code density, ease of programming, or plain backward compatibility. Anyway, this separation makes the RISC vs CISC debate a bit moot, because the ISA doesn’t affect the whole processor. Internally, processors do what they gotta do.

                                                                              • On the other hand, the ISA does matter. The x86 line of instruction set is remarkably hard to decode, to the point where decoding has become a bottleneck, forcing CPU vendors to add micro-code cache and other complications so instruction decoding could be fast enough (a problem that ARM is mostly free from). So the separation of the ISA and the implementation isn’t quite complete, and having a complex, hard to decode ISA not only has direct costs at the decoding step, it has indirect costs where the rest of the CPU has to compensate for slower decoding. It’s not just a CISC vs RISC thing: it’s also about legacy, and adding instructions to an instruction set that didn’t leave room for new instructions (quite unlike RISC-V). What we may call the “CISC tax” really is a “legacy tax”.

                                                                              • One important point where the ISA matters though is backward compatibility. Especially through the 90’s and early 2000’s, where cross compilation was even harder than it is now (that with every compiler and environment assuming they were compiling for the local machine), and distribution media more limited: imagine shipping 3 versions of your productivity suite on the shelves (I believe this also explains for some extent the Windows hegemony). Now however, we can just download the right executable for our platform. Anyway, this backward compatibility thing, as well as the quality of Intel’s foundry, skewed the game to the point where it isn’t clear whether the x86 tax will really matter in the long run. Now though, we stand to know more as Apple deploys their new ARM processors (I for one was quite surprised: I thought x86-64 would be impossible to displace when Apple ditched their PowerPC processors. But I guess the reason Apple can switch away from x86-64 is the same reason it could switch to it: they have such control over their ecosystem that they can force everyone to recompile everything).

                                                                              1. 4

                                                                                The first point is both true and misleading. It’s true that there’s little complexity in the majority of the pipeline(s) from some instructions being a compressed representation. For example, an x86 register-memory add instruction may be cracked into a load and an ALU op in the front end and has little impact on the rest of the pipeline. This; however, ignores the massive impact of microcoded instructions. Microcode is very cheap to implement if the instructions that are implemented in microcode are rarely used. When you hit a microcoded instruction, you stall the pipeline and just push in the instructions from the microcode. If you have microcoded instructions that are performance critical, then there’s a huge power / complexity cost to implement these.

                                                                                The really interesting place where the architecture leaks into the microarchitecture is the memory model. TSO requires a lot more hardware resources (longer reorder buffers, more complex cache coherency) than a weak memory model. The really interesting thing about the Apple M1 is that they implement TSO for x86 emulation, so they’re paying that cost. I believe it’s configurable via an MSR: I’d love to know what they turn off in their core when it’s disabled.

                                                                                The third point is missing the ecosystem cost. You can just download an AArch64 binary quite easily now, because there are AArch64 ports of gcc, clang, V8, and so on (including debugging tools and other bits of critical developer infrastructure). The value of this ecosystem is estimated to be around $2bn. Arm is able to play in this space because between Android and iOS they have a huge number of developers to amortise the cost of this ecosystem over a huge number of developers. In theory, it’s just as easy to download a MIPS binary, but there are so many holes in the developer ecosystem for MIPS that it’s unlikely to exist. This is the big struggle for RISC-V: it’s hard to bootstrap a software ecosystem if there aren’t any devices and it’s hard to get people to invest in the hardware if there’s no software ecosystem.

                                                                                1. 1

                                                                                  That was insightful, thanks.

                                                                                  What are TSO and MSR? Three Letters Acronyms are not easy to search for… I believe I can guess from context that TSO is some kind of strong memory model where everything seems to happen in order (from the programmer’s point of view), but I’m not sure exactly. Also, I’m not sure exactly how weak are the weaker memory models you’re referring to. I have no idea what MSR might mean, though.

                                                                                  I did miss the fact that Apple could lean on the iOS ecosystem (and Arm in general) to bootstrap its M1 change, though. I wonder whether that’s why they could switch to Intel as well: there was a huge user base and expertise to begin with.

                                                                                  I would love to see competitive processor designs with RISC-V. I’ve seen both praises and criticisms about what this ISA may or may not enable, but without some big company putting it at use on a seriously high performance (or low power) environment, as a layman I can’t know for sure, and that’s bloody frustrating. I want to simplify computing at every level (and did my little contribution to that end), so I want to know: does the RISC-V ISA (or family of ISA, really) have a comparative advantage in the performance or power efficiency areas? Or does its simplicity has a cost?

                                                                                  1. 4

                                                                                    What are TSO and MSR?

                                                                                    TSO is Total Store Ordering. It’s the x86 strong memory model.

                                                                                    MSR is machine-specific register (sometimes model-specific register). RISC-V calls them CSRs (control and status register). Basically, registers that aren’t used as normal operands for instructions and may have side effects when read from or written to.

                                                                                    I did miss the fact that Apple could lean on the iOS ecosystem (and Arm in general) to bootstrap its M1 change, though. I wonder whether that’s why they could switch to Intel as well

                                                                                    Yes. For OS X on PowerPC, Apple was mostly maintaining the GCC port, for example. A huge number of other folks were working on the GCC, GDB, and Binutils x86 versions. Apple had to add Mach-O support and any Darwin-specific bits, but those are tiny in comparison. IBM was mostly investing in XLC back then, so was happy for Apple to do all of the GCC work, Freescale also had their own proprietary compilers for embedded, so didn’t care much. Apple was basically maintaining the PowerPC software ecosystem.

                                                                                    does the RISC-V ISA (or family of ISA, really) have a comparative advantage in the performance or power efficiency areas? Or does its simplicity has a cost?

                                                                                    My personal view is that RISC-V went too far in terms of simplicity and will have to back-pedal. For example, RISC-V doesn’t (didn’t?) have a broadcast i-cache invalidate. This means that if you want to avoid stale instruction fetches, every time you map a page with execute permission you need to IPI all of the cores and do a local i-cache invalidate. SPARC did this and found that, on large systems (which, back then, meant >8 cores), the process-creation cost was too high. There are a lot of things like this, where you get a small simplicity benefit, but it turns out that the extra hardware complexity in Arm leads to much better whole-system performance.

                                                                                    I think this kind of thing is going to be the big problem with RISC-V. A lot of these things are easy to fix and if you fix them then you can get a big competitive advantage. This means that the incentive for any RISC-V vendor is to addd a bunch of custom extensions and show that (their branch of) Linux runs much faster on their hardware than any other RISC-V vendor’s hardware. This kind of fragmentation is exactly what killed MIPS: every MIPS vendor did something custom and clever. Every vendor had their own Linux / GCC fork. No one got any ecosystem benefits, and anything not from that vendor used MIPS III as the baseline because that’s the most that everyone supported.

                                                                                    The only way out of this path that I see is for a large player (for example, Google or Huawei) to capture enough of a market that they define the de-facto RISC-V standard. If Google defined a RISC-V Android Extension, for example, and promised to support it for AOSP and the Play Store then they’d probably get a number of CPU vendors to adopt it. At that point; however, it’s a royalty-free ISA, but not an openly developed one.

                                                                                    1. 1

                                                                                      My personal view is that RISC-V went too far in terms of simplicity and will have to back-pedal.

                                                                                      Okay thanks. I have the feeling this would be expected from a design that originated from a teaching environment. Also, if I understand correctly, this can mostly be solved by adding relevant extensions.

                                                                                      Speaking of which, I’m not sure we should be that worried about fragmentation: we’ve had the example of Intel and AMD implementing compatible multimedia and SIMD extensions for instance. OpenGL also had vendor-specific extensions, that were eventually standardised (possibly with a slightly different API, but standardised nonetheless). The same could happen to RISC-V: if I recall correctly, they have a way to standardise extensions. That said, a larger player throwing its weight would definitely help (with the drawback you mentioned).

                                                                                      it’s a royalty-free ISA

                                                                                      Something that puzzles me: is it even possible for an ISA, meaning the interface of a CPU, to be anything but royalty free? It’s been long established that making stuff that’s compatible with a competitor’s is not illegal, at least in many Western countries (see third party printer cartridges). The only way to compete with an existing CPU with an existing software ecosystem is to implement the same ISA. It ought to be permitted. Of course I expect established players will try to bully newcomers with lawsuits (as Intel have done to AMD), and may even succeed. I’m just not sure they would actually have a case.

                                                                                      1. 4

                                                                                        Okay thanks. I have the feeling this would be expected from a design that originated from a teaching environment. Also, if I understand correctly, this can mostly be solved by adding relevant extensions.

                                                                                        In theory, yes, but RISC-V’s encoding design means that there’s very little 16-bit instruction space (and it’s all used by the C extension) and most of the 32-bit encoding space is gone too. Extensions are soon going to start being pushed out into the 48-bit space. The problem is, no one wants that for their own extension and so vendors all use overlapping bits of the 32-bit encoding space.

                                                                                        Speaking of which, I’m not sure we should be that worried about fragmentation: we’ve had the example of Intel and AMD implementing compatible multimedia and SIMD extensions for instance

                                                                                        Intel implemented things, AMD copied them. AMD’s 3DNow was not widely deployed in software. Similarly, x86-64 was an AMD feature that Intel copied. This happened because one company gained significant market share and ecosystem buy-in, and the other had a patent cross-licensing agreement that allowed the other to implement it.

                                                                                        OpenGL also had vendor-specific extensions, that were eventually standardised (possibly with a slightly different API, but standardised nonetheless)

                                                                                        A few did. The fragmentation in the OpenGL ecosystem was a big reason for game developers moving to DirectX, which defined a set of features for each level and then required everyone to implement the same thing if they wanted to claim compliance. This was possible because Microsoft had enough market share that the DirectX team was able to act as a standards group and push consensus. Google may manage to be this central clearing point, but then it’s a Google ISA.

                                                                                        Something that puzzles me: is it even possible for an ISA, meaning the interface of a CPU, to be anything but royalty free?

                                                                                        Yes. There’s some awful precedent in MIPS, where they patented the technique for implementing their LWL / LWR instructions and then managed to sue a company that provided a fast software implementation in the illegal instruction trap on a CPU that implemented only the unpatented MIPS instructions.

                                                                                        In practice, there are two ways that you make an ISA not royalty-free. The simplest is that you patent a technique that is required to implement some instructions such that any vaguely efficient implementation is covered by a patent. Intel claims to have over 200 patents on SSE / AVX of this nature.

                                                                                        The second is to trademark the name and charge for certification of compliance. This is the main way that Arm protects their ISA. They do have some patents, but it’s almost impossible to implement a fully compliant core for any non-trivial ISA without a good conformance test suite and Arm controls the only one for their ISA (and their architecture docs are released under a license that prevents using them to develop a separate one). This means that you can possibly produce a knock-off Arm core without trampling any patents, but in practice it’s likely that it would have bugs that would break compatibility for some software.

                                                                                        Their internal x86 specs are some of the most valuable assets that companies like Centaur and AMD own. There’s a lot of undocumented / unspecified behaviour that comes from errata in older CPUs that software depends on, just implementing an x86 core from the Intel reference would probably not give something that worked.

                                                                                        1. 1

                                                                                          Something that puzzles me: is it even possible for an ISA, meaning the interface of a CPU, to be anything but royalty free?

                                                                                          Some ISAs have patented instructions or patents that cover all implementations; the Cray vector register patent and the MIPS patent on LWL/LWR are two historical examples of this.

                                                                                          1. 1

                                                                                            Some ISAs have patented instructions or patents that cover all implementations

                                                                                            Oh Dear. I suspected something similar, but if it’s that bad, I bet if I started to design my own CPU for fun I’d be likely to actually infringe some random patent. I wonder how many hoops RISC-V ended up hopping through just to avoid patents.

                                                                                            1. 3

                                                                                              Most of a conventional RISC ISA should be fine, e.g. load/store, logical operations, branches, etc. There are many 20+ year old designs you can copy instructions from. SIMD, bitfield, or string instructions are the main areas I would be concerned about.

                                                                                              1. 2

                                                                                                I wonder how many hoops RISC-V ended up hopping through just to avoid patents.

                                                                                                I already replied to this post but the 2016 paper RISC-V Geneaology discusses prior art for the RISC-V instruction set. There were 6 novel instructions at the time.

                                                                                  1. 11

                                                                                    I find these kind of articles exhausting. They usually come down to the same: pattern X seen in paradigm Y is bad, therefor Y is bad. The alternatives proposed usually aren’t really good either. Basically once you’ve read one of these, you’ve read all of them.

                                                                                    For example:

                                                                                    If it looks like a candidate for a class, it goes into a class. Do I have a Customer? It goes into class Customer. Do I have a rendering context? It goes into class RenderingContext.

                                                                                    Then the “solution” presented in this article:

                                                                                    The data itself will be in form of an ADT/PoD structures, and any references between the data records will be of a form of an ID (number, uuid, or a deterministic hash). Under the hood, it typically closely resembles or actually is backed by a relational database: Vectors or HashMaps storing bulk of the data by Index or ID, some other ones for “indices” that are required for fast lookup and so on.

                                                                                    Now there’s nothing wrong with plain old data structures. But eventually those data structures are going to need a bunch of associated functions. And then you pretty much have a class. In other words, the solution is basically the same as the problem.

                                                                                    The real problem isn’t unique to OOP and can just as easily occur in say functional programming languages. That is, the problem is people over-applying patterns without thinking “Hey, do we actually need this?”. Traditional OOP languages may promote this in some way, but again that’s a problem with those languages. The concept of OOP has nothing to do with any of this.

                                                                                    Random example: in any language that has support for macros, inevitably some will start abusing macros. But that doesn’t mean the entire language or its paradigm is bad and should be avoided.

                                                                                    As an aside, every time I see a quote from Dijkstra I can’t help but feel this man must have been absolutely insufferable to work with. Yes, he was very smart and made many contributions. But holy moly, his “I am right and everybody else is wrong” attitude (at least that’s how it comes across to me) is off putting to say the least.

                                                                                    1. 5

                                                                                      Now there’s nothing wrong with plain old data structures. But eventually those data structures are going to need a bunch of associated functions. And then you pretty much have a class. In other words, the solution is basically the same as the problem.

                                                                                      It is the other way around. With OOP, you pretty much end up jamming into data structure definitions what essentially are functions. The concept of class is fuzzy and it’s not a clear well define starting point for a thought process.

                                                                                      Notice that you said “associated functions”. I think it’s all the OOP non sense cornering you into that unclear language. What exactly are those? Functions that accept the type you are defining? Functions that manipulate the state of said data structure? Functions that return a reference to it?

                                                                                      If you think about this questions and find clear answers for them, you will realize that there is absolutely no reason to make functions have all sorts of tricky behaviours based on state or even “belonging to an instance”. At which point the concept of class becomes pointless.

                                                                                      Relational algebra was developed with solid theory behind it. To my knowledge, OOP was just something thrown together “because it is a good idea”.

                                                                                      Records, as in compound types, are very useful in many fields even outside programming languages. Hooking functions to them is just a strange idea whose motivation I am yet to discover.

                                                                                      1. 2

                                                                                        Notice that you said “associated functions”. I think it’s all the OOP non sense cornering you into that unclear language. What exactly are those?

                                                                                        When I say “associated function” I mean that in the most basic sense: it simply does something with the data, regardless of how the code is organised, named, etc.

                                                                                        If you think about this questions and find clear answers for them, you will realize that there is absolutely no reason to make functions have all sorts of tricky behaviours based on state or even “belonging to an instance”. At which point the concept of class becomes pointless.

                                                                                        I’m not sure what tricky behaviour have to do with anything. That just seems like you’re inventing problems to justify your arguments.

                                                                                        even “belonging to an instance”

                                                                                        Perhaps this comes as a surprise, but this exists in functional programming too. For example, if you have a String module with a to_lowercase() function, and that function only operates on strings, then that function basically “belongs to an instance”. How exactly you store that function (in the instance, elsewhere, etc) doesn’t matter; the concept is the same. Whether the data is mutable is also completely unrelated to that, as you can have OOP in a completely immutable language.

                                                                                        Relational algebra was developed with solid theory behind it. To my knowledge, OOP was just something thrown together “because it is a good idea”.

                                                                                        I suggest you do some actual research into the origins of OOP, instead of spewing nonsense like this. It’s frankly embarrassing.

                                                                                      2. 4

                                                                                        Have you ever stumbled upon good OOP code that actually looked OOP?

                                                                                        I haven’t. The good code I’ve seen was inevitably a mix of procedural, modular, and functional code, with a heavy slant towards either procedural or functional, with maybe a couple instances of inheritance, for polymorphism’s sake (and even then, sometimes we just pass functions directly).

                                                                                        The most distinguishing characteristic I see in OOP is how it stole almost every features to other paradigms or languages. ADT, encapsulation? Modular programming, from Modula. Generics? Parametric polymorphism from ML and Miranda. Lambdas? From every functional language ever. The only things left are inheritance, which was added in Simula to implement intrusive lists (which were needed because there was no C++ like templates), and subtype polymorphism, which is often better replaced by good old closures.

                                                                                        And guess what, inheritance is now mostly discouraged (we prefer composition). The only thing left is subtype polymorphism. OOP is an empty husk, that only survives by rebranding other programming styles.

                                                                                        1. 2

                                                                                          ADT, encapsulation? Modular programming, from Modula.

                                                                                          ADTs come from Barbara Liskov’s CLU, which cites Simula’s inheritance as inspiration.

                                                                                          1. 1

                                                                                            Hmm, didn’t know, thanks. I looked Modula up as well, it seems both languages appeared at rather the same time.

                                                                                          2. 1

                                                                                            Have you ever stumbled upon good OOP code that actually looked OOP?

                                                                                            This depends on what one would consider OOP, as the opinions/interpretations differ. Have I seen good OOP? Yes. Was that Java-like OOP as I would imagine most people think OOP is like? No. But just because something is OOP doesn’t mean it can’t have elements from other paradigms.

                                                                                            The most distinguishing characteristic I see in OOP is how it stole almost every features to other paradigms or languages. ADT, encapsulation? Modular programming, from Modula. Generics? Parametric polymorphism from ML and Miranda. Lambdas? From every functional language ever

                                                                                            Ah yes: functional languages invented everything, and every other language using elements from this is “stealing” them.

                                                                                            I’m honestly not sure what point you’re trying to make. X sharing elements with Y doesn’t mean somehow X isn’t, well, X. Just as how X having flaw Y doesn’t completely invalidate X. Having such a single minded (if that’s the right term) attitude isn’t productive.

                                                                                            1. 3

                                                                                              Ah yes: functional languages invented everything, and every other language using elements from this is “stealing” them.

                                                                                              Not just functional. Modules did not come from functional languages, that I recall.

                                                                                              To some extent though, yes: functional languages invented a lot. Especially the statically typed ones, whose inventors realise this fundamental truth that often gets me downvoted in programming forums if I voice it: that programming is applied mathematics, and by treating it as such we can find neat ways to make programs better (shorter, clearer, or even faster). Dijkstra was right. Even Alan Kay recognises now through his STEPS project that “math wins”. (Of course, it’s very different from calculus, or most of what you were taught in high school. If anything, it’s even more rigorous and demanding, because at the end of the day, a program has to run on a dumb formal engine: the computer.)

                                                                                              I’m honestly not sure what point you’re trying to make.

                                                                                              That to many OOP proponents, “OOP” mostly means “good”, and as we learn how to program better over the decades, they shift the definition of “OOP” to match what they think is good. It takes a serious break, like data oriented programming, to realise that there are other ways. To give but an example: back in 2007, I designed some over-complicated program in C++, with lots of stuff from the <algorithm> header so I could pretend I was using OCaml (I was an FP weenie at the time). My supervisor look at the code (or maybe I was outlying my design to them, I don’t remember), and said “well, this is very OO and all, but maybe it’s a bit over-complicated?”

                                                                                              That’s how pervasive OOP is. Show a programmer functional patterns (with freaking folds!), they will see OOP.

                                                                                              OOP is no longer a paradigm. It devolved into a brand.

                                                                                          3. 2

                                                                                            But eventually those data structures are going to need a bunch of associated functions. And then you pretty much have a class.

                                                                                            Running in the risk of taking your quote out of context, I think the mindset OOP is simply data structures with encapsulated functions is actually one of the biggest real dangers of OOP, because it hides its biggest flaw: pervasive proliferation of (global) states.

                                                                                            Thus, I understand where you are leading your argument, but I disagree with it.

                                                                                            As an aside, every time I see a quote from Dijkstra I can’t help but feel this man must have been absolutely insufferable to work with. Yes, he was very smart and made many contributions. But holy moly, his “I am right and everybody else is wrong” attitude (at least that’s how it comes across to me) is off putting to say the least.

                                                                                            I happen to know a lot of people who directly worked (or had classes) with him, and unanimously I hear both adjectives: genius and pretentious.

                                                                                            Of course those are just others’ opinions, not mine, but I share your (and those people) feelings.

                                                                                            1. 1

                                                                                              Running in the risk of taking your quote out of context, I think the mindset OOP is simply data structures with encapsulated functions is actually one of the biggest real dangers of OOP, because it hides its biggest flaw: pervasive proliferation of (global) states.

                                                                                              I agree a lot of OOP languages/projects suffer from too much (global) mutable state. But I’m not sure if that’s necessarily due to OOP. I think this is a case of “correlation is not causation”. Perhaps a silly example: if functional languages had mutable state, I think they would have similar issues. In other words, I think the issue is mutability being “attractive”/tempting, not so much the code organisation paradigm.

                                                                                              Another example: I think if you take away the ability to assign non-constant types (basically anything but an int, float, string, etc) to constants/globals, and maybe remove inheritance, you already solve a lot of the common issues seen in OOP projects. This is basically what I’m doing with Inko (among other things, such as replacing the GC with single ownership).

                                                                                              I do think for such languages we need a better term for the paradigm. Calling it X when it mixes properties from A, B, and C is confusing. Unfortunately, I haven’t found a good alternative term.

                                                                                            2. 2

                                                                                              I view OOP as an organizing principle—when you have few types, but lots of actions, then procedural is probably the way to organize the program (send the data to the action). When you have a few actions, but lots of types, then OOP is the way to organize the program (send the action to the data). When you have few actions, few types, then it doesn’t matter. It’s that last quadrant, lots of types, lots of actions, there is currently no good method to handle.

                                                                                              1. 2

                                                                                                When you have X types and Y actions, unless you have many of both, I believe your program is already mostly organised. Many types and few actions? It will end up looking OOP even if you write it in C. Few types and many actions? It will end up looking procedural even if you write it in Java.

                                                                                            1. 26

                                                                                              These are all valid criticisms of certain patterns in software engineering, but I wouldn’t really say they’re about OOP.

                                                                                              This paper goes into some of the distinctions of OOP and ADTs, but the summary is basically this:

                                                                                              • ADTs allow complex functions that operate on many data abstractions – so the Player.hits(Monster) example might be rewritten in ADT-style as hit(Player, Monster[, Weapon]).
                                                                                              • Objects, on the other hand, allow interface-based polymorphism – so you might have some kind of interface Character { position: Coordinates, hp: int, name: String }, which Player and Monster both implement.

                                                                                              Now, interface-based polymorphism is an interesting thing to think about and criticise in its own right. It requires some kind of dynamic dispatch (or monomorphization), and hinders optimization across interface boundaries. But the critique of OOP presented in the OP is nothing to do with interfaces or polymorphism.

                                                                                              The author just dislikes using classes to hold data, but a class that doesn’t implement an interface is basically the same as an ADT. And yet one of the first recommendations in the article is to design your data structures well up-front!

                                                                                              1. 15

                                                                                                The main problem I have with these “X is dead” type article is they are almost always straw man arguments setup in a way to prove a point. The other issue I have is the definition or interpretation of OOP is so varied that I don’t think you can in good faith just say OOP as a whole is bad and be at all clear to the reader. As an industry I actually think we need to get past these self constructed camps of OOP vs Functional because to me they are disingenuous and the truth, as it always does, lies in the middle.

                                                                                                Personally, coming mainly from a Ruby/Rails environment, use ActiveRecord/Class to almost exclusively encapsulate data and abstract the interaction with the database transformations and then move logic into a place where it really only cares about data in and data out. Is that OOP or Functional? I would argue a combination of both and I think the power lies in the middle not one versus the other as most articles stipulate. But a middle ground approach doesnt get the clicks i guess so here we are

                                                                                                1. 4

                                                                                                  the definition or interpretation of OOP is so varied that I don’t think you can in good faith just say OOP as a whole is bad and be at all clear to the reader

                                                                                                  Wholly agreed.

                                                                                                  The main problem I have with these “X is dead” type article is they are almost always straw man arguments setup in a way to prove a point.

                                                                                                  For a term that evokes such strong emotions, it really is poorly defined (as you observed). Are these straw man arguments, or is the author responding to a set of pro-OOP arguments which don’t represent the pro-OOP arguments with which you’re familiar?

                                                                                                  Just like these criticisms of OOP feel like straw men to you, I imagine all of the “but that’s not real OOP!” responses that follow any criticism of OOP must feel a lot like disingenuous No-True-Scotsman arguments to critics of OOP.

                                                                                                  Personally, I’m a critic, and the only way I know how to navigate the “not true OOP” dodges is to ask what features distinguish OOP from other paradigms in the opinion of the OOP proponent and then debate whether that feature really is unique to OOP or whether it’s pervasive in other paradigms as well and once in a while a feature will actually pass through that filter such that we can debate its merits (e.g., inheritance).

                                                                                                  1. 4

                                                                                                    I imagine all of the “but that’s not real OOP!” responses that follow any criticism of OOP must feel a lot like disingenuous No-True-Scotsman arguments to critics of OOP.

                                                                                                    One thing I have observed about OOP is how protean it is: whenever there’s a good idea around, it absorbs it then pretend it is an inherent part of it. Then it deflects criticism by crying “strawman”, or, if we point out the shapes and animals that are taught for real in school, they’ll point out that “proper” OOP is hard, and provide little to no help in how to design an actual program.

                                                                                                    Here’s what I think: in its current form, OOP won’t last, same as previous form of OOP didn’t last. Just don’t be surprised if whatever follows ends up being called “OOP” as well.

                                                                                                2. 8

                                                                                                  The model presented for monsters and players can itself be considered an OO design that misses the overarching problem in such domains. Here’s a well-reasoned, in-depth article on why it is folly. Part five has the riveting conclusion:

                                                                                                  Of course, your point isn’t about OOP-based RPGs, but how the article fails to critique OOP.

                                                                                                  After Alan Kay coined OOP, he realized, in retrospect, that the term would have been better as message-oriented programming. Too many people fixate on objects, rather than the messages passed betwixt. Recall that the inspiration for OOP was based upon how messages pass between biological cells. Put another way, when you move your finger: messages from the brain pass to the motor neurons, neurons release a chemical (a type of message), muscles receive those chemical impulses, then muscle fibers react, and so forth. At no point does any information about the brain’s state leak into other systems; your fingers know nothing about your brain, although they can pass messages back (e.g., pain signals).

                                                                                                  (This is the main reason why get and set accessors are often frowned upon: they break encapsulation, they break modularity, they leak data between components.)

                                                                                                  Many critique OOP, but few seem to study its origins and how—through nature-inspired modularity—it allowed systems to increase in complexity by an order of magnitude over its procedural programming predecessor. There are so many critiques of OOP that don’t pick apart actual message-oriented code that beats at the heart of OOP’s origins.

                                                                                                  1. 1

                                                                                                    Many critique OOP, but few seem to study its origins and how—through nature-inspired modularity—it allowed systems to increase in complexity by an order of magnitude over its procedural programming predecessor.

                                                                                                    Of note, modularity requires neither objects nor message passing!

                                                                                                    For example, the Modula programming language was procedural. Modula came out around the same time as Smalltalk, and introduced the concept of first-class modules (with the data hiding feature that Smalltalk objects had, except at the module level instead of the object level) that practically every modern programming language has adopted today - including both OO and non-OO languages.

                                                                                                  2. 5

                                                                                                    I have to say, after read the first few paragraphs, I skipped to ‘What to do Instead’. I am aware of many limitations of OOP and have no issue with the idea of learning something new so, hit me with it. Then the article is like ’hmm well datastores are nice. The end.”

                                                                                                    The irony is that I feel like I learned more from your comment than from the whole article so thanks for that. While reading the Player.hits(Monster) example I was hoping for the same example reformulated in a non-OOP way. No luck.

                                                                                                    If anyone has actual suggestions for how I could move away from OOP in a practical and achievable way within the areas of software I am active in (game prototypes, e.g. Godot or Unity, Windows desktop applications to pay the bills), I am certainly listening.

                                                                                                    1. 2

                                                                                                      If you haven’t already, I highly recommend watching Mike Acton’s 2014 talk on Data Oriented Design: https://youtu.be/rX0ItVEVjHc

                                                                                                      Rather than focusing on debunking OOP, it focuses on developing the ideal model for software development from first principles.

                                                                                                      1. 1

                                                                                                        Glad I was helpful! I’d really recommend reading the article I linked and summarised – it took me a few goes to get through it (and I had to skip a few sections), but it changed my thinking a lot.

                                                                                                      2. 3

                                                                                                        [interface-based polymorphism] requires some kind of dynamic dispatch (or monomorphization), and hinders optimization across interface boundaries

                                                                                                        You needed to do dispatch anyway, though; if you wanted to treat players and monsters homogenously in some context and then discriminate, then you need to branch on the discriminant.

                                                                                                        Objects, on the other hand, allow interface-based polymorphism – so you might have some kind of interface […] which Player and Monster both implement

                                                                                                        Typeclasses are haskell’s answer to this; notably, while they do enable interface-based polymorphism, they do not natively admit inheritance or other (arguably—I will not touch these aspects of the present discussion) malaise aspects of OOP.

                                                                                                        1. 1

                                                                                                          You needed to do dispatch anyway, though; if you wanted to treat players and monsters homogenously in some context and then discriminate, then you need to branch on the discriminant.

                                                                                                          Yes, this is a good point. So it’s not like you’re saving any performance by doing the dispatch in ADT handling code rather than in a method polymorphism kind of way. I guess that still leaves the stylistic argument against polymorphism though.

                                                                                                        2. 2

                                                                                                          Just to emphasize your point on Cook’s paper, here is a juicy bit from the paper.

                                                                                                          Any time an object is passed as a value, or returned as a value, the object-oriented program is passing functions as values and returning functions as values. The fact that the functions are collected into records and called methods is irrelevant. As a result, the typical object-oriented program makes far more use of higher-order values than many functional programs.

                                                                                                          1. 2

                                                                                                            Now, interface-based polymorphism is an interesting thing to think about and criticise in its own right. It requires some kind of dynamic dispatch (or monomorphization), and hinders optimization across interface boundaries.

                                                                                                            After coming from java/python where essentially dynamic dispatch and methods go hand in hand I found go’s approach, which clearly differentiates between regular methods and interface methods, really opened my eyes to overuse of dynamic dispatch in designing OO apis. Extreme late binding is super cool and all… but so is static analysis and jump to definition.

                                                                                                          1. 7

                                                                                                            I have two observations on this article.

                                                                                                            First, there is no mention of the “new” wave of federated services that popped up all over the place based on ActivityPub. I find that to be a glaring omission. Even though they didn’t get mass adoption, the number of users the Mastodon network has is impressive for a bunch of open-source projects.

                                                                                                            Second, I think that throwing the baby with the bath water because a corporation has captured a large number of users inside of a distributed network is pretty defeatist. Just because gmail has a large portion of email users it doesn’t mean that as a user I can’t choose a smaller provider like tutanota or proton.

                                                                                                            1. 12
                                                                                                              1. I didn’t mention these on purpose because I don’t have any direct experience with them (personal dislike of social media), and so don’t feel qualified to talk about them. From an outsider’s perspective though, they do seem to fit my case study of XMPP/Matrix etc.

                                                                                                              2. A lot of people seem to have got the impression that I hate these applications, and/or am somehow against them. I tried to make it clear in the post that I am active user of almost all the applications I discussed, and only want to see them succeed.

                                                                                                              1. 3

                                                                                                                I’m sorry to sound like the “ackchyually” gang, but I guess my comments were based on how your title makes a sweeping generalization without the article looking into all the options.

                                                                                                                PS. I’m working on an ActivityPub service myself, and that might colour my views. :D

                                                                                                              2. 4

                                                                                                                Quoting an entire paragraph:

                                                                                                                Whenever this topic comes up, I’m used to seeing other programmers declare that the solution is simply to make something better. I understand where this thought comes from; when you have a hammer, everything looks like a nail, and it’s comforting to think that your primary skill and passion is exactly what the problem needs. Making better tools doesn’t do anything about the backwards profit motive though, and besides, have you tried using any of the centralised alternatives lately? They’re all terrible. Quality of tools really isn’t what we’re losing on.

                                                                                                                ActivityPub might be awesome, but that is entirely beside the point the author tries to make.

                                                                                                                1. 2

                                                                                                                  How so? I’m not sure how the paragraph you’re quoting takes away from the fact that there are currently community driven federated projects and services which are popular and that the author didn’t consider.

                                                                                                                  1. 3

                                                                                                                    The article listed a few examples of where decentralization didn’t work out, even though it had the technical merits to be much better than the alternatives. Enumerating all possible examples where it did or didn’t work is out of the scope and besides the point - it’s never the technical merits that are lacking in these situations.

                                                                                                                    ActivityPub isn’t used (in a very broad term here…) by others than hypergeeks like you and me. We might find each other and form communities around our interests, but the was majority of users are not going to form their own communities like this.

                                                                                                                    1. 1

                                                                                                                      but the was majority of users are not going to form their own communities like this.

                                                                                                                      That’s fine, but considering this as the only metric for success is a poor choice.

                                                                                                                2. 4

                                                                                                                  Try to run your own mail server, though.

                                                                                                                  1. 3

                                                                                                                    There are options out there for allowing other people to do the nitty gritty or running the server with minimal costs and time investment. I’m running a purelymail account with multiple domains that I own.

                                                                                                                    1. 3

                                                                                                                      I did that for quite some time. Only switched to a commercial hosted provider because of general sysadmin burnout / laziness, not anything mail specific. The problem of Gmail treating my server as suspicious was easily solved by sending one outgoing email from Gmail to my server.

                                                                                                                      1. 4

                                                                                                                        You get blacklisted to hell once you put your mail server on a VPS with high enough churn in its IP neighborhood.

                                                                                                                        And there is no power on Earth (for now) that would convince GOOG or MSFT to reconsider. Their harsh policies are playing into their hands – startups buy their expensive services instead of running a stupid postfix instance.

                                                                                                                        We (the IT sector) need to agree on a way to group hosts on a network by their operator in much finer way so that regular leasers are not thrown in the same bag as the spammers or victims of a network breach.

                                                                                                                        1. 4

                                                                                                                          You get blacklisted to hell once you put your mail server on a VPS with high enough churn in its IP neighborhood.

                                                                                                                          Is that so? So far the main reason I see for mails not arriving is the use of Mailchimp. No hate on Mailchimp there, just experience, since many companies use their service.

                                                                                                                          Meanwhile Google is super fine, as long as you make sure SPF and DKIM/DMARC are set up correctly. Oh and of course reverse IP (PTR record) should be set up correctly, like with any server. They are even nice enough to report back why your mail was rejected and what to do about if you don’t do the above.

                                                                                                                          Experience is based on Mailchimp usage in multiple companies and e-mail servers in various setups (new domain, new IP, moving servers, VPS, dedicated hoster, small hosters, big hosters). Didn’t have a case so far where Google would have rejected an email, once the initial SPF/DKIM-Setup/PTR was running correctly.

                                                                                                                          The “suspicious email” stuff is usually analysis of the e-mail itself. The main causes are things like reply-to with different domain, HTML links, where it says example.com, but actually (for example for click tracking purposes) links somewhere else.

                                                                                                                          Not telling anyone they should run a mail server, just throwing in some personal experiences, because the only real life examples where Google would reject an email was, because of SPF, DKIM or PTR being misconfigured or missing. For accepted, but thrown into spam it’s mostly reply-to and links. I have close to no experience with MSFT. Only ever used a small noname-VPS and a big German dedicated hosing provider to send to hotmail addresses and it worked.

                                                                                                                          1. 3

                                                                                                                            Is that so?

                                                                                                                            I host my own postfix instance in a VPS for years now (well, not since last summer or so, but I’ll eventually get back to it). I had my email bounced from hotmail’s server, and the reason given by the bounce email was that my whole IP block was being banned. It tends to resolve itself after a few days. In the mean time, I am dead to hotmail users. Google is even more fickle. I am often marked as spam, including in cases where I was replying to email.

                                                                                                                            I don’t believe it was a misconfiguration of any kind. I did everything except DKIM, and tested that against dedicated test servers (I believe you can send an email to them, and they respond with how spammy your mail server looks).

                                                                                                                            So yes, it is very much so.

                                                                                                                            1. 2

                                                                                                                              Google is even more fickle. I am often marked as spam, including in cases where I was replying to email.

                                                                                                                              Same here. After finally setting up DKIM these hard to diagnose and debug problems finally went away completely , AFAICT.

                                                                                                                              1. 1

                                                                                                                                Interesting. Thank you for the response!

                                                                                                                                Just curious, whether if you open the detail view of the email it says more? When I played with it it usually did tell why it thought it was spammy.

                                                                                                                                1. 1

                                                                                                                                  Just curious, whether if you open the detail view of the email it says more?

                                                                                                                                  Didn’t thought of that: when I send email to my own gmail account, it does not end up in the spam folder. I have yet to hack into other people’s spam folder. :-)

                                                                                                                                  Right now I can’t make the test because I’m not using my own mail server (I’m currently using the mail service of my provider). I send an email to myself anyway, and the headers added by Gmail say SPF and DKIM are “neutral” (that is, not configured). I will set them up once I reinstate my own Postfix instance, though.

                                                                                                                              2. 2

                                                                                                                                It’s a frequent issue with Digital Ocean, for example.

                                                                                                                      1. 14

                                                                                                                        Whenever someone wants to send me a file […] they could just send me a magnet link

                                                                                                                        Ha. Well. Other than everything in the BitTorrent world being designed for mass sharing and feeling like overkill for one-to-one “beaming”, there are two elephants in the room, none of them having anything to do with the “piracy perception”. First, the possibility of both sides being behind awful cgNAT/NAT444, making p2p connections impossible. Second, the privacy thing: do you even want your recepients to know your home IP address? Probably not always.

                                                                                                                        1. 3

                                                                                                                          As a particular example of the Inverse DRM Problem, I don’t think that it’s possible to receive a file over an addressed switching network without telling somebody some portion of your address. (Recall that the DRM Problem is that you cannot create a secure computing enclave within somebody else’s machine and keep the inputs and outputs private from them. The Inverse DRM Problem is that you cannot exist as a single small node in a homogenous network without projecting most of your information across your neighbors.) For example, I usually recommend Magic Wormhole for transferring files, but it exposes addressing information to a trusted third-party intermediate server.

                                                                                                                          1. 4

                                                                                                                            Yeah, I’m not talking about intermediaries though, only specifically the recepients. If you want “Snowden level” security, use Tor, but for casually sending something to someone I only know online, all I need is some intermediary to just “abstract” the content away one step from me.

                                                                                                                          2. 2

                                                                                                                            Does UDP hole punching not work behind cgNAT?

                                                                                                                            edit: I know port forwarding is impossible but I can’t find anything on hole punching not working.

                                                                                                                            1. 3

                                                                                                                              I recall a network company doing some overlay network so people can communicate easily (like sending files or stuff to each other), no matter their machine. Apparently, a good percentage of their users have to pass through their centralised servers (used as a fallback), because even hole punching didn’t work.

                                                                                                                              Besides, whole punching requires a handshake server to begin with. What we would really like is a direct peer-to-peer connection, and that’s just flat out impossible if both sides are under NAT.

                                                                                                                              1. 5
                                                                                                                                1. 3

                                                                                                                                  Yes, that’s the one:

                                                                                                                                  This is a good time to have the awkward part of our chat: what happens when we empty our entire bag of tricks, and we still can’t get through? A lot of NAT traversal code out there gives up and declares connectivity impossible. That’s obviously not acceptable for us; Tailscale is nothing without the connectivity.

                                                                                                                                  We could use a relay that both sides can talk to unimpeded, and have it shuffle packets back and forth. But wait, isn’t that terrible?

                                                                                                                                  Sort of. It’s certainly not as good as a direct connection, but if the relay is “near enough” to the network path your direct connection would have taken, and has enough bandwidth, the impact on your connection quality isn’t huge. There will be a bit more latency, maybe less bandwidth. That’s still much better than no connection at all, which is where we were heading.

                                                                                                                          1. 20

                                                                                                                            I agree with the criticisms directed at C++, but the arguments made in favor for C are weak at best IMO. It basically boils down to C code being shorter than equivalent code in competing languages (such as Rust), and C being more powerful and giving more tools to the programmer. I disagree strongly with the second point: unless you’re trying to write obfuscated code in C (which admittedly is quite fun to do), the features C has that make it supposedly more “powerful” are effectively foot-guns that can be used to write reliable code, not the other way around. C’s largest design flaw is that it not only allows for “clever” unsafe code to be written, but it actively encourages it. By design, it’s easier to write unsafe C code than it is to properly handle all edge-cases, and C’s design also makes it incredibly difficult to spot these abuses. In plenty of cases, C seemingly encourages the abuse of undefined behavior, because it looks cleaner than the alternative of writing actually correct code. C is a language I still honestly quite like, but it is a deeply flawed language, and we need to acknowledge the language’s shortcomings, rather than just pretend they don’t exist or try to defend what isn’t defensible.

                                                                                                                            C is called portable assembly language for a reason, and I like it because of that reason.

                                                                                                                            C is a higher-level version of the PDP-11’s assembly language. C still thinks that every computer works just like the PDP-11 did, and the result is that the language really isn’t as low level as some believe it is.

                                                                                                                            1. 3

                                                                                                                              In plenty of cases, C seemingly encourages the abuse of undefined behavior, because it looks cleaner than the alternative of writing actually correct code.

                                                                                                                              In some of those cases, this is because the standard botched their priorities. Specifically, undefined behaviour of signed integer overflow: the clean way to check for signed overflow is to perform the addition or whatever, then check whether the result is negative or something. In C, that’s also the incorrect way, because overflow is undefined, despite the entire planet being 2’s complement.

                                                                                                                              Would that make optimisations harder? I doubt it matters for many real world programs though. And even if it does: perhaps we should have a better for loop, with, say, an immutable index?

                                                                                                                              1. 2

                                                                                                                                In my opinion, Zig handles overflow/underflow in a much better way. In Zig, overflow is normally undefined (though it’s caught in debug/safe builds), but the programmer can explicitly use +%, -%, or *% to do operations with defined behavior on overflow, or use a built-in function like @addWithOverflow to perform addition and get a value returned back indicating whether or not overflow occurred. This allows for clean and correct checking for overflow, while also keeping the optimizations currently in place that rely on undefined behavior on overflow. All that being said, I would be curious to know how much of a performance impact said optimizations actually have on real code.

                                                                                                                                1. 2

                                                                                                                                  Having such a simple alternative would work well indeed.

                                                                                                                                  I’m still sceptical about the optimisations to be honest. One example that was given to me was code that iterates with int, but compares with size_t, and the difference in width generated special cases that slows everything down. To which I thought “wait a minute, why is the loop index a signed integer to begin with?”. To be checked.

                                                                                                                                  1. 1

                                                                                                                                    To which I thought “wait a minute, why is the loop index a signed integer to begin with?”.

                                                                                                                                    Huh. I guess compilers really are written to compensate for us dumb programmers. What a world!

                                                                                                                                2. 2

                                                                                                                                  despite the entire planet being 2’s complement

                                                                                                                                  Perhaps your planet is not Earth but the Univac 1100 / Clearpath Dorado series is 1’s complement, can still be purchased, and has a C compiler.

                                                                                                                                  1. 3

                                                                                                                                    And in my mind, it can stick with C89. Can you name one other 1’s complement machine still in active use with a C compiler? Or any sign-magnitude machines? I think specifying 2’s complement and no trap representation will bring C compilers more into alignment with reality [1].

                                                                                                                                    [1] I can’t prove it, but I suspect way over 99.9% of existing C code assumes a byte-addressable, 2’s complement machine with ASCII/UTF-8 character encoding [2].

                                                                                                                                    [2] C does not mandate the use of ASCII or UTF-8. That means that all existing C source code is not actually portable across every compiler because the character set is “implementation defined.” Hope you have your EBCDIC tables ready …

                                                                                                                                    1. 1

                                                                                                                                      Don’t forget that the execution character set can differ from the translation character set, so it’s perfectly fine to target an EBCDIC execution environment with ASCII (or even ISO646) sources.

                                                                                                                                    2. 1

                                                                                                                                      Well, I guess we’ll still have to deal with legacy code in some narrow niches. Banks will be banks.

                                                                                                                                      Outside of legacy though, let’s be honest: when was designed the last ISA that didn’t use 2’s complement? My bet would be no later than 1980. Likely even earlier. Heck, the fight was already over when the 4-bit 74181 ALU went out, in the late sixties.

                                                                                                                                      1. 1

                                                                                                                                        Oh yeah, I definitely keep these examples on file for when people tell me that all negative numbers are 2’s complement/all floating points are IEEE854/all bytes are 8-bit etc., but they point to a fundamental truth that C solves a lot of problems that “C replacements” don’t even attempt to do. C replacements are often “if we ignore a lot of things you can do in C, then my language is better”.

                                                                                                                                        1. 1

                                                                                                                                          Except I’m not even sure C does solve those problems. Its approach has always been to skip the problems, that with implementation defined and undefined behaviour. There’s simply no way to be portable and take advantage of the peculiarities of the machines. If you want to get close to the metal, well, you need a compiler and programs for that particular metal.

                                                                                                                                          In the mean time, the most common metal (almost to the point of hegemony) has 8-bit bytes, 2’s complement integers, and IEEE floating point numbers. Let’s address that first, and think about more exotic architecture later. Even if those exotic architectures do have their place, they’re probably exotic enough that they can’t really use your usual C code, and instead need custom code, perhaps even a custom compiler.

                                                                                                                                    3. 1

                                                                                                                                      I’ve always felt people over-react to the implementation defined behavior in the C standard.

                                                                                                                                      It’s undefined in the language spec, but in most cases (like 2’s complement overflow) it is defined by the platform and compiler. Clearly it’s better to have it defined by the standard, but it’s not necessarily a bad thing to delegate some behavior to the compiler and platform, and it’s almost never the completely arbitrary, impossible to predict behavior people make it out to be.

                                                                                                                                      It’s a pain for people trying to write code portable to every conceivable machine ever created, but let’s be realistic: most people aren’t doing that.

                                                                                                                                      1. 4

                                                                                                                                        Signed overflow is not implementation defined, it is undefined. Implementation-defined behaviour is fine. It requires that the implementer document the behaviour and deterministically do the same thing every time. Undefined behaviour allows the compiler to implement optimisations that are sound if they assume as an axiom that the behaviour cannot exist in any valid program. Some of these are completely insane: it is UB in C99 (I think they fixed this in C11) for a source file to not end with a newline character. This is because limitations of early versions of Lex/YACC.

                                                                                                                                        1. 1

                                                                                                                                          It’s undefined in the language spec, but in most cases (like 2’s complement overflow) it is defined by the platform and compiler

                                                                                                                                          It’s defined by the platform only. Compilers do tread that as “we are allowed to summon the nasal demons”. I’m not even kidding: serious vulnerabilities in the past have been caused by security checks being removed by the compilers, because their interpretation of undefined behaviour meant the security check was dead code.

                                                                                                                                          In the specific case of signed integer overflow, Clangs -fsanitize=undefined does warn you about the overflow being undefined. I have tested it. Signed integer overflow is not defined by the compiler. It just doesn’t notice most of the time. Optimisers are getting better and better though. Which is why you cannot, in 2021, confidently write C code that overflows signed integers, even on bog standard 2’s complement platforms. Even on freaking Intel x86-64 processors. The CPU can do it, but C will not let it.

                                                                                                                                          If the standard actually moved signed overflow to “implementation defined behaviour”, or even “implementation defined if the platform can do it, undefined on platform that trap or otherwise go bananas”, I would very happy. Except that’s not what the standard says. It just says “undefined”. While the intend was most probably to say “behave sensibly if the platform allows is, go bananas otherwise”, that’s not what the standard actually says. And compiler writers, in the name of optimisation, interpreted “undefined” in the most broad way possible: if something is undefined because one platform can’t handle it, it’s undefined for all platforms. And you can pry the affected optimisations from their cold dead hands.

                                                                                                                                          Or you can use -fwrapv. It’s not standard. It’s not quite C. It may not be available everywhere. There’s no guarantee, if you write a library, that your users will remember to use that option when they compile it. But at least it’s there.

                                                                                                                                          It’s a pain for people trying to write code portable to every conceivable machine ever created, but let’s be realistic: most people aren’t doing that.

                                                                                                                                          I am. You won’t find a single instance of undefined behaviour there. There is one instance of implementation defined behaviour (right shift of negative integers), but I don’t believe we can find a single platform in active use that does not propagate the sign bit in this case.

                                                                                                                                      2. 1

                                                                                                                                        https://www.electronicsweekly.com/open-source-engineering/linux/the-ten-commandments-for-c-programmers-2009-04/

                                                                                                                                        Thou shalt foreswear, renounce, and abjure the vile heresy which claimeth that “All the world’s a VAX”, and have no commerce with the benighted heathens who cling to this barbarous belief, that the days of thy program may be long even though the days of thy current machine be short

                                                                                                                                        Whilst the world is not a VAX any more, neither is it an x86. Consider that your code may run on a PowerPC RISC-V, ARM, MIPS or any of the many other architectures supported by Linux. Some processors are big endian, others little. Some 32-bit and others 64. Most are single core but increasingly they are multi-core.

                                                                                                                                      1. 31

                                                                                                                                        Zig’s cross-compilation story is the best I’ve ever seen. It’s so good I didn’t even think it would be possible. Even if Zig-the-language never gains any traction (which would be a tragedy), Zig-the-toolchain is already fantastic and will be around for a long time.

                                                                                                                                        Go’s is good, don’t get me wrong, but Zig solves a much harder problem and does it so amazingly seamlessly.

                                                                                                                                        1. 16

                                                                                                                                          To be honest, the difficulty of cross compilation is something I have never really understood. A compiler takes source code written in some human readable formalism, and produces binary code in some machine readable formalism. That is it. It’s frankly baffling, and a testament to a decades long failure of our whole industry, that “cross compilation” is even a word: it is after all just like compilation: source code in, machine code out. We just happen to produce machine code for other systems than the one that happens to host the compiler.

                                                                                                                                          I see only two ways “cross” compilation can ever be a problem: limited access to target specific source code, and limited access to the target platform’s specifications. In both cases, it looks to me like a case of botched dependency management: we implicitly depend on stuff that vary from platform to platform, and our tools are too primitive or too poorly designed to make those dependencies explicit so we can change them (like, depending on the target platform’s headers and ABI instead of the compiler’s platform).

                                                                                                                                          I would very much like to know what went wrong there. Why is it so hard to statically link the C standard library? Why do Windows programs need VCRedists? Can’t a program just depend on it’s OS’s kernel? (Note: I know the security and bloat arguments in favour of dynamic linking. I just think solving dependency hell is more important.)

                                                                                                                                          1. 9

                                                                                                                                            Why is it so hard to statically link the C standard library?

                                                                                                                                            Well, because glibc… Maybe musl will save us 😅

                                                                                                                                            If you really want to go down that rabbit hole: https://stackoverflow.com/a/57478728

                                                                                                                                            1. 8

                                                                                                                                              Good grief, glibc is insane. What it does under the hood is supposed to be an implementation detail, and really should not be affected by linking strategy. Now, this business about locales may be rather tricky; maybe the standard painted them into a corner: from the look of it, a C standard library may have to depend on more than the kernel¹ to fully implement itself. And if one of those dependencies does not have a stable interface, we’re kinda screwed.

                                                                                                                                              When I write a program, I want a stable foundation to ship on. It’s okay if I have to rewrite the entire software stack to do it, as long as I have stable and reliable ways to make the pixels blinks and the speaker bleep. Just don’t force me to rely on flaky dependencies.

                                                                                                                                              [1]: The kernel’s userspace interface (system calls) is very stable. The stackoverflow page you link to suggests otherwise, but I believe they’re talking about the kernel interface, which was never considered stable (resulting in drivers having to be revised every time there’s a change).

                                                                                                                                              1. 5

                                                                                                                                                It’s worth noting (since your question was about generic cross-platform cross-compilation, and you mentioned e.g. Windows) that this comment:

                                                                                                                                                The kernel’s userspace interface (system calls) is very stable.

                                                                                                                                                is only really true for Linux among the mainstream operating systems. In Solaris, Windows, macOS, and historically the BSDs (although that may’ve changed), the official, and only stable, interface to the kernel, is through C library calls. System calls are explicitly not guaranteed to be stable, and (at least on Windows and Solaris, with which I’m most familiar) absolutely are not: a Win32k or Solaris call that’s a full-on user-space library function in one release may be a syscall in the next, and two separate syscalls in the release after that. This was a major, major issue with how Go wanted to do compilation early on, because it wanted The Linux Way to be the way everywhere, when in fact, Linux is mostly the odd one out. Nowadays, Go yields to the core C libraries as appropriate.

                                                                                                                                                1. 1

                                                                                                                                                  As long as I have some stable interface, I’m good. It doesn’t really matter where the boundary is exactly.

                                                                                                                                                  Though if I’m being honest, it kinda does: for instance, we want interfaces to be small, so they’re easier to stabilise and easier to learn. So we want to find the natural boundaries between applications and the OS, and put the stable interface there. It doesn’t have to be the kernel to be honest.

                                                                                                                                            2. 3

                                                                                                                                              I agree it is over complicated, but it isn’t as simple as you are saying.

                                                                                                                                              One answer is because many tools want to run code during the build process, so they need both compilers and a way to distinguish between the build machine and target machine. this does not need to be complicated, but immediately breaks your idealized world view.

                                                                                                                                              Another answer is our dependency management tools are so poor it is not easy to setup the required libraries to link the program for the target.

                                                                                                                                              1. 1

                                                                                                                                                many tools want to run code during the build process

                                                                                                                                                Like, code we just compiled? I see two approaches to this. We could reject the concept altogether, and cleanly separate the build process itself, that happens exclusively on the source platform from, tests, that happen exclusively on the target platform. Or, we could have a portable bytecode compiler and interpreter, same as Jonathan Blow does with his language. I personally like going the bytecode route, because it make it easier to have a reference implementation you can compare to various backends.

                                                                                                                                                a way to distinguish between the build machine and target machine.

                                                                                                                                                As far as I understand, we only need a way to identify the target machine. The build machine is only relevant insofar as it must run the compiler and associated tools. Now I understand how that alone might be a problem: Microsoft is not exactly interested in running MSVC on Apple machines… Still, you get the idea.

                                                                                                                                                Another answer is our dependency management tools are so poor it is not easy to setup the required libraries to link the program for the target.

                                                                                                                                                Definitely.

                                                                                                                                                1. 1

                                                                                                                                                  Like, code we just compiled?

                                                                                                                                                  There are two common cases of this. The first is really a bug in the build system: try to compile something, run it, examine its behaviour, and use that to configure the build. This breaks even the really common cross-compilation use case of trying to build something that will run on a slightly older version of the current system. Generally, these should be rewritten as either try-compile tests or run-time configurable behaviour.

                                                                                                                                                  The more difficult case is when you have some build tools that are built as part of the compilation. The one that I’m most familiar with is LLVM’s TableGen tool. To make LLVM support cross compilation, they needed to first build this for the host, then use it to generate the files that are compiled for the target, then build it again for the target (because downstream consumers also use it). LLVM is far from the only project that generates a tool like this, but it’s one of the few that properly manages cross compilation.

                                                                                                                                                  1. 1

                                                                                                                                                    Oh, so that what you meant by distinguishing the build platform from the target platform. You meant distinguishing what will be build for the host platform (because we need it to further the build process) from the final artefacts. Makes sense.

                                                                                                                                                    1. 1

                                                                                                                                                      Another example would be something like build.rs in rust projects, though that seems less likely to cause problems. The linux kernel build also compiles some small C utilities that it then uses during the build so they have HOSTCC as well as CC.

                                                                                                                                              2. 2

                                                                                                                                                The concept of self hosted language is fading away. The last truly self hosted language might have been Pascal.

                                                                                                                                                1. 4

                                                                                                                                                  On its surface this sounds preposterous. Can you elaborate? I know of maybe a dozen self-hosted languages since Pascal so I think I must be misunderstanding.

                                                                                                                                                  Edit: I’m guessing you mean that not only is the compiler self-hosted, but every last dependency of the compiler and runtime (outside the kernel I guess?) is also written in the language? That is a much more limited set of languages (still more than zero) but it’s not the commonly accepted meaning of self-hosted.

                                                                                                                                                  1. 3

                                                                                                                                                    The original Project Oberon kernel was written in assembly, but the newer version is written almost entirely in Oberon.

                                                                                                                                                    Some of the early Smalltalks were written almost entirely in Smalltalk, with a weird syntactic subset that had limited semantics but compatible syntax that could be compiled to machine code.

                                                                                                                                                    And of course LISP machines, where “garbage collection” means “memory management.”

                                                                                                                                                    1. 3

                                                                                                                                                      Yes the latter. Sorry, maybe the terminology I used is off.

                                                                                                                                                      1. 2

                                                                                                                                                        It’s an interesting distinction even if the terminology isn’t what I’d use. There’s a trend right now among languages to hop on an existing runtime because rebuilding an entire ecosystem from first principles is exhausting, especially if you want to target more than one OS/architecture combo. Sometimes it’s a simple as just “compile to C and benefit from the existing compilers and tools for that language”. But it seems fitting that we should have a way to describe those systems which take the harder route; I just don’t know what the word would be.

                                                                                                                                                  2. 1

                                                                                                                                                    limited access to the target platform’s specifications. […] it looks to me like a case of botched dependency management

                                                                                                                                                    This is exactly what’s going on. You need to install the target platform’s specifications in an imperative format (C headers), and it’s the only format they provide.

                                                                                                                                                    And it makes extreme assumptions about file system layout, which are all necessarily incorrect because you’re not running on that platform.

                                                                                                                                                  3. 5

                                                                                                                                                    Could you elaborate on what “the harder problem” is?

                                                                                                                                                    1. 26

                                                                                                                                                      Go can cross-compile Go programs but cgo requires an external toolchain even natively; cross compiling cgo is a pain.

                                                                                                                                                      Zig compiles Zig and C from almost any platform to almost any platform pretty seamlessly.

                                                                                                                                                      1. 4

                                                                                                                                                        Zig compiles Zig and C from almost any platform to almost any platform pretty seamlessly.

                                                                                                                                                        As I understand it, Zig doesn’t do much more than clang does out of the box. With clang + lld, you can just provide a directory containing the headers and libraries for your target with --sysroot= and specify the target with -target. Clang will then happily cross-compile anything that you throw at it. Zig just ships a few sysroots pre-populated with system headers and libraries. It’s still not clear to me that this is legal for the macOS ones, because the EULA for most of them explicitly prohibits cross compiling, though it may be fine if everything is built from the open source versions.

                                                                                                                                                        This is not the difficult bit. It’s easy if your only dependency is the C standard library but most non-trivial programs have other dependencies. There are two difficult bits:

                                                                                                                                                        • Installing the dependencies into your sysroot.
                                                                                                                                                        • Working around build systems that don’t support cross compilation and so try to compile and run things and dependencies that are compiler-like.

                                                                                                                                                        The first is pretty easy to handle if you are targeting an OS that distributes packages as something roughly equivalent to tarballs. On FreeBSD, for example, every package is just a txz with some metadata in it. You can just extract these directly into your sysroot. RPMs are just cpio archives. I’ve no idea what .deb files are, but probably something similar. Unfortunately, you are still responsible for manually resolving dependencies. It would be great if these tools supported installing into a sysroot directly.

                                                                                                                                                        The second is really hard. For example, LLVM builds a tablegen tool that generates C++ files from a DSL. LLVM’s build system supports cross compilation and so will first build a native tablegen and then use that during the build. If you’re embedding LLVM’s cmake, you have access to this. If you have just installed LLVM in a sysroot and want to cross-build targeting it then you also need to find the host tablegen from somewhere else. The same is true of things like the Qt preprocessor and a load of other bits of tooling. This is on top of build systems that detect features by trying to compile and run something at build time - this is annoying, but at least doesn’t tend to leak into downstream dependencies. NetBSD had some quite neat infrastructure for dealing with these by running those things in QEMU user mode while still using host-native cross-compile tools for everything else.

                                                                                                                                                        1. 1

                                                                                                                                                          As I understand it, Zig doesn’t do much more than clang does out of the box. With clang + lld, you can just provide a directory containing the headers and libraries for your target with –sysroot= and specify the target with -target. Clang will then happily cross-compile anything that you throw at it. Zig just ships a few sysroots pre-populated with system headers and libraries.

                                                                                                                                                          That’s what it does but to say that it “isn’t much more than what clang does out of the box” is a little disingenuous. It’s like saying a Linux distro just “packaged up software that’s already there.” Of course that’s ultimately what it is, but there’s a reason why people use Debian and Fedora and not just Linux From Scratch everywhere. That “isn’t much more” is the first time I’ve seen it done so well.

                                                                                                                                                          1. 1

                                                                                                                                                            It solves the trivial bit of the problem: providing a sysroot that contains libc, the CSU bits, and the core headers. It doesn’t solve the difficult bit: extending the sysroot with the other headers and libraries that non-trivial programs depend on. The macOS version is a case in point. It sounds as if it is only distributing the headers from the open source Apple releases, but that means that you hit a wall as soon as you want to link against any of the proprietary libraries / frameworks that macOS ships with. At that point, the cross-compile story suddenly stops working and now you have to redo all of your build infrastructure to always do native compilation for macOS.