Threads for tumdum

  1. 5

    If anyone is interested, Handmade Hero is a „c with classes” style project developed in the exactly same way as Tiger-Beetle.

    1. 5

      I haven’t watched HMH past the first few hundred episodes, but unless Casey drastically changed what he did in the first year or so, HMH is nowhere near the rigor that TigerBeetle has in its development approach… which is actually for the better, since it’s not a financial database.

      1. 5

        I’m just going by the description in that post. “allocate all the memory at a startup, and there’s zero allocation after that”, “we just cast the bytes we received from the network to a desired type”, “ aggressively minimize all dependencies”, “know exactly the system calls our system is making” and “ little abstraction between components” - all those properties apply to HMH as well.

    1. 33

      It doesn’t matter whether telemetry was added to a product in good faith or not: projects change leadership, corporations are taken over, data sets leak, or people just change their mind. Once I opt in any change in the amount and quality of the data sent to headquarters is likely to be done without my explicit consent.

      I also believe that the gains in quality that are claimed to be achieved based on telemetry data are highly exaggerated for commercial (i.e. ad industry) reasons. Apart from minor ergonomical improvements I can not think of much that is to be achieved by collecting user data, saying this as someone who is maintaining an open source project for 25 years, now.

      1. 8

        Apart from minor ergonomical improvements I can not think of much that is to be achieved by collecting user data

        Here is a list of things that Go team wanted to learn from transparent telemetry they proposed. It seems reasonable to me and hard to get any other way.

        1. 11

          It’s a reasonable list, but as Bunny351 pointed out, that list and who’s in charge of it can change. It does not exist in isolation or frozen in time.

          1. 6

            I’m not arguing for Go telemetry, just pointing out that the telemetry in general can have many use-cases that are even well documented. When you contrast that with someone who claims to have lots of experience and at the same time fails to see any use for such telemetry, it’s hard for me to not see this as a dishonest claim made only to push some hidden agenda. It would be much better for people involved in such discussion to be honest and say what they really think out loud, instead of making such silly claims. You don’t want telemetry in Go because you don’t trust Google or big corps in general? Fine, just say so instead of those silly pseudo technical arguments. 🤷

            1. 10

              No need to get personal. I clearly wrote that I think the claimed advantages are exaggerated, and that I do acknowledge minor gains, I also wonder what “hidden agenda” you are talking about. In fact, your strong wording makes me wonder what your agenda is…

              In my experience I found it more fruitful to directly engage with users and co-developers to gather areas of improvement instead of relying on questionable statistics that might be abused or may even misrepresent real use. Users should not be a resource to be mechanically harvested. If I’m unclear about the usage patterns of the tool I maintain, I can politely ask the userbase and they may or not be interested in telling me about it.

              No, I don’t trust Google. I do trust certain communities in the moment, that may change in the future the same way as communities, their governors or their policies may change. That is my main point: trust is not something I grant eternally, and tool vendors should acknowledge and accept this.

              Regarding my honesty: I truly can not remember any situation where adding telemetry to something I maintained would have made a genuine impact, note that I’m talking of my personal experience. There may be minor gains, perhaps, but is it worth requiring the user to pay for these gains with his or her data? I think not.

              1.  

                All I’ve seen of you in all the threads on the Go telemetry has been weird attacks on people rather than on their arguments. You jump down the throat of anyone who you feel hasn’t sufficiently “read the proposal”, you impute dishonest motivations to those who disagree with you, etc.

                Perhaps you could find a more constructive way to make your points?

            2. 7

              I’m a bit wary of this because I think it’s quite easy to use lists like the above to win the argument by economy: coming up with the list is a lot more time-efficient than going through it point by point and refuting each thing. But there are some points worth bringing up that apply to most of them.

              Many people have opposed opt-in, claiming it will bias the results. rsc has also been vocally anti-surveys for similar reasons. But all of these suggestions are susceptible to other sources of bias—they overcount projects which are built often (or invoke other tools often; the same objection applies). I don’t know which projects those are, but I see no reason to just assume that the number of times a thing is compiled correlates with its significance.

              Also, as other people in the thread have raised, that a thing happens (relatively) rarely doesn’t mean that its presence is unimportant. rsc has framed this as an exercise in answering important quantitative questions, but making the debate about how to answer them would be reasonable only if the quantitative answers were ends in themselves. They’re not—they’re a way of answering qualitative questions about the importance of various things to the community. I don’t think they achieve this.

              I could go deeper and attack the statistics side of this, but I think it’d be a bit redundant. The data can’t be of high quality, even if it gathered and handled perfectly (which I don’t think it is) because it’s answering the wrong questions.

              There is yet a wider zoom level through which to view this debate, however. Go as an institution has apparently been built on the premise of not talking to one’s customers. This is typical for Google, but it shouldn’t be typical for FOSS.

              1.  

                That article would be a bit more compelling if Russ put a disclaimer stating “I have not received any pressure from my superiors at Google to collect more data from Go users.”

              2.  

                What’s to prevent a sufficiently evil change in guard to just start collecting that info without really notifying you (or burying it)? Like, this seems like a legitimate concern that isn’t something that can be solved by complaining about telemetry. Just disable it and move on or use something else.

                1.  

                  Once I opt in any change in the amount and quality of the data sent to headquarters is likely to be done without my explicit consent.

                  That’s a fair complaint. Assuming we wanted telemetry but wanted to avoid this issue, I see a design that could fix this.

                  Most projects’ SEND_TELEMETRY setting or environment variable is a boolean – DO_NOT_SEND or SEND. In the project that we want to add telemetry to, instead make it an enum: DO_NOT_SEND, SEND_V1, SEND_V2. If the project wants to start collecting new data, it would have to convince users to change their settings to SEND_V2. Whether the project used opt-in or time-delayed opt-out for its initial request to enable telemetry, it would use the same strategy to request the upgraded telemetry.

                  1.  

                    So long as the initial team stays in control and that enum is never expanded to new telemetry. What about telemetry you thought you were collecting on day 1 and it turns out there was a bug that meant you never sent anything useful? Could you fix that? What about sending the uptime but you realize you also need to know if the copy on disk is newer than your program’s uptime?

                    Much of this relies on the continued single minded determination of the person or entity making the decisions to make them consistently and correctly.

                1.  

                  If you can drop the auto updates and just distribute self contained executable, the cosmopolitan might be an interesting option.

                  1.  

                    With regard to option 2, getting your package into some package repos for a few major distributions does give you roughly this (modulo the limitations of each distro). I (at least notionally) try to target Gentoo, Debian, Ubuntu (specifically, though if you get into Debian properly then you will eventually be ported into Ubuntu), and Arch Linux when I’m not developing something tied to some language or environment’s package manager. But you might add one each of the Mac and Windows package management tools as well.

                    Edit: doing this (and also specifically structuring your program so that it is easily packageable), and providing .deb’s for direct download, means that distro maintainers for other distros will tend to be pretty willing to do the ‘last mile’ packaging, getting you fairly complete coverage.

                    1.  

                      Let’s assume I have a way of generating arch-specific static binaries

                      Any easy way to automate publishing the packages into

                      1. homebrew
                      2. Ubuntu apt-get
                      3. Chocolatey (Windows)
                      1.  

                        cargo-dist might be able to do that in future.

                        1.  

                          So, usually you would work out how to package the project for all of these as part of your CI (rather than ‘just generating a static binary’), and then on a tag release automatically push the generated artifact to the relevant package manager. Eg I’ve seen people use GitHub Actions to wrap the static binary in the wrapper that Chocolatey needs and then push it to Chocolatey.

                          But the exact ‘how’ depends on the details of your whole thing. Eg for packaging Rust things for Debian it is actually a lot easier than that, you typically wouldn’t compile a static binary, you only need a debian directory in the root of your Rust project with a correctly formatted copyright file and a debcargo.toml file, which are processed automatically and compiled and distributed by the Debian infrastructure once you have registered your package with the Debian rust packaging team. Similar for Gentoo except you need a gentoo directory with an ebuild file, and distributing binary packages requires a bit more setup on your end instead of being completely automatic on the distro infrastructure end.

                          Basically, you do need to learn a bit about ‘the maintainer life’ across the major platforms you want to release on, but the upside is that you get those nice ‘native OS ergonomics’.

                      1. 14

                        You’re going to have a lot of wells to unpoison when it comes to this claim–especially given dark patterns such as not asking for consent around tracking in general. I don’t think folks argue it’s “telemetry = bad” but all of the mishandling of data we’ve seen and, without the source, it’s hard to to just “take their word” on what they’ll collect or that it won’t be abused in a future release.

                        1. 15

                          I don’t think folks argue it’s “telemetry = bad”

                          I think that this is exactly what folks argue. At least that’s how I perceived the reaction to Russ proposal to add ‘transparent telemetry’ to Go. In fact, the discussion on lobsters showed that many (if not almost all) who argued against it didn’t even read the proposal.

                          1. 15

                            That’s exactly the poisoned-well issue, though: there are enough somewhat credible accusations against Google that the actual data handling practices have in fact been incompatible with the literal reading of published documents. And any big org has stories about higher-ups overriding carefully balanced planning from the technical teams. Oh, and nobody doubts that Google can get both code and transmitted-data obfuscation done.

                            In this model — supported by many people’s evaluation of facts (mine included) — reading a proposal indeed cannot remove the worries, so some people just skipped it.

                            1.  

                              I have to agree. And it makes me sad, as software gets progressively worse for power users and developers, since they don’t receive our telemetry. I pretty much always enable telemetry, because I care about how I use a product being counted.

                              1.  

                                since they don’t receive our telemetry

                                Why would that be the reason software is getting worse, considering that there wasn’t telemetry when it was better?

                                1.  

                                  It can be used as an excuse to remove features that only the telemetry-naysayers are using, since the telemetry shows no use of said features. Before telemetry, someone had to formulate an argument for a feature removal, now the burden of proof has moved to those who want to keep it.

                                  1.  

                                    That makes sense, but if anything it is an argument against telemetry rather than for it.

                              2.  

                                I think it’s true that a lot of people didn’t read the proposal, and more so the further you got from the original discussion on GitHub (I didn’t follow that one on lobste.rs, but I did see some nuance-lite posts on mastodon). But in spaces where it was more constructive, the message was much more clearly “must be opt-in.” And hey, they listened.

                            1. 6

                              The tools we use shape our solutions. They shape how we think about the problem and approach it.

                              I advocate for a task-centric approach to learning languages. Pick something you’re really interested. In and find similar things and ask what they’re written in.

                              I actually wish I could advocate for everyone to learn Rust. But the learning curve is too steep and if someone had pushed it on me a decade ago I don’t know if I would have persisted.

                              I think maybe my current framework is: learn what interests you, prototype in what’s familiar, rewrite in rust ;p

                              1. 2

                                I actually wish I could advocate for everyone to learn Rust. But the learning curve is too steep

                                I don’t know if it is too steep but it’s definitely much less steep than few years back. The borrow checker is much better, the errors are less cryptic and lots of stuff is stabilized. Now is the best time to learn rust :)

                                1.  

                                  I feel like rust is a good language since you’ve experienced writing software in C. A lot of the “why the F does rust care about this tiny difference” goes away once you’ve felt the pain of debugging random runtime segfaults (even with address sanitizer).

                                  But if you’re coming from node or Python or Ruby it’s a harder sell. I think writing it as your first lang would be even harder than that.

                                  The borrow checker and its errors aren’t quite the barrier. It’s getting to the point where you understand enough about the problem to model a workaround that is an issue. Things like functional chaining are very powerful and nice, but thinking in type transformations is a higher order skill to master when people are at a true beginner “why should I put data in a variable” level of learning programming.

                                  I teach 5th graders, and I wonder if it would ever be possible to teach them Rust.

                                  1.  

                                    But if you’re coming from node or Python or Ruby it’s a harder sell.

                                    I’m not so sure about that. In my experience doing bigger refactors in those languages is much harder than in rust.

                                    1. 5

                                      There are plenty of other languages with type systems that are just as helpful as Rust’s, but don’t require you to jump thru a ton of hoops to avoid the overhead of garbage collection.

                                      If you’re solving a problem that could have been written in Ruby or Python, (that is to say, one in which GC is not a bottleneck) then picking Rust over OCaml is just opting in to a bunch of needless effort for no relevant gain. (Unless of course you need a specific library which exists in Rust but not OCaml.)

                                      1.  

                                        I didn’t say anywhere that rust is always the best choice :)

                                      2.  

                                        I’m with you, I agree. But devs need to have that perspective and pain before they can appreciate the difference. Confidence in refactoring is a major win, but someone struggling to even write a first version it’s a tougher sell.

                                        1.  

                                          Well, I agree with you. I didn’t say that rust is best choice for the first language - I have no idea if that would be a good choice. All I’m saying is that rust is a great language to learn, and it’s much easier to learn it today than 5 years ago. Certainly it’s also not the only language worth learning :)

                                1. 41

                                  I am fed up with the Rust hype myself but this is just pointless. “if written by the right people.” Yeah sure, you don’t need memory safety if you have the magical right people who never write make memory errors.

                                  1. 41

                                    If you have people who have sufficient attention to detail to write memory safe code in C, imagine what they would be able to do if you have them tools that removed that cognitive load and let them think about algorithms and data structures.

                                    1. -4

                                      and let them think about algorithms and data structures.

                                      Ideally they would be thinking about solving the problem at hand and not ways in which they can use spare tools.

                                    2. 3

                                      Software is prone to many sorts of defects, and I find it quite plausible that there are people who could, pursuant to appropriate external factors, produce software in c with an overall defect rate not far off from what it would be if they wrote it in, say, typescript. (Typescript is quite an odd strawman here, considering that it doesn’t have—and isn’t, to my knowledge, known for having—a particularly strong or expressive type system, but we’ll roll with it.) I might even go so far as to place myself in that category.

                                      I do agree with david that, absent such external factors, there are very good reasons to not choose to write applications in c; it is rather absurd that, in a general sense, programming languages are used which are not garbage-collected and capability-safe.

                                      1. 6

                                        Typescript is quite an odd strawman here, considering that it doesn’t have—and isn’t, to my knowledge, known for having—a particularly strong or expressive type system, but we’ll roll with it.

                                        Not strong, perhaps—it’s intentionally unsound in places—but I’d argue that it’s among the most expressive of mainstream languages I’ve used. You can express things like “function that, given an object [dict/hashmap], converts all integer values into strings, leaving other value as-is”, and even do type-level string parsing - very handy for modeling the sorts of metaprogramming shenanigans common in JS libraries.

                                      2. 2

                                        Not disagreeing, but something to add here.

                                        Languages also won’t prevent bad designs, bad performance, hard, unreasonably complicated builds and deployments, bugs (security-related or other kinds), and so on. So you can find projects by “the wrong people” in every language. Sometimes this colors how a language is perceived, especially when there aren’t many widely used applications written in it, or it is simply dominated by few applications.

                                        Another thing when it comes for example to C and security. It might depend a lot on context and not just the language itself. First of all, yes, C has a topic with memory safety. I am not here to defend that, but I think it’s a great example where huge numbers of work-arounds and mitigations have emerged in the ecosystem. For example running a service in C written for Windows 98 has strongly different properties than OpenSSH on OpenBSD or something using sandboxing on Linux or OpenBSD and switching libraries, Valgrind, Fuzzers, etc. can make a drastic difference. While that doesn’t make C safe, it does make a difference in the real world and as such it is reasonable to go with a project written in C for security reason when the other option is some hacked together project that might claim to be production ready, but hasn’t been audited and was programmed naively.

                                        So in the end it rarely is as black and white and very context dependent. The article could be understood as a language not defining these things on its own.

                                        1. 12

                                          Languages also won’t prevent bad designs, bad performance, hard, unreasonably complicated builds and deployments, bugs (security-related or other kinds), and so on.

                                          No, languages exist exactly to do these things. I’m excluding the extreme case of someone stubbornly writing intentionally terrible code to prove the point. When developers write code with best intentions, the language matters, and it does help or hinder them.

                                          We have type systems, so that a “bad programmer” calling a wrong function will get a compilation error before the software is released, instead of shipping buggy code and end users getting random “undefined is not a function”.

                                          When languages don’t have robust features for error handling, it’s easy to fail to handle errors, either by accident (you didn’t know a function could return -1) or just by being lazy (if it’s verbose and tedious, it’s less likely to get done). When languages won’t let you ignore errors, then you won’t. Ignoring an exception — On Error Resume Next style — needs to be done actively rather than passively. Rust goes further with Result types that in most cases won’t even compile if you don’t handle them somehow.

                                          You can have for-each loops that won’t have off-by-one errors when iterating over whole collections. You can have standard libraries that provide robust fast containers, good implementations of basic algorithms, so that you don’t poorly reinvent your own. Consider how GTA was terribly slow loading for years because of a footgun in a bad standard library combined with a crappy DIYed JSON parser. This error wouldn’t happen in a language with better string functions, and would be a total non-issue in a language where it’s easy to get a JSON parser.

                                          It’s much harder to write bad Rust code than to write bad C or bad JS or bash. Good languages make “bad programmers” write better code.

                                          1.  

                                            Languages also won’t prevent bad designs, bad performance, hard, unreasonably complicated builds and deployments, bugs (security-related or other kinds), and so on.

                                            No, languages exist exactly to do these things. I’m excluding the extreme case of someone stubbornly writing intentionally terrible code to prove the point. When developers write code with best intentions, the language matters, and it does help or hinder them.

                                            No, they don’t.

                                            • Bad designs are a factor of developer experience
                                            • Performance is a factor of algorithms and implementations
                                            • Complicated builds are based on design, build tools, libraries, frameworks, also see Bazel, etc.
                                            • Deployments pretty much the same thing. I’d agree with builds and deployments language can have a huge influence, which is why containers are often used as a workaround. If we look at C code there is a huge range starting from super simple, single binary to absolutely horrible. So that’s why I don’t think it’s defined through the language.
                                            • Bugs are to a large degree ones that can be made in every language. There’s so many bugs that have been done in C, Java, Python, Perl, PHP, Go and Rust. From bad error handling, over SQL injection, library misuse, etc. Sure, if your implementation has memory safety it is it’s own story, but it’s certainly not the only and not the only security critical bug.

                                            We have type systems, so that a “bad programmer” calling a wrong function will get a compilation error before the software is released, instead of shipping buggy code and end users getting random “undefined is not a function”.

                                            Yes. Yet there were C and C++ and people still came up with Perl, Ruby, PHP which all don’t have memory safety issues.

                                            You can have standard libraries that provide robust fast containers, good implementations of basic algorithms, so that you don’t poorly reinvent your own

                                            You can also have different implementations, replacements for standard libraries, other algorithms, because it’s not tied to the language.

                                            Consider how GTA was terribly slow loading for years because of a footgun in a bad standard library combined with a crappy DIYed JSON parser.

                                            This underlines my point though. Despite the language that is usually called fast it’s slow. Despite the language it was made fast. So it was not the language defining the software, but these factors.

                                            It’s much harder to write bad Rust code than to write bad C or bad JS or bash. Good languages make “bad programmers” write better code.

                                            While I agree with that sentiment, I’d be curious about any studies on that. Way back when I was at university I dug through many studies on such topics. From whether object oriented programming has measurable benefits, to typing, over development methodologies, and so on. In reality they all seem to have a lot less (that is no statistically significant) difference, from bugs to development speed over so many other factors. They always end up boiling down to developer experience and confidence. Unless there were biased authors.

                                            I think a great example is PHP. I think a lot of people here will have seen terrible PHP code, and it’s a language I like to hate as well. It’s incredibly easy to write absolutely horrible code in it. Yet the web would be empty without. And even new projects are started successfully. I can’t think of anything good about it really, and I try to stay away from it as much as I can, yet even after decades of better alternatives new projects are frequently and successfully started based on it. And there’s some software in PHP that a lot of people here would recommend and often write articles about here. Nextcloud for example.

                                            1. 8

                                              You’re reducing languages to the Turing Tarpit, overlooking human factors, differences in conventions, standards in their ecosystems, and varying difficulty in achieving goals in each language.

                                              The fact that it’s demonstrably possible to screw up an implementation in a language X just as much as in language Y, doesn’t mean they’re equivalent and irrelevant. Even if the extreme worst and extreme best results are possible, it still matters what the typical outcome is, and how much effort it takes. In other words, even when a full range of different outcomes is possible, the language used can be a good Bayesian prior for which outcome you can expect.

                                              Even when the same bugs can be written in different languages, they are not equally likely to be written in every language. This difference in likelihood is very important.

                                              • A mistake of passing too few arguments to a function is a real concern in some languages, and a total non-issue in others.
                                              • Strings with spaces — bane of bash and languages where code is naively glued from text. Total non-issue in most other languages.
                                              • Data races are a real thing to look out for in multi-threaded C and C++ projects, requiring skill and diligence to prevent. In Rust prevention of this particular bug requires almost no skill or diligence, because the compiler does it.

                                              You can also have different implementations, replacements for standard libraries, other algorithms, because it’s not tied to the language

                                              For the record, Rust’s standard library is optional and replaceable. But my point was not about possibilities, but about the common easy cases. In Go you can expect programs to use channels, because they’re easily available. In C they are equally possible to use in the turing-tarpit sense, but in practice much harder to get and use, and this affects how C programs are written. Majority of Go programs use channels, majority of C programs don’t. Both could use channels equally frequently, but they don’t.

                                              Despite the language that is usually called fast it’s slow. Despite the language it was made fast. So it was not the language defining the software, but these factors.

                                              There’s a spread of program speeds and program qualities, and some overlap between languages, but the medians are different. C is slow in this narrow case, but overall program speed is still influenced by the language, e.g. GTA written in pure Python wouldn’t be fast, even if it used all the best designs and all the right algorithms.

                                              And in this case the language choice was the cause of the failure. Languages aren’t just 1-dimensional “fast<>slow”, but have other aspects that affect programs written in them. In this case the other aspects of the language — its clunky standard library and cumbersome handling of dependencies — were the culprit.

                                              While I agree with that sentiment, I’d be curious about any studies on that.

                                              It’s a famously difficult problem to study, so there’s unlikely to be any convincing ones.

                                              In my personal experience: since switching to Rust I did not need to use Valgrind, except when integrating C libraries. In my C work it was my regular tool. My attempts at multi-threading in C were awful, with crashy outcomes in both macOS GCD as well as OpenMP. In Rust I wrote several complex pervasively-multithreaded libraries and services without problems. In Golang I had issues with my programs leaving temp files behind, because I just can’t be trusted to remember to write defer every time. In Rust I never had such bug, because its temp file library has automatic drop guards, so there’s nothing for me to forget. I find pure-Rust programs easy to build and deploy. I use cargo deb, which makes a .deb file with no extra config needed, in production.

                                              A bit less anecdata is https://github.com/rust-fuzz/trophy-case While it demonstrates that Rust programs aren’t always perfect, it also shows how successful Rust is at lowering severity of the bugs. Exploitable memory issues are rare, and majority are panics, which are technically equivalent of exceptions thrown.

                                              BTW, another aspect that language influences is that in Rust you can instrument programs to catch overflows in unsigned arithmetic. C doesn’t distinguish between intended and unintended unsigned overflow, so you can’t get that out of the box.

                                              Speaking of PHP, it’s an example where even changes to the language have improved quality of programs written in it. People programming for “old PHP” wrote worse code than when writing for “new PHP”. PHP removed footguns like HTTP includes. It removed magic quotes and string-gluing mysql extension, encouraging use of prepared statements. It standardized autoloading of libraries, which meant more people used frameworks instead of copy-pasting their terrible code.

                                              1.  

                                                You’re reducing languages to the Turing Tarpit, overlooking human factors, differences in conventions, standards in their ecosystems, and varying difficulty in achieving goals in each language.

                                                Why not say ecosystems then? I specifically wrote that I was talking about implementations, ecosystems, etc., so I don’t feel like I am reducing anything here.

                                                The fact that it’s demonstrably possible to screw up an implementation in a language X just as much as in language Y, doesn’t mean they’re equivalent and irrelevant.

                                                I did not claim that.

                                                Even when the same bugs can be written in different languages, they are not equally likely to be written in every language. This difference in likelihood is very important.

                                                I did not claim otherwise.

                                                Both could use channels equally frequently, but they don’t.

                                                That is not a factor of the language though. And it’s certainly not defining software, which the actual topic was.

                                                GTA written in pure Python wouldn’t be fast, even if it used all the best designs and all the right algorithms.

                                                Again, I did not claim so.

                                                And in this case the language choice was the cause of the failure. Languages aren’t just 1-dimensional “fast<>slow”, but have other aspects that affect programs written in them. In this case the other aspects of the language — its clunky standard library and cumbersome handling of dependencies — were the culprit.

                                                Yes, but as you say languages are not 1-dimensional. There’s tade-offs, many factors why languages are chosen, and the language chosen might lead to issues, but they tend to not define the software. Unless you see some stack trace, or see some widget toolkit tied closely to a language you usually can’t tell the language something is written in, unless it’s badly designed (and I don’t mean by some super-human programmer, I mean average).

                                                In Rust you can add an inexperienced member to the team, tell them not to write unsafe, and they will be mostly harmless. They’ll write unidiomatic and sometimes inefficient code, but they won’t cause memory corruption. In C++, putting a noob on the team is a higher risk, and they could do a lot more damage.

                                                They can still make your program crash, can still cause all sorts of other security issues, from sql injections to the classic “personal customer data has been exposed publicly on the internet”. Yes, memory safety is a different topic, you don’t need to re-iterate that over and over. I don’t think anyone seriously claims either that C++ will result in better memory safety, nor that that isn’t an issue. The topic at hand is whether the choice of a language defines software. And I’d argue in most situations it doesn’t. Yes, if you have memory safety issues you wanna get rid of switch to Rust, if you want to not have issues with deployment, go away from Python. However, like you say, that’s only two dimensions and there is more than that for language choice. Else there would be that one language everyone uses - at least for new projects.

                                                A bit less anecdata is https://github.com/rust-fuzz/trophy-case While it demonstrates that Rust programs aren’t always perfect, it also shows how successful Rust is at lowering severity of the bugs. Exploitable memory issues are rare, and majority are panics, which are technically equivalent of exceptions thrown.

                                                Yep, memory safety issues are bad, and the consequences horrible. But think about the C software you use on a daily basis. What percentage of that software do you think is defined through memory safety issues or being written in C in any other way? Of all the software I can think of, it’s only OpenSSL and image (or XML sigh) parsers for me. That’s bad enough, but it’s still not much the total amount and sadly it’s even pieces that are used by many other languages. Don’t get me wrong, I absolutely hope that soon enough everyone will for example use a Rust implementation of TLS. Also for most encoders/decoders it would be great. I’d be very happy, if it was rewritten. Still I think most other software I use, including libraries there is other defining factors than the language which is exactly why they can be rewritten in Rust, without too much headache. If the language was defining the software that would be a problem.

                                                1.  

                                                  Both [C and Go] could use channels equally frequently, but they don’t.

                                                  That is not a factor of the language though. And it’s certainly not defining software, which the actual topic was.

                                                  How is it not? Channels are a flagship feature of the Go language. This influences how typical Go programs are designed and implemented, and consequently it is a factor in their bugs and features.

                                                  The way I understand your argument is that one does not have to use channels in Go and can write a very C-like Go program, or may use C to write something with as much concurrency and identical behaviors as a channel-based Go program, and this disproves that programs are defined by their language. But I call this reducing languages to the Turing Tarpit, and don’t consider that relevant, because these are very unlikely scenarios. In practice, it’s not equally easy to do both. Given real-world constraints on skill and effort, Go and C programs will end up with different architectures and different capabilities typical for their language, and therefore “Written in Go” and “Written in C” will have a meaning.

                                                  Unless you see some stack trace, or see some widget toolkit tied closely to a language you usually can’t tell the language something is written in, unless it’s badly designed

                                                  There are many ways in which languages affect programs, even if just viewed externally:

                                                  • ease of installation and deployment (how they handle dependencies, runtimes or VMs). I’m likely to have an easier time running a static Go binary than a Python program.
                                                  • interoperability (portability across system versions, platforms, compatibility with use as a library, in embedded systems, in kernels, etc.). Wrong glibc.so version is not an issue for distribution of JS programs, except those having C++ dependencies.
                                                  • startup speed. if a non-trivial program starts in milliseconds, I know it’s not Java or Python.
                                                  • run-time performance. Yes, there are exceptions and bad implementations, but languages still impose limits and influence typical performance. Sass compilers switched from Ruby to C++, and JS bundlers are switching from JS to Go or Rust, because these language changes have a very visible impact on user experience.
                                                  • multi-core support. In some languages it’s hard not to be limited to single-threaded performance, or have fine-grained multi-threading without increased risk of defects.
                                                  • stability and types of bugs. Languages have different weaknesses. If I feed an invalid UTF-8 to a Rust program, I won’t be surprised if it panics. If I feed the same to JS, I’ll expect some UCS-2 mojibake back. If I feed it to C, I expect it to preserve it byte by byte, up to the first \0.

                                                  So “Written in X” does tell me which problems I’m signing up for.

                                                  1.  

                                                    How is it not? Channels are a flagship feature of the Go language. This influences how typical Go programs are designed and implemented, and consequently it is a factor in their bugs and features.

                                                    That was in response to you saying that it could be done in C, and that it isn’t done for the language community. I was responding to that. What people do with a language in my opinion isn’t what a language is. Take JavaScript. Then look at its history. Then look at node.js, etc. It’s not the language that defined this path it’s the community. In a similar way it could be that C programmers start to using channels or something else as a default way to do things.

                                                    Anyways, I am not saying that channels are not an integral feature of Go and tied to the language, it’s a feature that part of the specification. So of course it is.

                                                    ease of installation and deployment (how they handle dependencies, runtimes or VMs). I’m likely to have an easier time running a static Go binary than a Python program.

                                                    I stated the same thing!

                                                    [more points from the list “There are many ways in which languages affect programs” ]

                                                    I’d argue that this is at least partly implementation dependent, and I think that shows with things like FaaS. But I agree, that depends on the language. Again I am not trying to argue that languages aren’t different from each other. We keep coming back to this.

                                                    So “Written in X” does tell me which problems I’m signing up for.

                                                    Again something I don’t disagree with.

                                                    My point is that we see in the real world that C/C++ software is being rewritten in Rust. Often times that is done without the average user even noticing. Heck, there is even Rust in browsers where developers thought it was JavaScript. What I am arguing is that if this is the case, then a language doesn’t define a piece of software.

                                              2. 6

                                                It’s much harder to write bad Rust code than to write bad C or bad JS or bash. Good languages make “bad programmers” write better code.

                                                While I agree with that sentiment, I’d be curious about any studies on that.

                                                As reported by both Google and Microsoft around 70% of security issues are caused by memory safety bugs. So any suitable language which prevents memory safety issues will lead to programmers writing better code (at least security wise).

                                                1.  

                                                  Okay, thanks for the clarification. I thought you meant better code as in code quality not as in memory safety. And yes, it’s a quality of the software that you don’t have such issues, but still not the first thing that comes to mind when I read “code quality”.

                                            2. 4

                                              Given two project implementing the same functionality (some sort of network service), both implemented in similar time by equally competent developers. If one was in C and other in Rust - would you be equally willing to expose them over the network?

                                              1.  

                                                [Sorry, this is a longer response, because I don’t want to be misunderstood here]

                                                That’s an abstract/theoretical example though. In reality I will gladly use nginx, OpenSSH, etc. over what exists in Rust, but not because of the language choice, which is the whole point I am trying to make. At the same time I wouldn’t trust other software written in C.

                                                Let me ask you a question. Would you rather build your company on a real life kernel connected to the internet written in C or in Rust in the here and now?

                                                I am maybe a weird one out, but I usually take a look over the code and project that I am using. There is situations where C code is simply more trustworthy. Often in “new” languages you end up with authors basing on stuff that is new themselves, so you can end up being the guinea pig. There’s also a couple of signs that are usually good for a project, not because you need them on your own, but because they are a good factor for maturity. One of them is portability. Another is not just throwing a docker container on you, because the build is to complex otherwise.

                                                And of course these things change. Might very well be that in future I’ll use a Rust implementation of something like OpenSSH (in the sense that OpenSSH isn’t just an implementation of the protocol). But right now in the real world I don’t intend to. I’d be way too worried about non-memory safety bugs.

                                                But let’s go a bit more abstract than that. Let’s talk an abstract service and let us only focus on how it is built and not what it actually does. If I got to choose between an exposed service under your premises in memory-safe node.js/JavaScript or in non-memorysafe kcgi/C and my concern is being hacked I can see myself going for the latter.

                                                In your example, sure I’d go by Rust. But the thing here is that it’s because of missing context.

                                                But it isn’t language dependent. For example for non-language factors I am pretty certain that there is more PHP and more node.js code out there with SQL injection bugs than probably in Python and Ruby. Does that mean I should choose the latter if I am interacting with databases? I’d also assume that in the real world on average node.js applications are more susceptible to DOS-attacks (the non distributed version) than for example PHP, simply because of how they are typically deployed/run. That also doesn’t have much to do with the language.

                                                I am focusing on the title of the article here. “Software is not defined by the language it’s written in” I really don’t think languages are the thing that defines software. I’ve seen surprisingly good, solid Perl code and I’ve seen horrible Java, Python and Rust code. The reason for that was never the language. Other examples I can think of is when Tor was still in C (and they switch to Rust for good reasons, pretty successfully), and there are similar project in Java and Python for example. But I wouldn’t use them, because while they are written in a memory safe language I’d trust Tor more.

                                                I think the usual reason why a project is started in a certain language is this. A developer has a number of values, usually also found in the project. In the time when the project is started people with these language gravitate towards a language or a set of them. So when you look at a project and the chosen language you will usually see a time capsule of how languages were viewed during that time. But these views change. A project is started in C for other reason than 10, 20, 30 years ago. The same is true for Python, Go, and even Rust has seen that shift already. While the “more secure than C/C++, yet more performant” is a constant theme, things like web assembly, adoption by projects, even changes in the language are a factor that made people use or not use Rust for new projects (anymore).

                                                1.  

                                                  Would you rather build your company on a real life kernel connected to the internet written in C or in Rust in the here and now?

                                                  That’s rather easy since there is no mature rust based kernel that had similar (same order of magnitude) man hours spent on it as linux or even *bsd.

                                                  1.  

                                                    That’s the very point I am trying to make. The language is not the defining factor of a piece of software. As you mention the hours spent on a piece of software is a way bigger factor in this case.

                                                    1.  

                                                      I just said that I would prefer kernel in rust that had X hours of development from kernel in C that had 10X hours of development. How is that not a defining factor? :)

                                                      A good real world example of that is ack vs ripgrep mentioned in other threads. Both tools solve the same problem, yet it’s no surprise that the ripgrep is significantly faster. And that’s with (as far as I understand) significant parts of ack being written in C (the perl regex engine). Imagine how much slower would it be in pure perl. Another dimension that these tools differ in predictable way is distribution. I can get statically linked binary of ripgrep that has zero dependencies that can run on linux. With ack there is no such option.

                                                      You can claim that technically you can have memory safety, performance and easy distribution in any language. That’s technically true, but in practice it’s not the case and you can predict quite well how different tools will handle those dimensions based on the implementation language they are written in.

                                                      1.  

                                                        It’s also nothing that I claim.

                                          1.  

                                            Half truth. Let someone give a really fast ripgrep replacement in typescript and I’ll agree.

                                            1. 6

                                              Before ripgrep there was ack. It is fast and written in Perl. It never had the same hype as ripgrep, but was always a “better” grep.

                                              1.  

                                                I used ack. I didn’t feel that they were as fast as rg.

                                                1.  

                                                  And between the two, there was ag, written in… C.

                                                  1.  

                                                    Not as fast, but fast enough and more importantly better defaults than plain grep

                                                    1.  

                                                      Right; I’ve never once in my life written a program where the difference in speed between ack and ripgrep would have been noticeable by the end user.

                                                      There’s plenty of correctness-based reasons to prefer Rust over Perl, but for an I/O-bound program, the speed argument is very rarely relevant

                                                      1. 7

                                                        The difference of 8s is easily noticeable:

                                                        $ time sh -c 'rg test linux-6.2.8 | wc -l'
                                                        113187
                                                        ________________________________________________________
                                                        Executed in   13,69 secs    fish           external
                                                           usr time    1,61 secs  781,00 micros    1,61 secs
                                                           sys time    4,86 secs  218,00 micros    4,86 secs
                                                        
                                                        $ time sh -c 'ack test linux-6.2.8 | wc -l'
                                                        113429
                                                        ________________________________________________________
                                                        Executed in   21,07 secs    fish           external
                                                           usr time   12,82 secs    1,13 millis   12,81 secs
                                                           sys time    4,25 secs    0,00 millis    4,25 secs
                                                        
                                                        1.  

                                                          I dunno… I abuse ripgrep enough in enough various situations that I would probably notice if it got slower. But that also might lead to me being more careful about my workflow, instead of shoving gigabytes of data into it and saying “lol cpu go brrrr”.

                                                1. 4

                                                  I am so hyped. I work on embedded devices that run QNX. We have huge, lumbering C++ codebases. For the past couple of years, I’ve been building Rust skills and tooling. This stuff now runs on QNX. Not only will this make our systems safer and more reliable, it will also save us so, so much time as safe Rust cannot cause memory corruptions or segfault and it almost never leaks memory. All of this with the same performance as C++.

                                                  1. 1

                                                    it almost never leaks memory.

                                                    I wouldn’t try to sell rust to the team/org using that particular claim. The so called modern c++ leaks memory as often as rust. It’s much better to focus on claims that will be very visible easy to verify, like prevention of memory corruption, data races and undefined behaviours when convincing others :)

                                                    1. 3

                                                      I say “almost never leaks memory” because technically speaking, safe Rust does not guarantee freeing all the memory. In practice, however, memory leaks in Rust are extremely rare due to lifetime management at compile time. It pretty much never happens because the compiler will complain if you mismanage the lifetimes of your objects. You automatically do it right because the language deliberately makes it hard to do it wrong.

                                                      This is in stark contrast to C++, where it is indeed true that std::unique_ptr (etc.) cleans up automatically, but the language does literally nothing to prevent you from just allocating memory willy-nilly and not freeing it.

                                                      Saying that modern C++ leaks memory just as often as Rust is like saying that if you always brake carefully and in time, you travel just as safely as if you wore a seatbelt. Which might technically be true, but it does not reflect reality because humans make mistakes.

                                                      1.  

                                                        A very easy way to leak memory in Rust is to have a cyclical reference with an Arc without having the other side as a Weak. This makes building classic Java style trees a bit cumbersome, if you need to traverse in both directions.

                                                        A good structure is to just use flat vectors for your tree, but if you come from Java, cyclical reference counting is a quite common leak to make.

                                                        1. 3

                                                          I don’t know what is your experience with both languages, but as far as I’m concerned allocating a pointer and losing track of it is not the typical leak that I had to deal with (more common in c++ but possible in safe rust). In fact I had to fix a leak caused by incorrect reference count in Arc due to bug in unsafe part using MaybeUninit last week 🙃

                                                          In my experience, most of the time the reason for constantly increasing memory usage (both rust and c++) is some long lived set or map that someone forgot to remove entries from. Since such space leaks are much more common and are equally likely in both languages, I wouldn’t try to sell rust as leaks preventing language.

                                                          This is in big contrast to stuff like preventing use after free and other UB - I’m writing rust professionally for last 3 years and I haven’t yet had to deal with any memory corruption, or data race. Which were a recurring issues when I was working in c++ codebases previously.

                                                          1.  

                                                            I don’t know what is your experience with both languages, but as far as I’m concerned allocating a pointer and losing track of it is not the typical leak that I had to deal with

                                                            Interesting. That is precisely the type of leak that I most commonly encounter in C++ and pretty much never in Rust.

                                                            In my experience, most of the time the reason for constantly increasing memory usage (both rust and c++) is some long lived set or map that someone forgot to remove entries from.

                                                            I’m talking about leaks where you lose or drop the pointer/handle to the resource. Of course there’s always the “endlessly growing map” kind of leak. This kind of leak cannot be prevented by any language because the programmer deliberately grows the data structure, which is an entirely different class of leak which is way easier to detect because you still have a handle to the data structure.

                                                            I agree with you that the main benefit of Rust is the safety that comes from the absence of memory corruptions and data races. And yes these issues appear in pretty much every non-trivial C++ code base no matter how skilled the programmers are.

                                                    1. 41

                                                      This is incredible. I think it’s mostly large, established businesses where this is a thing. On the other end of the spectrum you have the overworked startup workers who have to work lots of overtime, and the struggling underpaid freelancers.

                                                      1. 20

                                                        I’m not convinced it’s size so much as location of the department within the company and whether that department is on a critical revenue path. I mean it’s hard to imagine this at a tiny to small (<25 headcount) company but such a company won’t really have peripheral departments as such and would somehow need to be simultaneously very dysfunctional and also successful to sustain such a situation.

                                                        The original author keeps talking about “working in tech” but the actual jobs listed as examples suggest otherwise: “software developer [at] one of the world’s most prestigious investment banks”, “data engineer for one the world’s largest telecommunications companies”, “data scientist for a large oil company”, “quant for one of the world’s most important investment banks.”

                                                        First off, these are not what I’d personally call the “tech industry”.

                                                        More importantly, these don’t sound like positions which are on a direct critical path to producing day-to-day revenue. Similarly importantly, they’re also not exactly cost centres within the company, whose productivity is typically watched like a hawk by the bean counters. Instead, they seem more like long-term strategic roles, vaguely tasked with improving revenue or profit at some point in the future. It’s difficult to measure productivity here in any meaningful way, so if leadership has little genuine interest in that aspect, departmental sub-culture can quickly get bogged down in unproductive pretend busywork.

                                                        But what do I know, I’ve been contracting for years and am perpetually busy doing actual cerebral work: research, development, debugging, informing decisions on product direction, etc.. There’s plenty of that going on if you make sure to insert yourself at the positions where money is being made, or at least where there’s a product being developed with an anticipation of direct revenue generation.

                                                        1. 12

                                                          I’ve seen very similar things at least twice in small companies (less than a hundred people in the tech department). In both cases, Scrum and Agile (which had nothing to do with the original manifesto but this is how it is nowadays) were religion, and you could see this kind of insane inefficiency all the time. But no one but a handful of employees cared about it and they all got into trouble.

                                                          From what I’ve seen, managers love this kind of structure because it gives them visibility, control and protection (“every one does Scrum/Agile so it is the right way; if productivity is low, let’s blame employees and not management or the process”). Most employees (managers included) also have no incentive beeing more productive: you do not get more money, and you get more work (and more expectations) every single time. So yes, the majority will vocally announce that a 1h hour task is really hard and will take a week. Because why would they do otherwise?

                                                          Last time I was in this situation, I managed to sidestep the problem by joining a new tiny separate team which operated independently and removed all the BS (Scrum, Agile, standups, reviews…) and in general concentrated on getting things done. It worked until a new CTO fired the lead and axed the team for political reasons but this is another story.

                                                          1. 9

                                                            It worked until a new CTO fired the lead and axed the team for political reasons but this is another story.

                                                            I’m guessing maybe it isn’t: a single abnormally productive team potentially makes many people look very bad, and whoever leads the team is therefore dangerous and threatens the position of other people in the company without even trying. I’d find it very plausible that the productivity of your team was the root cause of the political issues that eventually unfolded.

                                                            1. 4

                                                              This was 80% of the problem indeed. When I said it was another story, I meant that this kind of political game was unrelated to my previous comments on Scrum/Agile. Politics is everywhere, whether the people involved are productive or not.

                                                              1. 14

                                                                It’s not just a question of people not wanting to “look bad,” though.

                                                                As a professional manager, about 75% of my job is managing up and out to maintain and improve the legibility of my team’s work to the rest of the org. Not because I need to build a happy little empire, but because that’s how I gain evidence to use when arguing for the next round of appealing project assignments, career development, hiring, and promotions for my team.

                                                                That doesn’t mean I need to invent busywork for them, but it does mean that random, well-intentioned but poorly-aimed contributions aren’t going to net any real org-level recognition or benefit for the team, or that teammate individually. So the other 25% of my energy goes to making sure my team members understand where their work fits in that larger framework, how to gain recognition and put their time into engaging and business-critical projects, etc., etc.

                                                                …then there’s another 50% of my time that goes to writing: emails to peers and collaborators whose support we need, ticket and incident report updates, job listings, performance evaluations, notes to myself, etc. Add another 50% specifically for actually thinking ahead to where we might be in 9-18 months and laying the groundwork for staff development and/or hiring needed to have the capacity for it, as well as the design, product, and marketing buy-in so we aren’t blocked asking for go-to-market help.

                                                                Add up the above and you can totally see why middle managers are useless overhead who contribute nothing, and everyone would be better off working in a pure meritocracy without anyone “telling them what to do.”

                                                                1. 3

                                                                  omg, I’ve recently worked in a ‘unicorn’ where everyone one was preoccupied with how their work will look like from the outside and if it will improve their ‘promo package’. Never before have I worked in a place so full of buzzword driven projects that barely worked. But hey, you need one more cross team project with dynamodb to get that staff eng promo! 🙃 < /rant>

                                                                  1. 1

                                                                    Given your work history (from your profile), have you seen an increase in engineers being willfully ignorant about how their pet project does or does not fit into the big picture of their employer?

                                                                    I ask this from having some reports who, while quite sharp, over half the time cannot be left alone to make progress without getting bogged-down in best-practices and axe-sharpening. Would be interested to hear how you’ve handled that, if you’ve encountered it.

                                                                    1. 6

                                                                      I don’t think there’s any kind of silver bullet, and obviously not everyone is motivated by pay, title, or other forms of institutional recognition.

                                                                      But over the medium-to-long term, I think the main thing is to show consistently and honestly how paying attention to those drivers gets you more of whatever it is you want from the larger org: autonomy, authority, compensation, exposure in the larger industry, etc.

                                                                      Folks who are given all the right context, flexibility, and support to find a path that balances their personal goals and interests with the larger team and just persistently don’t are actually performing poorly, no matter their technical abilities.

                                                                      Of course, not all organizations are actually true to the ethos of “do good by the team and good things will happen for you individually.” Sometimes it’s worth to go to battle to improve it; other times you have accept that a particular boss/biz unit/company is quite happy to keep making decisions based on instinct and soft influence. (What to do about the latter is one of the truly sticky + hard-to-solve problems for me in the entire field of engineering management, and IME the thing that will make me and my team flip the bozo bit hard on our upper management chain.)

                                                                      1. 1

                                                                        Would you be able to elaborate on the last paragraph about making decisions based on instinct and soft influence? Why is it a problem and what do you mean by “soft influence” in particular? Quite interested to understand more.

                                                                        1. 2

                                                                          Both points (instinct + soft influence) refer to the opposite of “data-driven” decision-making. I.e., “I know you and we think alike” so I’m inclined to support your efforts + conclusions. Or conversely, “that thing you’re saying doesn’t fit my mental model,” so even though there are processes and channels in place for us to talk about it and come to some sort of agreement, I can’t be bothered.

                                                                          It’s also described as “type 1” thinking in the Kahneman model (fast vs. slow). Not inherently wrong, but also very prone to letting bias and comfort drown out actually-critical information when you’re wrestling with hard choices.

                                                                          Being on the “supplicant” end and trying to use facts to argue against unquestioned biases is demoralizing and often pointless, which is the primary failure mode I was calling out.

                                                                          1. 3

                                                                            This is true and relevant, but it’s also key to point out why instinct-driven decisions are preferred in so many contexts.

                                                                            By comparison, data-driven decision-making is slower, much more expensive, and often (due to poor statistical rigor) no better.

                                                                            Twice in my career I have worked with someone whose instincts consistently steered the team in the right direction, and given the option that’s what I’d always prefer. Both of these people were kind and understanding to supplicants like me, and - with persistence - could be persuaded to see new perspectives.

                                                                            1. 4

                                                                              Excellent points! Claiming to be “data driven” while cherry-picking the models and signals you want is really another form of instinctive decision-making…but also, the time + energy needed to do any kind of science in the workplace can easily be more than you (individually or as a group) have to give.

                                                                              If you have collaborators (particularly in leadership roles) with a) good instincts, b) the willingness to change their mind, and c) an attitude of kindness towards those who challenge their answers, then you have found someone worth working with over the long-term. I personally have followed teammates who showed those traits between companies more than once, and aspire to at least very occasionally be that person for someone else.

                                                                            2. 1

                                                                              That’s helpful, thanks for clarifying.

                                                                          2. 0

                                                                            Thanks for the reply, that’s quite helpful and matches a lot of what’s been banging around in my head.

                                                                          3. 3

                                                                            I ask this from having some reports who, while quite sharp, over half the time cannot be left alone to make progress without getting bogged-down in best-practices and axe-sharpening.

                                                                            I think this is part of the natural life-cycle of the software developer - the majority of developers I’ve known have had an extended period where this was true, usually around 7-10 years professional experience.

                                                                            This is complicated by most of them going into management around the 12-year mark, meaning that only 2-3 years of their careers combine “experienced enough to get it done” with “able to regulate their focus to a narrow target”.

                                                                            1. 2

                                                                              I think those timelines have been compressed these days. For better or worse, many people hold senior or higher engineering roles with significantly fewer than 7-10 years experience.

                                                                              My experience suggests that what you’ve observed still happens - just with less experience behind the best-practices and axe-sharpening o_O

                                                                    2. 3

                                                                      My team explicitly doesn’t use scrum and all the other teams are asking: “How then would you ever get anything done?”

                                                                      Well… a lot better.

                                                                  1. 4

                                                                    Having to remember to write defer and knowing which function should be used to free the returned resource is a very error prone approach.

                                                                    Sadly a less error prone solution doesn’t seem to be worked on judging by this pr https://github.com/ziglang/zig/issues/782

                                                                    1. 4

                                                                      That’s fair, but it’s not like other solutions are strictly better either.

                                                                      RAII-heavy projects can hide fun surprises in an object’s destructor, meaning that you still have to remember and know which object does what when it goes out of scope, and even when you manage to avoid bugs, you can end up with programs that take forever to shut down because every single object in the system wants to run its own destructor.

                                                                      I personally love defer because in Zig allocations are always explicit: if a function takes in an allocator, you know you will have to eventually free whatever it gives you back.

                                                                      Even better than defer is errdefer, which makes resource cleanup very natural. Taken from the Zig language reference:

                                                                      fn createFoo(param: i32) !Foo {
                                                                          const foo = try tryToAllocateFoo();
                                                                          // now we have allocated foo. we need to free it if the function fails.
                                                                          // but we want to return it if the function succeeds.
                                                                          errdefer deallocateFoo(foo);
                                                                      
                                                                          const tmp_buf = allocateTmpBuffer() orelse return error.OutOfMemory;
                                                                          // tmp_buf is truly a temporary resource, and we for sure want to clean it up
                                                                          // before this block leaves scope
                                                                          defer deallocateTmpBuffer(tmp_buf);
                                                                      
                                                                          if (param > 1337) return error.InvalidParam;
                                                                      
                                                                          // here the errdefer will not run since we're returning success from the function.
                                                                          // but the defer will run!
                                                                          return foo;
                                                                      }
                                                                      

                                                                      So, from my perspective, you can get right or mess up both RAII and defer resource management, so I’m not convinced either is inherently superior and it all boils down to which tool you can use best based on your programming style.

                                                                      1. 4

                                                                        RAII-heavy projects can hide fun surprises in an object’s destructor, meaning that you still have to remember and know which object does what when it goes out of scope, and even when you manage to avoid bugs,

                                                                        None of this is in any way specific to RAII. It applies equally to ‘defer heavy projects’ or any other system of releasing resources. Although, I have to admit I’m not sure what you mean by ‘heavy’ in this context. Is a non-heavy project one where not all resources are properly freed?

                                                                        you can end up with programs that take forever to shut down because every single object in the system wants to run its own destructor.

                                                                        Have to say that I’ve never seen a project where RAII was a reason for a slow shutdown. Do you have any specific example where replacing RAII with manual resource deallocation in callsites significantly improved shutdown times?

                                                                        I personally love defer because in Zig allocations are always explicit: if a function takes in an allocator, you know you will have to eventually free whatever it gives you back.

                                                                        But you don’t know at callsite which allocator was actually used. Just because you passed to the callee some allocator, it doesn’t mean that what you get in return was allocated with it.

                                                                        Even better than defer is errdefer, which makes resource cleanup very natural.

                                                                        I would say that this example is great for showing why raii is better and (err)defer just adds complexity. With raii like approach, you:

                                                                        • don’t have to know how allocation was made
                                                                        • don’t have to add code in every callsite to deallocate
                                                                        • don’t have to distinguish returning errors from ‘ok’ values

                                                                        An example: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=37007e0cb6b15bea2f46234316e66404

                                                                        1. 3

                                                                          None of this is in any way specific to RAII. It applies equally to ‘defer heavy projects’ or any other system of releasing resources.

                                                                          Unlike defer, RAII branches on code not explicitly defined in the function. This means you have to chase down 1) the type 2) type declaration 3) drop impl instead of following the deferred function. For rust in particular, it also doesn’t help that let _x = foo() drops it at the end of the scope while let _ = foo() drops it immediately.

                                                                          I’ve never seen a project where RAII was a reason for a slow shutdown

                                                                          Anecdotally, I’ve had something similar. Async rust implements cancellation by reusing the Future’s RAII destructor (Drop). This method is synchronous and requires cleanup to block the thread properly or it risks being unsound. When tons of similar Futures on a shared resource cancel at the same time, the contention may cause them to block and the destructor (which you never call explicitly when using the Future) can block for a while. Contrast this with Zig’s async which allows asynchronous operations in defer due to it being just code insertion. This allows cleanup to be asynchronous and not block.

                                                                          But you don’t know at callsite which allocator was actually used.

                                                                          Passing around allocators is a convention in Zig. Similar to how there’s also a convention to pass the allocator on cleanup. Zig calls these Unmanaged Containers where “Managed” ones are built on-top for convenience.

                                                                          don’t have to know how allocation was made

                                                                          Zig programs consider this a downside. Due to being more conscious about how memory is used, It’s preferable to know how to correctly/efficiently release it. When you don’t need to know, you can use the Managed containers which hold the initial allocator and still use defer to call their destructor to cleanup/dealloc properly.

                                                                          don’t have to add code in every callsite to deallocate

                                                                          As noted before, object destructors can be error prone and (more importantly) can do more than just free memory. It’s helpful for a good chunk of people to know how to easily trace whats going on there i.e. during code review when there’s no LSP.

                                                                          don’t have to distinguish returning errors from ‘ok’ values. An example

                                                                          The example uses move semantics to not run a destructor by returning the object. If you want to run arbitrary code on errdefer without having to return objects and pollute the callsite, this doesn’t work. Personally, this is also a plus for Zig.

                                                                          1. 3

                                                                            But you don’t know at callsite which allocator was actually used. Just because you passed to the callee some allocator, it doesn’t mean that what you get in return was allocated with it.

                                                                            Thats maybe how people do it in other languages, in Zig library functions / types are expected to never make up their own allocator and only allocate with the one provided as input.

                                                                            • don’t have to know how allocation was made
                                                                            • don’t have to distinguish returning errors from ‘ok’ values

                                                                            None of these things is considered inherently good in Zig, especially the second one: in Zig you have the opportunity of always handling OOM. Not everybody will want to do it, but it’s feasible so, if anything, the opposite is true: always having the opportunity to handle allocation failure as an error is a good thing.

                                                                            1. 5

                                                                              Thats maybe how people do it in other languages, in Zig library functions / types are expected to never make up their own allocator and only allocate with the one provided as input.

                                                                              If everyone would write what is expected we wouldn’t have bugs. Sadly this is not possible and also sadly zig fails to prevent such errors at compile time.

                                                                              always having the opportunity to handle allocation failure as an error is a good thing.

                                                                              Did I ever said anything about not handling allocation failures? All I said and showed in that example is that raii handles exiting function in “ok” and “nok” cases equally well without requiring explicit annotations.

                                                                        2. 3

                                                                          Sadly a less error prone solution doesn’t seem to be worked on judging by this pr

                                                                          My gut feeling for the “zig way” is that the less error prone solution is to avoid resource management to begin with. You can’t avoid 100% of resource management, but my gut feel is that, with a significant design effort, it’s possible to avoid maybe 90% on average?

                                                                        1. 2

                                                                          A reminder: you often don’t need much of the OS at all if you link statically or include the libraries you absolutely need in your container.

                                                                          1. 3

                                                                            Buildah (unlike Docker) makes it trivial to start with an empty base image. This lets you put the absolute minimum that you need in the container. It also doesn’t create container layers implicitly, so you can do the pip command to build all of the things you need, then the pip command to remove the toolchain and any .o files, and not end up with a load of layers that add the temporary things. And, my personal favourite, it has a copy command that copies from another container, so you can create a container that contains the build tools, build the thing, and then create a new container and (with a single command) copy from the build container to the deployment one.

                                                                            1. 2

                                                                              Isn’t that identical to what you’d do with multi-stage docker builds? FROM ubuntu AS builder, FROM scratch, COPY --from=builder source target?

                                                                              1. 1

                                                                                Ah, you’re right. I misremembered, the missing thing in Docker is the opposite of that: copying out of a container. I don’t think that’s possible with the Docker model, where the Dockerfile is evaluated inside the new container. It is with buildah, where the equivalent is a script that runs outside. I use this a lot for building statically linked tools that I run outside of containers (especially ocaml things that scatter things everywhere during the build), so I can build in a clean environment and then throw it away at the end.

                                                                                1. 2

                                                                                  Yeah, Dockerfiles are made with the intention of the image being the artefact. To achieve this with docker you would have to actually run the container and use mounts to copy files out of it.

                                                                                  1. 2

                                                                                    Or use:

                                                                                    docker save image | tar -xO --wildcards "*.tar" | tar -xO path/to/file/you/want
                                                                                    
                                                                              2. 1

                                                                                Afair, at one of my previous jobs we used docker, tini, multi stage builds and just copied the resulting binary plus any required shared libs. It was pretty easy and resulted in minimal images.

                                                                            1. 3

                                                                              That is a misleading title. The actual post content only shows how to do basic multiplication using rust and metal. There is nothing how to make FFT faster with those technologies :(

                                                                              1. 2

                                                                                I really liked that parallel+jq solution. Really nice! I also had to do it on my own and it meant rust. Here’s a quick brain dump of what I did: https://tilde.cat/posts/analyzing-multi-gigabyte-json-files-locally-with-serde/

                                                                                Also had to write minimal site generator to avoid writing html by hand for the second time in one week, it was a busy weekend ;)

                                                                                1. 1

                                                                                  Welcome to Lobsters! Self-posts are fine but avoid putting commentary in the text field, as per the submission guidelines (seen when you go to submit):

                                                                                  When submitting a URL, the text field is optional and should only be used when additional context or explanation of the URL is needed. Commentary or opinion should be reserved for a comment, so that it can be voted on separately from the story.

                                                                                  1. 2

                                                                                    Anyway, I rewrote llama.cpp in Rust so that it’s easier for me to embed it in my projects. It was fun, and learned a lot by doing it. Happy to answer questions!

                                                                                    Seems like context/explanation to me.

                                                                                    1. 1

                                                                                      That content/explanation is not necessary to understand the ‘title’ and is available to anyone who actually opens the link.

                                                                                      1. 1

                                                                                        Check the moderation log. These editorials get deleted all the time

                                                                                      2. 2

                                                                                        Oops, sorry about that! Will keep it in mind for next time.

                                                                                        1. 14

                                                                                          One thing that always bothered me about “programmer time is expensive; processors are cheap” is that it’s used to justify slow client software, when nobody is buying cheap processors for those who have to use it.

                                                                                          (And even if they did, it would still be irresponsible and unsustainable, in my opinion.)

                                                                                          1. 14

                                                                                            My issue is that all these threads turn into “we should learn from game devs, they care about performance”. But game developers pretty obviously are not doing any better, on average, than the people they’re sneering at – the industry is infamous for forcing customers onto a hardware upgrade treadmill and for ludicrous “minimum” hardware requirements on new titles. And they don’t even get the optimization of developer time out of it, because the industry is also infamous for extended “crunch” periods.

                                                                                            1. 11

                                                                                              At the same time, console game dev is one of the few places where so much time and effort is spent on optimizing consumer facing software, often with great results.

                                                                                              This is in contrast to my experience with most (if not all?) of the websites where often clicking something result in hundreds if not thousands of milliseconds of delay.

                                                                                              1. 3

                                                                                                Console is on an upgrade treadmill, same as other areas of game dev.

                                                                                                And the kinds of high-impact high-rated big-name games you’re thinking of are, to be frank, a small fraction of the games industry as a whole. And the industry as a whole does not have a great track record on performance. As I mentioned last time around, the single best-selling video game of all time (Minecraft) infamously has a large, dedicated community maintaining third-party addons and mods to make its performance more bearable on average hardware.

                                                                                                1. 1

                                                                                                  I don’t know what are you talking about with that upgrade trademill. The fact that every 8 or so years you can buy a better hardware doesn’t change the fact that during a given generation you can be pretty sure that games you buy will work well on current gen hardware. This is not something you can expect from typical user facing software.

                                                                                                  Let’s take as an example one of the biggest and richest company - Google. Can I expect that maps or sheets will work at 30fps (not to mention 60fps) without hiccups on a hardware that is ~6 years old? Of course not (and to a big extent that software is way simpler than realtime rendering of complex 3d scenes).

                                                                                                  This is not the case with console games, especially with first party studios. In those cases you can be almost 100% sure that current gen hardware will run the game flawlessly.

                                                                                                  This is what people are talking about when they put game dev as an example of subset of industry that cares about perf. No one is claiming that every game dev cares about perf. You seem to be strawmaning and I think you are well aware of this so this is my last message in this topic.

                                                                                                  1. 3

                                                                                                    This is not something you can expect from typical user facing software.

                                                                                                    Apple pretty commonly supports its hardware for as long or longer; macOS Monterey is still receiving patches and supports hardware Apple manufactured literally 10 years ago. The iPhone/iPad ecosystem similarly is known for long support cycles.

                                                                                                    That doesn’t mean every new app for those platforms is designed to stay within the capabilities of ten-year-old hardware, of course, but as I keep pointing out games go through a hardware upgrade treadmill too. It’s slower in the console world but the treadmill is still there and, if anything, is harsher – an old computer may run new software with reduced performance, but when the console world moves on they often just don’t release a version for the prior generation at all (and plenty of top titles are exclusives with deals locking them to exactly a particular console).

                                                                                                    This is what people are talking about when they put game dev as an example of subset of industry that cares about perf.

                                                                                                    Many people in this thread and the last one have been treating performance as a moral crusade. See, for example, other comments in this thread like “user time is sacred”. If console developers could require you to get a RAM or GPU or SSD add-on to run their games the way PC game developers can, they absolutely would do that without a second thought. We know this because PC game developers already do that. There’s no moral issue for them – the console devs aren’t carefully thinking about how every CPU cycle is a moral affront that steals time from the user, they’re thinking about it as a practical thing imposed on them by the hardware they’re targeting.

                                                                                                    No one is claiming that every game dev cares about perf.

                                                                                                    Well, here’s an example from the last thread where I brought up Minecraft and some other examples in response to someone who was claiming that:

                                                                                                    in a game, if you can’t keep to your frame budget (say, make 60 FPS on a modern PC, where nano/microseconds can add up) then that can lead to poor reviews, and significant loss of potential revenue

                                                                                                    This person wanted to generalize game dev not just to “cares about performance” but must care about it. Yet that’s just completely wrong. So I don’t know how you can reasonably claim I’m “strawmanning”.

                                                                                              2. 5

                                                                                                even in games, most people aren’t focussed in performance, that’s just one topic. Lots of games also don’t compete on performance metrics. Some games’ value proposition is very cool graphics, so those games are actually competing on performance. It’s one of few areas where customers are actually showing up to pay for the more performant thing. That stuff gets noticed by the lobsters crowd, but a lot of other stuff like dialogue systems don’t get noticed here as much, even though game devs love that stuff. and lots of people outside of games do care about performance. Performance tends to get talked about more when it’s easy to tie performance to revenue, and games is a domain where it’s often clear how performance relates to revenue, because in games, performance is often the product.

                                                                                                1. 3

                                                                                                  Yup lol. When I hear ‘gamers are focused on performance’ I can’t help but think ‘but are they focused on user experience?’: https://arstechnica.com/gaming/2021/03/hacker-reduces-gta-online-load-times-by-over-70-percent/

                                                                                                2. 11

                                                                                                  Programmer time is expensive.
                                                                                                  CPU time is cheap.
                                                                                                  User time is sacred.

                                                                                                  Many, possibly most, programs have many more users than they do developers. Saving only one second a day per user can amount to a huge benefit, but we often don’t pay attention because we can’t multiply. Likewise, while one CPU is cheap, the number of machines that needs to be updated because some popular program has chosen Electron definitely is not.

                                                                                                  1. 3

                                                                                                    Saving only one second a day per user can amount to a huge benefit, but we often don’t pay attention because we can’t multiply

                                                                                                    Or we can multiply, but we also realize that in many fields of programming we are rarely presented with such clean “this saves one second per day for every user of the software, with no other consequences or side effects” decisions to make.

                                                                                                    Remember: programmers are a finite resource. Assigning a programmer or team of programmers to do Task A means there are fewer available to be assigned to Tasks B, C, D, E, etc. Which is why, despite everyone hating it, we spend so much time in meetings where we try to prioritize different things that need to be done and estimate how long it will take to do them. And that is just the beginning of a complex web of tradeoffs involved in designing and building and shipping software.

                                                                                                    If you want to have, say, a rule that no new feature can be added until every existing feature has been optimized to a performance level than which no greater can be achieved, then you are of course welcome to run your teams that way. I don’t think you’re going to get very far doing it, though, because of your immensely long dev time for even small things.

                                                                                                    Which means that sooner or later you will have to decide on a level of performance that is less than the theoretical ideal maximum, but still acceptable, and aim for that instead.

                                                                                                    And we both know that you and everyone else who claims to “care about performance” already did that. So really the debate is not whether people care about “performance” or value the user’s time or whatever. It’s a debate about where, on a complex multi-axis spectrum of tradeoffs, you’ve decided to settle down and be content. But that doesn’t sound as pure and noble and as satisfyingly moralizing as making absolute proclamations that you care about these things and everyone else either doesn’t or is incompetent or both.

                                                                                                    But we both know the truth, and no amount of absolutist moralizing changes it.

                                                                                                    1. 2

                                                                                                      There’s a reason for grandstanding: the incentives of the programmer (or the programmer’s company) are often misaligned with the interest of the end user. Especially when the end user is locked in this particular software, there’s network effects, or switching costs… or just how performance looks before you’ve even tried the software. So yeah, the dev gonna prioritise. The question is for whom?

                                                                                                      Or we can multiply, but we also realize that in many fields of programming we are rarely presented with such clean “this saves one second per day for every user of the software, with no other consequences or side effects” decisions to make.

                                                                                                      Correct. The actual savings tend to be probabilistic (they affect fewer users), much larger (up to a freeze or crash), and fixing those is never without consequences… though if those consequences are too scary that would indicate a low-quality code base that should probably be refactored first thing in the morning, because at this point all your development cycle have seriously slowed down.

                                                                                                    2. 3

                                                                                                      Affirming this. My current client is willing to burn years of developer time on shaving literally one second off of an employee process because one second of employee time scaled over their nation-wide business works out to hundreds of millions in additional profit.

                                                                                                      1. 2

                                                                                                        This advice is applicable to any widely-deployed software.

                                                                                                        It does not apply to one-off scripts, except it actually does. You merely need to place the writer of the one-off script in the “user” slot.

                                                                                                        1. 2

                                                                                                          Biiiingo. There’s probably a blogpost in there somewhere, but I really just want to say thank you for highlighting what I’ve attempted (evidently poorly) to articulate elsewhere.

                                                                                                        2. 6

                                                                                                          It was also true more or less between 1990 and 2010, when powerful desktop hardware got 50% faster every year. People were just getting good at working with the implications of that when computers stopped getting trivially faster and battery-powered devices became way more important. I certainly wouldn’t call it true anymore; there’s plenty of people out there paying both developer salaries and AWS bills who will tell you how expensive processors are.

                                                                                                        1. 18

                                                                                                          Do you have any more information on the project? This is a bit light.

                                                                                                          1. 3

                                                                                                            I haven’t shared the open source project publicly yet, but I plan to later this year.

                                                                                                            This thread has some example code and a link for more info if you’re interested (some details have changed since): https://twitter.com/haxor/status/1618054900612739073

                                                                                                            And I wrote a related post about motivations here: https://www.onebigfluke.com/2022/11/the-case-for-dynamic-functional.html

                                                                                                            1. 18

                                                                                                              There is no static type system, so you don’t need to “emulate the compiler” in your head to reason about compilation errors.

                                                                                                              Similar to how dynamic languages don’t require you to “emulate the compiler” in your head, purely functional languages don’t require you to “emulate the state machine”.

                                                                                                              This is not how I think about static types. They’re a mechanism for allowing me to think less by making a subset of programs impossible. Instead of needing to think about if s can be “hello” or 7 I know I only have to worry about s being 7 or 8. The compiler error just meant I accidentally wrote a program where it is harder to think about the possible states of the program. The need to reason about the error means I already made a mistake about reasoning about my program, which is the important thing. Less errors before the program is run doesn’t mean the mistakes weren’t made.

                                                                                                              I am not a zealot, I use dynamically typed languages. But it is for problems where the degree of dynamism inherent in the problem means introducing the ceremony of a program level runtime typing is extra work, not because reading the compiler errors is extra work.

                                                                                                              This is very analogous to the benefits of functional languages you point out. By not having mutable globals the program is easier to think about, if s is 7 it is always 7.

                                                                                                              Introducing constraints to the set of possible programs makes it easier to reason about our programs.

                                                                                                              1. 4

                                                                                                                I appreciate the sentiment of your reply, and I do understand the value of static typing for certain problem domains.

                                                                                                                Regarding this:

                                                                                                                “making a subset of programs impossible”

                                                                                                                How do you know what subset becomes impossible? My claim is you have to think like the compiler to do that. That’s the problem.

                                                                                                                I agree there’s value in using types to add clarity through constraints. But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                                                                                                1. 10

                                                                                                                  I really like your point about having to master several languages. I’m glad to be rid of a preprocessor, and languages like Zig and Nim are making headway on unifying compile-time and runtime programming. I disagree about the type system, though: it does add complexity, but it’s scalable and, I think, very important for larger codebases.

                                                                                                                  Ideally the “impossible subset” corresponds to what you already know is incorrect application behavior — that happens a lot of the time, for example declaring a “name” parameter as type “string” and “age” as “number”. Passing a number for the name is nonsense, and passing a string for the age probably means you haven’t parsed numeric input yet, which is a correctness and probably security problem.

                                                                                                                  It does get a lot more complicated than this, of course. Most of the time that seems to occur when building abstractions and utilities, like generic containers or algorithms, things that less experienced programmers don’t do often.

                                                                                                                  In my experience, dynamically-typed languages make it easier to write code, but harder to test, maintain and especially refactor it. I regularly make changes to C++ and Go code, and rely on the type system to either guide a refactoring tool, or at least to produce errors at all the places where I need to fix something.

                                                                                                                  1. 4

                                                                                                                    How do you know what subset becomes impossible? My claim is you have to think like the compiler to do that. That’s the problem.

                                                                                                                    You’re right that you have “think like the compiler” to be able to describe the impossible programs for it to check it, but everybody writing a program has an idea of what they want it to do.

                                                                                                                    If I don’t have static types and I make the same mistake, I will have to reason about the equivalent runtime error at some point.

                                                                                                                    I suppose my objection is framing it as “static typing makes it hard to understand the compiler errors.” It is “static typing makes programming harder” (with the debatably worth it benefit of making running the program easier). The understandability of the errors is secondary, if there is value there’s still value even the error was as shitty as “no.”

                                                                                                                    But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                                                                                                    I think this is the same for “functionalness”. For example, often I find I’d rather set up a thread local or similar because it is easier to deal with then threading through some context argument through everything.

                                                                                                                    I suppose there is a difference in the sense that being functional is not (as of) a configurable constraint. It’s more or less on or off.

                                                                                                                    1. 3

                                                                                                                      I agree there’s value in using types to add clarity through constraints. But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                                                                                                      I sometimes divide programmers in two categories: the first acknowledge that programming is a form of applied maths. The seconds went to programming to run from maths.

                                                                                                                      It is very difficult for me to relate to the second category. There’s no escaping the fact that our computers ultimately run formal systems, and most of our job is to formalise unclear requirements into an absolutely precise specification (source code), which is then transformed by a formal system (the compiler) into a stream of instructions (object code) that will then be interpreted by some hardware (the CPU, GPU…) with more or less relevant limits & performance characteristics. (It’s obviously a little different if we instead use an interpreter or a JIT VM).

                                                                                                                      Dynamic type systems mostly allow scared-of-maths people to ignore the mathematical aspects of their programs for a bit longer, until of course they get some runtime error. Worse, they often mistake their should-have-been-a-type-error mistakes for logic errors, and then claim a type system would not have helped them. Because contrary to popular beliefs, type errors don’t always manifest as such at runtime. Especially when you take advantage of generics & sum types: they make it much easier to “define errors out of existence”, by making sure huge swaths of your data is correct by construction.

                                                                                                                      And the worst is, I suspect you’re right: it is quite likely most programmers are scared of maths. But I submit maths aren’t the problem. Being scared is. People need to learn.

                                                                                                                      My claim is you have to think like the compiler to do that.

                                                                                                                      My claim is that I can just run the compiler and see if it complains. This provides a much tighter feedback loop than having to actually run my code, even if I have a REPL. With a good static type system my compiler is disciplined so I don’t have to be.

                                                                                                                      1. 6

                                                                                                                        Saying that people who like dynamic types are “scared of math” is incredibly condescending and also ignorant. I teach formal verification and am writing a book on formal logic in programming, but I also like dynamic types. Lots of pure mathematics research is done with Mathematica, Python, and Magma.

                                                                                                                        I’m also disappointed but unsurprised that so many people are arguing with a guy for not making the “right choices” in a language about exploring tradeoffs. The whole point is to explore!

                                                                                                                        1. 3

                                                                                                                          Obviously people aren’t monoliths, and there will be exceptions (or significant minorities) in any classification.

                                                                                                                          Nevertheless, I have observed that:

                                                                                                                          • Many programmers have explicitly taken programming to avoid doing maths.
                                                                                                                          • Many programmers dispute that programming is applied maths, and some downvote comments saying otherwise.
                                                                                                                          • The first set is almost perfectly included in the second.

                                                                                                                          As for dynamic typing, almost systematically, arguments in favour seem to be less rigorous than arguments against. Despite CISP. So while the set of dynamic typing lovers is not nearly as strongly correlated with “maths are scary”, I do suspect a significant overlap.

                                                                                                                          While I do use Python for various reasons (available libraries, bignum arithmetic, and popularity among cryptographers (SAGE) being the main ones), dynamic typing has systematically hurt me more than it helped me, and I avoid it like the plague as soon as my programs reach non-trivial sizes.

                                                                                                                          I could just be ignorant, but despite having engaged in static/dynamic debates with articulate peers, I have yet to see any compelling argument in favour. I mean there’s the classic sound/complete dilemma, but non-crappy systems like F* or what we see in ML and Haskell very rarely stopped me from writing a program I really wanted to write. Sure, some useful programs can’t be typed. But for those most static check systems have escape hatches. and many programs people think can’t be typed, actually can. Se Ritch Hickey’s transducers for instance. All his talk he was dismissively daring static programmers to type it, only to have a Haskell programmer actually do it.

                                                                                                                          There are of course very good arguments favouring some dynamic language at the expense of some static language, but they never survive a narrowing down to static & dynamic typing in general. The dynamic language may have a better standard library, the static language may have a crappy type system with lots of CVE inducing holes… all ancillary details that have little to do with the core debate. I mean it should be obvious to anyone that Python, Mathematica, and Magma have many advantages that have little to do with their typing discipline.


                                                                                                                          Back to what I was originally trying to respond to, I don’t understand people who feel like static typing has a high cognitive cost. Something in the way their brain works (or their education) is either missing or alien. And I’m highly sceptical of claims that some people are just wired differently. It must be cultural or come from training.

                                                                                                                          And to be honest I have an increasingly hard time considering the dynamic and static positions equal. While I reckon dynamic type systems are easier to implement and more approachable, beyond that I have no idea how they help anyone write better programs faster, and I increasingly suspect they do not.

                                                                                                                          1. 6

                                                                                                                            Even after trying to justify that you’ve had discussions with “articular peers” and “could just be ignorant” and this is all your own observations, you immediately double back to declaring that people who prefer dynamic typing are cognitively or culturally defective. That makes it really, really hard to assume you’re having any of these arguments in good faith.

                                                                                                                            1. 1

                                                                                                                              To be honest I only recall one such articulate peer. On Reddit. He was an exception, and you’re the second one that I recall. Most of the time I see poorer arguments strongly suggesting either general or specific ignorance (most of the time they use Java or C++ as the static champion). I’m fully aware how unsettling and discriminatory is the idea that people who strongly prefer dynamic typing would somehow be less. But from where I stand it doesn’t look that false.

                                                                                                                              Except for the exceptions. I’m clearly missing something, though I have yet to be told what.

                                                                                                                              Thing is, I suspect there isn’t enough space in a programming forum to satisfactorily settle that debate. I would love to have strong empirical evidence, but I have reasons to believe this would be very hard: if you use real languages there will be too many confounding variables, and if you use a toy language you’ll naturally ignore many of the things both typing disciplines enable. For now I’d settle for a strong argument (or set thereof). If someone has a link that would be much appreciated.

                                                                                                                              And no, I don’t have a strong link in favour of static typing either. This is all deeply unsatisfactory.

                                                                                                                              1. 5

                                                                                                                                There seems to be no conclusive evidence one way or the other: https://danluu.com/empirical-pl/

                                                                                                                                1. 3

                                                                                                                                  Sharing this link is the only correct response to a static/dynamic argument thread.

                                                                                                                                  1. 1

                                                                                                                                    I know of — oops I do not, I was confusing it with some other study… Thanks a ton for the link, I’ll take a look.

                                                                                                                                    Edit: from the abstract there seem to be some evidence of the absence of a big effect, which would be just as huge as evidence of effect one way or the other.

                                                                                                                                    Edit 2: just realised this is a list of studies, not just a single study. Even better.

                                                                                                                        2. 1

                                                                                                                          How do you know what subset becomes impossible?

                                                                                                                          Well, it’s the subset of programs which decidably don’t have the desired type signature! Such programs provably aren’t going to implement the desired function.

                                                                                                                          Let me flip this all around. Suppose that you’re tasked with encoding some function as a subroutine in your code. How do you translate the function’s type to the subroutine’s parameters? Surely there’s an algorithm for it. Similarly, there are algorithms for implementing the various primitive pieces of functions, and the types of each primitive function are embeddable. So, why should we build subroutines out of anything besides well-typed fragments of code?

                                                                                                                        3. 4

                                                                                                                          Sure, but I think you’re talking past the argument. It’s a tradeoff. Here is another good post that explains the problem and gives it a good name: biformity.

                                                                                                                          https://hirrolot.github.io/posts/why-static-languages-suffer-from-complexity

                                                                                                                          People in the programming language design community strive to make their languages more expressive, with a strong type system, mainly to increase ergonomics by avoiding code duplication in final software; however, the more expressive their languages become, the more abruptly duplication penetrates the language itself.

                                                                                                                          That’s the issue that explains why separate compile-time languages arise so often in languages like C++ (mentioned in the blog post), Rust (at least 3 different kinds of compile-time metaprogramming), OCaml (many incompatible versions of compile-time metaprogramming), Haskell, etc.

                                                                                                                          Those languages are not only harder for humans to understand, but tools as well

                                                                                                                          1. 4

                                                                                                                            The Haskell meta programming system that jumps immediately to mind is template Haskell, which makes a virtue of not introducing a distinct meta programming language: you use Haskell for that purpose as well as the main program.

                                                                                                                            1. 1

                                                                                                                              Yeah the linked post mentions Template Haskell and gives it some shine, but also points out other downsides and complexity with Haskell. Again, not saying that types aren’t worth it, just that it’s a tradeoff, and that they’re different when applied to different problem domains.

                                                                                                                            2. 2

                                                                                                                              Sure, but I think you’re talking past the argument

                                                                                                                              This is probably a fair characterization.

                                                                                                                              Those languages are not only harder for humans to understand, but tools as well

                                                                                                                              I am a bit skeptical of this. Certainly C++ is harder for a tool to understand than C say, but I would be much less certain of say Ruby vs Haskell.

                                                                                                                              Though I suppose it depends on if the tool is operating on the program source or a running instance.

                                                                                                                          2. 7

                                                                                                                            One common compelling reason is that dynamic languages like Python only require you to learn a single tool in order to use them well. […] Code that runs at compile/import time follows the same rules as code running at execution time. Instead of a separate templating system, the language supports meta-programming using the same constructs as normal execution. Module importing is built-in, so build systems aren’t necessary.

                                                                                                                            That’s exactly what Zig is doing with it’s “comptime” feature, using the same language, but while keeping a statically typed and compiled approach.

                                                                                                                            1. 4

                                                                                                                              I’m wondering where you feel dynamic functional languages like Clojure and Elixir fall short? I’m particularly optimistic about Elixir as of late since they’re putting a lot of effort in expanding to the data analytics and machine learning space (their NX projects), as well as interactive and literate computing (Livebook and Kino). They are also trying to understand how they could make a gradual type system work. Those all feel like traits that have made Python so successful and I feel like it is a good direction to evolve the Elixir language/ecosystem.

                                                                                                                              1. 3

                                                                                                                                I think there are a lot of excellent ideas in both Clojure and Elixir!

                                                                                                                                With Clojure the practical dependence on the JVM is one huge deal breaker for many people because of licensing concerns. BEAM is better in that regard, but shares how VMs require a lot of runtime complexity that make them harder to debug and understand (compared to say, the C ecosystem tools).

                                                                                                                                For the languages themselves, simple things like explicit returns are missing, which makes the languages feel difficult to wield, especially for beginners. So enumerating that type of friction would be one way to understand where the languages fall short. Try to recoup some of the language’s strangeness budget.

                                                                                                                              2. 2

                                                                                                                                I’m guessing the syntax is a pretty regular Lisp, but with newlines and indents making many of the parenthesis unnecessary?

                                                                                                                                Some things I wish Lisp syntax did better:

                                                                                                                                1. More syntactically first-class data types besides lists. Most obviously dictionaries, but classes kind of fit in there too. And lightweight structs (which get kind of modeled as dicts or tuples or objects or whatever in other languages).
                                                                                                                                2. If you have structs you need accessors. And maybe that uses the same mechanism as namespaces. Also a Lisp weak point.
                                                                                                                                3. Named and default arguments. The Lisp approaches feel like cludges. Smalltalk is a kind of an ideal, but secretly just the weirdest naming convention ever. Though maybe it’s not so crazy to imagine Lisp syntax with function names blown out over the call like in Smalltalk.
                                                                                                                                1. 1

                                                                                                                                  Great suggestions thank you! The syntax is trying to avoid parentheses like that for sure. If you have more thoughts like this please send them my way!

                                                                                                                                  1. 1

                                                                                                                                    This might be an IDE / LSP implementation detail, but would it be possible to color-code the indentation levels? Similar to how editors color code matching brackets these days. I always have a period of getting used to Python where the whitespace sensitivity disorients me for a while.

                                                                                                                                    1. 2

                                                                                                                                      Most editors will show a very lightly shaded vertical line for each indentation level with Python. The same works well for this syntax too. I have seen colored indentation levels (such as https://archive.fosdem.org/2022/schedule/event/lispforeveryone/), but I think it won’t be needed because of the lack of parentheses. It’s the same reason I don’t think it’ll be necessary to use a structural editor like https://calva.io/paredit/

                                                                                                                            1. 2

                                                                                                                              This is one of the weirdest things about the C++ stdlib. The unordered map and set were added in C++03, as I recall. The ordered ones were in the first standard. I don’t think I’ve ever had a use case where I care about stable ordering. Maybe once or twice in a couple of decades of C++. Pretty much every program I’ve written in any language wants a dictionary though, and when a standard library says ‘map’ or ‘dictionary’ (or ‘set’), I assume it’s some form of hash table, unless it’s a variant that’s specialised for very small sizes. Having the default for these be a tree is just confusing.

                                                                                                                              1. 4

                                                                                                                                unordered_map is from C++11, so it’s relatively modern. What boggles my mind is that, despite being modern C++, standard still messes up and hard-codes a particular (and slow) implementation strategy via exposing bucket iteration. Like, I can see how std::map could be a tree because, when Stepanov was designing what were to become STL, we didn’t care about cache that much. But fixing that by including a crippled hash map is beyond me.

                                                                                                                                where I care about stable ordering

                                                                                                                                natural ordering is indeed rare (and often, when you have order, you want just a sorted array). However, having some stable iteration order is a much better default than non-deterministic iteration, at leas if you don’t need 100% performance. So, I’d expect more or less every GC language to do what Python and JS are doing, for the default container.

                                                                                                                                1. 2

                                                                                                                                  What boggles my mind is that, despite being modern C++, standard still messes up and hard-codes a particular (and slow) implementation strategy via exposing bucket iteration. Like, I can see how std::map could be a tree because, when Stepanov was designing what were to become STL, we didn’t care about cache that much. But fixing that by including a crippled hash map is beyond me.

                                                                                                                                  See “Chaining vs Open Addressing” in the proposal for std::unordered_map for historical background about why they went with chaining in 2003.

                                                                                                                                  1. 2

                                                                                                                                    In hindsight the justification was terrible: “nobody has written a good implementation yet”. And later Google has written swisstable.

                                                                                                                                  2. 1

                                                                                                                                    unordered_map is from C++11

                                                                                                                                    It was in TR1 and merged into C++ in C++03, so it’s been around for 20 years. There are C++ programmers who started programming after that. I first used it many years before C++11 was standardised.

                                                                                                                                    What boggles my mind is that, despite being modern C++, standard still messes up and hard-codes a particular (and slow) implementation strategy via exposing bucket iteration

                                                                                                                                    I’d have to check the spec, but I don’t believe that there’s a requirement that the size of a bucket is >1.

                                                                                                                                    natural ordering is indeed rare (and often, when you have order, you want just a sorted array). However, having some stable iteration order is a much better default than non-deterministic iteration, at leas if you don’t need 100% performance.

                                                                                                                                    After writing my original comment, I remembered the one case where I have recently wanted a stale (sorted) ordering. I haven’t ever wanted the Python behavior of insertion order and, for the any case where I might, it’s easy to add in a wrapper, but impossible to remove in a wrapper. I believe the main reason that languages like Python keep this is to make JSON serialisation simpler, which isn’t a consideration in most places where C++ is a sensible choice.

                                                                                                                                    1. 2

                                                                                                                                      I believe the main reason that languages like Python keep this is to make JSON serialisation simpler,

                                                                                                                                      Python had a json module before dict iteration order was made stable by setting it to insertion order.

                                                                                                                                      A much bigger reason is that a change landed is Python 3.6 that shrunk the memory usage of Python dicts by making the hash buckets be indexes into a dense array of entries (1-8 bytes each, picking the smallest possible size for each dict at runtime during resizing), instead of the buckets being typedef struct { Py_ssize_t me_hash; PyObject* me_key; PyObject* me_value; }; (12 or 24 bytes depending on arch). Offhand I believe Python dicts typically have a load factor of about 0.5ish.

                                                                                                                                      dict iteration order matching insertion order fell out of this almost for free. It got added to the language documentation for python 3.7. Arguments in favour included “it already does iteration order as of python 3.6” and “it’s helpful to remove one possible source of non determinism”.

                                                                                                                                      1. 1

                                                                                                                                        It was in TR1 and merged into C++ in C++03

                                                                                                                                        Hm, I am 0.95 sure unordered map is TR1, but not C++03. According to https://en.cppreference.com/w/cpp/language/history TR1 came after C++03 standard?

                                                                                                                                        1. 2

                                                                                                                                          Ah, you’re right. I think libstdc++ exposed it only in c++03/gnu++03 modes. I was definitely using it (possibly in the experimental namespace?) long before 2011.

                                                                                                                                          1. 1

                                                                                                                                            Uhu, my “from C++11” was also wrong: 11 is when it became standard, but it was totally available before that as a part of TR1.

                                                                                                                                        2. 1

                                                                                                                                          +1 for TR1…in a previous life my gamedev buddies and I had to wrap unordered_map to account for some minor differences of some form or another between MSVC and GCC. I think that’s all in the past now, one hopes.

                                                                                                                                      2. 1

                                                                                                                                        I personally still end up defaulting to std::set and std::map unless profiling shows that it’s a bottleneck, even when I don’t care about ordering, because they Just Work for just about any reasonable type, while the coverage of std::hash is weirdly spotty. For example I fairly often want a tuple as dictionary key, but there is no standard std::hash<std::tuple> specialization (or one for std::pair, or std::array), and I don’t want to go down a rabbit hole of rolling my own hash function or pulling in boost just to use a dictionary.

                                                                                                                                        1. 1

                                                                                                                                          Defaulting to map makes it harder for someone to understand the code later one. They will have to figure out that the ordering doesn’t matter and that maps guarantees aren’t needed.

                                                                                                                                        2. 1

                                                                                                                                          Map and Set have insertion ordering in JS - it was our solution to ensuring definable behavior (and also consistency with the existing object/dictionary use cases. JS has been hit by Hyrum’s law numerous times over the decades and having a defined order meant that there was no leaky abstraction horror.

                                                                                                                                          Take the “always randomize iteration order” solution people often propose - does that mean that getting two iterators of the same map will enumerate in the same order. Will the enumeration order change over time? How does mutation impact ordering, how does gc impact things, etc.

                                                                                                                                          The problem with unordered_map is that it overprescribes the implementation scheme, but what it has prescribed also doesn’t provide any useful gains to end users/developers. I haven’t seen any large scale C++ projects that don’t end up using a non-std implementation of unordered maps due to the std definition due to the poor unordered_map performance mandated by the spec.

                                                                                                                                          1. 1

                                                                                                                                            I’ve used unordered map a lot in C++. I often don’t have a sensible total order to use for key types, which means that map is a pain. Where performance really matters, it’s easy to drop in something like LLVM’s dense map or tsl’s robin map, but only after profiling shows me that this is necessary.

                                                                                                                                            1. 1

                                                                                                                                              Oh I agree, I don’t think I’ve ever needed an ordered map (outside of code puzzles/maybe interview style quizzes). The defined ordering in JS is specifically about removing any kind of non-determinism or leaked implementation semantics.

                                                                                                                                              A few committee members would produce examples of problems that can be made “better” or “smaller” with insertion order enumeration[1], but for implementor committee members it was entirely about ensuring no hyrum’s rule horrors.

                                                                                                                                              [1] Though those examples either have bad memory use problems or hurt the cpu/runtime performance of maps in all other cases

                                                                                                                                        1. 1

                                                                                                                                          I still have no idea why telemetry could have been useful after watching this

                                                                                                                                          1. 2

                                                                                                                                            Here is a list with detailed explanations https://research.swtch.com/telemetry-uses