1.  

    The main innovations in Rust (substructural types[0] and lifetime analysis) would be just as useful in a language with automatic memory management. I am still astonished that most programmers seem to only appreciate these innovations when they want to avoid automatic garbage collection.

    My main interest is in making common data structures and algorithms impossible to use incorrectly. In my experience, any run of the mill Hindley-Milner variant suffices to implement data structures that advertise a purely functional interface. These data structures only use mutation internally to switch between different in-memory representations of the same abstract value. Thus, unsynchronized reads and writes from mutable stores are benign as long as they are atomic, which is not too much to ask for in a managed language.

    However, some data structures and algorithms are imperative in a way that cannot be hidden behind a purely functional interface. For example, consider a directed graph represented as an array of nodes. Each node is represented as a list of integers. Each integer is the index in the array of a forward neighbor of the original node.

    Suppose you want to implement a graph traversal algorithm, e.g., depth-first search. To make the implementation as flexible as possible for the user, you want to traverse the graph lazily, i.e., you want a stream (rather than a list) of integers, corresponding to the indices of the nodes in the order in which they are visited. The user is free to consume only a prefix of the stream if they so wish, in which case the unconsumed suffix must not be computed.

    We want graph objects to be mutable, because constructing immutable graphs is slower, both in theory (asymptotically) and in practice (in benchmarks). However, when the graph is being traversed, it should be temporarily frozen, lest the traversal produce nonsensical results.

    How do we do this without a borrow checker? Well, you can’t.

    [0] Of course, substructural types existed long before Rust, but Rust made them popular.

    1.  

      Have you heard of http://protz.github.io/mezzo/ ? :)

      1.  

        I have not, until now. I am reading the papers right now. Thanks for the reference.

    1. 30

      To me the big deal is that Rust can credibly replace C, and offers enough benefits to make it worthwhile.

      There are many natively-compiled languages with garbage collection. They’re safer than C and easier to use than Rust, but by adding GC they’ve exited the C niche. 99% of programs may work just fine with a GC, but for the rest the only practical options were C and C++ until Rust showed up.

      There were a few esoteric systems languages or C extensions that fixed some warts of C, but leaving the C ecosystem has real costs, and I could never justify use of a “weird” language just for a small improvement. Rust offered major safety, usability and productivity improvements, and managed to break out of obscurity.

      1. 38

        Ada provided everything except ADTs and linear types, including seamless interoperability with C, 20 years before Rust. Cyclone was Rust before Rust, and it was abandoned in a similar state as Rust was when it took off. Cyclone is dead, but Ada got a built-in formal verification toolkit in its latest revision—for some that stuff alone can be a reason to pick instead of anything else for a new project.

        I have nothing against Rust, but the reason it’s popular is that it came at a right time, in the right place, from a sufficiently big name organization. It’s one of the many languages based on those ideas that, fortunately, happened to succeed. And no, when it first got popular it wasn’t really practical. None of these points makes Rust bad. One just should always see a bigger picture especially when it comes to heavily hyped things. You need to know the other options to decide for yourself.

        Other statically-typed languages allow whole-program type inference. While convenient during initial development, this reduces the ability of the compiler to provide useful error information when types no longer match.

        Only in languages that cannot umabiguously infer the principal type. Whether to make a tradeoff between that and support for ad hoc polymorphism or not is subjective.

        1. 14

          I’ve seen Cyclone when it came out, but at that time I dismissed it as “it’s C, but weird”. It had the same basic syntax as C, but added lots of pointer sigils. It still had the same C preprocessor and the same stdlib.

          Now I see it has a feature set much closer to Rust’s (tagged unions, patterns, generics), but Rust “sold” them better. Rust used these features for Result which is a simple yet powerful construct. Cyclone could do that, but didn’t. It kept nullable pointers and added Null_Exception.

          1. 10

            Ada provided everything except ADTs and linear types

            Unfortunately for this argument, ADTs, substructural types and lifetimes are more exciting than that “everything except”. Finally the stuff that is supposed to be easy in theory is actually easy in practice, like not using resources you have already cleaned up.

            Ada got a built-in formal verification toolkit in its latest revision

            How much of a usability improvement is using these tools compared to verifying things manually? What makes types attractive to many programmers is not that they are logically very powerful (they are usually not!), but rather that they give a super gigantic bang for the buck in terms of reduction of verification effort.

            1. 17

              I would personally not compare Ada and Rust directly as they don’t even remotely fulfill the same use-cases.

              Sure, there have been languages that have done X, Y, Z before Rust (the project itself does not lay false claim to inventing those parts of the language which may have been found elsewhere in the past), but the actual distinguishing factor for Rust that places it into an entirely different category from Ada is how accessible and enjoyable it is to interact with while providing those features.

              If you’re in health or aeronautics, you should probably be reaching for the serious, deep toolkit provided by Ada, and I’d probably be siding with you in saying those people probably should have been doing that for the last decade. But Ada is really not for the average engineer. It’s an amazing albeit complex language, that not only represents a long history of incredible engineering but a very real barrier of entry that’s simply incomparable to that of Rust’s.

              If, for example, I wanted today to start writing from scratch a consumer operating system, a web browser, or a video game as a business venture, I would guarantee you Ada would not even be mentioned as an option to solve any of those problems, unless I wanted to sink my own ship by limiting myself to pick from ex-government contractors as engineers, whose salaries I’d likely be incapable of matching. Rust on the other hand actually provides a real contender to C/C++/D for people in these problem spaces, who don’t always need (or in some cases, even want) formal verification, but just a nice practical language with a systematic safety net from the memory footguns of C/C++/D. On top of that, it opens up these features, projects, and their problem spaces to many new engineers with a clear, enjoyable language free of confusing historical baggage.

              1. 6

                Have you ever used Ada? Which implementation?

                1. 15

                  I’ve never published production Ada of any sort and am definitely not an Ada regular (let alone pro) but I studied and had a fondness for Spark around the time I was reading “Type-Driven Development with Idris” and started getting interested in software proofs.

                  In my honest opinion the way the base Ada language is written (simple, and plain operator heavy) ends up lending really well to extension languages, but it also can make difficult for beginners to distinguish the class of concept used at times, whereas Rust’s syntax has a clear and immediate distinction between blocks (the land of namespaces), types (the land of names), and values (the land of data). In terms of cognitive load then, it feels as though these two languages are communicating at different levels. Like Rust is communicating in the mode of raw values and their manipulation through borrows, while the lineage of Ada languages communicate at a level that, in my amateur Ada-er view, center on the expression of properties of your program (and I don’t just mean the Spark stuff, obviously). I wasn’t even born when Ada was created, and so I can’t say for sure without becoming an Ada historian (not a bad idea…), but this sort of seems like a product of Ada’s heritage (just as Rust’s so obviously written to look like C++).

                  To try and clarify this ramble of mine, in my schooling experience, many similarly young programmers of my age are almost exclusively taught to program at an elementary level of abstract instructions with the details of those instructions removed, and then after being taught a couple type-level incantations get a series of algorithms and their explanations thrown at their face. Learning to consider their programs specifically in terms of expressing properties of that program’s operations becomes a huge step out of that starting box (that some don’t leave long after graduation). I think something that Rust’s syntax does well (if possibly by mistake) is fool the amateur user into expressing properties of their programs on accident while that expression becomes part of what seems like just a routine to get to the meat of a program’s procedures. It feels to me that expressing those properties are intrinsic to the language of speaking Ada, and thus present a barrier intrinsic to the programmer’s understanding of their work, which given a different popular curriculum could probably just be rendered as weak as paper to break through.

                  Excuse me if these thoughts are messy (and edited many times to improve that), but beyond the more popular issue of familiarity, they’re sort of how I view my own honest experience of feeling more quickly “at home” in moving from writing Rust to understanding Rust, compared to moving from just writing some form of Ada, and understanding the program I get.

              2. 5

                Other statically-typed languages allow whole-program type inference. While convenient during initial development, this reduces the ability of the compiler to provide useful error information when types no longer match.

                Only in languages that cannot unabiguously infer the principal type. Whether to make a tradeoff between that and support for ad hoc polymorphism or not is subjective.

                OCaml can unambiguously infer the principal type, and I still find myself writing the type of top level functions explicitly quite often. More than once have I been guided by a type error that only happened because I wrote the type of the function I was writing in advance.

                At the very least, I check that the type of my functions match my expectations, by running the type inference in the REPL. More than once have I been surprised. More than once that surprise was caused by a bug in my code. Had I not checked the type of my function, I would catch the bug only later, when using the function, and the error message would have made less sense to me.

                1.  

                  At the very least, I check that the type of my functions match my expectations, by running the type inference in the REPL

                  Why not use Merlin instead? Saves quite a bit of time.

                  That’s a tooling issue too of course. Tracking down typing surprises in OCaml is easy because the compiler outputs type annotations in a machine-readable format and there’s a tool and editor integrations that allow me to see the type of every expression in a keystroke.

                  1.  

                    Why not use Merlin instead? Saves quite a bit of time.

                    I’m a dinosaur, that didn’t take the time to learn even the existence of Merlin. I’m kinda stucks in Emacs’ Tuareg mode. Works for me for small projects (all my Ocaml projects are small).

                    That said, my recent experience with C++ and QtCreator showed me that having warnings at edit time is even more powerful than a REPL (at least as long as I don’t have to check actual values). That makes Merlin look very attractive all of a sudden. I’ll take a look, thanks.

              3. 5

                Rust can definitely credibly replace C++. I don’t really see how it can credibly replace C. It’s just such a fundamentally different way of approaching programming that it doesn’t appeal to C programmers. Why would a C programmer switch to Rust if they hadn’t already switched to C++?

                1. 41

                  I’ve been a C programmer for over a decade. I’ve tried switching to C++ a couple of times, and couldn’t stand it. I’ve switched to Rust and love it.

                  My reasons are:

                  • Robust, automatic memory management. I have the same amount of control over memory, but I don’t need goto cleanup.
                  • Fearless multi-core support: if it compiles, it’s thread-safe! rayon is much nicer than OpenMP.
                  • Slices are awesome: no array to pointer decay. Work great with substrings.
                  • Safety is not just about CVEs. I don’t need to investigate memory murder mysteries in GDB or Valgrind.
                  • Dependencies aren’t painful.
                  • Everything builds without fuss, even when supporting Windows and cross-compiling to iOS.
                  • I can add two signed numbers without UB, and checking if they overflow isn’t a party trick.
                  • I get some good parts of C++ such as type-optimized sort and hash maps, but without the baggage C++ is infamous for.
                  • Rust is much easier than C++. Iterators are so much cleaner (just a next() method). I/O is a Read/Write trait, not a hierarchy of iostream classes.
                  1. 5

                    I also like Rust and I agree with most of your points, but this one bit seems not entirely accurate:

                    Fearless multi-core support: if it compiles, it’s thread-safe! rayon is much nicer than OpenMP.

                    AFAIK Rust:

                    • doesn’t guarantee thread-safety — it guarantees the lack of data races, but doesn’t guarantee the lack of e.g. deadlocks;
                    • guarantees the lack of data races, but only if you didn’t write any unsafe code.
                    1. 18

                      That is correct, but this is still an incredible improvement. If I get a deadlock I’ll definitely notice it, and can dissect it in a debugger. That’s easy-peasy compared to data races.

                      Even unsafe code is subject to thread-safety checks, because “breaking” of Send/Sync guarantees needs separate opt-in. In practice I can reuse well-tested concurrency primitives (e.g. WebKit’s parking_lot) so I don’t need to write that unsafe code myself.

                      Here’s an anecdote: I wrote some single threaded batch-processing spaghetti code. Since it each item was processed separately, I decided to parallelize it. I’ve changed iter() for par_iter() and the compiler immediately warned me that in one of my functions I’ve used a 3rd party library which used an HTTP client library which used an event loop library which stored some event loop data in a struct without synchronization. It pointed exactly where and why the code was unsafe, and after fixing it I had an assurance the fix worked.

                      1. 5

                        I share your enthusiasm. Just wanted to prevent a common misconception from spreading.

                        Here’s an anecdote: I wrote some single threaded batch-processing spaghetti code. Since it each item was processed separately, I decided to parallelize it. I’ve changed iter() for par_iter() and the compiler immediately warned me that in one of my functions I’ve used a 3rd party library which used an HTTP client library which used an event loop library which stored some event loop data in a struct without synchronization. It pointed exactly where and why the code was unsafe, and after fixing it I had an assurance the fix worked.

                        I did not know it could do that. That’s fantastic.

                      2. 8

                        Data races in multi-threaded code are about 100x harder to debug than deadlocks in my experience, so I am happy to have an imperfect guarantee.

                        guarantees the lack of data races, but only if you didn’t write any unsafe code.

                        Rust application code generally avoids unsafe.

                        1.  

                          Data races in multi-threaded code are about 100x harder to debug than deadlocks in my experience, so I am happy to have an imperfect guarantee.

                          My comment was not a criticism of Rust. Just wanted to prevent a common misconception from spreading.

                          Rust application code generally avoids unsafe.

                          That depends on who wrote the code. And unsafe blocks can cause problems that show in places far from the unsafe code. Meanwhile, “written in Rust” is treated as a badge of quality.

                          Mind that I am a Rust enthusiast as well. I just think we shouldn’t oversell it.

                        2. 7

                          guarantees the lack of data races, but only if you didn’t write any unsafe code.

                          As long as your unsafe code is sound it still provides the guarantee. That’s the whole point, to limit the amount of code that needs to be carefully audited for correctness.

                          1.  

                            I know what the point is. But proving things about code is generally not something that programmers are used to or good at. I’m not saying that the language is bad, only that we should understand its limitations.

                      3. 10

                        This doesn’t really match what we see and our experience: a lot of organisations are investigating their replacement of C and Rust is on the table.

                        One advantage that Rust has is that it actually lands between C and C++. It’s pretty easy to move towards a more C-like programming style without having to ignore half of the language (this comes from the lack of classes, etc.).

                        Rust is much more “C with Generics” than C++ is.

                        We currently see a high interest in the embedded world, even in places that skipped adopting C++.

                        I don’t think the fundamental difference in approach is as large as you make it (sorry for the weak rebuttal, but that’s hard to quantify). But also: approaches are changing, so that’s less of a problem for us, as long as we are effective at arguing for our approach.

                        1.  

                          It’s just such a fundamentally different way of approaching programming that it doesn’t appeal to C programmers. Why would a C programmer switch to Rust if they hadn’t already switched to C++?

                          Human minds are sometimes less flexible than rocks.

                          That’s why we still have that stupid Qwerty layout: popular once for mechanical (and historical) reasons, used forever since. As soon as the mechanical problems were fixed, Sholes imself devised a better layout, which went unused. Much later, Dvorak devised another better layout, and it is barely used today. People thinking in Qwerty simply can’t bring themselves to take the time to learn the superior layout. (I know: I’m in a similar situation, though my current layout is not Qwerty).

                          I mean, you make a good point here. And that’s precisely what’s make me sad. I just hope this lack of flexibility won’t prevent C programmers from learning superior tools.

                          (By the way, I would chose C over C++ in many cases, I think C++ is crazy. But I also know ML (OCaml), a bit of Haskell, a bit of Lua… and that gives me perspective. Rust as I see it is a blend of C and ML, and though I have yet to write Rust code, the code I have read so far was very easy to understand. I believe I can pick up the language pretty much instantly. In my opinion, C programmers that only know C, awk and Bash are unreasonably specialised.)

                          1.  

                            I tried to switch to DVORAK twice. Both times I started to get pretty quick after a couple of days but I cheated: if I needed to type something I’d switch back to QWERTY, so it never stuck.

                            The same is true of Rust, incidentally. Tried it out a few times, was fun, but then if I want to get anything useful done quickly it’s just been too much of a hassle for me personally. YMMV of course. I fully intend to try to build something that’s kind of ‘C with lifetimes’, a much simpler Rust (which I think of as ‘C++ with lifetimes’ analogously), in the future. Just have to, y’know, design it. :D

                            1.  

                              I too was tempted at some point to design a “better C”. I need:

                              • Generics
                              • Algebraic data types
                              • Type classes
                              • coroutines, (for I/O and network code, I need a way out of raw poll(2))
                              • Memory safety

                              With the possible exception of lifetimes, I’d end up designing Rust, mostly.

                              1.  

                                I agree that you need some way of handling async code, but I don’t think coroutines are it, at least not in the async/await form. I still feel like the ‘what colour is your function?’ stuff hasn’t been solved properly. Any function with a callback (sort with a key/cmp function, filter, map, etc.) needs an async_ version that takes a callback and calls it with await. Writing twice as much code that’s trivially different by adding await in some places sucks, but I do not have any clue what the solution is. Maybe it’s syntactic. Maybe everything should be async implicitly and you let the compiler figure out when it can optimise things down to ‘raw’ calls.

                                shrug

                                Worth thinking about at least.

                        2.  

                          I agree with @milesrout. I don’t think Rust is a good replacement for C. This article goes into some of the details of why - https://drewdevault.com/2019/03/25/Rust-is-not-a-good-C-replacement.html

                          1. 16

                            Drew has some very good points. Its a shame he ruins them with all the other ones.

                            1. 23

                              Drew has a rusty axe to grind: “Concurrency is generally a bad thing” (come on!), “Yes, Rust is more safe. I don’t really care.”

                              Here’s a rebuttal of that awful article: https://telegra.ph/Replacing-of-C-with-Rust-has-been-a-great-success-03-27 (edit: it’s a tongue-in-cheek response. Please don’t take it too seriously: the original exaggerated negatives, so the response exaggerates positives).

                              1. -4

                                Drew is right and this article you link to is just blatant fanboyism. It’s the classic example of fanboyism because it tries to respond to every point, yet some of them are patently true. Like, really? You can’t argue that Rust is more portable than C on the basis that there’s a little bit of leaky abstraction over Windows-specific stuff in its standard library. C is just demonstrably more portable.

                                It criticises C for not changing enough, but change is bad and C89 is all C ever needed in terms of standardisation for the most part. About the only useful thing added since then was stdint.h. -ftrapv exists and thus wanky nonsense about signed overflow being undefined is invalid.

                                I love this bit in particular:

                                In C I could use make, gnu make, cmake, gyp, autotools, bazel, ninja, meson, and lots more. The problem is, C programmers have conflicting opinions on which of these is the obvious right choice, and which tools are total garbage they’ll never touch.

                                In Rust I can use Cargo. It’s always there, and I won’t get funny looks for using it.

                                In C you can use whatever you like. In Rust, if you don’t like Cargo, you just don’t use Rust. That’s the position I’m in. This isn’t better.

                                1. 10

                                  I didn’t read that post as blatant fanboyism, but if someone’s positive and successful experience with Rust is fanboyism, let’s agree to disagree for now.

                                  It criticises C for not changing enough, but change is bad and C89 is all C ever needed in terms of standardisation for the most part.

                                  Change isn’t necessarily bad! With a few exceptions for libraries/applications opting into unstable features, you can compile and use the same Rust code that was originally authored in 2015. However, some of the papercuts that people faced in the elapsed time period were addressed in a backwards-compatible way.

                                  About the only useful thing added since then was stdint.h. -ftrapv exists and thus wanky nonsense about signed overflow being undefined is invalid.

                                  Defaults matter a great deal. People have spent a heroic amount of work removing causes of exploitable behavior in “in-tree” (as much as “in-tree” exists in C…) with LLVM/ASAN, and even more work out-of-tree with toolkits like CBMC, but C is still not a safe language. There’s a massive amount of upfront (and continuous!) effort needed to keep a C-based project safe, whereas Rust works for me out of the box.

                                  In C you can use whatever you like. In Rust, if you don’t like Cargo, you just don’t use Rust. That’s the position I’m in. This isn’t better.

                                  My employer has a useful phrase that I’ll borrow: “undifferentiated heavy lifting”. I view deciding which build system I should use for a project as “undifferentiated heavy lifting”, as Cargo covers 90-95% of the use cases I need. The remainder is either patched over using ad-hoc scripts or there is an upcoming RFC addressing that. This allows me to focus on my project instead spinning cycles wrangling build systems! That being said, I’ll be the first to admit that Cargo isn’t the perfect build system for every use case, but for my work (and increasingly, for several organizations at my employer), Cargo and Rust are an excellent replacement for C.

                                  1. 8

                                    let’s imagine I download some C from github. How do I build it?

                                    hopefully it’s ./configure && make && make install, but maybe not! Hopefully I have the dependencies, but maybe not! Hopefully if I don’t have the dependencies they are packaged for my distro, but maybe not!

                                    let’s imagine I download some rust from github. How do I build it?

                                    cargo build --release

                                    done

                                    I know which one of those I prefer, personally

                                    1.  

                                      You read the README. It says what you need to do.

                                      cargo build –release

                                      This ease-of-use encourages stuff like needing to compile 200+ Rust dependencies just to install the spotifyd AUR package. It’s a good thing for there to be a bit of friction adding new dependencies, in my opinion.

                                      1. 12

                                        So the alternative that you propose is to:

                                        1. Try to figure out which file(s) (if any) specify the dependencies to install
                                        2. Figure out what those dependencies are called on your platform, or even exist.
                                        3. Figure out what to do when they don’t exist, if you can compile them from source, how, etc
                                        4. Figure out which versions you need, because the software may not work with the latest version available on your platform
                                        5. Figure out how to install that older version without breaking whatever your system may have installed, making sure all your linker flags and what not are right, etc
                                        6. Figure out how to actually configure/install the darn thing, which at this point is something you have probably lost interest in.

                                        Honestly your argument that ease of use leads to 200+ dependencies is a weak argument. Even if all projects suffered from this, from the user’s perspective it’s still easier to just run cargo build --release and be done with it. Even if it takes 10 minutes to build, that’s probably far less time than having to do all the above steps manually.

                                        1. 5

                                          Dude everyone here has had to install C software in some environment at some point. Like we all know it’s not “just read the docs”, and you know we know. What’s the point of pretending it’s not a nightmare as a rule?

                                      2.  

                                        Sorry you got downvoted to oblivion. You make some good points, but you also tend to present trade-offs and opinions as black-and-white facts. You don’t like fanboyism, but you also speak uncritically about C89 and C build systems.

                                        For example, -ftrapv exists and indeed catches overflows at run time, but it also doesn’t override the C spec that defines signed overflow is UB. Optimizers take advantage of that, and will remove naive checks such as if (a>0 && b>0 && a+b<0), because C allows treating it as impossible. It’s not “wanky nonsense”. It’s a real C gotcha that has lead to exploitable buffer overflows.

                                        1.  

                                          -ftrapv exists and thus wanky nonsense about signed overflow being undefined is invalid.

                                          Nope, the existence of this opt in flag doesn’t make the complaints about signed overflow nonsensical. When I write a C library, I don’t control how it will be compiled and used, so if I want any decent amount of portability, I cannot assume -ftrapv will be used. For instance, someone else might be using -fwrapv instead, so they can check overflows more easily in their application code.

                                          In C you can use whatever you like.

                                          So can I. So can they. Now good luck integrating 5 external libraries, that uses, say CMake, the autotools, and ninja. When there’s one true way to do it, we can afford lots of simplifying assumption that make even a non-ideal one true way much simpler than something like CMake.

                                          (By the way, it seems that in the C and C++ worlds, CMake is mostly winning, as did the autotools before, and people will look at you funny for choosing something else.)

                                          1.  

                                            I think I’m done discussing anything remotely controversial on this website. I’m going to get banned or something because people keep flagging my comments as ‘incorrect’ when they’re literally objective fact just because they can’t handle that some people don’t like Rust. It’s just sad. I thought this site was meant to be one where people could maturely discuss technical issues without fanboyism but it seems like while that’s true of most topics, when it comes to Rust it doesn’t matter where you are on the internet: the RESF is out to get you.

                                            It’s not like I’m saying ‘RUST BAD C GOOD’ or some simplistic nonsense. I’ve said elsewhere in the thread I think it’s a great alternative to C++, but it’s just so fundamentally different from C in so many ways that it doesn’t make sense to think of it as a C replacement. I’d love to see a language that’s more like ‘C with lifetimes’ than Rust which is ‘C++ with lifetimes’. Something easier to implement, more portable, but with those memory safety guarantees.

                                            1. 10

                                              I thought this site was meant to be one where people could maturely discuss technical issues

                                              It is. Maturity implies civility, in which almost every comment I read of yours is lacking, regardless of topic. Like, here, there are plenty of less abrasive ways of wording what you tried to say (“wanky nonsense” indeed). Then you assume that you are being downvoted because you hurt feelings with “objective facts” and everyone who disagreed with you is a fanboy, without considering that you could simply be wrong.

                                              Lobste.rs has plenty of mature technical discussion. This ain’t it.

                                              1.  

                                                Drew is right and this article you link to is just blatant fanboyism.

                                                Is not at all objective. You are leaning far out of the window and people didn’t appreciate.

                                                It’s fine to be subjective, but if you move the discussion to that field, be prepared for the response to be subjective.

                                                1.  

                                                  A lot of the design of Rust seems to be adding features to help with inherent ergonomics issues with the lifetimes systems; out of interest, what are some of things Rust does (or doesn’t do) that you would change to make it more minimalistic?

                                                  I think it’s right not to view Rust as a C replacement in the general case. I kind of view it as an alternative to C++ for programmers who wanted something ‘more’ than C can provide but bounced of C++ for various reasons (complexity, pitfalls, etc).

                                                  1.  

                                                    I’d like you to stay.

                                                    Before clicking “Post” I usually click “Preview” and read what I wrote. If you think this is a good idea, feel free to copy it :)

                                              2.  

                                                So many bad points from this post.

                                                • We can safely ignore the “features per year”, since the documentation they are based on don’t follow the same conventions. I’ll also note that, while a Rust program written last year may look outdated (I personally don’t know Rust enough to make such an assessment), it will still work (I’ve been told breaking changes are extremely rare).

                                                • C is not really the most portable language. Yes, C and C++ compilers, thanks to having decades of work behind them, target more devices than everything else put together. But no, those platforms do not share the same flavour of C and C++. There are simply too many implementation defined behaviours, starting with integer sizes. Did you know that some platforms had 32-bit chars? I worked with someone who worked on one.

                                                  I wrote a C crypto library, and went out of my way to ensure the code was very portable. and it is. Embedded developers love it. There was no way however to ensure my code was fully portable. I right-shift negative integers (implementation defined behaviour), and I use fixed width integers like uint8_t (not supported on the DSP I mentioned above).

                                                • C does have a spec, but it’s an incomplete one. In addition to implementation defined behaviour, C and C++ also have a staggering amount of undefined and unspecified behaviour. Rust has no spec, but it still tries to minimise undefined behaviour. I expect this point will go away when Rust stabilises and we get an actual spec. I’m sure formal verification folks will want to have a verified compiler for Rust, like we currently have for C.

                                                • *C have many implementations… and that’s actually a good point.

                                                • C has a consistent & stable ABI… and so does Rust, somewhat? OK, it’s opt-in, and it’s contrived. My point is, Rust does have an FFI which allows it to talk to the outside world. It doesn’t have to be at the top level of a program. On the other hand, I’m not sure what would be the point of a stable ABI between Rust modules. C++ at least seems to be doing fine without that.

                                                • Rust compiler flags aren’t sable… and that’s a good point. They should probably stabilise at some point. On the other hand, having one true way to manage builds and dependencies is a god send. Whatever we’d use stable compile flags for, we probably don’t want to depart from that.

                                                • Parallelism and Concurrency are unavoidable. They’re not a bad thing, they’re the only thing that can help us cheat the speed of light, and with it single threaded performance. The ideal modern computer is more likely a high number of in-order cores, each with a small amount of memory, and an explicit (exposed to the programmer) cache hierarchy. Assuming performance and energy consumption trumps existing C (and C++) programs. Never forget that current computers are optimised to run C and C++ programs.

                                                • Not caring about safety is stupid. Or selfish. Security vulnerabilities are often mere externalities, which you can ignore if it doesn’t damage your reputation to the point of affecting your bottom line. Yay Capitalism. More seriously, safety is a subset of correctness, and correctness is the main point of Rust’s strong type system and borrow checker. C doesn’t just make it difficult to write safe programs, it makes it difficult to write correct programs. You wouldn’t believe how hard that is. My crypto library had to resort to Valgrind, sanitisers, and the freaking TIS interpreter to eke out undefined behaviour. And I’m talking about “constant time” code, that has fixed memory access patterns. It’s pathologically easy to test, yet writing tests took as long as writing the code, possibly longer. Part of the difficulty comes from C, not just the problem domain.

                                                Also, Drew DeVault mentions Go as a possible replacement for C? For some domains, sure. But the thing has a garbage collector, making it instantly unsuitable for some constrained environments (either because the machine is small, or because you need crazy performance). Such constrained environment are basically the remaining niche for C (and C++). For the rest, the only thing that keeps people hooked on C (and C++) are existing code and existing skills.

                                                1.  

                                                  But the thing has a garbage collector, making it instantly unsuitable for some constrained environments (either because the machine is small, or because you need crazy performance).

                                                  The Go garbage collector can be turned off with debug.SetGCPercent(-1) and triggered manually with runtime.GC(). It is also possible to allocate memory at the start of the program and use that.

                                                  Go has several compilers available. gc is the official Go compiler, GCC has built-in support for Go and there is also TinyGo, which targets microcontrollers and WASM: https://tinygo.org/

                                                  1.  

                                                    Can you realistically control allocations? If we have ways to make sure all allocations are either explicit or on the stack, that could work. I wonder how contrived that would be, though. The GC is on by default, that’s got to affect idiomatic code in a major way. To the point where disabling it probably means you don’t have the same language any more.

                                                    Personally, to replace C, I’d rather have a language that disables GC by default. If I am allowed to have a GC, I strongly suspect there are better alternatives than Go. (My most major objection being “lol no generics”. And if the designers made that error, that kind of cast doubt over their ability to properly design the rest of the language, and I lose all interest instantly. Though if I were writing network code, I would also say “lol no coroutines” at anything designed after 2015 or so.)

                                              3.  

                                                I don’t think replacing C is a good usecase for Rust though. C is relatively easy to learn, read, and write to the level where you can write something simple. In Rust this is decidedly not the case. Rust is much more like a safe C++ in this respect.

                                                I’d really like to see a safe C some day.

                                                1. 5

                                                  Have a look at Cyclone mentioned earlier. It is very much a “safe C”. It has ownership and regions which look very much like Rust’s lifetimes. It has fat pointers like Rust slices. It has generics, because you can’t realistically build safe collections without them. It looks like this complexity is inherent to the problem of memory safety without a GC.

                                                  As for learning C, it’s easy to get a compiler accept a program, but I don’t think it’s easier to learn to write good C programs. The language may seem small, but the actual language you need to master includes lots of practices for safe memory management and playing 3D chess with the optimizer exploiting undefined behavior.

                                              1. 40

                                                The claim is simple: in a static type system, you must declare the shape of data ahead of time, but in a dynamic type system, the type can be, well, dynamic! It sounds self-evident, so much so that Rich Hickey has practically built a speaking career upon its emotional appeal. The only problem is it isn’t true.

                                                Immediate standing ovation from me.

                                                I can only assume that oft-made claim is perpetuated from a position of ignorance. Have those people actually tried doing the thing in a statically typed language that they claim a statically typed language cannot do? Here’s an approach that appears all over my Haskell projects:

                                                  req <- requireCheckJsonBody :: Handler Value
                                                  let chargeId = req ^. key "data" . key "object" . key "id" . _String
                                                

                                                I don’t know (or care) what the exact structure of the JSON coming over the network will look like. I just know it will contain this one field that I care about, and here I pull it out and read it as a string.

                                                Do I need the entire JSON string to conform to some specific protocol (more specific than JSON itself)? No. I am just parsing it as some JSON (which is represented with the Value type).

                                                Do I need to parse it into some complex data type? No. I’m just building a string. I am doing — in Haskell — exactly the kind of thing that Clojurists do, but without being smug about it.


                                                If we keep the datatype’s constructor private (that is, we don’t export it from the module that defines this type), then the only way to produce a UserId will be to go through its FromJSON parser.

                                                I’m glad I read this article even for this design tip alone. I had never thought to do it this way; I thought a “smart constructor” was always necessary, even when that seemed like overkill.

                                                1. 5
                                                    let chargeId = req ^. key "data" . key "object" . key "id" . _String
                                                  

                                                  So what does this piece of code actually do? Get the value under data->object->id as a String? _String is there to prevent name clashes with actual String? Is the magic here that the JSON payload isn’t parsed any more than it needs to be?

                                                  Stylistically, do you know why Haskell people often seem to decide to use weird operators? Are all alternatives somehow worse?

                                                  1. 8

                                                    So what does this piece of code actually do? Get the value under data->object->id as a String?

                                                    Yeah, exactly. This is how the Stripe API structures their responses. I could have picked a simpler hypothetical example, but I think even this real-world case is simple enough.

                                                    _String is there to prevent name clashes with actual String?

                                                    I believe so, yes. This is just a thing in the lens library.

                                                    Is the magic here that the JSON payload isn’t parsed any more than it needs to be?

                                                    I believe it is parsed only as much as necessary, yes. I’m not sure there’s any magic happening.

                                                    Stylistically, do you know why Haskell people often seem to decide to use weird operators? Are all alternatives somehow worse?

                                                    There are plenty of alternative syntaxes and approaches you could opt for. I happen to find this easy enough to read (and I think you do too, since you worked out exactly what it does), but that is of course subjective.

                                                    1. 3

                                                      the syntactic weirdness is mostly due to the fact that the base grammar is very simple, so you end up basically relying on making infix operators to build symbol-ful DSLs.

                                                      This is very powerful for making certain kinds of libraries, but means that lots of Haskell looks a bit “out there” if you haven’t looked at code using a specific library before. This tends to be at its worst when doing stuff like JSON parsing (where you have very variably-shaped data)

                                                      1. 6

                                                        Although conversely, I think more typical parsing with Aeson (especially the monadic form) is usually very tidy, and easy to read even by people not familiar with Haskell. It’s much less noisy than my lens example.

                                                        Here’s an example: https://artyom.me/aeson#recordwildcards

                                                        I think you probably know this, but I am posting here mostly so that other curious onlookers don’t get the wrong idea and think that Haskell is super weird and difficult.

                                                    2. -5

                                                      Lol what - you’re defining a benefit of dynamically typed language with your example. The json in your case IS a dynamic object.

                                                      1. 7

                                                        I think you are quite confused about what we’re discussing.

                                                        The discussion is around type systems in programming languages. JSON is just a protocol. The JSON that my example parses is not a “dynamic object”. There is no such thing as a JSON object. JSON is only ever a string. Some data structure can be serialised as a JSON string. A JSON string can potentially be parsed by a programming language into some data structure.

                                                        The JSON protocol can be parsed by programming languages with dynamic type systems, e.g., Clojure, and the protocol can also be parsed by programming languages with static type systems, e.g., Haskell.

                                                        My example is taken verbatim from some Haskell systems I’ve written, so it is not “defining a benefit of dynamically typed language”.

                                                        You’re going to have to go and do a bit more reading, but luckily there is plenty of material online that explains these things. I think your comment is a good example of the kind of confusion the article’s author is trying to address.

                                                        1. 2

                                                          I read the article, and I agree somewhat with the parent commenter. It really seems that the author – and perhaps you as well – was comfortable with the idea of potentially declaring parts of the program as just handling lots of values all of a single generic/opaque/relatively-underspecified type, rather than of a variety of richer/more-specified types.

                                                          That position is not all that far from being comfortable with all values being of a single generic/opqaue/relatively-underspecified type. Which is, generally, the most charitable description the really hardcore static-typing folks are willing to give to dynamically-typed languages (i.e., “in Python, all values are of type object”, and that’s only if someone is willing to step up their politeness level a bit from the usual descriptions given).

                                                          In other words, a cynical reading would be that this feels less like a triumphant declaration of “see, static types can do this!” and more an admission of “yeah, we can do it but only by making parts of our programs effectively dynamically typed”.

                                                          1. 2

                                                            I don’t know how you’ve come to this conclusion. Moreover, I don’t understand how your conclusion is related to the argument in the article.

                                                            In other words, a cynical reading would be that this feels less like a triumphant declaration of “see, static types can do this!” and more an admission of “yeah, we can do it but only by making parts of our programs effectively dynamically typed”.

                                                            What does this even mean? How did you come up with this idea? When you want to parse some arbitrary JSON into a more concrete type, you can just do that. How does parsing make a program no longer statically typed?

                                                            1. 2

                                                              What is the difference between:

                                                              1. “Everything in this part of the program is of type JSON. We don’t know what the detailed structure of a value of that type is; it might contain a huge variety of things, or not, and we have no way of being sure in advance what they will be”.
                                                              2. “Everything in this part of the program is of type object. We don’t know what the detailed structure of a value of that type is; it might contain a huge variety of things, or not, and we have no way of being sure in advance what they will be”.

                                                              The first is what the article did. The second is, well, dynamic typing.

                                                              I mean, sure, you can argue that you could parse a JSON into some type of equally-generic data structure – a hash table, say – but to usefully work with that you’d want to know things like what keys it’s likely to have, what types the values of those keys will have, and so on, and from the type declaration of JSON you receive absolutely none of that information.

                                                              In much the same way you can reflect on an object to produce some type of equally-generic data structure – a hash table, say – but to usefully work with that you’d want to know things like… hey, this is sounding familiar!

                                                              Now do you see what I mean? That’s why I said the cynical view here is the author has just introduced a dynamically-typed section into the program.

                                                              1. 2

                                                                Any program which reads some JSON and parses it will be making some assumptions about its structure.

                                                                This is true of a program written in a dynamically-typed language.

                                                                This is true of a program written in a statically-typed language.

                                                                Usually, you will want to parse a string of JSON into some detailed structure, and then use that throughout your system instead of some generic Value type. But you don’t necessarily need to do that. Nothing about writing in a statically-typed programming language forces you to do that. And no, Haskell programmers don’t generally intentionally try to make their programs worse by passing Value types, or generic Map types, or just anything encoded as a String, throughout their program. That would be stupid.

                                                                1. 3

                                                                  OK, I’ll do the long explanation.

                                                                  Many programmers whose primary or sole familiarity is with statically-typed languages assume that in dynamically-typed languages all code must be littered with runtime type checks and assertions. For example, I’ve run into many people who seem to think that all Python code is, or should be, full of:

                                                                  if isinstance(thing, some_type) ...
                                                                  elif isinstance(thing, some_other_type) ...
                                                                  

                                                                  checks in order to avoid ever accidentally performing an operation on a value of the wrong type.

                                                                  While it is true that you can parse a JSON into a data structure you can then pass around and work with, the only way to meaningfully do so is using your language’s equivalent idiom of

                                                                  if has_key(parsed_json, some_key) and isinstance(parsed_json.get(some_key), some_type)) ...
                                                                  elif has_key(parsed_json, some_other_key) and isinstance(parsed_json.get(some_other_key), some_other_type) ...
                                                                  

                                                                  since you do not know from the original type declaration whether any particular key will be present nor, if it is present, what type the value of that key will have (other than some sort of suitably-generic JSONMember or equivalent).

                                                                  Which is to say: the only way to effectively work with a value of type JSON is to check it, at runtime, in the same way the stereotypical static-typing advocate thinks all dynamically-typed programmers write all their code. Thus, there is no observable difference, for such a person, between working with a value of type JSON and writing dynamically-typed code.

                                                                  Now, sure, there are languages which have idioms that make the obsessive checking for members/types etc. shorter and less tedious to write, but the programmer will still be required, at some point, either to write such code or to use a library which provides such code.

                                                                  Thus, the use of JSON as a catch-all “I don’t know what might be in there” type is not distinguishable from dynamically-typed code, and is effectively introducing a section of dynamically-typed code into the program.

                                                                  1. 2

                                                                    I still don’t get what point you’re trying to make. Sorry.

                                                                    Thus, the use of JSON as a catch-all “I don’t know what might be in there” type is not distinguishable from dynamically-typed code, and is effectively introducing a section of dynamically-typed code into the program.

                                                                    This now sounds like you’re making an argument between parsing and validation, and misrepresenting it at static vs dynamic.

                                                                    1. 2

                                                                      This now sounds like you’re making an argument between parsing and validation, and misrepresenting it at static vs dynamic.

                                                                      For an alternative formulation, consider that people often claim, or want to claim, that in a statically-type language most of the information about the program’s behavior is encoded in the types. Some people clearly would like a future where all such information is encoded in the types (so that, for example, an add function would not merely have a signature of add(int, int) -> int, but a signature of add(int, int) -> sum of arguments which could be statically verified).

                                                                      I have complicated thoughts on that – the short hot-take version is those people should read up on what happened to logical positivism – but the point here is a reminder that this article, which was meant to show a way to have nice statically-typed handling of unknown data structures, was able to do so only by dramatically reducing the information being encoded in the types.

                                                                      1. 2

                                                                        the point here is a reminder that this article, which was meant to show a way to have nice statically-typed handling of unknown data structures, was able to do so only by dramatically reducing the information being encoded in the types.

                                                                        …How else would a program know what type the program’s author intends for the arbitrary data to be parsed into? Telepathy?

                                                                        1. 1

                                                                          I think at this point it’s pretty clear that there’s nothing I can say or do that will get you to understand the point I’m trying to make, so I’m going to bow out.

                                                                2. 2

                                                                  What is the difference between:

                                                                  1. “Everything in this part of the program is of type JSON. We don’t know what the detailed structure of a value of that type is; it might contain a huge variety of things, or not, and we have no way of being sure in advance what they will
                                                                  2. “Everything in this part of the program is of type object. We don’t know what the detailed structure of a value of that type is; it might contain a huge variety of things, or not, and we have no way of being sure in advance what they will be”.

                                                                  The first is what the article did. The second is, well, dynamic typing.

                                                                  The difference is that in a statically-typed language, you can have other parts of the program where proposition 1. is not the case, but in a dynamically-typed language proposition 2. is true all the time and you can’t do anything about it. No matter what style of typing your language uses, you do have to inspect the parsed JSON at runtime to see if it has the values you expect. But in a statically-typed language, you can do this once, then transform that parsed JSON into another type that you can be sure about the contents of; and then you don’t have to care that this type originally came from JSON in any other part of your program that uses it.

                                                                  Whereas in a dynamically-typed language you have to remember at all times that one value of type Object happens to represent generic JSON and another value of type Object happens to represent a more specific piece of structured data parsed from that JSON, and if you ever forget which is which the program will just blow up at runtime because you called a function that made incorrect assumptions about the interface its arguments conformed to.

                                                                  Anyway even introducing a “generic JSON” type is already encoding more useful information than a dyanmically-typed language lets you. If you have a JSON type you might expect to have some methods like isArray or isObject that you can call on it, you know that you can’t call methods that pertain to completely different types like getCenterPoint or getBankAccountRecordsFromBankAccountId. Being able to say that a value is definitely JSON, even if you don’t know anything about that JSON, at least tells you that it’s not a BankAccount or GLSLShaderHandle or any other thing in the vast universe of computing that isnt JSON.

                                                                  1. 2

                                                                    Whereas in a dynamically-typed language you have to remember at all times that one value of type Object happens to represent generic JSON and another value of type Object happens to represent a more specific piece of structured data parsed from that JSON, and if you ever forget which is which the program will just blow up at runtime because you called a function that made incorrect assumptions about the interface its arguments conformed to.

                                                                    This is where the discussion often veers off into strawman territory, though. Because I’ve written code in both dynamically and statically typed languages (and hybrid-ish stuff like dynamically-typed languages with optional type hints), and all the things people say about inevitable imminent doom from someone passing the wrong types of things into functions are, in my experience, just things people say. They don’t correspond to what I’ve actually seen in real programming.

                                                                    That’s why in one of my comments further down I pointed out that the generic JSON approach used in the article forces the programmer to do what people seem to think all dynamically-typed language programmers do on a daily basis: write incredibly defensive and careful code with tons of runtime checks. My experience is that people who prefer and mostly only know statically-typed languages often write code this way when they’re forced to use a dynamically-typed language or sections of code that are effectively dynamically-typed due to using only very very generic types, but nobody who’s actually comfortable in dynamic typing does that.

                                                                    And the best literature review I know of on the topic struggled to find any meaningful results for impact of static versus dynamic typing on defect rates. So the horror stories of how things will blow up from someone forgetting what they were supposed to pass into a function are just that: stories, not data, let alone useful data.

                                                                    Anyway, cards on the table time here.

                                                                    My personal stance is that I prefer to write code in dynamically-typed languages, and add type hints later on as a belt-and-suspenders approach to go with meaningful tests (though I have a lot of criticism for how Python’s type-hinting and checking tools have evolved, so I don’t use them as much as I otherwise might). I’ve seen too much statically-typed code fall over and burn the instant someone pointed a fuzzer at it to have much faith in the “if it passes type checks, it’s correct” mantra. And while I do enjoy writing the occasional small thing in an ML-ish language and find some of the idioms and patterns of that language family pleasingly elegant, mostly I personally see static typing as a diminishing-returns technique, where beyond a very basic up-front pass or two, the problems that can be prevented by static typing quickly become significantly smaller and/or less likely as the effort required to use the type system to prevent them increases seemingly without bound.

                                                                    1.  

                                                                      This is where the discussion often veers off into strawman territory, though. Because I’ve written code in both dynamically and statically typed languages (and hybrid-ish stuff like dynamically-typed languages with optional type hints), and all the things people say about inevitable imminent doom from someone passing the wrong types of things into functions are, in my experience, just things people say. They don’t correspond to what I’ve actually seen in real programming.

                                                                      I disagree - passing the wrong types of things into functions is definitely a phenomenon I’ve personally seen (and debugged) in production Ruby, JavaScript, and Python systems I’ve personally worked on.

                                                                      For instance, I’ve worked on rich frontend JavaScript systems where I was tasked with figuring out why a line of code a.b.c was throwing TypeError but only sometimes. After spending a bunch of time checking back to see where a ultimately came from, I might find that there was some function many frames away from the error in the call stack that sets a from the result of an xhr that isn’t actually guaranteed to always set a key b on a, and that code was not conceptually related to the code where the error happened, so no one thought it was unusual that a.b wasn’t guaranteed, which is how the bug happened.

                                                                      In a statically typed language, I could convert the JSON that will eventually become a into a specifically-typed value, then pass that down through 10 function calls to where it’s needed, without worrying that I’ll find 10 frames deep that SpecificType randomly doesn’t have a necessary field, because the conversion from the generic to the specific would’ve failed at the conversion site.

                                                                      I am a fan of statically typed languages, and a huge reason for this is because I’ve debugged large codebases in dynamically-typed languages where I didn’t write the original code and had to figure it out by inspection. Static typing definitely makes my experience as a debugger better.

                                                                      1.  

                                                                        definitely a phenomenon I’ve personally seen (and debugged)

                                                                        Notice I didn’t say “your stories are false”.

                                                                        Nor did you refute my claims that I’ve seen statically-typed code which passed static type checks fall over and crash when fuzzed.

                                                                        We each can point to instances where our particular bogeyman has in fact happened. Can either of us generalize usefully from that, though? Could I just use my story to dismiss all static typing approaches as meaningless, because obviously they’re not catching all these huge numbers of bugs that must, by my generalization, be present in absolutely every program every person has ever written in any statically-typed language?

                                                                        The answer, of course, is no. And so, although, you did write a lot of words there, you didn’t write anything that was a useful rebuttal to what I actually said.

                                                            2. 0

                                                              I appreciate your condencending tone but really you should work on your ability to argument. The original post claims that this statement is not true:

                                                              in a static type system, you must declare the shape of data ahead of time, but in a dynamic type system, the type can be, well, dynamic!

                                                              You argue however that this is indeed not true because you can have “dynamic” data and “static” types when that’s just a silly loophole. Surely you want data as an object right? A string of characters without the meta structure are completely useless in the context of programming.

                                                              Just because you can have a static type that doesn’t have strict, full protocol implementation doesn’t mean that you don’t need to declare it before hand which renders the original statement absolutely correct - you must declare static shape of data that matches what your type expects. The claim that types can be “lose” doesn’t invalidate this statement.

                                                              1. 3

                                                                I appreciate your condencending tone but really you should work on your ability to argument.

                                                                I’m sorry you feel that way. I genuinely did my best to be kind, and to present some absolute truths to you that I had hoped would clear up your confusion. Unfortunately, it looks like you’ve decided to dig your heels in.

                                                                You argue however that this is indeed not true because you can have “dynamic” data and “static” types when that’s just a silly loophole.

                                                                I don’t know what you are talking about. Dynamic data? What does this mean? And what silly loophole?

                                                                In the context of this argument: the JSON being parsed is a string. It’s not static. It’s not dynamic. It’s a string.

                                                                which renders the original statement absolutely correct - you must declare static shape of data that matches what your type expects.

                                                                No, you don’t. Again, you have misunderstood me, you have misunderstood the article, and you have misunderstood some pretty basic concepts that are fundamental to constructing a cohesive argument in this debate.

                                                                The argument is whether or not — in a statically-typed programming language — the JSON string you are parsing needs to conform 1:1 to the structure of some data type you are trying to parse it into.

                                                                The answer is: No. Both statically-typed and dynamically-typed programming languages can parse arbitrary data.

                                                                1. 0

                                                                  The answer is: No. Both statically-typed and dynamically-typed programming languages can parse arbitrary data.

                                                                  That was never the topic; you can parse arbitrary data with a pen and a piece of toilet paper…

                                                                  1. 1

                                                                    That was never the topic

                                                                    Yes it was. Perhaps you should have actually read the article.

                                                                    you can parse arbitrary data with a pen and a piece of toilet paper…

                                                                    At this point, it is clear you are not even trying to add anything constructive. I suggest we leave this discussion here.

                                                                    1.  

                                                                      Oof, I guess there had to be first failed discussion experience here on lobsters. I’m sorry but you are absolutely inept at discussing this. Maybe it’s better if we don’t continue this. Cheers.

                                                            3. 2

                                                              The grandparent’s code example uses way more than one type.

                                                          1. -3

                                                            This article is obviously wrong in its conclusion. To see how, first recall that while Haskell’s types don’t form a category, we can imagine a “platonic” Hask whose objects are types, whose arrows are functions, and where undefined and friends have been removed.

                                                            Now, consider that platonic Hask is but one object of Cat. From size issues, it is immediate that Cat cannot be a subcategory of Hask; that is, that Hask cannot describe all of Cat’s objects. It follows that Haskell typeclasses like Functor are not arrows in Cat, but endofunctors on Hask, and that Control.Category does not capture objects in Cat, but the internal category objects in Hask.

                                                            Finally, pick just about any 2-category, indeed say Cat, and then ask whether Hask can represent it faithfully: The answer is a clear, resounding, and obvious “no”. Going further, pick any ∞-category, say Tomb, and then ask whether Hask can even represent a portion of any object; an ∞-object is like a row of objects, one per level, but Haskell’s type system could only see one single level of types at a time. (This is not just theoretical; I have tried to embed Tomb into Haskell, Idris, and Coq, and each time I am limited by the relatively weak type system’s upper limits.)

                                                            I wonder why the author believes otherwise.

                                                            1. 16

                                                              This article is obviously wrong in its conclusion.

                                                              I think the word “obviously” is relative to the reader’s familiarity with category theory.

                                                              For the purposes of the misconception she is addressing, the author’s conclusion — to me — is obviously correct.

                                                              You appear to be refuting her argument in some different context. I’m interested to hear your argument (although it would probably be a long time before I learn the CT necessary to properly understand your argument), but switching out the context the argument was made in to refute the entire original argument makes your own argument (to me, at least) appear as an attack against a straw-man.

                                                              1. -1

                                                                My argument ought to follow readily for any ML, and we can see the scars it causes in the design of many MLs. Idris, for example, uses a hierarchy of universes to avoid universe-inconsistency paradoxes as it climbs this tower that I’m talking about. Haskell and Elm don’t bother trying to climb the tower at all. SML and OCaml have exactly one tier, adding on the module system, and strict rules governing the maps between modules and values.

                                                                I’m not removing the word “obviously”. Cat obviously contains Hask, Set, and many other common type systems as objects; the size issues around Cat are usually one of the first things mentioned about it. (Third paragraph in WP and nCat, for example.) And Cat is one of the first categories taught to neophytes, too; for example, in the recent series of programmer-oriented lectures on category theory, Programming with Categories, Cat is the second category defined, after Set.

                                                                My refutation is of the article’s title: Yes indeed, dynamic type systems are more open, simply because there are certain sorts of infinite objects that, when we represent them symbolically, still have infinite components. Haskell can represent any finite row of components with multi-parameter typeclasses but that is not sufficient for an ∞-category. By contrast, when we use dynamic type systems, especially object-based systems, our main concern is not about the representation of data, since that is pretty easy, but the representation of structures. For categories, for example, there are many different ways to give the data of a category, depending on what the category should do; we can emphasize the graph-theoretic parts, or the set-theoretic parts, or even transform the category into something like a Chu space.

                                                                Finally, if static type systems are so great, why isn’t your metatheory, the one you use for metaphysics and navigating the world, a static type system? Probably because you have some sort of open-world assumption built into the logic that you use for day-to-day reasoning, I imagine. This assumption is the “open” that we are talking about when we talk about how “open” a type system is! Just like how we want a metatheory in our daily lives that is open, we all too often want to represent this same sort of open reasoning in our programming languages, and in order to do that, we have to have ways to either subvert and ignore, or entirely remove, limited static types.

                                                                1. 5

                                                                  My argument ought to follow readily for any ML, and we can see the scars it causes in the design of many MLs. Idris, for example, uses a hierarchy of universes to avoid universe-inconsistency paradoxes as it climbs this tower that I’m talking about.

                                                                  Could you give examples of useful programs that are inexpressible in a typed way without a hierarchy of universes? Even when doing pure mathematics (which demands much stronger logical foundations than programming), most of the time I can fix a single universe and work with (a tiny part of) what lives in it.

                                                                  When programming in ML, the feature that I want the most badly is the ability to “carve out” subsets of existing types (e.g., to specify that a list must contain a given element). This would be actually useful for specifying preconditions and postconditions of algorithms (which is ultimately the point to programming, i.e., implementing algorithms). But it does not require hierarchical type universes.

                                                                  Yes indeed, dynamic type systems are more open, simply because there are certain sorts of infinite objects that, when we represent them symbolically, still have infinite components.

                                                                  You seem to be confusing symbols with their denotation. Symbols are finite out of necessity, but you can use them to denote infinite objects just fine, whether you use a type system or not.

                                                                  Haskell can represent any finite row of components with multi-parameter typeclasses but that is not sufficient for an ∞-category.

                                                                  The arity of a multiparameter type class has absolutely nothing to do with n-categories. But, in any case, why is Haskell supposed to do represent ∞-categories in its type system? It is a general-purpose programming language, not a foundation of mathematics.

                                                                  Finally, if static type systems are so great, why isn’t your metatheory, the one you use for metaphysics and navigating the world, a static type system? Probably because you have some sort of open-world assumption built into the logic that you use for day-to-day reasoning, I imagine.

                                                                  Every nominal type definition literally brings a new type of thing into existence. What exactly is this, if not dealing with an open world?

                                                                  And, by the way, my metatheory is ML.

                                                                  1. 3

                                                                    Can any programming language usefully represent these infinite objects? Is that ever useful?

                                                                    Surely you can just build something with opaque objects within Haskell if the type system is too restrictive?

                                                                2. 9

                                                                  I wonder why the author believes otherwise.

                                                                  Probably because the author isn’t comparing Hask to all of category theory. They’re comparing it to the unitype, which cannot faithfully represent anything at all.

                                                                  1. -5

                                                                    As long as we are using “probably” to speak for the author, I think that they probably are not familiar enough with type theory to understand that there are size issues inherent to formalizing type systems.

                                                                    Please reread the original article; they do not talk about “unityping” or Bob Harper’s view on type theory of languages which don’t know the types of every value.

                                                                    1. 26

                                                                      The author is Alexis King, who is a PLT researcher, an expert in both Haskell and Racket and has discussed category theory in depth on Twitter. I’d be shocked if she didn’t understand the ramifications here and was intentionally simplifying things for her target audience.

                                                                      1. -1

                                                                        Sure, and I am just a musician. Obviously, therefore, the author is right.

                                                                        Anyway, they didn’t talk about size issues, nor did they talk about “unitype” ideas, in the article. I am not really fond of guessing what people are talking about. I am happy to throw my entire “probably” paragraph into the trash, as I do not particularly value it.

                                                                  2. 4

                                                                    I don’t know enough category theory to follow your argument precisely, but I’d argue that the category theoretic perspective isn’t relevant in this discussion. How much of category theory you can model using Haskell’s type system is totally unrelated to how much you can model with a program written in Haskell. I guess I don’t even need to make this argument, but still, whatever code you were planning to write with Javascript, can be mechanically translated by a Haskell beginner line-by-line to a Haskell program that simply uses JSON.Value everywhere.

                                                                    I believe the parts of category theory you can’t model in Haskell’s types corresponds to the kinds of relationships you can’t get the type checker to enforce for you. And you go into the language knowing you can’t model everything in types, so that’s no news. What’s relevant is how much you can model, and whether that stuff helps you write code that doesn’t ruin people’s lives and put bread on the table. As a full time Haskeller for a long time, my opinion is that the answer is “yes”.

                                                                    I think the friction comes from the desire to view the language as some sort of deity that you can describe your most intricate thoughts and it will start telling you the meaning of life. For me, once I stopped treating GHC (Haskell’s flagship compiler) as such and started viewing it as a toolbox for writing ad-hoc support structures to strengthen my architecture here and there it all fell into place.

                                                                    1. 2

                                                                      I’m going to quote some folks anonymously from IRC, as I think that they are more eloquent than I am about this. I will say, in my own words, that your post could have “Haskell” replaced with basically any other language with a type system, and the same argument would go through. This suggests that the discussion is not at all about Haskell in particular, but about any language with a type system. I would encourage you to reconsider my argument with that framing.

                                                                      (All quoted are experts in both Python and Haskell. Lightly edited for readability.)

                                                                      Maybe another way of making the point is to say that the job of a type system is to reduce the number of programs you can write, and proponents of a type system will argue that enough of the reduction comes from losing stupid/useless/broken programs that it’s worth it.

                                                                      The problem with this argument and the statement [IRC user] just made is the same, I think. It depends. Specifically, it depends on whether one is trying to use the type system as a mathematical object, or as a practical programming tool. And further, on how good your particular group of programmers is with their practical programming tools on the particular day they write your particular program. With a mathematical system, you can produce something correct and prove it; with a practical programming tool, you can produce something correct and run it.

                                                                  1. 23

                                                                    FTFY: “A plea to developers everywhere: Write Junior Code”

                                                                    Let’s get bogged down with how much simple code we write.

                                                                    God, I wish every developer would make an effort to write simple code.

                                                                    1. 7

                                                                      I don’t disagree with you at all, but Haskell does have a bit of a spiral problem with these sorts of things; often folks writing even simple Haskell programs end up using very exotic types that are abstruse to more junior devs (or even more senior devs who just haven’t looked at, say, lenses before). I have this tweet about a simple dialect of Haskell saved because I think about this often when interacting with Haskell code.

                                                                      1. 8

                                                                        Those exotic types describe complexity that is present in other languages as well. However, in other languages, you do not need the type checker’s permission to introduce complexity. Instead, you discover this complexity after the fact by debugging your program.

                                                                        It is questionable whether the Haskell approach is as wise as it is clever. At least to me, it does not seem very suitable for writing what the original post calls “junior code”. Consider some of Haskell’s main features:

                                                                        • Purity and precise types:

                                                                          • Benefit: You can use equational reasoning to understand the complexity in your code.
                                                                          • Drawback: You cannot ignore the complexity in your code, even when it does not matter to you.
                                                                        • Lazy evaluation:

                                                                          • Benefit: It is easy to write programs that manipulate conceptually large data structures, but in the end only need to inspect a tiny part of them.
                                                                          • Drawback: It is difficult to track the sequence of states resulting from running your program.
                                                                        • Higher-kinded types:

                                                                          • Benefit: It possible to abstract not only over concrete types, such as Int or String, but also over “shapes of data types”, such as List or Tree (leaving the element type unspecified).
                                                                          • Drawback: Oftentimes, type errors will be an unintelligible mess.

                                                                        It is ultimately a subjective matter whether these are good tradeoffs.

                                                                        1. 6

                                                                          often folks writing even simple Haskell programs end up using very exotic types

                                                                          … abstruse …

                                                                          🤔

                                                                        2. 1

                                                                          Isn’t a large aspect of Java and C# that they force you to write simple code? Then they get called “blub” languages or whatever. The reality is that you should write for whoever your audience is. Explaining everything such that a six-year old can understand it requires an inordinate amount of effort and without picking a target audience this is what your suggestion devolves into.

                                                                          1. 6

                                                                            Isn’t a large aspect of Java and C# that they force you to write simple code?

                                                                            No. C# has had type inference, covariant and contravariant generics, opt-in dynamic typing as distinct from type inference, lambdas, value variables, reference variables, checked and unchecked arithmetic, and G–d knows what else I’m forgetting since at least the late 2000s. Java’s missing some of that (although less and less recently), but adds to it things like implicit runtime code generation, autoboxing, and a bunch of other stuff. Neither language is intrinsically simple.

                                                                            But that said, I don’t honestly know that they’re honestly much more complicated than most languages, either. They’re more complicated than Go, maybe, but I don’t even know for sure if they’re more complicated than Python. The thing is that Java projects—at least, the “enterprise” ones for which the language has become famous—go crazy with complexity, despite—and often at odds with—the underlying language. There’s nothing preventing Python from doing absolutely crazy things, for example, and people who remember pre-1.0 versions of Django might recall when it used metaclasses and what would now be importlib to make one hell of a lot of magic happen in model classes. But the community rejects that approach. The Java community, on the other hand, is happy to go crazy with XML, factories, and custom class loaders to roam way into the Necronomicon of software development. I tend to regard this as the ecosystem, rather than the language, going to the extreme.

                                                                            Haskell in practice, to me, feels like what C# or Java code taken to the extreme would look like. And there’s even indeed libraries like language-ext for C# or Arrow (which is for Kotlin, but same difference), which do go there, with (IMVHO) disastrous results. (Disclaimer: I work heavily on an Arrow-based code base and am productive in it, albeit in my opinion despite that comment.) This is also an ecosystem decision, and one that I think this article is rightfully and correctly railing against.

                                                                            1. 4

                                                                              There’s nothing preventing Python from doing absolutely crazy things, for example, and people who remember pre-1.0 versions of Django might recall when it used metaclasses and what would now be importlib to make one hell of a lot of magic happen in model classes. But the community rejects that approach.

                                                                              I don’t think that’s true at all. The difference is that Python has good abstractions, so if you want to do something complex under the hood, you can still expose a simple interface. In fact, Python programmers would much rather use something with a simple interface and complex internals than the other way around. That’s why they’re using Python!

                                                                              1. 3

                                                                                I’m not sure we’re disagreeing, except for I think you’re implying that Java and C# lack an ability to expose something with complex internals and a simple interface. I’m logging off tech for the weekend, but Javalin is a great example of a Java framework that’s on par with Flask in terms of both simplicity and power, and done with 100% vanilla Java. It’s just not popular. And the reason I cited early versions of Django for Python is specifically because the community felt that that tradeoff of a simple interface for complex internals went too far. (If you have not used way-pre-1.0 versions of Django, it did Rails-style implicit imports and implicit metaclasses. We are not talking about current, or even 1.0, Django here.)

                                                                                In other words, I think you’re making my point that this is about culture and ecosystem, not language in the abstract. Which is also why this article is making a plea about how to write Haskell, and not about abandoning Haskell for e.g. OCaml.

                                                                                1. 3

                                                                                  Ah right yes I see about the Django thing. I was thinking about how it uses them now. I wasn’t aware it did import magic before, that definitely sounds a bit much!

                                                                          2. 1

                                                                            I used to use juxt and comp and partial quite a bit in my Clojure code, but these days I try to avoid them. They’re clever, they’re fun, they’re succinct… but they can also make it harder for the next person who comes along if they’re not already a Clojure hotshot.

                                                                            1. 5

                                                                              That’s setting a pretty low bar, isn’t it? Partially applying functions isn’t exactly whizz-bang fancy-pants programming in a Lisp.

                                                                              1. 2

                                                                                And yet, there’s usually another way to write it that’s more clear to someone not as familiar with Lisps.

                                                                                (I’m not saying “never use these”. There are definitely times when it’s more awkward to use something else.)

                                                                                1. 3

                                                                                  Function composition is the most fundamental functional programming concept as far as modularity is concerned, and partial application is not far behind. They are not specific to Lisps. juxt is slightly more “clever,” but nonetheless provides a ton of utility, is a part of the core library, and should not be shied away from. Talking about avoiding these functions without explicit examples or clear criteria is pointless.

                                                                                  Do you disapprove of any macro usage in your Clojure code? Are transducers out? What about core.async? I’ve seen more “clever” and confusing code written using those features than with any of the functions you’ve listed. For that matter, the worst (all?) Clojure codebases tend to be agglomerations of layer after layer of “simple” map-processing functions which are impossible to grasp in the aggregate and incredibly frustrating to debug. This is evidence of a general lack of coherent system-level thinking, versus any specific features in Clojure being responsible for complex, unmaintainable code.

                                                                                  The guidelines for writing clean, simple, maintainable code are never so straightforward such that they can be stated pithily, to the chagrin of Rich Hickey true-believers everywhere. It’s a combination of figuring out what works for a given team, adopting conventions and architecture well-suited to the domain, and choosing an environment and libraries to integrate with so that you introduce as little friction as possible (and probably more that I’m forgetting, unrelated to the choice of language). But picking and choosing arbitrary functions to eschew will not get you very close to the goal of writing simple code.

                                                                                  1. 2

                                                                                    I think you’re taking this a lot farther than what I actually said.

                                                                                    1. 2

                                                                                      I’m sorry, I was trying to respond systematically to a comment I disagreed with. If you wouldn’t mind: how exactly did I take it too far?

                                                                                      1. 1

                                                                                        Well, I didn’t say “don’t use these”, I said that I “try to avoid them”. I don’t always succeed in that, and I’m happy to use them where they make sense.

                                                                                        There’s a continuum between “can’t avoid it” and “totally gratuitous” and I try to push my personal cutoff towards the left, there. When it would make the code harder to read, I don’t avoid them!

                                                                                        1. 1

                                                                                          Well, I didn’t say “don’t use these”, I said that I “try to avoid them”. I don’t always succeed in that, and I’m happy to use them where they make sense.

                                                                                          Why do you try to avoid using them? When does it make sense to use them?

                                                                          1. 3

                                                                            This phenomenon is amusing the first time you encounter it. But after running into it over and over, in multiple languages, it stops being funny and now becomes a major preoccupation. It is mighty time we investigate its fundamental causes. Why do programming language features that individually seem sensibly designed, have such unexpected interactions when put together? Perhaps there is something wrong with the process by which language features are usually designed.

                                                                            1. 2

                                                                              So what are your thoughts there?

                                                                              1. 3

                                                                                Doesn’t this behavior seem common to many/most software systems? Initially systems are conceptually simple and consistent, but overtime they get adhoc extensions that cause unexpected complexities. Those extensions cause unexpected behavior and possibly bugs.

                                                                                Programming languages seem to have more strict backwards compatibility requirements than many other software projects so it makes sense that mistakes would accrue overtime.

                                                                                1. 1

                                                                                  Backwards compatibility is only a manifestation of a more fundamental problem. General-purpose programming languages are meant to be, well, general-purpose, i.e., address a very large space of use cases that you cannot possibly hope to enumerate. Designing language features based on ad-hoc use cases is a mistake.

                                                                                2. 1

                                                                                  Design features of general-purpose programming languages based on general principles, not use cases. Make sure that your principles neither (0) contradict each other, nor (1) redundantly restate each other. This is more likely to lead to orthogonal language features.

                                                                                  By definition, use cases are concrete and specific. They are useful as guidelines for designing software that addresses concrete and specific needs, and is unlikely to be used in situations that you cannot foresee in advance. If a user comes up with a gross hack to use your software for something else, and ends up burning themselves, you can rightfully tell them “Well, that is not my problem.”

                                                                                  However, a general-purpose programming language does not fit the above description. By definition, the ways in which a general-purpose programming language can be used are meant to be limitless, but you can only imagine finitely many use cases. A general principle has the advantage that you can apply it to a situation that only arose after you stated the principle.

                                                                                3. 2

                                                                                  One would imagine that the superior way would then be to make extending the language as easy as possible. In other words, Lisp. Every Lisp developer is a potential Lisp developer (wink). The extensions would compete against each other like regular libraries do, and the cream would rise to the top.

                                                                                  But the actual effect (at least the way the Lisp community currently is) seems to be that since extending the language is so easy, everybody just extends it to their own liking and no (or very rare) centralized improvements that everyone adopts happen. Nobody codes Lisp, but Lisp+extension set #5415162.

                                                                                  Or perhaps it just has too many parentheses. Pyret might show if that’s the problem.

                                                                                  1. 2

                                                                                    This just pushes the problem onto the user community. A programming language needs a vision, and a vision needs a visionary.

                                                                                  2. 1

                                                                                    The problem isn’t the features, it’s that people expect to use something as complex as a programming language without a single bit of reading.

                                                                                    Nobody expects to be able to just waltz up to a bridge building project and play around without knowing anything about engineering. Yet people think that Python should just work exactly the way they imagine it to work in their head.

                                                                                    1. 1

                                                                                      it’s that people expect to use something as complex as a programming language without a single bit of reading.

                                                                                      The problem you mention is very real too, but it is not fair to blame it only on the language users. Programming languages are designed in a way that makes it difficult to learn all their intricacies. Oftentimes, even the language designers are not aware of the consequences of their own designs. Features are conceptualized by their inventors exclusively in operational terms (i.e., “How do we desugar this into smaller imperative steps?”), and not enough thought is put into question “What does this actually mean?”

                                                                                      Try picking arbitrary pairs (X,Y), where X is a programming language and Y is a feature in X not shared with many other languages. Enter X’s IRC channel and ask why Y was designed the way it is. Count how many times they actually give you a design rationale vs. how many times they reply as if you had asked how Y works. And, when they do give a design a rationale, count how many times it looks like a post hoc rationalization of the prior design.

                                                                                      1. 3

                                                                                        The problem is that people have a shallow, surface-level understanding of two features, then when they combine them they act in a way that you can only understand if you have a deeper understanding of the features. Then they throw up their hands and say ‘WTF?’

                                                                                        ‘WTFs’ in programming languages, a ‘meme’ that really started with PHP in my opinion, made a lot more sense when it was the deeper design of individual features that was batshit crazy. Now people are just applying it to every language they don’t like. Two features interact in a way that doesn’t make sense from my perspective of shallow understanding? Must be the language that’s broken.

                                                                                        If you actually understand the features in the context of their design - which yes, might very well be syntactic sugar over a series of small imperative steps, what’s wrong with that? - then you’ll understand why they work the way they do.

                                                                                        1. 1

                                                                                          If you actually understand the features in the context of their design - which yes, might very well be syntactic sugar over a series of small imperative steps, what’s wrong with that? - then you’ll understand why they work the way they do.

                                                                                          Sure, you will understand the mechanics of how it works. But this will still give you zero insight on why the feature makes sense. It might turn out that the feature does not actually make the intended sense. Consider this answer by the ##c++ quote bot on Freenode:

                                                                                          • unyu: !perfect
                                                                                          • nolyc: The C++11 forwarding idiom (which uses a variadic template and std::forward) is not quite perfect, because the following cannot be forwarded transparently: initializer lists, 0 as a null pointer, addresses of function templates or overloaded functions, rvalue uses of definition-less static constants, and access control.

                                                                                          In other words, the feature’s behavior deviates from what its own users consider reasonable.

                                                                                          what’s wrong with that?

                                                                                          The problem is that it is ad hoc. Memorizing lots of ad hoc rules does not scale.

                                                                                          Programming is a purposeful activity. When you program, you usually want to achieve something other than just seeing what the computer might do if you give it this or that command. The meaning of a language feature is what allows you to decide whether using the feature contributes towards your actual goal.

                                                                                          1. 4

                                                                                            I’m not at all defending C++ here. It’s a perfect example of where there really is a problem. But I don’t think that the Python examples on the linked page are like this at all. They’re basic interactions of features that make perfect sense if you understand those features beyond the basic surface level.

                                                                                            Some of them (e.g. ‘yielding None’) aren’t ‘WTFs’ they’re just bugs. Bugs that have been fixed! Some of them are basic features of Python, like default arguments being evaluated at function definition time. One of them is that you need to write global a to be able to modify a global variable called a within a function! That’s a good thing! That’s not a WTF. An entirely local statement like a = 1 suddenly modifying global state because you added a new global variable would be a WTF.

                                                                                            1. 1

                                                                                              Oh, okay. You have a point there.

                                                                                  1. 6

                                                                                    The Python lambda isn’t quite the same as lambda in Lisp, but as far as I can tell, but it’s pretty close. Anonymous functions are cool, and Python has that coolness.

                                                                                    Scheme gets lexical scope mostly right. The way the top-level environment is handled in the REPL is slightly wrong (in spite of its name, define does not always define a new symbol), but it is not too wrong. Most importantly, it will not affect you if you are batch compiling your programs the good old-fashioned way, which is when correctness matters the most.

                                                                                    The same cannot be said about Python. Environments are mutable dictionaries just like any other. There is a little bit of syntactic sugar for manipulating those dictionaries, because otherwise things would be too weird, and that is it.

                                                                                    Unfortunately, without lexical scope, lambda is just another letter in the Greek alphabet.

                                                                                    1. 2

                                                                                      The best part about C: You can do absolutely anything in it.

                                                                                      The worst part about C: You can do absolutely anything in it, so it’s now 100% on you to make sure you didn’t do anything dumb in your 100k lines of code.

                                                                                      Spoiler alert - you probably did do something dumb, and won’t find out until the NSA/Russians/Chinese/North Koreans etc takes over your server with a 0-day.

                                                                                      1. 2

                                                                                        The best part about C: You can do absolutely anything in it.

                                                                                        This is only true if you program alone and your programs are not meant to be used by anyone else. You may not even regard [yourself as a programmer] and [yourself as a user] as two distinct roles. In this extremely implausible scenario, you may say that you can do absolutely anything.

                                                                                        As soon as there are two or more people/roles involved, inevitably one person/role will want to restrict what another person/role can do. Because, if you cannot restrict what others do, then you cannot rely on them doing useful things for you. At best, they will accidentally do things that you find useful, and you will have to take it as a gift.

                                                                                        Cooperation is agreeing to a set of restrictions that lead to a better result for everyone than just working in isolation. Note that this is precisely what modularity is: The author of a module is not free to do literally what they want to. They have to communicate with other modules through their public interfaces.

                                                                                        This is a very steep price that you pay when you use C. You cannot enforce modularity in the language. You can only hope that others will respect your module’s internals, and if they do, you will have to take it as a gift.

                                                                                      1. 3

                                                                                        C has no easy way to return multiple values; the natural modern API is Go’s approach of returning the result and the errno

                                                                                        In C, it is „ugly“ and „procedural“, but possible and quite easy – you can have an „output parameter“ (a pointer), that is filled by the function (actually a procedure). It is a common way to „return“ multiple values from a „function“. Often the return value is „misused“ for exceptions/errors and actuall return value (something useful, why you called the function) is returned through an output parameter. Just the syntactic sugar is missing. I do not like this C way much, but I must admit that it is something that works and can be used.

                                                                                        1. 3

                                                                                          I would agree that “return a success/failure code and write output values via pointers instead” is a better idiom for C. When I’ve tried using apr (it’s surprisingly nice except that I could not find any documentation bar the headers) which uses this style pervasively, it’s relatively okay. Every function had identical error checking style. Downsides are that every function call takes 2 or more lines of boilerplate: declare automatic variables for the return value(s), call the function, branch on failure to a handler.

                                                                                          For historical reasons, though, none of the posix C API is written in that style and we get errno (which sucks) and in-band errors instead (which sucks more) from many (but not all) functions.

                                                                                          1. 0

                                                                                            Such inconvenience is mostly given by the manual memory management. This is one of main reasons why I prefer higher languages than C for almost everything (application code). But C still makes sense in some cases – as a low level language, or as a „gateway language“ for linking code written in various languages – the C API can be called from probably every other language + many languages allow exporting their code/functions as a C API, so it allows linking code written in different languages together.

                                                                                            Often this is the only option. However it would be nice to have similar bridge for object-oriented APIs. The D language can do this with C++… Of course, we can use some protocol (e.g. D-Bus, SNMP, CORBA or WebServices) and communicate over sockets, but it is completely different story than direct function/method calls.

                                                                                            1. 3

                                                                                              Yeah, uh, I don’t know why you are talking about this here? The discussion here is about why the posix C API kind of sucks as an ABI from Golang’s perspective.

                                                                                              1. 2

                                                                                                Such inconvenience is mostly given by the manual memory management.

                                                                                                Not really. Go has GC but its error handling is only a little less boiler-plate-y, while writing purely unsafe Rust has very little of it because its type system and the language do more of the work for you.

                                                                                            2. 1

                                                                                              In C, it is „ugly“ and „procedural“, but possible and quite easy – you can have an „output parameter“ (a pointer), that is filled by the function (actually a procedure).

                                                                                              What exactly is so ugly about this? It is simple and clean, and it lets you write the outputs of a procedure exactly where they are needed. The only “downside” seems to be that you cannot chain operations like foo.do_this().do_that().and_then_that(). But manipulating imperative data structures in this fashion seems like a pain to debug, even if you are using automatic memory management.

                                                                                            1. 22

                                                                                              The “O” part is consistently omitted by all comparisons. The most unusual thing about it is that the object system is structural rather than nominal, and there’s type inference for it. Classes are just a handy way to make many objects of the same type.

                                                                                              For example, the type of a function let f x = x#foo () is < foo : unit -> 'a; .. > -> 'a, which means “any object that provides method foo with type unit -> anything:

                                                                                              # let f x = x#foo () ;;
                                                                                              val f : < foo : unit -> 'a; .. > -> 'a = <fun>
                                                                                              
                                                                                              # let o = object 
                                                                                                method foo () = print_endline "foo" 
                                                                                              end ;;
                                                                                              val o : < foo : unit -> unit > = <obj>
                                                                                              
                                                                                              # f o ;;
                                                                                              foo
                                                                                              - : unit = ()
                                                                                              
                                                                                              # let o' = object
                                                                                                method foo n = Printf.printf "%d\n" n 
                                                                                              end ;;
                                                                                              val o' : < foo : int -> unit > = <obj>
                                                                                              
                                                                                              # f o' ;;
                                                                                              Line 1, characters 2-4:
                                                                                              Error: This expression has type < foo : int -> unit >
                                                                                                     but an expression was expected of type < foo : unit -> 'a; .. >
                                                                                                     Types for method foo are incompatible
                                                                                              
                                                                                              1. 17

                                                                                                I regularly recommend people to the OCaml object system. Not only is it decently nice (but underused because other parts of the language are more popular and cover most of what you might want) but it also challenges your ideas of what “OO” is in very credible and nice ways.

                                                                                                Classes are clearly separated from object types and interfaces. Inheritance clearly relates to classes as opposed to objects. Structural typing, which exists throughout the OCaml universe, makes clear what subtyping is offering (e.g., a more restricted sort of subsumption than structural classes will give you).

                                                                                                It’s just an alternate universe Object system which is entirely cogent, replicates most of the features you might expect, but does so with a sufficiently different spin that you can clearly see how the difference pieces interact. Highly recommended!

                                                                                                1. 6

                                                                                                  OCaml’s object system reflects what type theorist sees in an object-oriented language. You know, the kind of people who write things like TaPL’s three chapters on objects. What these people see is static structure, which, from their point of view, is fundamentally flawed, e.g., “it is wrong to make either Circle or Ellipse a subtype of the other one” and “inheritance does not entail subtyping in the presence of binary methods”.

                                                                                                  However, what a type theorist sees does not always match what a programmer sees. A conversation between a type theorist and a programmer could go this way:

                                                                                                  • Type theorist: Here, look! I have finally devised a way to adequately describe object-oriented languages!
                                                                                                  • Programmer: By “adequately describe”, do you mean “give types to”?
                                                                                                  • Type theorist: Of course. Sound types describe the logical structure of programs, in particular, your object system.
                                                                                                  • Programmer: I do not see how or why you can make such a claim. The object systems that I use are dynamic and programmable, so that I can adapt the system to my needs, instead of getting stuck at a single point in the language design space. Are you saying that your type system changes the type of every object every time I reconfigure the system?
                                                                                                  • Type theorist: No. I just describe the parts that will not change, no matter how you reconfigure the system.
                                                                                                  • Programmer: Then you are not describing much that is useful, I am afraid.
                                                                                                  • Type theorist: By the way, I have concluded that your system has the grand total of exactly one type.
                                                                                                  • Programmer: sigh

                                                                                                  And another conversation could go this way:

                                                                                                  • Programmer: (to a newbie) The internal state of an object is hidden from the rest of the program. Objects communicate by sending messages to each other.
                                                                                                  • Newbie: What is the point to this?
                                                                                                  • Programmer: This helps you compartmentalize the complexity of large software. Each object manages a relatively small part of the program’s state. Small enough to understand.
                                                                                                  • Newbie: Ah, okay. I can see why you would want to work this way.
                                                                                                  • Type theorist: Objection! You claim the state of an object is encapsulated, but your development tools are fundamentally based on the ability to inspect the internal state of every object, even when it is running.
                                                                                                  • Programmer: Right. In an object-oriented system, encapsulation is not enforced by the language, but is rather a property of how the system is designed.
                                                                                                  • Type theorist: So, in other words, encapsulation is a property of how you program the system, rather than of the system itself.
                                                                                                  • Programmer: That is not quite precise. Encapsulation is a property of the system. However, the system is more than just the language it is programmed in.
                                                                                                  1. 8

                                                                                                    So are you saying that the only real-world usable object systems are dynamically-typed ones? One of the most respected OOP teachers and language designers (Bertrand Meyer, of Eiffel and Design by Contract fame), disagrees with you on that one: https://en.wikipedia.org/wiki/Eiffel_(programming_language)

                                                                                                    1. 3

                                                                                                      In retrospect, my comment was much harsher than I wanted it to be. What I really want to see is a CLOS-like object system, embedded in a static language, preferably ML.

                                                                                                      In practice, the most popular object systems are not fully static. Java and .NET rely on runtime class metadata for lots of things. And, even if they didn’t, I doubt their designers would trust their own designs to be safe enough to remove all the runtime type checking.

                                                                                                      Eiffel would be a much better argument for your position if it were actually type safe. Pony is, to the best of my knowledge, the most widely used fully type safe object-oriented language.

                                                                                                      1. 2

                                                                                                        Hmm, Ada certainly takes a lot of inspiration from Eiffel. And I’d say Ada is probably more popular than Pony. Actually I’m not even sure if Pony is ‘OOP’ in the sense that the other mainstream languages are. AFAIK it is actor-based.

                                                                                                        1. 2

                                                                                                          Oops, you are right there.

                                                                                                          1. 2

                                                                                                            At least according to Wikipedia, Ada precedes Eiffel in time.

                                                                                                            Certainly one of the few things I know about Ada is that it supports programming by contract.

                                                                                                            1. 3

                                                                                                              But I think Ada’s contract system is inspired by Eiffel’s though?

                                                                                                              1. 5

                                                                                                                Correct, Eiffel is where design-by-contract originated. Ada added support in the 2012 spec.

                                                                                                                See the comparison here, where it refers to “preconditions and postconditions” (since “design-by-contract” is now a trademark owned by Eiffel Software.

                                                                                                                1. 1

                                                                                                                  Thanks for clearing this up for me!

                                                                                                      2. 4

                                                                                                        I feel like you’re arguing a lot for dynamic types as opposed to static types. That’s totally fine. I’m speaking pretty much entirely to the static components and trying to talk about something pretty different.

                                                                                                        There’s a different discussion, not really related at all to OO, about how the things you’re discussing here can be interpreted via types. I’m personally of the belief that most programmatic reasoning could be modeled statically, though perhaps not in any system popularly available. To that end, speaking using types can still be important.

                                                                                                        Which, to the point here, we can model a lot of these dynamic interactions using types at least temporarily. And there remains a big difference between the way that “factories for creating objects” can inherit from one another and the way that “one object serves as a suitable stand-in for another” is a different, if sometimes related, concept.

                                                                                                        1. 1

                                                                                                          I feel like you’re arguing a lot for dynamic types as opposed to static types.

                                                                                                          More precisely, I’m arguing for the use of dynamism in object systems, even (or perhaps especially!) if they are embedded in languages with static types. Multiple static types are already useful, but they would be even more useful if the dynamic language’s unitype were one of them, in a practical way.

                                                                                                          Not everything needs to be object-oriented or dynamically typed, though. For example, basic data structures and algorithms are best done in the modular style that ML encourages.

                                                                                                          1. 2

                                                                                                            That’s a reasonable thing to ask for. I think many statically typed languages do offer dynamic types as well (in Haskell, for instance, there’s Dynamic which is exactly the unittype you’re looking for). Unfortunately, whenever these systems exist they tend to be relegated to “tricking the type system into doing what you want” more than playing a lead role.

                                                                                                            Personally, this makes sense to me as when 80% of my work is statically typed, I tend to want to extend the benefits I receive there more than introduce new benefits from dynamicism.

                                                                                                            I’d be interested to see if there were some good examples of how to encode an interesting dynamically typed idiom into something like Haskell’s Dynamic.

                                                                                                            1. 1

                                                                                                              Haskell’s Dynamic would not work very well in ML. Usually, the runtime system can only determine the concrete type of a given value. But the same concrete type could be the underlying representation of several different abstract types! Haskell gets away with having Dynamic, because it has newtype instead of real abstract data types.

                                                                                                              The unitype embedded in a multityped language should only be inhabited by dynamic objects (whose methods are queried at runtime), not by ordinary values that we expect to be mathematically well-behaved, such as integers, lists and strings.

                                                                                                              1. 2

                                                                                                                I may not fully understand, but that feels doable in both ML and Haskell.

                                                                                                                In ML, when we work dynamically we’ll be forgetting about all of the abstract types. Those have to be recovered behaviorally through inquiring the underlying object/module/what-have-you.

                                                                                                                In Haskell, we can have something like newtype Object = Object (Map String Dynamic) possibly with a little more decoration if we’d like to statically distinguish fields and methods. You could consider something like newtype Object = Object { fields :: Map String Dynamic, methods :: Map String (Object -> Dynamic) } for a slightly more demanding object representation.

                                                                                                                1. 1

                                                                                                                  In ML, when we work dynamically we’ll be forgetting about all of the abstract types. Those have to be recovered behaviorally through inquiring the underlying object/module/what-have-you.

                                                                                                                  This is precisely what I don’t want. Why do I need to give up my existing abstract data types to work dynamically? I have worked very hard to prove that certain bad things cannot happen when you use my abstract data types. These theorems suddenly no longer hold if the universe of dynamic values encompasses the internal representations of these abstract data types.

                                                                                                                  Instead, what I’m saying is: Have a separate universe of dynamic objects, and only then embed it into the typed language as a single type. If you don’t mind the syntactic inconvenience (I do!), this can already be done to some extent, in an ugly way, in Standard ML (but not Haskell or OCaml), using the fact that exception declarations are generative. But this is a hack, and exceptions were never meant to be used this way.

                                                                                                            2. 1

                                                                                                              C# 4.0 introduced dynamic type, and it pretty much works that way.

                                                                                                              1. 2

                                                                                                                The whole C# language is dynamic. It has downcasts, the as operator, runtime reflection for arbitrary types, etc.

                                                                                                                What I want is a language that has both a static part, with ML-like “guaranteed unbreakable” abstractions (so no reflection for them!), and a dynamic part, which is as flexible and reconfigurable as CLOS.

                                                                                                                1. 1

                                                                                                                  Having a function or operator that consumes an “obj” doesn’t make an entire language “dynamic”, by this definition literally every language is ‘dynamic’. I don’t love C# but its important to keep the discussion honest.

                                                                                                                  If you’re looking for one that does dynamics “as good as CLOS” while also doing static types you won’t ever be happy. It’s like saying I want something that is completely rigid, but also extremely flexible. If you have access to both extremes in one environment, your types will become a dead weight and your flexibility will be useless. If you permit a compromise then you can get what you want, but rightly you don’t want a compromise.

                                                                                                                  If you start with the solution instead of describing your actual needs you’re not going to find what you desire. What I found is that I value being able to express a wide variety of constructs statically, and with a degree of flexibility in my consumption of types. We can get close to how that might feel in F# through sum types, bidirectional type inference, operator overloading, and statically resolved type parameters. This may or not fit your needs, but it made me happy.

                                                                                                                  1. 1

                                                                                                                    Having a function or operator that consumes an “obj” doesn’t make an entire language “dynamic”, by this definition literally every language is ‘dynamic’.

                                                                                                                    I can guarantee you, Standard ML is not.

                                                                                                                    If you’re looking for one that does dynamics “as good as CLOS” while also doing static types you won’t ever be happy. It’s like saying I want something that is completely rigid, but also extremely flexible.

                                                                                                                    This is a mischaracterization of what I said. Re-read the technical specifics: the dynamic universe should start completely separate from the static one, and only then should we inject the former into the latter as a single type. In other words, the dynamic universe does not need to include literally every value in the static universe. It only needs to contain object references. A value whose static type is dynamic could be a file, a window, a game entity, but not an integer, a string or a list - not even a list of dynamic objects!

                                                                                                                    If you start with the solution instead of describing your actual needs you’re not going to find what you desire.

                                                                                                                    I think I stated my needs in a pretty clear way. I want (0) static stuff to be static, (1) dynamic stuff to be dynamic, (2) static stuff not to compromise the flexibility of dynamic stuff, (3) dynamic stuff not to compromise the proven guarantees of static stuff. The way C# injects literally everything into the dynamic universe compromises the safety of static abstractions. Sadly, this is not possible to work around, except by hiding the entire .NET object system, which defeats the point of targeting .NET as a platform. Java suffers from similar issues.

                                                                                                          2. 2

                                                                                                            I am just a programmer, and know (very) little of type theory. I’m sorry, but I didn’t really understand your point(s?). Could you explain it in a way that doesn’t assume I understand the perspective of both parties (or just ELI5 / ELI-only-know-JS)?

                                                                                                            1. 7

                                                                                                              In the first conversation, the type theorist begins by being proud that he came up with a sound static type system for an object-oriented language. (Basically, OCaml, whose object system is remarkable in that it doesn’t need any runtime safety checks.) Presumably, before this, all other object-oriented languages were statically unsafe, requiring either dynamic checks to restore safety, or simply leaving open the possibility of memory corruption.

                                                                                                              The programmer’s reply is the same point that Kiczales et al make in the introduction of The Art of the Metaobject Protocol. A completely fixed object system (e.g., those of OCaml, Eiffel and C++) cannot possibly satisfy the needs of every problem domain. So he uses object systems that allow some reconfiguration by the programmer (e.g., CLOS), to adapt the object system to the problem domain, rather than the other way around. Hence, by design, these object systems offer few static guarantees beyond memory safety.

                                                                                                              In the second conversation, the programmer talks about the benefits of encapsulating state in an object-oriented setting. The type theorist retorts that many object-oriented languages don’t really do any sort of encapsulation, because you can just up and inspect the internal state of any object anytime. (By contrast, in a modular language, like ML, you really can’t inspect the internal representation of an abstract data type. The type checker stops you if you try.) The programmer acknowledges that this is indeed how object-oriented languages work (at least the ones he uses), but then says that encapsulation is still a property of the systems he designs, even if it is not a property of the languages he uses to implement these systems.

                                                                                                              The point of the second conversation (along with other points) is better made in this essay by Richard Gabriel about incommensurability in scientific and engineering research.

                                                                                                              1. 1

                                                                                                                Got it, thanks!

                                                                                                        2. 3

                                                                                                          How does OCaml’s structural record system compare with row types e.g. in PureScript or Elm or Ur? (Note: Structural (subtyping) is distinct from row types.) Does OCaml avoid the soundness issues of subtyping, e.g. writing an object of two fields into a mutable cell and then reading back an object of one field (subtype), thus accidentally losing a field (but which may or may not be printed if we print the object)? This is something that row types avoid.

                                                                                                          1. 3

                                                                                                            What you describe doesn’t seem like a soundness issue. It seems like normal upcasting and virtual dispatch. Btw OCaml doesn’t have structural records, it has structural typing of objects (i.e. OOP) using a row type variable to indicate structural subtypes of an object type. E.g. here’s a general description of entity subtypes:

                                                                                                            type 'a entity_subtype = < id : string; .. > as 'a
                                                                                                            

                                                                                                            And here’s a specific entity subtype:

                                                                                                            type person = < id : string; name : string > entity_subtype
                                                                                                            

                                                                                                            The .. above is a row variable: https://v1.realworldocaml.org/v1/en/html/objects.html#idm181614624240

                                                                                                            1. 3

                                                                                                              Thanks. My best effort to parse your post yields the following observations:

                                                                                                              type 'a entity_subtype = < id : string; .. > as 'a
                                                                                                              

                                                                                                              is OCaml’s way to write the equivalent of

                                                                                                              type EntitySubtype r = { id :: String | r }
                                                                                                              

                                                                                                              in PureScript. The r is a type variable referring to an unspecified record. Then

                                                                                                              type Person = EntitySubtype { name :: String }
                                                                                                              

                                                                                                              which, by replacing the r with { name :: Sting }, yields { id :: String, name :: String }.

                                                                                                              So, based on the documentation, OCaml has normal row types for its record system. It’s sometimes hard to parse OCaml discussion due to the “OO” syncretism.

                                                                                                              1. 2

                                                                                                                Right! PureScript record row types, and merging them together, has a really nice syntax. OCaml’s is comparatively more primitive and explicit. It causes a little frustration when modelling hierarchies of JavaScript types in the BuckleScript/ReasonML community. We usually recommend using modules and module types as much as possible though. They’re also super powerful, and target JavaScript classes quite well by using some BuckleScript binding constructs.

                                                                                                        1. 4

                                                                                                          The Python example immediately stands out to me as having two more alternatives:

                                                                                                          • crash (unironically, just crash, because you shouldn’t be dividing by zero)
                                                                                                          • use a typed language and specify division as only being allowed for non-zero divisors

                                                                                                          (Someone told me monads were Fourier transforms but for function execution and that somehow made them click for me. I like the article.)

                                                                                                          1. 7

                                                                                                            I think that I grok monads and Fourier transforms, but don’t see a connection. Could you elaborate?

                                                                                                            1. 3

                                                                                                              @tomjakubowski my mind jumped to convolutions after seeing Fourier transforms, and a convolution is an operation on two functions to produce a third function which serves as a translation between one method to another. I think the declaration of a convolution would map to a functor.

                                                                                                              I’m reading Wikipedia and my brain is melting from the definition of the convolution theorem, but it basically says you can transform convolution in one domain into multiplication into another domain. I guess that’s akin to binding (or using some operand to restructure a monolithic calculation into a pipeline), which is a property of monads.

                                                                                                              1. 6

                                                                                                                Theorems are about what logically follows from definitions, not about what you or anyone else “can” do. In particular, the convolution theorem asserts that, given two absolutely integrable functions f and g, the Fourier transform of their convolution ℱ(f ⋆ g) equals the ordinary product of the Fourier transforms of the original functions ℱ(f) ℱ(g). Provided we accept the usual definitions of “Fourier transform” and “convolution”, the convolution theorem is simply true. It would still be true even if nobody had been smart enough to prove it.


                                                                                                                Convolutions are not “translations”. The easiest way to think of convolutions is as follows. Suppose you have two independent random variables X and Y that could be either 0, 1, 2, 3, with some given probabilities. Then X + Y could be any of 0, 1, 2, 3, 4, 5, 6. But, what is the probability that X + Y equals, say, 4? We need to take into account every way in which X + Y = 4 could happen, namely:

                                                                                                                • X = 1 and Y = 3
                                                                                                                • X = 2 and Y = 2
                                                                                                                • X = 3 and Y = 1

                                                                                                                What are their respective probabilities? Since X and Y are independent,

                                                                                                                • Pr[X = 1 and Y = 3] = Pr[X = 1] Pr[Y = 3]
                                                                                                                • Pr[X = 2 and Y = 2] = Pr[X = 2] Pr[Y = 2]
                                                                                                                • Pr[X = 3 and Y = 1] = Pr[X = 3] Pr[Y = 1]

                                                                                                                Since the three cases are mutually exclusive,

                                                                                                                • Pr[X + Y = 4] = Pr[X = 1] Pr[Y = 3] + Pr[X = 2] Pr[Y = 2] + Pr[X = 3] Pr[Y = 1]

                                                                                                                One convenient way to write this is as follows:

                                                                                                                • Pr[X + Y = 4] = sum_{i + j = 4} Pr[X = i] Pr[Y = j]

                                                                                                                We had a nice finite sum, because our space of possibilities is finite. When X and Y may take arbitrary real numbers as values, the sum becomes an integral.


                                                                                                                Finally, none of this has any relationship to monads or monadic bind whatsoever.

                                                                                                              2. 2

                                                                                                                Honestly I’m sorry I ever brought it up. I too understand both and cannot remember why that off-handed explanation made sense to me, I’ve been trying to remember all day.

                                                                                                                1. 1

                                                                                                                  There is no need to be sorry. This just means that you have to be careful with other people’s claims. If you do not understand why something has to be true, it is healthy to doubt it, at least a little. The stronger a claim is, the more you should doubt it. And the claim “This complicated thing becomes super simple if you just use this trick or analogy!” is super duper strong.

                                                                                                                  1. 1

                                                                                                                    It’s not like that at all. It’s not a stupid trick (type theorists hate her!), it was an off-the-cuff comment that triggered my synapses to wire up correctly and turned my knowledge to understanding.

                                                                                                                    1. 1

                                                                                                                      Ah, okay.

                                                                                                              3. 3

                                                                                                                @WilhelmVonWeiner thanks for replying! Those are great points you raised, and I talked a little bit about them in my blog post, but to respond to you directly:

                                                                                                                1. I would agree with you that crashing is the appropriate solution for most OLTP workloads (esp. those where the runtime is accessible by the developer). Worst case scenario, you lose one record, but you get a crisp error handled by the runtime. But for OLAP workloads it may be a bit different. In my last job I was handling 10^8 / 10^9 records for a OLAP job, and the tool was a monolith. Crashing may mean halting job execution or losing context of runtime state. You can’t crash on one “0” input if you have many records to process. I think crashing would also be inappropriate because the execution path differs based on the current job context (sometimes a 0 is okay, sometimes not). It was also an on-premise tool, so high-latency customer feedback loops was the only way I could resolve bugs as opposed to SSHing into a server.

                                                                                                                2. Compile-time guarantees like static typing are really nice, and I’m not terribly familiar with them at the moment (I’m working through a Haskell book so I should have more knowledge by the end of the year), but I don’t see how that translates to correctness guarantees at runtime. What if you wanted to break out a monolith into services and each portion communicated to another through a protocol (e.g. stdin/stdout/stderr)?

                                                                                                                Yeah, you can think of monads as a convolution! And thank you for the like :)

                                                                                                                1. 4

                                                                                                                  Regarding 2.: The general idea is that when you parse data from the outside world (like stdin) you don’t just validate if the input is correct but you also transfer the data into your type safe data model. That’s how the compile time guarantees also hold at runtime. You would normally use some kind of parser for this, like aeson for JSON or ReadP for general text processing.

                                                                                                                  1. 1

                                                                                                                    Got it. I’m guessing since Haskell makes it easy to define types or typeclasses, there shouldn’t really be a situation where an external type could not map to a Haskell type, or if there is, there’s a catch-all bytes type acting as a container of bytes?

                                                                                                                    IIRC I encountered some weird issues with various binary encodings where I found it difficult to map to any data type in Python without being forced to apply an encoding (and then having to strip out that encoding during export). For example, reading PostgreSQL bytea into a binary column in another database using a custom process, or capturing and rendering Java log4j stdin within a parent Python process. If Haskell was flexible enough to design parsers for arbitrary bytestreams from various protocols that would be a major selling point to data engineers.

                                                                                                                    1. 2

                                                                                                                      Sure, you can leave parts of your data in a more general data structure. Of course you can’t interact with these parts as easily, but if you just want to pass them through that’s ok. E.g. in aeson there is the Value type which is just a sum type over all possible JSON elements. This is still perfectly safe, because if you want to work with these values you always have to consider every possible case of the sum type. And of course there is stuff like ByteArray.

                                                                                                              1. 7

                                                                                                                The definition of what a monad is in your post is wrong. A monad is a type constructor, e.g., List, in a language where List is not a type, but List[int] is, together with three functions

                                                                                                                map : (A -> B) -> (List[A] -> List[B])
                                                                                                                pure : A -> List[A]
                                                                                                                join : List[List[A]] -> List[A]
                                                                                                                

                                                                                                                such that

                                                                                                                1. Given

                                                                                                                  • f : A -> B
                                                                                                                  • g : B -> C
                                                                                                                  • xs : List[A]

                                                                                                                  the following equality holds:

                                                                                                                  • map (g . f) xs = map g (map f xs)

                                                                                                                  where g . f denotes function composition:

                                                                                                                  • (g . f) x = g (f x)
                                                                                                                2. Given

                                                                                                                  • xs : List[A]

                                                                                                                  the following equalities hold:

                                                                                                                  • xs = join (pure xs)
                                                                                                                  • xs = join (map pure xs)
                                                                                                                3. Given

                                                                                                                  • xs : List[List[List[A]]]

                                                                                                                  the following equality holds:

                                                                                                                  • join (join xs) = join (map join xs)

                                                                                                                Of course, List is just an example. (Exercise: Figure out what pure and join should be for the List monad.) To implement another monad, say, Option, you have to replace all occurrences of List with Option in the type signatures of map, pure and join. That is,

                                                                                                                map : (A -> B) -> (Option[A] -> Option[B])
                                                                                                                pure : A -> Option[A]
                                                                                                                join : Option[Option[A]] -> Option[A]
                                                                                                                
                                                                                                                1. 4

                                                                                                                  Does it count as ‘boring Haskell’ if it starts by enabling 39 language extensions?

                                                                                                                  1. 3

                                                                                                                    Did you look at the list? It’s almost entirely syntax sugar extensions.

                                                                                                                    1. 1

                                                                                                                      GADTs, existential quantification, multiparameter typeclasses, polykinds, rankntypes, scopedtypevariables, type families and datakinds are hardly syntactic sugar.

                                                                                                                      I think enabling stuff like ViewPatterns is fine. That is syntactic sugar. RankNTypes is not.

                                                                                                                    2. 1

                                                                                                                      They start with 39 extensions in their proposed standard library, by my hand count.

                                                                                                                      “Boring” seems like a carefully-chosen word to distract from the obvious commercialization being attempted by the author.

                                                                                                                      1. 4

                                                                                                                        Distract? We’re trying to get paid to move more companies into using Haskell. That’s my (I’m Chris) job and the whole thesis of the article is that there is a highly productive subset of Haskell that is extremely useful in software projects operating under commercial constraints.

                                                                                                                        The first line is:

                                                                                                                        Goal: how to get Haskell into your organization, and how to make your organization more productive and profitable with better engineering.

                                                                                                                        Your reply isn’t constructive and casts aspersions by claiming the explicit point of article is somehow an act of subterfuge. We just want people to start using better tools. For us programmers at FP Complete the reasons for that are selfish but straight-forward: we’re programmers who want to use the tools we like because they make our work less tedious.

                                                                                                                        I want to get paid to write software in nice programming languages. I want to create more jobs where people get to get paid to write code using nice programming languages.

                                                                                                                        1. 1

                                                                                                                          The ‘highly productive subset of Haskell that is extremely useful in software projects operating under commercial constraints’ does not involve starting with 39 language extensions and a huge pile of extremely complex type system nonsense that results in awful error messages. If people want to use a language with cryptic type errors and high performance they should use C++.

                                                                                                                          Haskell’s highly productive subset is Haskell with at most about 3 language extensions, all of which are totally optional, along with a set of well-built libraries. I’d avoid typeclasses entirely, and even if you don’t go that far, certainly I’d avoid GADTs, lenses of any kind, and anything to do with the word ‘monad’. No MTL or anything of that nature.

                                                                                                                        2. 3

                                                                                                                          I’m confused, is commercialization a good thing here or a bad thing?

                                                                                                                          1. 1

                                                                                                                            39 extensions in their proposed standard library

                                                                                                                            That’s what I meant, whoops.

                                                                                                                            1. 3

                                                                                                                              A lot of language extensions are innocuous and useful. It’s not our fault those extensions haven’t been folded into the main language. What’s your point?

                                                                                                                              1. 1

                                                                                                                                The extensions haven’t been folded into the main language because they’re completely unnecessary. You do not need GADTs to write Haskell. Advanced type system trickery is exactly the sort of unnecessary crap that a ‘Boring Haskell’ movement should be not using.

                                                                                                                                1. 1

                                                                                                                                  Many of those extensions are actually less well-thought-out than Haskellers think.

                                                                                                                                  It is widely acknowledged that, as a general rule, orphan instances are bad. One problem with them is that they allow third-party modules to introduce distinctions between types that were originally meant to be treated as isomorphic. (At least assuming you are a well-meaning Haskeller. If you wanted to allow third parties to subvert the meaning of your code, you would use Lisp or Smalltalk, not Haskell.)

                                                                                                                                  Then GADTs and type families are dangerous for exactly the same reason.

                                                                                                                          1. 4

                                                                                                                            I wouldn’t be surprised if the boring Haskell language subset would probably be a language pretty close to SML, but pure.

                                                                                                                            1. 6

                                                                                                                              and, sadly, lazy.

                                                                                                                              1. 7

                                                                                                                                I get upset when people talk about laziness strictly as a bad thing, while for me it’s almost purely a good thing. In my 3 years of full-time commercial Haskell development, I’ve used laziness countless times to simplify my code greatly and never have I been bitten by it performance wise. I have spent zero seconds debugging strictness issues.

                                                                                                                                1. 7

                                                                                                                                  Fair enough, I was unecessarily dismissive. I personally find lazyness hard to reason about, and I think it requires heroic efforts from the compiler to get good performance. OCaml gets you pretty fast programs with very little optimizations in comparison. But indeed, there’s a lot of elegance in lazy functions.

                                                                                                                                  1. 5

                                                                                                                                    I do sometimes talk about laziness as a bad thing but I have a specific argument in mind when I do this: eager languages are more expressive in that they can express lazy structures whereas lazy languages cannot express eager structures. There is a fundamental duality between data versus co-data in computer science: the former is about inductively building bigger structures out of smaller ones whereas the latter is about (co-inductively) building smaller things out of (infinitely) big things.

                                                                                                                                    In Haskell, one works with co-data and pretends that it is data. For instance: a natural number by its very definition cannot be infinite: any inhabitant of the type ℕ must be a natural number you can write down; it should be a finite thing made up from gradually smaller things. Consider the following definition

                                                                                                                                    ∞ : ℕ
                                                                                                                                    ∞ = ∞ + 1
                                                                                                                                    

                                                                                                                                    In both Haskell and SML you can write something like this but an attempt to evaluate it should get the program immediately stuck in an infinite loop. In Haskell, however, this is not the case because it is treated as co-data due to laziness. So ℕ really behaves like co-ℕ, the dual of ℕ. The situation is the same for lists: you can never write the type of lists in Haskell, only co-lists (i.e., streams) (of course here, I am hand-waving over the extremely important question of what one means by equality; by a suitable definition of equality you can look at them as the same thing).

                                                                                                                                    There exists a fundamentally co-inductive type in almost every language: the function type. It is built into the language and its inhabitants are inherently co-data rather than data. Using function types you can express many other co-data: for instance the type of lists over some type A in Haskell can be represented by a type like ℕ → A + Unit in SML; the idea is that this is like a list you can force into existence as you observe it which is the meaning Haskell assigns to ordinary lists.

                                                                                                                                    Agda for instance allows the user to express their precise expectations from a type as it supports co-inductive types so in that sense it is superior to both. It also supports co-pattern matching: the dual of pattern matching, an extremely convenient tool for dealing with co-data. If we are to choose between SML and Haskell, though, there is a sense in which SML is superior.

                                                                                                                                    Check this article out if you are interested in reading more: https://existentialtype.wordpress.com/2011/04/24/the-real-point-of-laziness/.

                                                                                                                                    1. 3

                                                                                                                                      What are the strictness issues that are hard to debug? All cases I can think of (forgetting that rectypes isn’t the default, or forgetting to thunk something) blow up fast, usually at compile time, and are fairly straightforward. What am I missing?

                                                                                                                                  2. 3

                                                                                                                                    In addition to purity, the most obvious differences that come to mind are:

                                                                                                                                    • laziness (I like it, like enobayram)
                                                                                                                                    • typeclasses vs. modules
                                                                                                                                    • higher-kinded types (although sounds like modules allow for an approximation?)

                                                                                                                                    I’d maybe also toss in generics as a great feature that doesn’t get as much press, although I think you need DeriveGeneric to really leverage the full power there.

                                                                                                                                    Taken as a whole, my impression is that these lead to a pretty different programming style compared to ML-descended languages, but I don’t have any experience with ML-family languages so can’t really say.

                                                                                                                                    1. 15

                                                                                                                                      Having used both ML and Haskell, I can guarantee that the programming styles they encourage are very different.

                                                                                                                                      ML encourages you to use types to describe states. If you have a long computation, you define a data type for each intermediate state, and a function for each computation step. To ensure that steps may only be chained in ways that make sense, you use abstract data types. The hallmark of good ML programming is designing abstract data types in such a way that third parties may not create invalid values.

                                                                                                                                      Haskell encourages you to use types to describe computations. If you identify a pattern that several computations follow, you formalize the pattern as a higher-order function parameterized by the varying parts, which may be either explicit function arguments or implicit type class dictionaries. To ensure that your patterns are reasonably universal, you demand that they satisfy laws that can be stated as equations. The hallmark of good Haskell programming is identifying the most generally useful patterns that computations follow.

                                                                                                                                  1. 1

                                                                                                                                    Dynamically scoped variables are uniscoped. You have a global mutable variable of type Stack<Foo>, and always use the Foo at the top of the stack. (You need to remember to pop the stack at the right times. But this can be automated with a little bit of macro support.)

                                                                                                                                    Hence dynamic scoping is just a special case of static (lexical) scoping.

                                                                                                                                    1. 2

                                                                                                                                      And you can implement static scoping using a macro for function definition (which basically does the same thing a closure would) in a dynamically scoped language!

                                                                                                                                      I believe it’s the other way around though. Consider: static typing can be macro-rewritten to dynamic typing, but dynamic typing must use runtime assertions to emulate static typing. In the same way, dynamic scoping can be macro-rewritten to static scoping, but static scoping must use runtime stuff (e.g. a global mutable stack) to emulate dynamic scoping.

                                                                                                                                      1. 2

                                                                                                                                        And you can implement static scoping using a macro for function definition (which basically does the same thing a closure would) in a dynamically scoped language!

                                                                                                                                        Can you? Even ignoring the fact that functions are not the only things to which variable scopes are tied, you would need (0) a gensym [this much is okay], (1) to pray nobody subverts what you are doing, say, by reverse-engineering the gensym and predicting what names it is going to use [this is the scary part].

                                                                                                                                        Moreover, even this much isn’t perfect. There is no general procedure to statically determine whether a dynamically scoped variable is guaranteed to be set in the current lexical environment (the classical argument: write a program that finds a counterexample to the Collatz conjecture, and only then calls a function that uses a variable that hasn’t been set), so an AOT compiler cannot fail with a “variable not in scope” error.

                                                                                                                                        In the same way, dynamic scoping can be macro-rewritten to static scoping, but static scoping must use runtime stuff (e.g. a global mutable stack) to emulate dynamic scoping.

                                                                                                                                        Dynamic scoping itself is “runtime stuff”, so no new “runtime stuff” is used to implement it in terms of static scoping.

                                                                                                                                        I stand by my claim that dynamic scoping is a special case of static scoping. (Just to be clear: This is a purely descriptive claim. I’m not saying “dynamic scoping is bad”. Emacs, a program I use and will continue to use everyday, is constructive proof that dynamic scoping is super useful.)

                                                                                                                                        1. 2

                                                                                                                                          I think we’re on about different things: I’m saying you can use a dynamically-scoped language as if it were statically-scoped, as in, you can write a fresh program in a dynamically scoped language and pretend it’s statically scoped with the help of macros; you’re saying you can’t mechanically rewrite a program written in a dynamically-scoped way into a statically-scoped language (which is true). I just didn’t write what I was trying to say well enough.

                                                                                                                                          The gensym thing isn’t much of a problem: you can transform every un-gensym’d identifier in a way that won’t ever clash with a gensym’d identifier, in the same way that a language might mangle its identifiers to be C-compatible.

                                                                                                                                          Dynamic scoping itself is “runtime stuff”

                                                                                                                                          I’ll concede that, and you are correct overall, which is nice given the symmetry between static/dynamic scoping and typing (which was your point in the first place). As it happens I’m currently a little caught up in making dynamic scope pretend it’s static, so I’m happy to be shown wrong.

                                                                                                                                          1. 2

                                                                                                                                            I think we’re on about different things: I’m saying you can use a dynamically-scoped language as if it were statically-scoped, as in, you can write a fresh program in a dynamically scoped language and pretend it’s statically scoped with the help of macros

                                                                                                                                            Ah, yes, there we agree, of course.

                                                                                                                                            The gensym thing isn’t much of a problem: you can transform every un-gensym’d identifier in a way that won’t ever clash with a gensym’d identifier, in the same way that a language might mangle its identifiers to be C-compatible.

                                                                                                                                            The problem is malicious users who will purposefully clash with the gensym’d identifiers. (With a carefully designed gensym, it seems unlikely that an identifier clash will happen by accident.) Otherwise, there isn’t much to worry about.

                                                                                                                                    1. 2

                                                                                                                                      I don’t know much about Nim, but I’m pretty tired of the implication that scripting languages can’t be compiled. Nearly every well-known scripting language has a compiler; stop perpetuating 1990s stereotypes.

                                                                                                                                      1. 5

                                                                                                                                        It’s funny how you say “1990s stereotypes” when PHP didn’t have a compiler until 2010.

                                                                                                                                        1. 7

                                                                                                                                          While a fair criticism, 2010 was nearly a decade ago. There are certainly many of us who still think of the nineties as merely a decade or so ago.

                                                                                                                                          1. 3

                                                                                                                                            One’s perception doesn’t matter, 2010 is 9 years ago (9 and a half if we consider February 2010, the release date of Hip Hop), and calling it “1990s stereotypes” to bring forward an incorrect and overblown statement is not a valid move.

                                                                                                                                            1. 2

                                                                                                                                              Perceptions definitely matter when it comes to language and communication. This is because you must communicate such that it is received by the listener’s expectations. A speaker who communicates to the listener’s perceptions will be far more effective. 90’s stereotypes for example may not have been meant to be taken so literally. Now you can say as a listener, “It didn’t work for me, I took it literally” and that’s a fair criticism, however you’re not the only listener, and not every listener will receive it the way you did.

                                                                                                                                          2. 4

                                                                                                                                            Ruby was a tree-walking interpreter until 2009 (based on this page).

                                                                                                                                            Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                                                                                                            @technomancy is getting way too overwrought over the imprecise but useful phrase, “compiled language”. Wikipedia: “The term is somewhat vague.” There is a useful separation here. Until we come up with a better name for it, let’s be kind to usage of “compiled language”.

                                                                                                                                            1. 10

                                                                                                                                              Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                                                                                                              These things are not in opposition to each other. OCaml has a native code compiler, and its runtime system has a GIL (a misnomer: a better name is “global runtime lock”).

                                                                                                                                              1. 1

                                                                                                                                                Interesting!

                                                                                                                                              2. 4

                                                                                                                                                imprecise but useful phrase

                                                                                                                                                What do you find this phrase to be useful for?

                                                                                                                                                You can see in the comments below that the author’s intent was not to describe programs that compile to machine language, but actually to describe programs that can be distributed as single-file executables. (which is also true of Racket, Lua, Forth, and many other languages)

                                                                                                                                                So it seems that even by declaring “compiled” to mean “compiled to machine language” we haven’t actually achieved clear communication.

                                                                                                                                                1. 2

                                                                                                                                                  I’m actually rather interested in how you get single file executables with Lua? Is there a tool that makes it easy, or is it something out of the way?

                                                                                                                                                  EDIT: I know how to get single file executables with Love2d, and I vaguely recall that it can be done with Lua outside of Love, but it’s certainly not an automated/well-known thing.

                                                                                                                                                  1. 4

                                                                                                                                                    Many people simply embed Lua in their C programs (that was the original use case it was designed for) but if you’re not doing that you can streamline it using LuaStatic: https://github.com/ers35/luastatic/

                                                                                                                                                  2. 2

                                                                                                                                                    You’re pointing out a second way in which it is imprecise. I’m pointing out that for me – and for a large number of people who don’t know all the things you do – it was useful since it perfectly communicated what the author meant.

                                                                                                                                                    1. 3

                                                                                                                                                      Oh, interesting, so you mean to say that when you read “compiled language” you took it to mean “produces a single-file executable”?

                                                                                                                                                      This is honestly surprising to me because most of the time I see “compiled language” being misused, it’s put in contrast to “interpreted language”, but in fact many interpreted languages have this property of being able to create executables.

                                                                                                                                                      It’s just a tangled mess of confusion.

                                                                                                                                                      1. 5

                                                                                                                                                        Indeed it is. I didn’t mean that I understood “produces a single-file executable.” I meant that he’s pointing towards a divide between two classes of languages, and I understood the rough outline of what languages he was including in both classes.

                                                                                                                                                        Edit: I can’t define “compiled language” and now I see it has nothing to do with compilation. But I know a “compiled language” when I see it. Most of the time :)

                                                                                                                                                        1. 3

                                                                                                                                                          Perhaps a good way to put it is “degree of runtime support that it requires”. Clearly, normal usage of both Nim and Python requires a runtime system to do things for you (e.g. garbage collection). But Nim’s runtime system does less for you than Python’s runtime system does, and gets away with it mostly because Nim can do many of those things at compile time.

                                                                                                                                                          1. 4

                                                                                                                                                            Even if a language retrofitted a compiler 20 years ago, it’s hard to move away from the initial programming UX of an interpreted language. Compilation time is a priority, and the experience is to keep people from being aware there’s a compiler. With a focus on UX, I think all your examples in this thread have a clear categorization: Perl, Python, Ruby, Racket, Lua, Node and Forth are all interpreted.

                                                                                                                                                            1. 4

                                                                                                                                                              I would frame it differently; I’d say if a language implementation has tooling that makes you manually compile your source into a binary as a separate step before you can even run it, that’s simply bad usability.

                                                                                                                                                              In the 90s, you had to choose between “I can write efficient code in this language” (basically C, Pascal, or maaaaybe Java) vs “this language has good-to-decent usability.” (nearly everything else) but these days I would like to think that dichotomy is dated and misguided. Modern compilers like rust and google golang clearly are far from being “interpreted languages” but they provide you with a single command to compile and run the code in one fell swoop.

                                                                                                                                                              1. 4

                                                                                                                                                                I’m super sympathetic to this framing. Recent languages are really showing me how ill-posed the divide is between categories like static and dynamic, or compiled and interpreted.

                                                                                                                                                                But it feels a bit off-topic to this thread. When reading what somebody else writes my focus tends to be on understanding what they are trying to say. And this thread dominating the page is a huge distraction IMO.

                                                                                                                                                                I also quibble with your repeated invocation of “the 90s”. This is a recent advance, like in the 2010s. So I think even your distraction is distractingly phrased :)

                                                                                                                                                2. 3

                                                                                                                                                  Are you not barking at the wrong tree ? I don’t find a line where the author implies anything close to it.

                                                                                                                                                  1. 0

                                                                                                                                                    I was referring to the title of the post; as if “scripting ease in a complied language” is not something provided by basically every scripting language in existence already.

                                                                                                                                                    1. 3

                                                                                                                                                      Specifically, most scripting languages make it nontrivial to package an executable and move it around the filesystem without a shebang line. On Linux, this isn’t a huge issue, but it’s convenient to not have to deal with it on Windows

                                                                                                                                                      1. 1

                                                                                                                                                        OK, but that has next to nothing to do with whether there’s a compiler or not. I think what you’re talking about is actually “emits native code” so you should like … say what you mean, instead of the other stuff.

                                                                                                                                                        1. 1

                                                                                                                                                          Fair enough.of a point, I suppose. Many people use “compiled” vs “interpreted” to imply runtime properties, not parsing/compilation properties, it isn’t exactly the proper definition.

                                                                                                                                                          I’ll try to be more precise in the future, but I would like a term for “emits native code” that is less of a mouthful.

                                                                                                                                                  2. 3

                                                                                                                                                    Scripting languages can be compiled, but, out of Python, Ruby, Tcl, Perl and Bash, most of them are by default written in such a way that they require code to follow a certain file structure, and if you write a program that is bigger than a single file, you end up having to lug the files around. I know that Tcl has star deploys, and I think that’s what Python wheels are for. Lua code can be wrapped into a single executable, but it’s something that isn’t baked into the standard Lua toolset.

                                                                                                                                                    1. 3

                                                                                                                                                      I think it would be helpful to a lot of people if you could give examples for node, Ruby, python, … Maybe you’re just referring to the wrong usage of the word compiler ?

                                                                                                                                                      EDIT: typo

                                                                                                                                                      1. 1

                                                                                                                                                        For python there is Nuitka as far as I know.

                                                                                                                                                        1. -1

                                                                                                                                                          Node, Ruby, and Python are all all typically compiled. With Ruby and Node it first goes to bytecode and then usually the hotspots are further JIT-compiled to machine code as an optimization pass; I don’t know as much about Python but it definitely supports compiling to bytecode in the reference implementation.

                                                                                                                                                          1. 7

                                                                                                                                                            When talking about “compiled languages”, people typically mean “AOT compiled to machine code”, producing a stand-alone binary file. Python’s official implementation, CPython, interprets bytecode and PyPy has a JIT compiler. V8 (the JS engine in Node) compiles JavaScript to a bytecode and then both interprets and JIT compiles that. Ruby has a similar story. The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                                                                                                            1. 1

                                                                                                                                                              The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                                                                                                              The idea that only scripting languages care about ease of use is just plain outdated.

                                                                                                                                                              In the 1990s it used to be that you could get away with having bad usability if you made up for it with speed, but that is simply not true any more; the bar has been raised across the board for everyone.

                                                                                                                                                        2. 3

                                                                                                                                                          Some languages like Dart [1] have first class support for both interpreting and compiling. I dont think its fair to say something like “some random person has a GitHub repo that does this” as being the same thing.

                                                                                                                                                          1. https://github.com/dart-lang/sdk
                                                                                                                                                          1. 0

                                                                                                                                                            That’s the whole point; Dart has a compiler, and any language that doesn’t is very unlikely to be taken seriously.

                                                                                                                                                            1. 1

                                                                                                                                                              The point is:

                                                                                                                                                              Nearly every well-known scripting language has a compiler

                                                                                                                                                              That may be true, but nearly every well-known scripting language doesnt have an official compiler

                                                                                                                                                              1. -2

                                                                                                                                                                Also false; Ruby’s official implementation has had a compiler (YARV) since the 1.9 days; Node.js has used the v8 JIT compiler since the beginning, (not to mention TraceMonkey and its descendants) and python has been compiling to .pyc files for longer than I’ve been a programmer.

                                                                                                                                                                According to this, Lua has had a compiler since the very beginning: https://www.lua.org/history.html I don’t know much about Perl, but this page claims that “Perl has always had a compiler”: https://metacpan.org/pod/perlcompile

                                                                                                                                                                The only exception I can think of is BASIC, and that’s just because it’s too old of a language for any of its numerous compilers to qualify as official. (edit: though I think Microsoft QuickBasic had a compiler in the 1980s or early 90s)

                                                                                                                                                                1. 6

                                                                                                                                                                  QuickBasic compiled to native code, QBasic was an interpreter, GWBasic compiled to a tokenized form that just made interpretation easier (keywords like IF were replaced with binary short codes)

                                                                                                                                                                  1. -1

                                                                                                                                                                    YARV is an interpreter, not a compiler…

                                                                                                                                                                    https://en.wikipedia.org/wiki/YARV

                                                                                                                                                                    and .pyc is not an executable file. Do you really not know the different between machine code and bytecode?

                                                                                                                                                                    https://en.wikipedia.org/wiki/Bytecode

                                                                                                                                                                    1. 2

                                                                                                                                                                      Do you really not know the different between machine code and bytecode?

                                                                                                                                                                      No.

                                                                                                                                                                      You seem to be under the mistaken assumption that something can’t be a compiler unless the compilation output is machine code I guess?

                                                                                                                                                                      YARV contains both a compiler (from ruby source -> bytecode) and an interpreter for the bytecode that it compiles, just like Python does for .pyc files.

                                                                                                                                                          1. 39

                                                                                                                                                            An interesting parallel, which works for the author, but doesn’t hold universally. Go and Ruby are fundamentally different.

                                                                                                                                                            Unlike Rust where the compiler constantly shouts “fuck you” even though you are trying to do your best to serve their majesty and the rules they dictate, Ruby never gets in your way.

                                                                                                                                                            This sentence is bad in many ways. First of all: it is toxic. Maybe error messages weren’t that useful when the author tried Rust as they are nowadays, but I doubt that. Failure to understand WHY things fail should be a priority to any developer. And having things actually fail loudly as early as possible should be considered a huge benefit. And then saying “Ruby never gets in your way” is plainly wrong. I would let “never gets in my way” slide, of course.

                                                                                                                                                            1. 23

                                                                                                                                                              “Ruby never gets in your way”

                                                                                                                                                              Ruby gets in the way, but later on. Whether it’s the rare exception that must be tracked down or the large refactor that demands a whole-system understanding of the system, Ruby is not any less in the way than Rust or Java; it gets in the way of getting things done, but in a different manner and at different times.

                                                                                                                                                              1. 8

                                                                                                                                                                Ruby gets in the way, but later on.

                                                                                                                                                                Very much this. I’m sure it’s possible in other languages but I’ve never seen people deliberately create ungreppable methods in any other idiom, such as

                                                                                                                                                                %i(foo bar baz).each do |sym|
                                                                                                                                                                  define_method("prefix_#{sym}_suffix") do
                                                                                                                                                                    # method body
                                                                                                                                                                
                                                                                                                                                                1. 7

                                                                                                                                                                  C++ preprocessor/template dark magic can get horrible quickly. It’s not just that people actively subvert auto-complete or grep. It’s also that these people never document these types of things, and they often do it when there’s an easier, much less bad way to accomplish what they’re trying to do.

                                                                                                                                                                  1. 3

                                                                                                                                                                    I generally make strong recommendations to the teams I’m on to keep things greppable, so this is more about best practices than the language itself being significantly flawed. (I understand the argument that foot guns shouldn’t be there to begin with.) define_method can be useful if there is a large number of things you want to iterate over and make methods for. Having said that, I basically never use define_method, myself.

                                                                                                                                                                    1. 4

                                                                                                                                                                      Common Lisp - but the debugger system will let you look the method up, and likely the method arises from a macro call-site which you can find from the method invocation etc.

                                                                                                                                                                      1. 2

                                                                                                                                                                        I would say the difference is that Common Lisp developers typically try to avoid doing crazy macro stuff if they can avoid it. I suspect most Ruby devs are the same, but there seems to be a vocal minority who love using metaprogramming as much as possible.

                                                                                                                                                                        1. 4

                                                                                                                                                                          There’s a similar minority in the CL community. Metaprogramming pushes happy buttons for a bunch of people.

                                                                                                                                                                          That doesn’t make it any more supportable, mind you.

                                                                                                                                                                          1. 2

                                                                                                                                                                            Back in the day it seemed mostly relegated to high end dev shops. I heard multiple stories about these folks creating metaprogramming monstrosities. Sure it worked, was BDD-tested, etc, but unless you were already familiar with it, the code was pretty hard to touch.

                                                                                                                                                                            1. 2

                                                                                                                                                                              I’m not sure I buy the implicit argument here that all code must be maximally accessible.

                                                                                                                                                                              Metaprogramming is just build-time code execution that is able to impact code at runtime. It is a skill you can learn like any other.

                                                                                                                                                                              1. 3

                                                                                                                                                                                Sure, but it requires additional thought to read, and makes things much less discoverable by tools (such as grep/ack/ag/etc, as mentioned earlier).

                                                                                                                                                                                Accessibility is a virtue when the code is going to be read by people other than the original author.

                                                                                                                                                                                1. 2

                                                                                                                                                                                  I’m not a big fan of it in general, but metaprogramming done well makes code easier to read and understand.

                                                                                                                                                                                  For example, autowrap is used to create Common Lisp bindings to C and C++ libraries. All of the work is done with a “c-include” macro, which does exactly what it sounds like, and is much easier to read and understand than bindings written by hand. There’s a real life example in the GDAL bindings I’m making. A single macro call generates a few thousand lines of CFFI glue code and definitions.

                                                                                                                                                                                  Poor discoverability might depend on the implementation. Bindings from autowrap are tab completable in the REPL and everything.

                                                                                                                                                                        2. 2

                                                                                                                                                                          Dynamic programming? It’s possible in python too. Useful for generating unit tests methods.

                                                                                                                                                                          1. 2

                                                                                                                                                                            Dynamic programming is something quite different. You’re looking for the term “metaprogramming”.

                                                                                                                                                                            1. 1

                                                                                                                                                                              thanks for catching

                                                                                                                                                                          2. 2

                                                                                                                                                                            Oh it’s perfectly doable with PHP’s __call :) https://www.php.net/manual/en/language.oop5.overloading.php#object.call

                                                                                                                                                                            1. 2

                                                                                                                                                                              I’ve switched opinions on this so many times I’ve lost count.

                                                                                                                                                                              However, I remain convinced that:

                                                                                                                                                                              • metaprogramming is an extremely sharp tool, which can end up hurting you later
                                                                                                                                                                              • sharp tools aren’t inherently bad

                                                                                                                                                                              Programming will always have ways to shoot yourself in the foot. The best we can do is make those ways explicit, contained, and able to be reasoned about in a sane way.

                                                                                                                                                                              1. 2

                                                                                                                                                                                I think we can do better. We can eliminate footguns where it makes sense. I have some examples, not all will agree they are footguns. Python eliminated switch statement. Go made it so the default is to not fall through. also Go doesnt have a ?: syntax. Further, Go only allows booleans with conditionals. So you cant do something like:

                                                                                                                                                                                n1 := 10
                                                                                                                                                                                if n1 {
                                                                                                                                                                                   println("something")
                                                                                                                                                                                }
                                                                                                                                                                                

                                                                                                                                                                                At first this was annoying, but it make sense. By only allowing this:

                                                                                                                                                                                n1 := 10
                                                                                                                                                                                if n1 != 0 {
                                                                                                                                                                                   println("something")
                                                                                                                                                                                }
                                                                                                                                                                                

                                                                                                                                                                                You dont have to think about or answer the question “what fails the condition”? Ruby only fails with nil and false. JavaScript fails with anything falsy. You just avoid that problem alltogether at the cost of a little terseness.

                                                                                                                                                                          3. 9

                                                                                                                                                                            I also find it weird because as a n00b Rust developer, I find the compiler absolutely lovely. It feels like my pal, cheering me on and giving helpful suggestions. (I come from a Java and Javascript world, where compiler messages can be pretty… terse.)

                                                                                                                                                                            Yeah, Rust can be difficult to write, but the compiler’s error messages sure make it pleasant.

                                                                                                                                                                            1. 8

                                                                                                                                                                              “This A -> B looks incorrect, did you mean A -> C?” changes stuff “This A -> C looks incorrect, did you mean A -> B?”

                                                                                                                                                                              Not debating the point you’re trying to make but the author sounds like many people who were only starting with Rust. I can 100% feel the sentiment that’s made in the article, and I’m a huge Rust fan and would love to use it more - but right now I’m at a stage where giving up in frustration sometimes happens. And this has nothing to do when it was. You call it toxic, I can wholeheartedly agree with “been there, seen that”.

                                                                                                                                                                              1. 3

                                                                                                                                                                                This “try {A, B, C} forever” type of errors happens very often with Rust noobs (to clarify: that’s a learning curve, not a dig at someone being a novice) who try to do something that is impossible to do, or impossible to prove to the borrow checker. Unfortunately, the compiler is incapable of understanding that it’s not a problem with a particular line of code, but the whole approach. For example, use of references as if they were pointers digs a deep hole the compiler won’t get you out of (it’ll try to annotate lifetimes instead of telling you to stop using references and use owned types), e.g. don’t write a linked list, use VecDeque.

                                                                                                                                                                                The “toxic” bit was about rather harsh framing of the issue. Although, I don’t blame anyone for not liking Rust’s strictness. It has its uses, just like Go/Ruby more lax approaches.

                                                                                                                                                                                1. 2

                                                                                                                                                                                  Congratulations on drafting an even more smug answer to the thread.

                                                                                                                                                                                  I think you’re missing the point and you also don’t need to explain the exact issue at hand, I was just citing an example of a clearly not impossible thing (the one where I encountered this the last time was simply about getting some form of text into a function, and of course it was a String vs slice problem but it wasn’t obvious at the time).

                                                                                                                                                                                  And yes, I do think that I prefer a screen full of old Clojure’s unhelpful stack trace than rustc telling me something completely wrong while trying to be helpful. At least then I’m not led on the wrong path because I usually trust my compiler.

                                                                                                                                                                                  1. 4

                                                                                                                                                                                    I don’t see how @kornel was being smug here. He’s saying that if a beginner tries something that the borrow checker considers impossible, it will cycle between several incorrect errors instead of raise the actual issue. Is it the “noobs”?

                                                                                                                                                                                2. 3

                                                                                                                                                                                  but right now I’m at a stage where giving up in frustration sometimes happens

                                                                                                                                                                                  It’s interesting that so many people are running into this. I learned the rules of the borrows checker, knew moves from C++ already, and that was pretty much it. Sure, I sometimes run into problems with the borrows checker (less than when I started writing Rust), but multiple immutable xor single mutable is easy to appease. Lifetime issues can be more difficult, but typically reasoning about the lifetimes solves them pretty quickly.

                                                                                                                                                                                  I wonder if it has to do with different problem domains that people are trying to tackle (I mostly work on machine learning and natural language processing, have not done much web backend programming with Rust), or the languages that they come from – I used C++ for many years prior, so perhaps I had to reason about ownership anyway (compared to e.g. someone who comes from GC’ed languages)?

                                                                                                                                                                                  I am not saying that Rust’s learning curve is not steep or that I am some genius that understands Rust better (most definitely not). But more that it would be interesting to investigate which problem domains or programming language backgrounds make it easier/harder to pick up Rust.

                                                                                                                                                                                  1. 5

                                                                                                                                                                                    I think the experience of Rust coming from C++ is vastly different than coming from Ruby or Python. Those latter languages shield the programmer from a number of things that need to be thought about with a systems-level language. Confronting those things for the first time is fraught.

                                                                                                                                                                                    1. 2

                                                                                                                                                                                      Users of GC’ed languages sometimes have to reason about ownership and lifetimes too. However, they are not punished anywhere as badly as C++ users for failing to do it correctly. They just trap the error, log it, do their best to recover from it, and move on.

                                                                                                                                                                                      It seems disheartening to explicitly write code whose sole purpose is to recover from errors that you will inevitably make, though.

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        Indeed, it’s the unforgiving strictness of Rust ownership that gets people by surprise, even if they know it at a conceptual level.

                                                                                                                                                                                        The second problem is that with GC design patterns can reference any data from anywhere, e.g. it’s natural to keep mutual parent-child relationships between objects. The borrow checker wants data to be structured as a tree (or DAG), and it comes as a shock that more complex relationships need to be managed explicitly (with refcounting, arenas, etc.)

                                                                                                                                                                                      2. 2

                                                                                                                                                                                        No idea. But I only started doing C++ after Rust and I haven’t had any of these problems with move semantics. But I wouldn’t claim I’d never created a bug because of it :P Rust was in learning in my spare time, for C++ I had a whole team to review my code and help me out.

                                                                                                                                                                                        Also I’m not saying I ran into this constantly - just that it happened several times. And I think my code tends to attract these kinds of problem, if it’s not “5% preparing data into proper data model, then 95% working with it” but being exactly on the surface.. like when i wrote my IRC bot - it’s 90% string handling and a little bit of networking.

                                                                                                                                                                                    2. 4

                                                                                                                                                                                      I’ve seen somebody saying something among the lines of “C is harder than Go because Go’s compiler will scream at me if I have an unused import and C’s doesn’t”. I can’t say I understand this mentality.

                                                                                                                                                                                      1. -2

                                                                                                                                                                                        Unlike Rust where the compiler constantly shouts “fuck you” even though you are trying to do your best to serve their majesty and the rules they dictate, Ruby never gets in your way.

                                                                                                                                                                                        I wonder if comments like these come from the kind of people who were throwing a hissy fit as a kid when their teacher told them to do their homework.

                                                                                                                                                                                        1. 16

                                                                                                                                                                                          While the original article is needlessly hostile, so is this response. For good or for bad, programming is a gratification-driven activity. It is not hard to see why more people find it gratifying to write a program that runs, rather than a program that type checks.

                                                                                                                                                                                          Besides, on a purely intuitive level, type errors are not necessarily the easiest way to understand why a program is flawed. Errors, just like anything else, are best understood with examples. Watching your program fail on a concrete input provides that example.

                                                                                                                                                                                          (Admittedly, fishing for counterexamples in a huge state space is not an approach that scales very well, but what use is scaling if your target audience does not want to try the alternative you suggest even in the tiniest cases?)

                                                                                                                                                                                          To illustrate the power of modeling concurrency with types, one has to give concrete examples of idiomatic programs written in non-typeful concurrent languages that contain subtle bugs that would have been caught by Rust’s type checker.

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        The game was fun. However, if this is how concurrent programmers have to think to earn their living, then it must be truly maddening.

                                                                                                                                                                                        It is regrettable that the authors of the game conflate concurrency with parallelism.

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          Why this jab against concurrent programming? The world is concurrent. Threads only virtualize CPUs.

                                                                                                                                                                                          Concurrency without parallelism is easier because the interleaving is more coarse. Compare the “Expand” function in the game.

                                                                                                                                                                                          1. 1

                                                                                                                                                                                            It was not meant as a jab at all. What I meant to say is just that I greatly respect those who have the courage to write concurrent programs, and the brainpower to get them right. I have neither.

                                                                                                                                                                                            I don’t think the lack of parallelism makes concurrency any easier, at least not if you care about any invariants that you might want to establish on your own.

                                                                                                                                                                                        1. 15

                                                                                                                                                                                          Oooh, dimensional analysis is a really deeply interesting problem. There’s lots of surprising challenges here. Just a few:

                                                                                                                                                                                          • Two incompatible quantities can have the same units. Each vector component of torque has units kg*(m/s)², which are also the same units for energy. But you can’t meaningfully add the x component of a torque vector to the kinetic energy!
                                                                                                                                                                                          • Two compatible quantities can have different units. This can happen if you need to convert between different systems. In SI, capacitance is measured in Farads, which has units s⁴/kg(A/m)². In CGS, though, capacitance is measured in… cm.
                                                                                                                                                                                          • Two dimensional quantities can be incompatible. Can you add a molar ratio to radians?
                                                                                                                                                                                          • There’s a difference between point and interval units. Kelvin and “degrees celsius” are identical interval units but different point units.

                                                                                                                                                                                          Some more resources on this topic:

                                                                                                                                                                                          • Frink is a programming language designed around doing unit conversions.
                                                                                                                                                                                          • Before he died, Bill Kent drafted out a book on representing measurements. Even the first draft is extremely good.
                                                                                                                                                                                          • The Frink unit file has tons of technical and historical information on units. Like how we actually have two different versions of “frequency” and Why Lumens Are Stupid.
                                                                                                                                                                                          1. 3

                                                                                                                                                                                            Some thoughts

                                                                                                                                                                                            • I’m always dissatisfied with dimensionless quantities. That’s one of the bigger frustrations in use of unit systems in scientific programming.
                                                                                                                                                                                            • Conversion between systems is handled at least some of the time by standardization and conversion on creation of a unit. This seems reasonable for an important subset of users.
                                                                                                                                                                                            • The difference between temperature types is a big deal. “Dimensional quantities” end up taking the form of a kind of vector space, but then their differences are a different sort of algebraic object (a torsor) which supports a different set of operations. This, honestly, should also be the case with energies.

                                                                                                                                                                                            The compatibility of torque and energy is a good example there. Despite them having the same unit signature, they’re not compatible for algebraic reasons (energy is a torsor again). This is partly alleviated by noting that torque could also be considered a joule-per-radian and realizing its dissatisfying, again, for dimensionless units to participate only as scalars.

                                                                                                                                                                                            1. 2

                                                                                                                                                                                              Two incompatible quantities can have the same units.

                                                                                                                                                                                              Two compatible quantities can have different units.

                                                                                                                                                                                              This suggests that you want dimensional analysis, rather than “units of measure” analysis.

                                                                                                                                                                                              When I took physics in college, it always annoyed me that people said things like “voltage”, rather than “electric potential”, or “amperage”, rather than “current”, etc. It suggests insufficiently abstract thinking.

                                                                                                                                                                                              Can you add a molar ratio to radians?

                                                                                                                                                                                              Dimensions are a matter of what distinctions you choose to make. To make radians not dimensionless, you need to accept that “lengths of straight line segments” and “lengths of circle arcs” are fundamentally different things.

                                                                                                                                                                                              If your physical problems are such that the geometric constructions in it can be carried the Ancient Greek way, the distinction between “constructible straight line distance” and “constructible angle” will be as clear as daylight.

                                                                                                                                                                                            1. 18

                                                                                                                                                                                              I believe that the “single entry, single exit” coding style has not been helpful for a very long time, because other factors in how we design programs have changed.

                                                                                                                                                                                              Unless it happens that you still draw flowcharts, anyway. The first exhortation for “single entry, single exit” I can find is Dijkstra’s notes on structured programming, where he makes it clear that the reason for requiring that is so that the flowchart description of a subroutine has a single entry point and a single exit point, so its effect on the whole program state can be understood by understanding that subroutine’s single collection of preconditions and postconditions. Page 19 of the PDF linked above:

                                                                                                                                                                                              These flowcharts share the property that they have a single entry at the top and a single exit at the bottom: as indicated by the dotted block they can again be interpreted (by disregarding what is inside the dotted lines) as a single action in a sequential computation. To be a little bit more precise” we are dealing with a great number of possible computations, primarily decomposed into the same time-succession of subactions and it is only on closer inspection–i.e, by looking inside the dotted block–that it is revealed that over the collection of possible computations such a subaction may take one of an enumerated set of distinguished forms.

                                                                                                                                                                                              These days we do not try to understand the behaviour of a whole program by composing the behaviour of each expression. We break our program down into independent modules, objects and functions so that we only need to understand the “inside” of the one we’re working on and the “outside” of the rest of them (the “dotted lines” from Dijkstra’s discussion), and we have type systems, tests and contracts to support understanding and automated verification of the preconditions and postconditions of the parts of our code.

                                                                                                                                                                                              In other words, we’re getting the benefits Dijkstra wanted through other routes, and not getting them from having a single entry point and a single exit point.

                                                                                                                                                                                              1. 7

                                                                                                                                                                                                An interesting variant history is described in https://softwareengineering.stackexchange.com/a/118793/4025

                                                                                                                                                                                                The description there is that instead of “single exit” talking about where the function exits from, it actually talks about a function only having a single point it returns to, namely the place where it was called from. This makes a lot more sense, and is clearly a good practice. I’ve heard this description from other places to, but unfortunately I don’t have any better references.

                                                                                                                                                                                                1. 3

                                                                                                                                                                                                  Yes, very interesting. It makes sense that “single exit” means “don’t jump to surprising places when you’re done” rather than “don’t leave the subroutine from arbitrary places in its flow”. From the Structured Programming perspective, both support the main goal: you can treat the subroutine as a box that behaves in a single, well-defined way, and that programs behave as a sequence of boxes that behave in single, well-defined ways.

                                                                                                                                                                                                2. 4

                                                                                                                                                                                                  The first exhortation for “single entry, single exit” I can find is Dijkstra’s notes on structured programming

                                                                                                                                                                                                  Also Tom Duff’s “Reading Code From Top to Bottom” says this:

                                                                                                                                                                                                  During the Structured Programming Era, programmers were often taught a style very much like the old version. The language they were trained in was normally Pascal, which only allowed a single return from a procedure, more or less mandating that the return value appear in a variable. Furthermore, teachers, influenced by Bohm and Jacopini (Flow Diagrams, Turing Machines and Languages with only two formation rules, Comm. ACM 9#5, 1966, pp 366-371), often advocated reifying control structure into Boolean variables as a way of assuring that programs had reducible flowgraphs.

                                                                                                                                                                                                  1. 1

                                                                                                                                                                                                    These days we do not try to understand the behaviour of a whole program by composing the behaviour of each expression. We break our program down into independent modules, objects and functions so that we only need to understand the “inside” of the one we’re working on and the “outside” of the rest of them (the “dotted lines” from Dijkstra’s discussion), and we have type systems, tests and contracts to support understanding and automated verification of the preconditions and postconditions of the parts of our code.

                                                                                                                                                                                                    Maybe I’m missing something, but it seems to me that Dijkstra’s methodology supports analyzing one program component at a time, treating all others as black boxes, so long as:

                                                                                                                                                                                                    • Program components are hierarchically organized, with lower-level components never calling into higher-level ones.
                                                                                                                                                                                                    • Relevant properties of lower-level components (not necessarily the whole analysis) are available when analyzing higher-level components.

                                                                                                                                                                                                    In particular, although Dijkstra’s predicate transformer semantics only supports sequencing, selection and repetition, it can be used to analyze programs with non-recursive subroutines. However, it cannot handle first-class subroutines, because such subroutines are always potentially recursive.

                                                                                                                                                                                                    In other words, we’re getting the benefits Dijkstra wanted through other routes, and not getting them from having a single entry point and a single exit point.

                                                                                                                                                                                                    Dijkstra’s ultimate goal was to prove things about programs, which few of us do. So it is not clear to me that we are “getting the benefits he wanted”. That being said, having a single entry point and a single exit point merely happens to be a requirement for using the mathematical tools he preferred. In particular, there is nothing intrinsically wrong about loops with two or more exit points, but they are awkward to express using ALGOL-style while loops.