Threads for cadit_in_piscinam

  1. 23

    Is a language good because it has many features? My current thesis is that adding features to languages can open up new ways to encode entire classes of bugs, but adding features cannot remove buggy possibilities.

    1. 23

      If you have a foot-gun in your arsenal and you add a new safe-gun, sure, technically that’s just one more way you can shoot yourself in the foot, but that’s missing the point of having a safe-gun.

      Many features can be used as less bug prone alternatives to old constructs. E.g., match expression instead of a switch statement where you could forget the assignment or forget a break and get unintentional fall-through. Same way features like unique_ptr in C++ can help reduce bugs compared to using bare pointers.

      1. 12

        Another thing worth mentioning is that PHP has also grown some good linters that keep you away from the unsafe footguns. I believe it’s gotten really good over the years.

        1. 7

          Just to fill this out:

          Psalm

          PHPStan

          EA Inspections Extended

          Sonar

          I actually run all of these. Obviously no linter is perfect and you can still have bugs but if you’re passing all of these with strict types enabled, you’re not writing the bad amateur code that got PHP it’s reputation from the “bad old days”. PHP’s not perfect but it’s no more ridiculous than, say, JavaScript, which curiously doesn’t suffer from the same street cred problems.

          1. 6

            …JavaScript, which curiously doesn’t suffer from the same street cred problems.

            I see what you’re saying, but JS actually does kinda have serious street cred problems. I mean, there are a ton of people who basically view JS programmers as second-class or less “talented”. And JS as a language is constantly mocked. I think the difference is that JS just happens to be the built-in language for the most widely deployed application delivery mechanism of all time: the web browser.

        2. 1

          It’s not as if match replaced switch; and why did it have default falkthrough to begin with, whilst match doesn’t?

          1. 2

            It’s probably just taken verbatim from C. It’s funny because PHP seems to have taken some things from Perl, which curiously does not have this flaw (it does allow a fallthrough with the next keyword, so you get the best of both worlds).

            1. 1

              Switch has been in PHP since at least version 3.0 which is from the 1990s. Match doesn’t replace switch in the language but it can replace switch in your own code, making it better.

          2. 15

            I disagree. People saying this usually have C++ on their mind, but I’d say C++ is an unusual exception in a class of its own. Every other language I’ve seen evolving has got substantially better over time: Java, C#, PHP, JS, Rust. Apart from Rust, these are old languages, that kept adding features for decades, and still haven’t jumped the shark.

            PHP has actually completely removed many of its worst footguns like magic quotes or include over HTTP, and established patterns/frameworks that keep people away from the bad parts. They haven’t removed issues like inconsistent naming of functions, because frankly that’s a cosmetic issue that doesn’t get in the way of writing software. It’s very objectionable to people who don’t use PHP. PHP users have higher-priority higher-impact wishes for the language, and PHP keeps addressing these.

            1. 2

              removed many of its worst footguns

              or the infamous mysql API (that was replaced by mysqli)

              edit: Also I like that the OOP vs functional interfaces keep existing. My old code just runs and I get the choice between OOP and functional stuff (and I can switch as I like)

              1. 1

                I liked the original mysql api. Was the easiest to use with proper documentation back then. A footgun is good analogy. A gun can be used in a perfectly safe manner. Of course if you eyeball the barrel or have no regard for basic safety rules about it being loaded or where it is pointed to at any time, then yeah, things are going to go south sooner or later.

                Likewise, the old functional mysql api was perfectly usable and I never felt any worry about being hacked through sql injection. If you are going to pass numbers as string parameters or rely on things like auto-escape, then just like in the gun example, things are not going to end well. But let’s all be honest, at the point it is expected to be hacked.

                1. 1

                  I haven’t been around the PHP community in any serious capacity for probably 17 years now, but “with proper documentation” was a double edged sword. The main php.net website was a fantastic documentation reference, except for the part where lots of people posted really terrible solutions to problems on the same page as the official documentation. As I grew as a developer, I learned where a lot of the footguns were, but starting out the easy path was to just grab the solution in the comments on the page and use it, with all of the accompanying downfalls.

                  1. 1

                    Already back in the day, it baffled me that the site even had comments, let alone people relying on them.nI would never blindly trust anything in the comments.

            2. 8

              There is only one way of modifying a language that works in practice: add new features. As one of my colleagues likes to say, you can’t take piss out of a swimming pool. Once a feature is in a language, you can’t remove it without breaking things. You can; however, follow this sequence:

              1. Add new feature.
              2. Recommend against using old feature.
              3. Refactor your codebase to avoid the old feature.
              4. Add static analysis checks to CI that you aren’t using the old feature.
              5. Provide compiler options to make use of the old features a hard error.

              At this point, the old feature technically exists in the language, but not in your codebase and not in new code. I’ve seen this sequence (1-4, at least) used a lot in C++, where unsafe things from C++98 were gradually refactored into modern C++ (C++11 and later), things like the C++ Core Guidelines were written to recommend against the older idioms, then integrated into static analysers and used in CI, so you the old usages gradually fade.

              If you manage to get to step 5, then you can completely ignore the fact that the language still has the old warts.

              1. 6

                I thought I was going crazy. Needed validation as no one would state the obvious.

                None of these features is a game changer for PHP. And even less so is all the composer and laravel craze that pretty much boils down to a silly explosion of javaesque boilerplate code.

                Heck, even the introduction of a new object model back in PHP 5 had marginal impact on the language at best.

                PHP’s killer features were:

                • Place script in location to deploy and map a URL to it
                • Out of the box support MySQL. Easy to use alternatives were payed back then, and connecting to MySQL or PostgreSQL was a PITA in most languages.
                • A robust template engine. It still is among the best and most intuitive to use our there. Although alternatives exist for every language.
                • Affordable availability on shared hosting with proper performance. This blew the options out of the water, with alternatives coating up to three orders of magnitude more for a minimum setup.

                These things are not killer features anymore. Writing a simple webapp with a Sinatra-like framework it’s easier than setting up PHP. The whole drop file to deploy only made sense in the days of expensive shared servers. It is counterproductive in the $3 vps era.

                I would prefer if the language would:

                1. Ship a robust production grade http server to use with the language instead of the whole mess it requires to be used via third party web servers

                2. Even better. Drop the whole http request and response as default input/output. It makes no sense nowadays. It is just a cute reliq from past decades. Which is more a source of trouble than a nicety.

                1. 1

                  Place script in location to deploy and map a URL to it

                  Which was possible for years before PHP via CGI and is no longer possible for PHP in many setups. PHP != mod_php

                  1. 6

                    Which was possible for years before PHP via CGI

                    mod_php did this better than CGI did at the time.

                    1. From what I remember from trying out this stuff at the time, the .htaccess boilerplate for mod_cgi was more hassle and harder to understand.
                    2. CGI got a rep for being slow. fork/exec on every request costs a little, starting a new Perl interpreter or whatever on every request cost a lot. (and CGI in C was a productivity disaster)
                    3. PHP had features like parsing query strings and form bodies for you right out of the box. No need to even write import cgi.

                    Overall the barrier to entry to start getting something interactive happening in PHP was much lower.

                    From what I remember the documentation you could find online was much more tutorial shaped for PHP than what you could find online for CGI.

                    PHP != mod_php

                    Sure now, but pm is discussing the past. PHP == mod_php was de facto true during the period of time in which PHP’s ubiquity was skyrocketing. Where pm above describes what PHP’s killer features “were”, this is the time period they are describing.

                    1. 4

                      mod_php did this better than CGI did at the time.

                      It also did it much worse. With CGI, the web browser would fork, setuid to the owner of the public_html directory, and then execve the script. This had some overhead. In contrast, mod_php would run the PHP interpreter in-process. This meant that it had read access to all of the files that the web server had access to. If you had database passwords in your PHP scripts, then you’d better make sure that you trust all of the other users on the system, because they can write a PHP script that reads files from your ~/public_html and sends them to the requesting client. A lot of PHP scripts had vulnerabilities that let them dump the contents of any file that the PHP interpreter could read and this became any file the web server could read when deployed with mod_php. I recall one system I was using being compromised because the web server could read the shadow password file, someone was able to dump it, and then they were able to do an offline attack (back then, passwords were hashed with MD5 and an MD5 rainbow table for a particular salt was something that was plausible to generate) and find the root password. They then had root access on the system.

                      This is part of where the PHP hate came from: ‘PHP is fast’ was the claim, and the small print was ‘as long as you don’t want any security’.

                      1. 1

                        This is completely irrelevant to the onboarding experience.

                        Either way, empirically, people didn’t actually care all that much about the fact that their php webhosts were getting broken into.

                        1. 1

                          This is completely irrelevant to the onboarding experience.

                          It mattered for the people who had their database credentials stolen because mod_php gave everyone else on their shared host read access to the file containing them. You’re right that it didn’t seem to harm PHP adoption though.

                    2. 2

                      Not to the same extent at all. CGI would spawn a process on the operative system per request. It was practically impossible to keep safe. PHP outsourced the request lifecycle out of the developer’s concern. And did so with a huge performance gain compared to CGI. While in theory you could to “the same” with CGI, in practice,.it was just not viable. When PHP4 arrived, CGi was already in a downwards spiral already, with most hosting providers disabling access to it. While Microsoft and Sun microsystems followed PHP philosophy by offering ASP and JSP, which had their own share of popularity.

                      PHP is, by and large, mod_php and nowadays fpm. The manual introductory tutorial even assumes such usage. Had they packaged it early on as a regular programming language, with its primary default interpreter hooked up to standard streams, it might have been forgotten today. Although personally I think they should have made that switch long ago.

                  2. 4

                    “Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary.”

                    https://schemers.org/Documents/Standards/R5RS/HTML/

                    1. 1

                      I think it’s the same principle as with source code: you want as little as possible while keeping things readable and correct

                    1. 1

                      The FRP capabilities mentioned look impressive. Interesting also that Whiley is used for teaching – have students been receptive to learning and working with a niche language?

                      I remember finding the Whiley development blog a while back and appreciating how it doesn’t shy away from the uncertainty in language design.

                      1. 2

                        Hey,

                        Well, given the contexts in which we’ve used Whiley (i.e. for teaching about specification and formal methods) then it makes sense to students. A while back we had a 2nd year course with 200 odd students simply called “software correctness” which covered a whole bunch of stuff. It worked great at that level, though the course was eventually cancelled for lots of exciting reasons. They do get frustrated when things don’t work properly, or have weird behaviour. We also use it now in a 3rd year safety critical systems course, where we had been using Alloy before. Students struggled with Alloy to be honest, and Whiley is more like a programming language they know.

                      1. 5

                        Cool project! The documentation looks quite thorough. I’d be interested to see some longer source examples as well.

                        I wonder how the VM handles memory-management? I didn’t see any garbage collection code when I peeked at the source. I’m curious how this language tackles the problem, given Rust’s strictness around memory.

                        1. 3

                          Looks like it uses reference counting exclusively:

                          https://rune-rs.github.io/rune/variables.html

                          1. 2

                            FWIW I stumbled across this paper in my GC research, but didn’t read the whole thing (I’m using C++, not Rust):

                            It seems consistent with my early experience, as I’m finding you need a little bit of casting, but everything else can be type safe. This is how I think of GC:

                            • from the program’s point of view (the “mutator”), the heap is a heterogeneous graph (it has types, arrays, etc. and pointers are edges)
                            • from the GC’s point of view, the heap is a homogeneous graph (or at least more homogeneous, since you only care about record/array sizes, and positions of pointers)

                            Although there are also significant practical differences between GCs for statically-typed languages and dynamically-typed, even though the algorithms are the same.

                            It looks like Rune is dynamically typed, so maybe the experience in this paper doesn’t totally apply to it.


                            Rust as a Language for High Performance GC Implementation

                            https://scholar.google.com/scholar?cluster=7217598857552682372&hl=en&as_sdt=0,5&sciodt=0,5

                            http://users.cecs.anu.edu.au/~steveb/pubs/papers/rust-ismm-2016.pdf

                            We describe our experience implementing an Immix garbage collector in Rust and C. We discuss the benefits of Rust, the obstacles encountered, and how we overcame them. We show that our Immix implementation has almost identical performance on micro benchmarks, compared to its implementation in C, and outperforms the popular BDW collector on the gcbench micro benchmark. We find that Rust’s safety features do not create significant barriers to implementing a high performance collector. Though memory managers are usually considered low-level, our high performance implementation relies on very little unsafe code, with the vast majority of the implementation benefiting from Rust’s safety. We see our experience as a compelling proof-of-concept of Rust as an implementation language for high performance garbage collection.

                          1. 1

                            I think that people only hate a technology when:

                            a) it’s frustrating to use, and

                            b) they’re stuck with it (there’s no better alternative)

                            The latter only really happens when a technology is widespread in a given domain. It follows that OOP is hated because of its ubiquity, not in spite of it.

                            A failed technology isn’t one that nobody likes; it’s one that nobody remembers.

                            1. 1

                              I’ve tried to implement a type system based on this algorithm. Things get tricky when you throw recursive objects into the mix, and I was never able to come up with a good workaround. That being said, I think the ideas have potential, especially when it comes to retrofitting dynamic language with type checking.

                              1. 2

                                Very nice! Is the process of queuing a print and posting it to the customer automated, or is the “end point” a mail to you and you print & parcel it up when you have time?

                                1. 3

                                  The process is pretty much completely automated! Once you submit a design and payment, the backend automatically generates the finalized design and sends the order to the print service. The poster then gets sent directly to the provided shipping address when it’s completed. I make the system wait for me to confirm each order, but that’s just a matter of pressing a button.

                                  If the site gets more interest I might do a write-up on the implementation – it was pretty interesting to design.

                                  1. 2

                                    I would be interested to read that write-up :-)

                                    1. 2

                                      I would also be interested to read that writeup!

                                  1. 4

                                    I like the idea but the contrast between the unfilled* pieces and the board is too low - they all end up looking like there’s only one player from a distance.

                                    • filled with the white square colour.
                                    1. 2

                                      Hmm that’s a fair point. I just took a look at the samples I’ve ordered: they look like you’d expect at a normal viewing distance, but become more abstract when viewed from across the room like you said. I’ll look into whether there’s a way to set the piece background color.

                                    1. 5

                                      I take notes with this little page I made: https://averyn.net/paper.html

                                      It’s got a title field and a content textbox. When you change the title of the note the page title changes. You then save the note by saving the entire page. I like it because it lets me take notes in the browser, while keeping my data offline.

                                      1. 1

                                        Wrapping up my custom-chess-poster-generator site checkmateposters.com. Now I have to learn how to market it! Any tips?

                                        1. 2

                                          Trying my hand at graphic design making some chess posters. Tomorrow I’ll make a few more for some classic games and see if I can sell ’em!

                                          1. 3

                                            Method syntactic sugar: these two forms are equivalent:

                                            cat('urls.txt').grep('https://').print()
                                            print(grep(cat('urls.txt'), 'https://'))
                                            

                                            I love this feature. Seems like it would provide great support for both point-free style and OOP.

                                            1. 4

                                              Nim also has this feature.

                                            1. 3

                                              I’m learning F# now, I’m practicing with Exercism exercises (they provide tests and exercise description). One exercise is a simple Forth (subset). After skimming this article (only superficially, as I had a pretty long day) I found some aspects similar to my solution.

                                              I share it if anybody might be interested. Feedback is welcome, but keep in mind, it is practise, not meant to be fast/efficient/optimal/user friendly. Actually it has no shell/repl.

                                              https://gist.github.com/kodfodrasz/d9a8054d6d5d86ff5a2687d51150f02d

                                              1. 2

                                                I don’t know anything about F#, so I can’t give much feedback, but it looks neat! I can see some similarities to the Pointless code. Thanks for sharing!

                                                1. 1

                                                  I’ve been doing exercism for Elixir and I am really looking forward to the Forth problem. It seems like it’ll be fun and I really want to see what the community solutions look like.

                                                  1. 1

                                                    Where’s the return stack, or the dictionary? The immediate time macros? Memory?

                                                    It’s very cool, but it’s just a stack computer. Forth at a minimum is a two stack language.

                                                    1. 1

                                                      a simple Forth (subset)

                                                      1. 1

                                                        But it’s only technically a subset of Forth. It’s a stack machine that processes generic stack machine instructions, with a forth-inspired word definition mechanism. Forth is : forth stacks blocks words ; (little in-joke there). It doesn’t even have an address/return stack!

                                                  1. 2

                                                    (just a total side-note on the site: You might not want to add the <header> to the <main> body, because otherwise it gets rendered in the (Firefox) reader view.

                                                    1. 1

                                                      Good call, thanks!

                                                      1. 1

                                                        )

                                                      1. 1

                                                        Awesome talk! I’d imagine the next step towards safety in the examples given would be to use something like smart constructors:

                                                        data RGB = RGB (Double, Double, Double) deriving Show
                                                        
                                                        rgb (r, g, b) | r < 0 || r > 1 = error "r out of range"
                                                                      | g < 0 || g > 1 = error "g out of range"
                                                                      | b < 0 || b > 1 = error "b out of range"
                                                                      | otherwise      = RGB (r, g, b)
                                                        

                                                        You could then write tests against this implementation, but I wonder how effective that would be? It seems like as soon as you think of an edge case to test against, you can just add it as an assertion and have it covered. I’d imagine that when writing stateless code, most unit tests can be made redundant by incorporating equivalent assertions throughout the program (though redundancy could be beneficial).

                                                        1. 1

                                                          PEP 617 “New PEG parser for CPython” linked in the article was a really good read, with a nice comparison of real-world parser capabilities and limitations. Could be a good resource for those looking to develop parsers for new programming languages. Also, the use of position-based memoization to accommodate left-recursion is a neat idea that I hadn’t seen before.

                                                          1. 2

                                                            I also enjoyed this extensive explanation from Python creator and PEG parser implementor Guido van Rossum himself, which can be found in this video: https://youtu.be/QppWTvh7_sI

                                                            It’s also just a fun video on language parsers in general

                                                          1. 12

                                                            This is interesting, and I think I agree with many arguments when it comes to the reasons java, OCaml, Haskell, Go, etc. haven’t replaced C. However the author cites rust only briefly (and C++ not at all?) and doesn’t really convince me why rust isn’t the replacement he awaits: almost all you can do in C, you can do in rust (using “unsafe”, but that’s what rust is designed for after all), or in C++; you still have a higherer-level, safer (by default — can still write unsafe code where needed), unmanaged language that can speak the C ABI. Some projects have started replacing their C code with rust in an incremental (librsvg, I think? And of course firefox) because rust can speak the C ABI and use foreign memory like a good citizen of the systems world. Even C compilers are written in C++ these days.

                                                            To me that’s more “some were meant for no-gc, C ABI speaking, unsafe-able languages” than “some were meant for C”. :-)

                                                            1. 17

                                                              Besides Rust, I think Zig, Nim, and D are strong contenders. Nothing against Rust, of course, but I’m not convinced it’s the best C replacement for every use case. It’s good to have options!

                                                              Nonetheless, I imagine C will linger on for decades to come, just due to network effects and economics. Legacy codebases, especially low-level ones, often receive little maintenance effort relative to usage, and C code is incredibly widespread.

                                                              1. 15

                                                                I love Rust, but I think Zig and D (in the ‘better C’ mode and hopefully their new borrow checker) are closer to the simplicity and low-level functionality of C. Rust is a much nicer C++, with simpler (hah!) semantics and more room to improve. C++ is, unfortunately, a Frankenstein monster of a language that requires a 2000 page manual just to describe all the hidden weird things objects are doing behind your back. Every time I have to re-learn move semantics for a tiny project, I want to throw up.

                                                                1. 3

                                                                  i was also wondering, while reading the article, how well ada would fit the author’s use case (i’m not at all familiar with the langauge, i’ve just heard it praised as a safe low-level language)

                                                                  1. 1

                                                                    The lot of them! It was kind of a large gap between C and Python/Perl/Ruby/Java.

                                                                    1. 12

                                                                      Maybe I’m the archetype of a C-programmer not going for Rust. I appreciate Rust and as a Mathematician, I like the idea of hard guarantees that aren’t a given in C. However, Rust annoys me for three main reasons, and these are deal-breakers for me:

                                                                      • Compile time: This is not merely the language’s fault, and has more to do with how LLVM is used, but there doesn’t seem to be much push to improve the situation, either. It annoys me as a developer, but it also really annoys me as a Gentoo user when a Firefox compilation takes longer and longer with each subsequent Rust release. Golang is a shining example for how you can actually improve compilation times over C. Admittedly, Rust has more static analysis, but damn is it slow to compile! I like efficiency, who doesn’t? Rust really drops the ball there.
                                                                      • Standard library/external libraries: By trying to please everyone and not mandating certain solutions, one is constantly referred to this or that library on GitHub that is “usually used” and “recommended”. In other cases, there are two competing implementations. Sure, Rust is a young language, but for that reason alone I would never choose it to build anything serious on top of it, as one needs to be able to rely on interfaces. The Rust developers should stop trying to please everybody and come up with standard interfaces that also get shipped with the standard install.
                                                                      • Cargo/Package management: This point is really close to the one before it: Cargo is an interesting system, but really ends up becoming a “monosolution” for Rust setups. Call me old-fashioned, but I like package managers (especially Gentoo’s) and Cargo just works around it. When installing a Rust package, you end up having to be connected to the internet and often end up downloading dozens of small crates from some shady GitHub repos. I won’t make the comparison with node.js, given Cargo can be “tagged” to a certain version, but I could imagine a similar scenario to leftpad in the future. Rust really needs a better standard library so you don’t have to pull in so much stuff from other people.

                                                                      To put it shortly: What I like about C is its simplicity and self-reliance. You don’t need Cargo to babysit it, you don’t need dozens of external crates to do basic stuff and it doesn’t get in the way of the package manager. I actually like Rust’s ownership system, but hate almost anything around it.

                                                                      1. 16

                                                                        C doesn’t even have a hash table. It needs more external libraries to do basic stuffs, not less.

                                                                        1. 5

                                                                          See, I feel the exact opposite when it comes to Cargo vs. system’s package manager: managing versions of libraries using your system’s package manager is a royal pain in the ass or outright impossible when you have multiple projects requiring different versions of a library. In my experience with C and C++, you’ll end up using CMake or Meson to build exactly the same functionality that Cargo deploys for you, at a much higher cost than just adding one line in a configuration file.

                                                                          In fact, my biggest gripe with C and C++ is that they still depend on a 3-step build system (preprocessing, compiling, linking) each of which requires you to specify the location of a group of files. I get why having header files was attractive in the 1970s when you counted your computer’s memory in KBs, but it makes writing and maintaining code such a pain in the ass when compared with a modern module system.

                                                                          The funniest bit is I used to consider all these things as ‘easy to deal with’ when all I did was write C/C++ code 15 years ago. Nowadays, having to switch from Go, Rust or any other language to C for a small project makes me want to cry because I know I’ll spend about 20% of the time managing bullshit that has nothing to do with the code I care about.

                                                                          1. 9

                                                                            Build systems in C give a feeling of craftsmanship. It takes a skill to write a Makefile that correctly supports parallel builds, exact dependencies, interruptions and cleanups, etc. And so much work into making it work across platforms, and compilers.

                                                                            And then Cargo just makes it pointless. It’s like you were king’s best messenger trained to ride the fastest stallions, and Cargo’s like “thanks, but we’ve got e-mail”.

                                                                            1. 2

                                                                              LOL, I guess it’s a matter of age. When I first started programming, I’d love all that stuff. I can’t count how many libraries and tools I re-implemented or extended because they didn’t do something exactly the way I wanted it. Or the nights I spent configuring my Linux machine to work just right. Or the CPU time I spent re-encoding all of my MP3 collection to VBR because it’d save 5% of storage.

                                                                              Now, I learned Cargo for a tiny project and I keep swearing every time I have to start a Python virtualenv because it’s just not easy enough, goddammit!.

                                                                          2. 2

                                                                            This is a fair criticism of C, personally I would love to see a set commonly used data structures added to the C standard library. However, currently in the C world you either write your own or use something like glib, neither of these cases require the equivalent of Cargo.

                                                                            1. 4

                                                                              However, currently in the C world you either write your own or use something like glib, neither of these cases require the equivalent of Cargo.

                                                                              Neither does using the Rust standard library, which also has a hash table implementation (and many other useful data structures). You can just use std and compile your project with rustc.

                                                                              1. 1

                                                                                We’re talking about dependencies in general, not just hash tables. FRIGN’s point is that the Rust standard library is lacking, so you end up needing crates.

                                                                                1. 3

                                                                                  But you and FRIGN are complaining about the Rust standard library compared to C. The Rust standard library is much more comprehensive than the C standard library or the C standard library + glib. So, the whole point seems to be void if C is the point of comparison.

                                                                                  If you are comparing to the Java standard library, sure!

                                                                                  1. 1

                                                                                    But you and FRIGN are complaining about the Rust standard library compared to C.

                                                                                    Not really. The point being made is that a typical Rust application has to download a bunch of stuff from Github (crates), where as a typical C application does not.

                                                                                    1. 8

                                                                                      That’s just because it’s convenient and most people don’t really care that it happens. But it’s not inherent to the tooling:

                                                                                      $ git clone -b ag/vendor-example https://github.com/BurntSushi/ripgrep
                                                                                      $ cd ripgrep
                                                                                      $ cargo build --release
                                                                                      

                                                                                      Other than the initial clone (obviously), nothing should be talking to GitHub or crates.io. You can even do cargo build --release --offline if you’re paranoid.

                                                                                      I set that up in about 3 minutes. All I did was run cargo vendor, setup a .cargo/config to tell it to use the vendor directory, committed everything to a branch and pushed it. Easy peasy. If this were something a lot of people really cared about, you’d see this kind of setup more frequently. But people don’t really care as far as I can tell.

                                                                                      where as a typical C application does not

                                                                                      When was that last time you built a GNU C application? Last time I tried to build GNU grep, its build tooling downloaded a whole bunch of extra goop.

                                                                                      1. -2

                                                                                        Nice strawman, I said a typical C application, not a typical GNU C application.

                                                                                        1. 5

                                                                                          TIL that a GNU C application is not a “typical” C application. Lol.

                                                                                          1. -1

                                                                                            None of the C code I’ve worked on was written by GNU, and most of the C code out in the real world wasn’t written by GNU either. I find it frankly bizarre that you are seriously trying to suggest that GNU’s practices are somehow representative of all projects written in C.

                                                                                            1. 3

                                                                                              You said “a typical C application.” Now you’re saying “representative” and “what I’ve worked on.”

                                                                                              If the implementation of coreutils for one of the most popular operating systems in history doesn’t constitute what’s “typical,” then I don’t know what does.

                                                                                              Talk about bizarre.

                                                                                              Moreover, you didn’t even bother to respond to the substance of my response, which was to point out that the tooling supports exactly what you want. People just don’t care. Instead, you’ve decided to double down on your own imprecise statement and have continued to shift the goal posts.

                                                                                              1. 0

                                                                                                and most of the C code out in the real world wasn’t written by GNU either.

                                                                                                ^ typical

                                                                                                I don’t have the time or patience to debate semantics though.

                                                                                                As for your other point, see FRIGN’s comment for my response. (It doesn’t matter what’s possible when the reality is random crates get pulled from github repos)

                                                                            2. 1

                                                                              C doesn’t even have a hash table.

                                                                              Why do you say “even”? There are many hash table implementations in C, with different compromises. It would be untoward if any of them made its way into the base language. There are other things missing in C which are arguably more fundamental (to me) before hash tables. It is only fair if all of these things are kept out of the language, lest the people whose favorite feature has not been included feel alienated by the changes.

                                                                            3. 15

                                                                              but there doesn’t seem to be much push to improve the situation

                                                                              Definitely not true. There are people working on this and there has been quite a bit of progress:

                                                                              $ git clone https://github.com/BurntSushi/ripgrep
                                                                              $ cd ripgrep
                                                                              $ git checkout 0.4.0
                                                                              $ time cargo +1.12.0 build --release
                                                                              
                                                                              real    1:04.05
                                                                              user    1:51.42
                                                                              sys     2.282
                                                                              maxmem  360 MB
                                                                              faults  736
                                                                              $ time cargo +1.43.1 build --release
                                                                              
                                                                              real    19.065
                                                                              user    2:34.51
                                                                              sys     3.101
                                                                              maxmem  740 MB
                                                                              faults  0
                                                                              

                                                                              That’s 30% of what it once was a few years ago. Pretty big improvement from my perspective. The compilation time improvements come from all around too. Whether it’s improving the efficiency of parallelism or micro-optimizing rustc itself: here, here, here, here, here, here or here.

                                                                              People care.

                                                                              The Rust developers should stop trying to please everybody and come up with standard interfaces that also get shipped with the standard install.

                                                                              That’s one of std’s primary objectives. It has tons of interfaces in it.

                                                                              This criticism is just so weird, given that your alternative is C. I mean, if you want the C experience of “simplicity and self-reliance,” then std alone is probably pretty close to sufficient. And if you want the full POSIX experience, bring in libc and code like its C. (Or maybe use a safe interface that somebody else has thoughtfully designed.)

                                                                              When installing a Rust package, you end up having to be connected to the internet

                                                                              You do not, at least, no more than you are with a normal Linux distro package manager. This was a hard requirement. Debian for example requires the ability to use Cargo without connecting to the Internet.

                                                                              and often end up downloading dozens of small crates from some shady GitHub repos.

                                                                              Yup, the way the crates.io model works means the burden of doing due diligence is placed on each person developing a Rust project. But if you’re fine with the spartan nature of C’s standard library, then you should be just fine using a pretty small set of well established crates that aren’t shady. Happy to see counter examples though!

                                                                              but I could imagine a similar scenario to leftpad in the future.

                                                                              The leftpad disaster was specifically caused by someone removing their package from the repository. You can’t do that with crates.io. You can “yank” crates, but they remain available. Yanking a crate just prevents new dependents from being published.

                                                                              Rust really needs a better standard library so you don’t have to pull in so much stuff from other people.

                                                                              … like C? o_0

                                                                              1. 10

                                                                                This point is really close to the one before it: Cargo is an interesting system, but really ends up becoming a “monosolution” for Rust setups. Call me old-fashioned, but I like package managers (especially Gentoo’s) and Cargo just works around it.

                                                                                C has just been in the luxurious position that its package managers have been the default system package managers. Most Linux package managers are effectively a C package managers. Of course, over time packages for other languages have been added, but they have mostly been second-class citizens.

                                                                                It is logical that Cargo works around those package managers. Most of them are a mismatch for Rust/Go/node.js packages, because they are centered around distributing C libraries, headers, and binaries.

                                                                                but I could imagine a similar scenario to leftpad in the future.

                                                                                Rust et al. certainly have a much higher risk, since anyone can upload anything to crates.io. However, I think it is also an illusion that distribution maintainers are actually vetting code. In many cases maintainers will just bump versions and update hashes. Of course, there is some gatekeeping in that distributions usually only provide packages from better-known projects.

                                                                                Rust really needs a better standard library so you don’t have to pull in so much stuff from other people.

                                                                                You mean a large standard library like… C?

                                                                                1. 3

                                                                                  C has just been in the luxurious position that its package managers have been the default system package managers.

                                                                                  This just isn’t true, if you look at how packages are built for Debian for example you will find that languages such as Python and Perl are just as well supported as C. No, the system package managers are for the most part language agnostic.

                                                                                  1. 5

                                                                                    This just isn’t true, if you look at how packages are built for Debian for example you will find that languages such as Python and Perl are just as well supported as C.

                                                                                    Most distributions only have a small subset of popular packages and usually only a limited number of versions (if multiple at all).

                                                                                    The fact that most Python development happens in virtual environments with pip-installed packages, even on personal machines, shows that most package managers and package sets are severely lacking for Python development.

                                                                                    s/Python/most non-C languages/

                                                                                    No, the system package managers are for the most part language agnostic.

                                                                                    Well if you define language agnostic as can dump files in a global namespace, because that’s typically enough for C libraries, sure. However, that does not work for many other languages, for various reasons, such as: no guaranteed ABI stability (so, any change down the chain of dependencies needs to trigger builds of all dependents, but there is no automated way to detect this, because packages are built in isolation), no strong tradition of ABI stability (various downstream users need different versions of a package), etc.

                                                                                    1. 5

                                                                                      No, most development happens in virtualenv because python packaging is so broken that if you install a package you cannot reliably uninstall it.

                                                                                      If we didn’t have a package manager for each language then the packages maintained by the OS would be more comprehensive, by necessity. Basically having a different packaging system for each programming language was a mistake in my view. I have some hope that Nix will remedy the situation somewhat.

                                                                                      edit: it’s also difficult to reply to your comments if you substantially edit them by adding entirely new sections after posting…

                                                                                      1. 5

                                                                                        No, most development happens in virtualenv because python packaging is so broken that if you install a package you cannot reliably uninstall it.

                                                                                        I have no idea what you mean here. Can’t you use dpkg/APT or rpm/DNF to uninstall a Python package?

                                                                                        If we didn’t have a package manager for each language then the packages maintained by the OS would be more comprehensive, by necessity.

                                                                                        We are going in circles. Why do you think languages have package managers? Technical reasons: the distribution package managers are too limited to handle what languages need. Social/political reasons: having the distributions as gatekeepers slows down the evolution of language ecosystems.

                                                                                        I have some hope that Nix will remedy the situation somewhat.

                                                                                        Nix (and Guix) can handle this, because it is powerful enough to implement the necessary language-specific packaging logic. In fact, Nix’ buildRustCrate is more or less an implementation of Cargo in Nix + shell script. It does not use Cargo. Moreover, Nix can handle a lot of the concerns that I mentioned upthread: it can easily handle multiple different versions of a package and ABI-instability. E.g. if in Nix the derivation of say the Rust compiler is updated, all packages of which Rust is a transitive dependency are rebuilt.

                                                                                        As I said, traditional package managers are built for a C world. Not a Rust, Python, Go, or whatever world.

                                                                                2. 4

                                                                                  The first problem is a technical one, unless Rust is doing things such that it can’t be compiled efficiently, but the latter two are cultural ones which point up differences in what language designers and implementers are expected to provide then versus now: In short, Rust tries to provide the total system, everything you need to build random Rust code you find online, whereas C doesn’t and never did. Rust is therefore in with JS, as you mention, but also Perl, Python, Ruby, and even Common Lisp now that Quicklisp and ASDF exist.

                                                                                  I was going to “blame” Perl and CPAN for this notion that language implementations should come with package management, but apparently CPAN was made in imitation of CTAN, the Comprehensive TeX Archive Network, so I guess this goes back even further. However, the blame isn’t with the language implementers at all: Packaging stuff is one of those things which has been re-invented so many times it’s bound to be re-invented a few more, simply because nobody can decide on a single way of doing it. Therefore, since language implementers can’t rely on OS package repos to have a rich selection up-to-date library versions, and rightly balk at the idea of making n different OS-specific packages for each version of each library, it’s only natural each language would reinvent that wheel. It makes even more sense when you consider people using old LTS OS releases, which won’t get newer library versions at this point, and consider longstanding practice from the days before OSes tended to have package management at all.

                                                                                  1. 7

                                                                                    Therefore, since language implementers can’t rely on OS package repos to have a rich selection up-to-date library versions, and rightly balk at the idea of making n different OS-specific packages for each version of each library, it’s only natural each language would reinvent that wheel.

                                                                                    This is right on the mark.

                                                                                    Sorry if this is a bit of a tangent, but I think it is not just a failing of package sets – from the distributor’s perspective it is impossible to package every Rust crate and rust crate version manually – but especially of package managers themselves. There is nothing that prevents a powerful package management system to generate package sets from Cargo.lock files. But most package managers were not built for generating package definitions programmatically and most package managers do not allow allow installing multiple package versions in parallel (e.g. ndarray 0.11.0 and ndarray 0.12.0).

                                                                                    Nix shows that this is definitely feasible, e.g. the combo of crate2nix and buildRustCrate can create Nix derivations for every dependency in a Cargo.lock file. It does not use cargo at all, compiles every crate into a separate Nix store path. As a result, Rust crates are not really different from any other package provided through nixpkgs.

                                                                                    I am not that familiar with Guix, but I bet it could do the same.

                                                                                  2. 3

                                                                                    Build times are important, and you’re right, Rust takes a while to compile. Given the choice between waiting for rustc to finish and spending a lot longer debugging a C program after the fact, I choose the former. Or, better yet, use D and get the best of both worlds.

                                                                                    1. 3

                                                                                      rust can be damn fast to compile, most rust library authors just exercise fairly poor taste in my opinion and tend not to care how bad their build times get. sled.rs compiles in 6 seconds on my laptop, and most other embedded databases take a LOT longer (usually minutes), despite sled being Rust and them being C or C++.

                                                                                      rust is a complex tool, and as such, you need to exercise judgement (which, admittedly, is rare, but that’s no different from anything else). you can avoid unnecessary genericism, proc macros, and trivial dependencies to get compile times that are extremely zippy.

                                                                                      1. 2

                                                                                        Thanks for your insights! I’ll keep it in mind the next time I try out Rust.

                                                                                        1. 1

                                                                                          Feel free to reach out if you hit any friction, I’m happy to point folks in the direction they want to go with Rust :)

                                                                                    2. 5

                                                                                      almost all you can do in C, you can do in rust

                                                                                      As an anecdata, I‘ve immediately recognized the snippet with elf header from the article, because I used one of the same tricks (just memcpying a repr(C) struct) for writing elf files in Rust a couple of months ago.

                                                                                      1. 2

                                                                                        I though about using rust to implement a byte-code compiler / vm for a gc’d language project, but I assumed that this would require too much fighting to escape rust’s ownership restrictions. Do you have any insight into how well suited rust is for vm implementation? I haven’t used the language much but I’d love to pick it up if I though I could make it work for my needs.

                                                                                        (I see that there’s a python interpreter written in rust, but I’m having trouble locating its gc implementation)

                                                                                        1. 5

                                                                                          I honestly don’t know, I have never written an interpreter. You probably can fall back on unsafe for some things anyway, and still benefit from the move semantics, sum types, syntax, and friendly error messages. I’m doing a bit of exploring symbolic computations with rust and there are also some design space exploration to be done there.

                                                                                          1. 2

                                                                                            IMO, Rust is just as good as C and C++ for projects like this, if not better thanks to pattern matching and a focus on safety (which goes far beyond the borrow checker). Don’t be afraid to use raw pointers and NonNull pointers when they are appropriate.

                                                                                            1. 2

                                                                                              Also, just saw this GC for Rust on the front page: https://github.com/zesterer/broom/blob/master/README.md

                                                                                              Looks like it’s designed specificly for writing dynamic languages in Rust.

                                                                                              1. 1

                                                                                                Oh cool! This looks super useful

                                                                                            2. 2

                                                                                              I wrote a ST80 VM in rust just to play around; the result was beautifully simple and didn’t require any unsafe code, though it doesn’t currently have any optimization at all. The result was still reasonably snappy, but I suspect that a big part of that is that the code I was running on it was designed in, well, 1980.

                                                                                              1. 2

                                                                                                I recently did a simple lisp interpreter in rust. Eventually decided to re-do it in c because of shared mutation and garbage collection.

                                                                                                1. 2

                                                                                                  Rust works well for VM implementation. For GC, you may want to look at https://github.com/zesterer/broom.

                                                                                              1. 2

                                                                                                I wish that I understood monads in Haskell well enough to appreciate this article - but I’ve had so much trouble trying to grasp how monadic IO and state work under-the-hood that I’ve never gotten around to learning about the different ways that they can be used.

                                                                                                In particular I remember reading these paragraphs from two different articles:

                                                                                                From Unraveling the mystery of the IO monad

                                                                                                “When we teach beginners about Haskell, one of the things we handwave away is how the IO monad works. Yes, it’s a monad, and yes, it does IO, but it’s not something you can implement in Haskell itself, giving it a somewhat magical quality.”

                                                                                                And from Pure IO monad and Try Haskell

                                                                                                “As Haskellers worth their salt know, the IO monad is not special … I’d recommend Haskell intermediates (perhaps not newbies) to implement their own IO monad as a free monad, or as an mtl transformer, partly for the geeky fun of it, and partly for the insights.”

                                                                                                These inconsistent descriptions of how much magic these monads involve is, for me, a far bigger source of confusion than notation.

                                                                                                1. 4

                                                                                                  Here’s a short explanation of what’s “going” under the hood. It’s shortly a problem of representing interactions using functions.

                                                                                                  You could think that under every “print”, etc. there’s a function that takes a world and gives a world where something’s been printed to your screen. Now you can chain these functions to do input/output.

                                                                                                  But how about you write a function that fills a type such as World → (World, World)? There’s a trick to preventing this. Lets say that you must return a function World → World, and the result you return determines which interaction is actually happening when you run the program, you can still create “speculated” images.

                                                                                                  But there’s still a problem, how about you do this:

                                                                                                  let (x, w1) = read (print "input?" world)
                                                                                                  in print ("hello " ++ x) world
                                                                                                  

                                                                                                  To make it clear, this program is producing it’s output from results that cannot be produced because they are result of speculation.

                                                                                                  It’s possible to limit production of results such that you can only read those results that you actually chained to being part of the interaction. This is achieved by retrieving the result in a function. Eg. you get the result by passing in: input → World. Now the system can prepend the interaction to be part of the result and you no longer can use up values that end up being pure speculation.

                                                                                                  “Under the hood” the value you returned is deconstructed and steps are taken to produce the interaction that’s represented.

                                                                                                  If you need more details, I can provide, but eventually they become implementation dependent. The first article you got is probably slightly wrong because it conflates things. You can roll your own IO monad inside Haskell.

                                                                                                  {-# LANGUAGE GADTs #-}
                                                                                                  
                                                                                                  module MY_IO where
                                                                                                  
                                                                                                  data MyIo a where
                                                                                                      M_print :: String -> MyIo ()
                                                                                                      M_getLine :: MyIo String
                                                                                                      M_bind :: MyIo a -> (a -> MyIo b) -> MyIo b
                                                                                                  
                                                                                                  m_hello :: MyIo ()
                                                                                                  m_hello = M_getLine `M_bind` step2
                                                                                                      where step2 name = M_print ("Hello " ++ name)
                                                                                                  
                                                                                                  m_interpret :: MyIo a -> IO a
                                                                                                  m_interpret (M_print s) = print s
                                                                                                  m_interpret (M_getLine) = getLine
                                                                                                  m_interpret (M_bind x f) = m_interpret x >>= m_interpret . f
                                                                                                  

                                                                                                  You can also consider how it works if you replace “data MyIo” by “class MyIo”.

                                                                                                  Think that it’s not real IO monad because “it’s been implemented with IO”. Well.. You don’t need to implement it with IO.

                                                                                                  l_interpret :: MyIo a -> [String]
                                                                                                      -> Either [String] (a, [String], [String])
                                                                                                  l_interpret (M_print s) x      = Right ((), x, [s])
                                                                                                  l_interpret (M_getLine) (x:xs) = Right (x, xs, []) 
                                                                                                  l_interpret (M_getLine) []     = Left []
                                                                                                  l_interpret (M_bind x f) xs = case l_interpret x xs of
                                                                                                      Left ys1 -> Left ys1
                                                                                                      Right (z,xs1,ys1) -> case l_interpret (f z) xs1 of
                                                                                                          Left ys2 -> Left (ys1 ++ ys2)
                                                                                                          Right (q,xs2,ys2) -> Right (q, xs2, ys1 ++ ys2)
                                                                                                  

                                                                                                  For example, there it’s been interpreted as abstract interactions like that. And now you can examine the “m_hello” as potential interactions:

                                                                                                  *MY_IO> l_interpret m_hello []
                                                                                                  Left []
                                                                                                  *MY_IO> l_interpret m_hello ["foo"]
                                                                                                  Right ((),[],["Hello foo"])
                                                                                                  *MY_IO> l_interpret m_hello ["foo", "bar"]
                                                                                                  Right ((),["bar"],["Hello foo"])
                                                                                                  

                                                                                                  You can also treat it with continuations if you don’t like that the previous thing recomputes everything when you add an input.

                                                                                                  1. 2

                                                                                                    Most people tell me to just use them in Haskell in a variety of situations to understand them rather than worrying about what’s underneath. I still searched for a specific comment on internals that enlightened me quite a bit. I didn’t find it.

                                                                                                    I’ll still share the search results since they had many interesting comments on the topic. One or more might be helpful. One even has an implementation of Maybe in the C language.

                                                                                                  1. 2

                                                                                                    I liked the examples in this article, and agree with its assessment. I wrote a bit of code that gets at the “what might have been” example:

                                                                                                    function convertValue(value) {
                                                                                                      if (value instanceof HTMLElement) {
                                                                                                        return value;
                                                                                                      }
                                                                                                      return document.createTextNode(value.toString());
                                                                                                    }
                                                                                                    
                                                                                                    function makeNode(tag, children) {
                                                                                                      let node = document.createElement(tag);
                                                                                                      children.map(convertValue).map(node.appendChild.bind(node));
                                                                                                      return node;
                                                                                                    }
                                                                                                    
                                                                                                    function H(tag, attrs = {}) {
                                                                                                      return (...children) => {
                                                                                                        let node = makeNode(tag, children)
                                                                                                        Object.keys(attrs).map(key => node.setAttribute(key, attrs[key]));
                                                                                                        return node;
                                                                                                      }
                                                                                                    }
                                                                                                    
                                                                                                    // example usage
                                                                                                    
                                                                                                    let f = () => alert("foo");
                                                                                                    let button = H("button", {class: "foo, bar", data: "1", onclick: "f()"})
                                                                                                    document.body.appendChild(button("Hello ", H("em")("there")))
                                                                                                    

                                                                                                    It works pretty nicely! (though solutions like lit-html would be far more robust)