Threads for kbd

  1. 3

    building the compiler itself used to require 9.6GB of RAM, while now it takes 2.8GB

    Wow, how can a compiler for a language like ZIP use that much memory?

    1. 6

      One reason would be that the entire project is giant compilation unit.

      1. 3

        Is it? Then the size of the project would be limited by available memory or CPU architecture. Doesn’t the Zig compiler support separate compilation?

        1. 1

          What’s wrong with whole program compilation? http://mlton.org/References.attachments/060916-mlton.pdf

          1. 8

            That it requires 9.6GB of RAM?

            1. 7

              Optimizing compilers and linkers use a lot of ram. Because it is generally accepted that developers would prefer compile time be shorter, and generally will happily trade ram for that - it is much easier to double the ram in your machine that it is to double the cpu performance.

              This post does say “hey, we’ve made compilation use less ram”, which I’m going to guess was some particular section was using a data structure that had the trade off skewed, or configured with the wrong trade off.

              Sure you can point to old compilers that used less ram, but they produced worse code, took longer, or both.

              There are plenty of reason to bash Zig, but this just isn’t one of them.

              1. 8

                I think the real tradeoff is that C compilers used to operate a line at a time, more or less. That warps the language and requires things like forward declarations and the preprocessor.

                (The preprocessor also enables separate compilation – parallelization by processes and incremental builds, which is nice.)

                But no sane language would make those language concessions now, including Zig.


                Still I would like to read a blog post about why self-hosted Zig requires 2.8 GB of RAM. I’m not saying it is too much, but I think it would be instructive.

                Especially after the talk about data-oriented programming and Zig’s tokenizing / parsing / AST (which I found useful).

                I thought the Zig compiler was around 100K lines of code, not 1M lines of code. So very naively speaking, that would be 28 KB of memory per line, which is ~1000x blowup on the input size.

                The code representation doesn’t require that much blowup – it’s 10x at most. So what’s are the expensive algorithms for the other 100x? Type checking, executing comptime, register allocation, etc. ?

                That would be a very interesting analysis

                1. 9

                  I was curious too, so I learned how to use massif (turned out to be near-trivial) and collected this data:

                  https://i.imgur.com/pAUASx4.png

                  Appears to be mostly LLVM. So another interesting data point would be how much memory is used when Zig builds itself without LLVM involved.

                  Zig’s non-LLVM x86 backend is not capable of building Zig yet, but I can offer a data point on building the behavior tests, which total about 31,000 lines: peak RSS of 90 MiB

                  Another data point would be using that contributor’s branch that improves the C backend mentioned in the post- I’m actually able to use it to translate the Zig self-hosted compiler into C code. Peak RSS: 459 MiB Visualization: https://i.imgur.com/ww23lx3.png So here it looks like the culprit is, again, buffering the whole .c file output before writing it. I would expect the upcoming x86 backend to have an even better memory profile than this since it does not buffer everything in memory.

                  1. 4

                    Ah thanks, well this makes me realize that I read the title / first two paragraphs and assumed that pure Zig code was taking 2.8 GB :-/ Judging by other comments, I was probably not the only one who thought that …

                    This makes more sense – from what little I know about LLVM, there are lots of hard algorithms that are quadratic or exponential in time or space, and heuristics to “give up” when passes use too many resources. I’d definitely expect the hot patching Zig compiler (a very cool idea) to use less memory, since the goal is to generate code fast, not fast code


                    I also sympathize with the 2 years of “under the hood” work – it’s a similar story for https://www.oilshell.org right now !

                    1. 4

                      Yeah it’s a bit awkward to disambiguate between these things:

                      • frontend in C++, backend in LLVM (“old bootstrap compiler”)
                      • frontend in Zig, backend in LLVM (“self-hosted compiler”)
                      • frontend in Zig, backend in Zig (“fully self-hosted compiler”)
                      • frontend in C, backend in C, outputs C (“new bootstrap compiler”)

                      All of these are relevant codebases at this point in time. What would you call them?

                      1. 3

                        I don’t have that strong an opinion, but I would say “self-hosted front end” makes sense – i.e. “Zig’s front end is self hosted”, but I wouldn’t yet say the “Zig compiler is self-hosted”

                        I don’t think of the last one as a “bootstrap” compiler. The term “bootstrapping” is overloaded, but I think of that as the first one only – the thing you wrote before you had Zig to write things in!

                        If the purpose of the last one is to run on architectures without LLVM, then maybe “generated compatible compiler” or “generated compiler in C” ?

                        That said, I probably don’t understand the system enough to suggest good names. I can see why the last one would be used for “bootstrapping” a new platform. (“Turning up” instead ?)

                        stage{0,1,2,3} might be OK too – what I do is link all terms to the “glossary”

                        https://www.oilshell.org/cross-ref.html

                        Although that doesn’t necessarily help people writing on other sites! What I really do is explain things over and over again, while trying to converge on stable terms with explicit definitions … but the terms change as the code changes, and the project’s strategy changes, so I understand the problem :)


                        edit: I think the first one “bootstraps the Zig language” and the second one “bootstraps a new platform with Zig”, which are related but different things. I think having 2 different words for those could reduce confusion, but you would probably have to invent something (which is work, but IMO fun)

                        1. 1

                          frontend in C, backend in C, outputs C (“new bootstrap compiler”)

                          Interesting, didn’t realize y’all were doing a new bootstrap compiler too. Just wondering why it doesn’t make sense to have the fully self-hosted compiler compile to C for bootstrapping?

                          1. 2

                            I’m fine with doing that for a little while but it does not actually solve the bootstrapping problem. Bootstrapping means starting with source, not an output file (even if that output file happens to have a .c extension).

                            1. 1

                              IIRC, that’s the plan

                      2. 2

                        Still I would like to read a blog post about why self-hosted Zig requires 2.8 GB of RAM ….

                        That brings it to the point, thanks; I’ve read that the compiler is 200kLOC, but 500x blowup is still very, very much; assuming that the backend is still LLVM, we can assume that the frontend was responsible for the additional 6.8 GB, which makes it even more mysterious.

                      3. 3

                        it is much easier to double the ram in your machine that it is to double the cpu performance.

                        But doubling RAM is only half of the story: you also need adequate memory bandwidth. You can see the effect of this in how poorly (non-Pro) Threadrippers scale for C++ compilation.

                        1. 1

                          Yes, but it’s still easier to double the ram than double the cpu runtime performance - as the last decade I guess has shown the bulk of the big compute increases are from increasing the number of cores, which in general does not seem to benefit compilation of individual translation units.

                          Obviously you would ideally have faster and uses less ram, but that’s true of anything with a trade off: we’d rather there not be one :)

                          So I would rather something use more ram and get things done faster than skimp on memory out of some arbitrary misplaced fear of using the available system resources.

                          A lot of my work in JSC was directly memory vs page load time, and people seem to have found that to be the correct trade off.

                          1. 1

                            TU? plt?

                            1. 2

                              Updated the comment to remove the abbreviations sorry.

                              TU = translation unit, eg the piece of code a compiler is “translating” to assembly or what have you, for example a single .c file (including all the included headers, etc) would be a TU

                              Plt: super ambiguous here sorry, here it is page load time, but contextually you could reasonably have thought programming language theory, though that would make the sentence even more confusing :)

                              1. 1

                                Translation Unit

                                Programming Language Theory

                                1. 1

                                  Wrong in the latter! :)

                                  Plt here is page load time :)

                                  1. 1

                                    Aha! Thank you for the correction

                                    1. 2

                                      Correction is strong - I used plt in a conversation about programming languages, and was not referring to programming languages, what could go wrong? :D

                          2. 2

                            Optimizing compilers and linkers use a lot of ram

                            I started building compilers three decades ago and have a formal education to do so and there are a lot of compilers still developed today which use less RAM for the same source code size, are faster and don’t generate worse code. Therefore it’s a fair question what the Zig compiler does differently.

                            1. 2

                              there are a lot of compilers still developed today which use less RAM for the same source code size, are faster and don’t generate worse code.

                              Now I’m curious! Which compilers were you thinking of here?

                              1. 1

                                E.g. all C and C++ compilers I worked with; my favorite is still GCC 4.8.

                                1. 1

                                  Interesting…given that apparently most of Zig’s memory usage turned out to be LLVM [0], I’d be really surprised if clang used less RAM and ran faster for the same size C++ code, but I have very little firsthand experience with either clang or sizable C++ code bases, so maybe that’s just my misimpression of clang and C++!

                                  edit: just realized it’s also possible that clang isn’t one of the C++ compilers you’ve worked with.

                                  [0] https://lobste.rs/s/csax21/zig_is_self_hosted_now_what_s_next#c_bspzkq

                                  1. 3

                                    To provide some figures, I built my most recent LeanQt release (https://github.com/rochus-keller/LeanQt) with different compilers on x86 and x86_64.

                                    Here are the cloc results:

                                    LeanQt-2022-10-21/core $ cloc .
                                         491 text files.
                                         491 unique files.                                          
                                           3 files ignored.
                                    
                                    http://cloc.sourceforge.net v 1.60  T=3.81 s (128.1 files/s, 76031.5 lines/s)
                                    -------------------------------------------------------------------------------
                                    Language                     files          blank        comment           code
                                    -------------------------------------------------------------------------------
                                    C++                            209          32448          64890          98328
                                    C/C++ Header                   265           9203          11711          68493
                                    Objective C++                   10            387            484           1961
                                    IDL                              1             12              0            963
                                    Assembly                         3            106             67            682
                                    -------------------------------------------------------------------------------
                                    SUM:                           488          42156          77152         170427
                                    -------------------------------------------------------------------------------
                                    

                                    Here is the result of gcc version 4.8.2, Target: i686-linux-gnu

                                    /usr/bin/time -v ./lua build.lua .. -P HAVE_CORE_ALL
                                    	Maximum resident set size (kbytes): 119912
                                    

                                    Here is the result of gcc version 5.4.0, Target: x86_64-linux-gnu

                                    /usr/bin/time -v ./lua build.lua .. -P HAVE_CORE_ALL
                                    	Maximum resident set size (kbytes): 286628
                                    

                                    Here is the result of Apple LLVM version 7.3.0 (clang-703.0.31), Target: x86_64-apple-darwin15.6.0

                                    /usr/bin/time -l ./lua build.lua ../LeanQt -P HAVE_CORE_ALL
                                     117989376  maximum resident set size
                                    

                                    Conclusion: A C++ code base comparable in size with the Zig compiler requires less than 300 MB RAM with all tested GCC and Clang versions; Clang x86_64 requires even less than half of the RAM than GCC x86_64 (118 MB vs. 287 MB).

                                    1. 2

                                      Wow, not at all what I would have guessed - thank you so much for sharing!

                                    2. 1

                                      I’m also using Clang/LLVM; the most recent version on my M1 Mac is 13.x, though my favorite version is still 4.x.

                                2. 1

                                  Sure, I would guess that there are places where they have chosen clearer/more easily understood architecture and algorithms because modern hardware changes the trade offs.

                                  Because even on individual TUs in clang can easily hit gigs of ram at a time, LTO modes easily make it insane.

                                  You could argue “lazy devs aren’t doing what we had to do in the past” but that would simply mean you should turn around and say “why bother making CPUs faster, or making systems with more ram”.

                                  I don’t have three decades of compiler experience unless you consider my masters or work on the Gyro patch for Rotor, which you could argue is “compiler work”, but as none of this was real production level code I wouldn’t in this context. So let’s say I have somewhere in the 10-15 year range of production and shipping to consumers and developers, but I think that my experience is sufficient here.

                                  1. 1

                                    unless you consider my masters …

                                    That was not the point. The point was that I came across a lot of compilers and languages, even in a time when 100 MB was an incredible lot of RAM, and that I cannot explain why a language like Zig, which is not the most complex language there is (certainly less complex than C++) requires ten to hundred times more memory than e.g. the C++ compilers I ever had to do with. Even if we can make educated guesses, they are still guesses.

                                    1. 1

                                      That’s literally the first part of my answer:

                                      Sure, I would guess that there are places where they have chosen clearer/more easily understood architecture and algorithms because modern hardware changes the trade offs.

                                      Increasing the available resources (ram, cpu time) is an enabler - if you no longer have to worry about every single byte, or every single cycle, in order to make a robust and usable compiler. Presumably if people start making large scale projects in Zig, the trade offs between simplicity and ease of development vs. performance will change, I assume the 9+ -> 2.7gb reduction was the result of something along those lines.

                              2. 2

                                It now takes 2.8GB of RAM so I’m not sure what your point is?

                                1. 4

                                  That brings us back to my original question: Doesn’t the Zig compiler support separate compilation? 2.8 GB is still too much if you e.g. want to compile on an x86 Linux machine.

                                  1. 1

                                    Not that this is the only answer, but you can easily cross compile with Zig. So you need not compile on an x86 machine.

                                    1. 1

                                      Separate compilation doesn’t get you a whole lot - if you have standard 1 to 1 of TU to a process, you are now using more memory concurrently than a single process would.

                                      The reality is that a lot of compile time performance is achieved by trading off against memory use (that’s how tu/process works), And a lot of the more advanced optimization algorithms for runtime have very large working sets - which you can carefully reduce the size of, but frequently at the cost of compile time again.

                                      For many compiler devs the implementation complexity required for fast compilation in a 32bit address space is not worth it on its own, let alone the opportunity cost of doing that instead of something else.

                                      1. 1

                                        Separate compilation doesn’t get you a whole lot

                                        Is this the confirmation that Zig doesn’t support separate compilation?

                                        the implementation complexity required for fast compilation in a 32bit address space is not worth it on its own

                                        I can easily compile the Linaro ARM GCC which is even bigger than the 200 kLOC of Zig on an x86 Linux machine and it doesn’t use more than a few 100 MB RAM to work. Zig is supposed to be a “better C”, isn’t it?

                                        1. 2

                                          is this confirmation…

                                          No idea, I don’t like Zig as I disagree with a bunch of their core language design decisions, so haven’t investigated any implementation details :)

                                          I can compile…

                                          Again no idea about the zig compilation, but all 32bit compilers explicitly drop a bunch of compile time performance optimizations due to address space constraints, the extreme case being compilers from a few decades ago that did essentially line by line compilation because anything more burned too much ram - it’s why [Obj-]C[++] all require forward decls (though in c++ templates also don’t help :) )

                                          1. 3

                                            Here is some x86 and x86_64 data with compiler versions between 2013 and 2019: https://lobste.rs/s/csax21/zig_is_self_hosted_now_what_s_next#c_2lc2dk. Much less than the available memory is used, even on 64 bit systems.

                                            1. 2

                                              Thanks for all that work!

                                              My opinion is very much that as a more recent language Zig took the imo reasonable approach of using standard data types and algorithms, rather than custom everything that llvm, clang, gcc, etc have.

                                              Because of your work I’m now curious just how much memory is saved in llvm+clang (I know next to nothing about the gcc code base) by the large amounts of effort in keeping memory use down (llvm originated in an era with much less ram, and clang less so, gcc was literally decades earlier so presumably also does fairly complicated stuff to keep size down).

                                              But the big thing that’s different is that in earlier times a lot of core architectural decisions in the older compilers that result in much more complicated data types than I suspect the Zig implementation does. There are numerous different versions of the common core data types in llvm specifically to keep memory use down, other things like how the IR, ASTs, etc get kept around make the types themselves obnoxious but also impacts the architecture as removing info from some types means you need to have fast ways to get that info again.

                                              Why would a new language take on that complexity - especially if it’s trying to be welcoming to new devs (imagine if you were introduced to C or C++ by some absurd macro&template monstrosity instead of clean easy to comprehend code)

                                2. 2

                                  The presentation you linked mentions compiling a 100k program with less than 1 GB RAM. If zig is about 150 KLOC (like I think I saw elsewhere in the thread) then I think it lends to Rochus’s point that 9 GB was pretty heavy.

                                  Of course, the presentation is from 2006 and the MLton authors were probably more concerned with 32 bit builds so they might have made more efforts to keep build sizes low. (Which is to say it was different because things were different. Woo-hoo!)

                          1. 28

                            Exciting times.

                            I’ve been sneaking it in at work to replace internal tools that have 1.5 second startup delay and 200+ MB on runtime dependencies with fast, static little zig exes that I can cross-compile to every platform used in the workplace.

                            1. 16

                              I find your story more flattering than any big tech company deciding to adopt Zig. Thank you very much for sharing!

                              1. 8

                                My impression of Zig and you all who are behind it has been that you care about these use cases at least as much as enabling big complex industrial applications, and not only in words but in action. :)

                                I actually started out with Rust, which I thought would be more easily accepted. I work in the public sector and tech choices are a bit conservative, but Rust has the power of hype in addition to its nice qualities, and has some interest from techy people in the workplace.

                                But then the easiest way to cross-compile my initial Rust program seemed to be to use Zig, and I didn’t really want to depend on both of them!

                                1. 7

                                  Seems like Go would be a natural choice. Far more popular than Zig and cross-compiles everywhere. Why Zig?

                                  1. 19

                                    Not OP, but I can’t stand programming in Go. Everything feels painful for no reason. Error handling, scoping rules, hostile CLIs, testing, tooling, etc.

                                    My greatest hope for Zig is that I can use it to replace Go, not just to replace C.

                                    @kristoff what’s your take on that? Given that Zig has higher-level constructs like async/await built-in, with the support of higher-level APIs, are there reasons programming in Zig can’t be as convenient as programming in higher-level languages like Go?

                                    1. 16

                                      I’m not going to argue with that but if you’re my report and you’re building company infrastructure in some esoteric language like Zig that will be impossible to find team members to maintain said infrastructure after you leave, we’re going to have a serious talk about the clash between the company’s priorities and your priorities.

                                      OP said “sneaking in at work”. When working in a team, you use tooling that the team agrees to use and support.

                                      1. 11

                                        Oh, can’t disagree there. I’m just hoping that someday I can replace my boring corporate Go with boring corporate Zig.

                                        1. 7

                                          Two half-baked thoughts on this:

                                          1. small, well-scoped utilities should not be hard for some future engineer to come up to speed on, especially in a language with an ever-growing pool of documentation. if OP was “sneaking in” some Brainfuck, that’s one thing. Zig? that’s not a horribly unsafe bet - it’s a squiggly brace language that looks and feels reasonably familiar, with the bonus of memory management thrown in

                                          2. orgs that adhere religiously to “you use tooling that the team agrees to use and support” tend to rarely iterate on that list, which can make growth and learning hard. keeping engineers happy often entails a bit of letting them spread their wings and try/learn/do new things. this seems like a relatively lower-risk way to allow for that. mind you, if OP were “sneaking in” whole database engines or even Zig into hot-path app code without broader discussion, that’s a whole other problem, but in sidecar utility scripts? not much worse than writing a Bash script (which can often end up write-only anyway) IMO

                                          1. 9

                                            Pretty much this, in my case.

                                            The “sneaking” part was not entirely serious.

                                            I have used it before at work to implement external functions for Db2, which has a C API, which is very easy to use with Zig: import the C headers, write your code, add a C ABI wrapper on top. Using it just as “a better C” in that case.

                                            And while we mostly use “boring” old languages, there are some other things here and there. It’s not entirely rigid, especially not outside of the main projects.

                                            1. 3

                                              (1) assumes that there is no cost to adding an additional tool chain simply because it’s for a small/self contained utility, which I’d hope people understand is simply not true

                                              (2) you’re not wrong about tooling conservatism, but that’s because of your statement (1) being false - adding new tools has a real cost. The goal of a project is not to help you learn new things, that’s largely a happy coincidence. More to the point you’re artificially limiting who can fix things later, especially if it’s a small out of the way tool - once you’re gone if any issues arise any bug fix first requires learning a new tool chain not used elsewhere.

                                            2. 2

                                              At least in my own domain (stuff interacting with other stuff on internet) I could say the same thing about Go, or most languages that aren’t Java/JS/C#/PHP/Python/Ruby. Maybe we will get to live in the 90’s forever :)

                                              1. 2

                                                I am not a Zig user, but a Go user, yet I disagree about the team part.

                                                In my experience that’s not really true, and my assumption here is that this is because it’s not just fewer people looking for a job using language X, but also fewer companies for these developers to choose from.

                                                More then that I’d argue that the programming language might not be the main factor. As in that’s something you can learn if it’s interesting.

                                                Of course all of that depends on a lot of other context as well. The domain of the field that you’ll actually work on, the team, its mentality, frameworks being used, alignment of values within the profession and potentially ones outside as well.

                                                I also would assume that using Zig for example might make it a lot easier to find a fitting candidate when compared to let’s say Java where you night get a very low percentage of applications where the candidates actually fit. Especially when looking for a less junior position. Simply because that’s what everyone learns in school.

                                                So I think having a hard time finding (good) devs using Zig or other smaller languages (I think esoteric means something else for programming languages) is not a given.

                                                1. 1

                                                  That’s completely right, and also a bit sad.

                                                2. 4

                                                  I don’t think that Zig can be a Go replacement for everyone, but if you are comfortable knowing what lies behind the Go runtime, it can be. I can totally see myself replacing all of my usage of Go once the Zig ecosystem becomes mature enough (which, even optimistically, is going to take a while, Go has a very good ecosystem IMO, especially when it comes to web stuff).

                                                  Zig has some nice quality of life improvements over Go (try, sane defer, unions, enums, optionals, …), which can be enough for me to want to switch, but I also had an interest in learning lower level programming. If you really don’t want to learn anything about that, I don’t think Zig can really be a comfortable replacement, as it doesn’t have 100% fool-proof guard rails to protect you from lower level programming issues.

                                                  1. 1

                                                    How does Zig’s defer work differently than Go’s?

                                                    1. 5

                                                      In Go, deferred function calls inside loops will execute at the end of the function rather than the end of the scope.

                                                      1. 4

                                                        Oh I didn’t realize Zig had block scoped defer. I assumed they were like Go. Awesome! Yeah that’s a huge pain with Go.

                                                  2. 3

                                                    I “agree to disagree” on many of the listed issues, but one of them sincerely piqued my interest. Coming from Go and now Rust (and before C, C++, and others), I am actually honestly interested in Zig (as another tool in my toolbox), and tried dabbling in it a few times. However (apart from waiting for better docs), one thing I’m still super confused by and how I should approach it, is in fact error handling in Zig. Specifically, that Zig seems to be missing errors with “rich context”. I see that the issue is still open, so I assume there’s still hope something will be done in this area, but I keep wondering, is this considered not a pain point by Zig users? Is there some established, non-painful way of passing error context up the call stack? What do experienced Zig devs do in this area when writing non-trivial apps?

                                                    1. 4

                                                      I see that the issue is still open, so I assume there’s still hope something will be done in this area

                                                      You are right, no final decision has been made yet, but you will find that not everybody thinks that errors with payloads are a good idea. They clearly are a good idea from an ergonomics perspective, but they also have some other downisides and I’m personally in the camp that thinks not having them is the better choice overall (for Zig).

                                                      I made a post about this in the Zig subreddit a while ago: https://old.reddit.com/r/Zig/comments/wqnd04/my_reasoning_for_why_zig_errors_shouldnt_have_a/

                                                      You will also find that not everybody agrees with my take :^)

                                                      1. 2

                                                        Cool post, big thanks!!! It gives me an understandable rationale, especially making sense in the context of Zig’s ideals: squeezing out performance (in this case esp. allocations; but also potentially useless operations) wherever possible, in simple ways. I’ll need to keep the diagnostics idea in my mind for the next time with Zig then, and see what I think about them after trying. Other than that, after reading it, my main takeaway is, that I was reminded of a feeling I got some time ago, that errors & logging seem a big, important, yet still not well understood nor “solved” area of our craft :/

                                                        1. 1

                                                          I used zig-clap recently, which has diagnostics that you can enable and then extract when doing normal Zig error handling. I think that’s an okay compromise. And easier than all those libraries that help you deal with the mess of composing different error types and whatnot.

                                                    2. 2

                                                      Out of curiosity, what issues have you encountered when it comes to scoping rules in Go?

                                                      1. 6

                                                        There are gotchas/flaws like this: https://github.com/golang/go/discussions/56010

                                                        I feel like I run into shadowing issues, and then there’s things like where you’re assigning to err a bunch of times and then you want to reorder things and you have to toggle := vs =, or maybe you do err2, err3, etc. In Zig all that error-handling boilerplate is gone and operations become order-independent because you just try.

                                                        And don’t get me started on the fact that Go doesn’t even verify that you handle errors, you need to rely on golangci-lint for extra checks the language should do…

                                                        Edit: also as Andrew points out, Go doesn’t block-scope things when it should: https://lobste.rs/s/csax21/zig_is_self_hosted_now_what_s_next#c_g4xnfw

                                                        Edit: ohh yeah part of what I meant by “scoping” was also “visibility” rules. It’s so dumb that changing the visibility (public/private) of an identifier also makes you change its name (lowercase vs uppercase initial letter).

                                                        1. 2

                                                          It’s so dumb that changing the visibility (public/private) of an identifier also makes you change its name (lowercase vs uppercase initial letter).

                                                          Especially since some people write code in their native language(s) (like at my job), and not all writing systems even have this distinction.

                                                    3. 4

                                                      I’ve had better results (and more fun) with Rust and Zig in my personal projects, so Go didn’t really cross my mind.

                                                      If it was already in use in this workplace, or if there had been interest from coworkers in it, I might agree that it would be a “natural choice”.

                                                      Edit: I think it’s also easier to call Zig code from the other languages we use (and vice versa), than to call Go code. That might come in handy too.

                                                    4. 1

                                                      Just curious what platforms you’re targeting? x86 vs. ARM, or different OSes, etc.?

                                                      1. 1

                                                        The platforms are those used by people in the organisation, currently Linux, Windows, and macOS. Mostly x86, some ARM.

                                                  3. 2

                                                    Ah, the golang niche! Good to see :D

                                                    1. 2

                                                      Do the other devs all know Zig? If not, seems like a large downside is that tools formerly editable by anyone become black boxes?

                                                      To be clear, I hate unnecessarily bloated tools too. I’m just considering the full picture.

                                                      1. 2

                                                        They don’t/didn’t, but it’s a quick one to pick up when you already know C and some other ones.

                                                        I didn’t know the previous language before I started contributing to these tools either.

                                                        It was pretty easy when someone else had already done the foundation, and I think/hope I am providing a solid foundation for others as well.

                                                      2. 1

                                                        What’s the delay from? Are these java tools?

                                                        1. 2

                                                          It’s node/TypeScript. It used to be worse. :)

                                                      1. 23

                                                        I’ve just tried Helix and I’m absolutely blown away by how good this is. I needed to rebind the hjkl to a sensible colemak equivalent like I always do (I wish editors would work on keys not characters. Some games know how to do that), but after that… everything is there, and it just works. Opened a Rust file, all the LSP support, all the sensible keybindings and checks… like in every modern IDE, you may say – and you’ll be right. But this is lightweight, starts instantly and has all the nice moving and editing capabilities – and with a kakoune-like grammar, which I find superior to vim’s, but kakoune was never quite there for me with all the IDE-like niceties.

                                                        I can’t wait to try this at actual work tomorrow and see where it itches me. But my first impression is a 9/10.

                                                        1. 8

                                                          I wish editors would work on keys not characters

                                                          As a Dvorak user, I wouldn’t be happy about this. When I think cw for “change word”, I’m definitely not thinking j,. hjkl are an exception that don’t really have a ‘word’ meaning they symbolize. I’ve just gotten used to it and with jk being more common to me and still next to one another on Dvorak, it’s tolerable.

                                                          games

                                                          This is where it actually is problematic, where the keys are purely buttons and rarely are symbolic of anything else. Since I don’t change my key labels, it’s a quick, one-time setup to just press the matching key before I start a game and accessibility has been pushed pretty hard in recent times so all games published in serious are configurable.

                                                          1. 2

                                                            Hmm, good point. I remap hjkl to hnei, but I keep all the others and remember them because of the meanings (like word-change) that make them easy to remember.

                                                            j-k on colemak becomes y-n on my keyboard, which is not very handy (and upside-down, logically). I still taught myself to use it, whenever I’m on vim that doesn’t have controls remapped.

                                                            1. 2

                                                              and upside-down, logically

                                                              Indeed. One of the things I like about dvorak vs colemak is that many of the keys are also just fine for programming. eg. jk are right next to each other, hl are still logically left and right, dash/underscore is on the home row, etc.

                                                              1. 2

                                                                I really undervalued the home-row dash/underscore until I tried briefly an alternative layout. It’s such a common key for programming and normally gets placed in the most obtuse places – like number row on QWERTY.

                                                          2. 3

                                                            I wish editors would work on keys not characters.

                                                            Sadly, terminal-based apps need to work with the input events that the terminal sends them, and the terminal only sends characters. Even GUI apps can wind up in this trap if they’re based-on or portable-to a terminal UI, like GVim for example.

                                                            1. 1

                                                              I guess i3 isn’t a terminal app, but I think when that’s initialized they remap everything if the layout is different. Someone correct me if I’m wrong here. … But maybe an editor could do the same? Just an idea.

                                                              1. 4

                                                                As a concrete example, when you press a key in your browser, the KeyboardEvent that gets fired has a [key] property that describes the letter typed, if any, as well as the code property that describes the physical location on the keyboard, as well as properties like shiftKey, ctrlKey, altKey, and metaKey that reflect whether that modifier was held down. So when I type a double-quote on my Dvorak keyboard layout, the key property is ", but the code property is “KeyQ” (because Dvorak puts the double-quote where QWERTY puts Q) and the shiftKey property is true.

                                                                An app like i3 will look at the binding specified in the config file, consult the X11 keyboard layout extension to find out what code corresponds to that key, and then when a keyboard event arrives it checks the code rather than the key, and therefore works no matter what changes have occurred to your keyboard layout in the mean time.

                                                                Meanwhile, when I type a double-quote in my terminal, the terminal receives byte 0x22, ASCII ". That’s it. No key code, just the key pressed. Even if there were a way to obtain the keycode of a key event, there’s no way to consult the keyboard layout configuration, so it wouldn’t be possible to transparently remap bindings.

                                                            2. 2

                                                              I wish editors would work on keys not characters. Some games know how to do that

                                                              I think in general this should be avoided, because it makes documentation difficult. When character-based, you just document that ctrl+a does something, and no matter the layout or keyboard configuration anybody who has these two keys will be able to do just that. If it was key-based, an azerty keyboard user would have to know that this actually means pressing ctrl+q. This get much worse with non-alphanumeric symbols.

                                                              With dvorak or colemak it might be ok since the user generally knows the american qwerty layout, but that’s definitely not the case for other layouts.

                                                              As stated in another comments, games are indeed one of the only cases where I can see an argument being made for key-based bindings.

                                                            1. 1

                                                              Ooh it’s public!

                                                              Edit: looks like it’s in Rust now aww

                                                              1. 8

                                                                looks like it’s in Rust now aww

                                                                Look at the last entry in the FAQ. The compiler has always been written in Rust, the runtime is written in Zig. To see some of the Zig source code, look in crates/compiler/builtins/bitcode.

                                                                1. 2

                                                                  would be interesting to read the story behind the rust rewrite

                                                                  1. 8

                                                                    As far as I know, the compiler has always been written in Rust! You can write platforms in whatever language you like, though.

                                                                    1. 1

                                                                      oh, interesting; when they first announced the language i assumed it started life as a straight-up fork of the haskell-based elm compiler

                                                                      1. 5

                                                                        Always rust, though parts of it were translated very directly from the elm compiler. So some sections the code may be quite similar.

                                                                1. 4

                                                                  I use Hammerspoon to customize my Mac app switching in a lot of similar ways. I mapped “hyper” key to caps lock using Karabiner Elements. Hyper+B is browser, Hyper+T is text editor, Hyper+S is my shell. Hyper+0 is “previous window” instead of previous app. I have fuzzy finders built with Hammerspoon that let me fuzzy find among all open windows or open windows per app. Hyper+N moves the current window to the next monitor, Hyper+1-9 sizes windows in various ways (1/2s, 1/3s, etc.). Hammerspoon is super useful!

                                                                  If anyone cares, here’s my config: https://github.com/kbd/setup/blob/master/HOME/.hammerspoon/init.fnl (in Fennel lisp :)

                                                                  1. 1

                                                                    I had no idea this is possible.

                                                                    fennel = require("fennel")
                                                                    table.insert(package.loaders or package.searchers, fennel.searcher)
                                                                    fennel.dofile("init.fnl") -- exports into global namespace
                                                                    

                                                                    I wonder if Python would also be possible.

                                                                    1. 2

                                                                      Fennel is great! But it specifically integrates with Lua, Python wouldn’t work the same way.

                                                                  1. 1

                                                                    Yet another program that re-implements shell scrips.

                                                                    If I were to use one of these kinds of tools I’d probably use Just, but I’ve never seen a reason to.

                                                                    1. 3

                                                                      On the contrary, z-run, just like just (which I’ve looked at) don’t “re-implement shell scripts”, instead they build upon shell scripts and cover some corner use cases that shell scripting doesn’t solve easily. For example:

                                                                      • (in case of just) it is dependencies between shell scripts;

                                                                      • (both z-run and just) its easily mixing multiple scripting languages inside the same source file; (you can get this with here-documents in plain shell but it doesn’t work for complex pipelines nor does it play nicely when you also need stdin;)

                                                                      • (both z-run and just) modularity – it’s quite hard to implement modular shell scripts; functions and sourcing are two solutions, but given the shell’s dynamic scoping when it comes to (environment) variables (unless one uses local) you quickly get into trouble; (also some “sanity” flags like set -e don’t apply to functions automatically;) thus my own pattern is having one large script file with case "${command}" in ( name-that-otherwise-would-be-a-function-name ) ... ;; esac, and instead of calling functions I call "${0}" some-sub-command ..., which makes sure the “function” doesn’t taint the environment of the calling script;

                                                                      • (in case of z-run) remote execution via SSH – just the simple ssh user@host rm '/some path that might contain a space' breaks due to the lack of extra quoting;

                                                                      • (in case of z-run and make) dynamic generation of scripts (both as code and as source); in shell we might have source <( some command ), but at least with z-run it’s easier, and the some command output is cached;

                                                                      Than being said, I’m a long time Linux user, writing a lot of shell scripts (bash) both personally and professionally; thus z-run is not something I’ve came up in my first year of using Linux. For a long time I’ve actually used shell scripts like these ones where I’ve tried to modularize them; however that is much more tedious than using z-run, at least for development and operations.

                                                                    1. 1

                                                                      It is also much fuller-featured than commonmark, with support for definition lists, footnotes, tables, several new kinds of inline formatting (insert, delete, highlight, superscript, subscript), math, smart punctuation, attributes that can be applied to any element, and generic containers for block-level, inline-level, and raw content.

                                                                      Why does everyone hate underlines?

                                                                      1. 6

                                                                        well, in the web context, “Links should look like links, and nothing else should”, and more often than not links are underlined. And apart from that, using underlines for emphasis or other typographical purposes is discouraged by most authorities. Using up an ascii punctuation character for underlining would be a mistake imo.

                                                                      1. 7

                                                                        I took “Zig has no function colors” to mean at compile time. You don’t have to define two versions of a function just to use them in sync vs async contexts. Obviously those have different representations at run-time and if you do meta-programming you’re going to need to know how to call them differently based on their runtime representation. And obviously if you use an async thing in a function (like suspend) that only makes sense for async.

                                                                        So, he found compiler bugs and some more things the compiler should figure out at compile time, but what else does this post teach us?

                                                                        1. 6

                                                                          I think it teaches there needs to be better clarity of messaging.

                                                                          First of all, the “color” of functions is a vague concept based on everyone’s reading of a blog post that complained about a few different aspects of async, from issues of implementations that can’t await in a sync context (a real major limitation) to not linking particular syntax (just one person’s opinion, a trade-off at most).

                                                                          Second, Zig’s “colorblind” solution sounded like it solved all of it (for whatever “it” was). Zig is getting brought up whenever another language discusses their async implementation, with implication that everyone does it wrong, since Zig apparently could do without “color”.

                                                                          So now we have ambiguity of what “color” really is, multiplied by the gap between what Zig was assumed to solve vs what it actually solves.

                                                                          1. 4

                                                                            there needs to be better clarity of messaging.

                                                                            The original article(s) describing Zig’s “colorblindness” were already reasonably lengthy. In reading them some time ago, it was immediately clear to me that they were referring to the fact that programmers don’t have to write the same function twice because the compiler specializes automatically. If they had gone into excruciating detail about how that worked, calling conventions, etc, then somebody else would have complained that the articles were too long. You can’t please everyone, especially folks who are determined to find faults and complain.

                                                                            Zig is getting brought up whenever another language discusses their async implementation

                                                                            It seems especially unfair to find fault in the Zig community for imprecise feedback from compiler/language implementation non-experts in other communities. The experts in those communities have the obligation of relations with their community members.

                                                                        1. 1

                                                                          Interesting it seems that part of Neovim’s goal is to “Lua all the things!” I support this.

                                                                          1. 1

                                                                            I find it somewhat annoying the number of plugins now that are “exactly like this other popular and well-established thing, only this one is Lua!”

                                                                            1. 3

                                                                              And what’s wrong with that? If you can make a good plugin faster without otherwise changing it, why not do so?

                                                                          1. 22

                                                                            hi, direnv author here :)

                                                                            1. 7

                                                                              Don’t have a question but love the tool! Thank you!

                                                                              1. 3

                                                                                thanks! :)

                                                                              2. 3

                                                                                Hi, thanks for direnv! It’s been life-changing. Thanks to direnv, my virtualenvs are automatically activated/deactivated per-directory, and project config and secrets are all in environment variables and loaded by direnv, making it easy to have parity between dev and production.

                                                                                Since you’re here, wanted to ask: you think unload hooks can ever be a thing? There’s a longstanding ticket for the feature: https://github.com/direnv/direnv/issues/129

                                                                                1. 3

                                                                                  Since you’re here, wanted to ask: you think unload hooks can ever be a thing? There’s a longstanding ticket for the feature: https://github.com/direnv/direnv/issues/129

                                                                                  There are too many edge cases. For example, what happens if the user just closes the shell? direnv is just a hook that is executed before every prompt and wouldn’t know about it. Or what happens if two shells are opened on the same project, how does direnv keep track?

                                                                                  Users that ask for this generally want to start/stop service on demand for a project. I think the best for that is to actually open a second shell and start a process manager by hand. Then you get to see what services are running, you can start/stop them, …

                                                                                  1. 1

                                                                                    “not guaranteed to be called in case of closed terminal” are semantics I would welcome. I just want to hook into where it says “direnv: unloading”. My idea was to style my terminal (eg. terminal tab) by directory, and hooking into direnv’s activated inheritance seemed the perfect place to set/unset it. I can do something on directory enter with a plugin but not on directory leave without unload.

                                                                                2. 1

                                                                                  I freaking love direnv. Thank you so much!

                                                                                1. 1

                                                                                  I wish I had time to participate more in the thread, but my entire system setup is in my repo: https://github.com/kbd/setup

                                                                                  Here’s my bin dir: https://github.com/kbd/setup/tree/master/HOME/bin

                                                                                  My aliases: https://github.com/kbd/setup/blob/master/HOME/bin/shell/aliases.sh

                                                                                  Worth noting things like ‘autopager’, ‘fzr’, ‘t’ (touch, but creates intermediate directories), ‘mcd’, ‘install-manual’, ‘kw’ (opens any program in a new split in kitty, used constantly). Also see my extensive git aliases that make heavy use of fzf. A lot of people are mentioning prompts… I wrote my own in Zig so it’s fast and cross-shell, see kbd/prompt.

                                                                                  1. 26

                                                                                    People love arguing against using ligatures in code. I don’t know why. People who enjoy ligatures in code are going to continue using them. If you don’t like them don’t use them. Fonts let you control whether to enable them on a case-by-case basis, so I enable ligatures (I use Victor Mono) in my text editor (vscode) but disable them in my terminal (kitty).

                                                                                    I’ve been using ligatures in my editor for years and have never once found myself in the author’s prophesied “swamp of despair”. My code just looks nicer with things like nice mathematical symbols (≤ instead of <=) and it hasn’t once caused me a problem.

                                                                                    Article is from 2019 btw.

                                                                                    1. 15

                                                                                      You wrote:

                                                                                      If you don’t like them don’t use them.

                                                                                      But you’re responding to an article which says:

                                                                                      “What do you mean, it’s not a matter of taste? I like using ligatures when I code.” Great! In so many ways, I don’t care what you do in private. Although I predict you will eventually burn yourself on this hot mess, my main concern is typography that faces other human beings. So if you’re preparing your code for others to read—whether on screen or on paper—skip the ligatures.

                                                                                      And I really don’t have a choice but to read code with ligatures from time to time. Sometimes it’s on a blog post with ligatures, sometimes it’s in a screen shot of someone’s code they sent while asking for help. I wish people would follow his advice.

                                                                                      1. 5

                                                                                        His “main concern” is in a footnote, that basically he doesn’t want to have to read other people’s code with ligatures, which is your main point too. That’s fair! For many purposes (github, stack overflow, a textbook, maybe not an algorithms book) I even agree with you. I’ve never had a coworker complain when screen-sharing that they had trouble reading code, but it’s trivial to disable temporarily if they were to ask.

                                                                                        But if someone wants to post code on their site in a font with ligatures I’d go “ooh nice that code looks pretty” and maybe “what font is that”? Despite protests, arguments, and vague predictions of doom, it does come down to taste and to circumstances. That’s my main concern, acting like it’s a matter of right and wrong.


                                                                                        I’ve already given reasons in the thread why ligatures can actually increase readability. Check out the Victor Mono homepage for yourself for some code examples https://rubjo.github.io/victor-mono/ I just noticed that triple equals (JS) is represented by three bars. I don’t usually code JavaScript but triple equals being so visually distinct from double thanks to the ligatures I imagine would make it harder to make a mistake with them. On the other hand I dislike the use of dull color italics for comments that the author uses in the code samples… Again, it’s primarily about taste.

                                                                                        Of course, I’d be interested in any real-life examples of the author’s once-in-10-to-15-year occurrence prognostications of despair.

                                                                                        1. 9

                                                                                          Luckily because taste is a factor, my userContent.css contains

                                                                                          code, kbd, pre, samp {
                                                                                             font-variant-ligatures: none !important;
                                                                                          }
                                                                                          
                                                                                          1. 1

                                                                                            That’s awesome, thanks for sharing. I’ve been meaning to set up a userContent.css! (What stopped me is that it seems difficult to source control across computers because the profile folder name changes between machines.)

                                                                                            1. 1

                                                                                              I usually just update a snippet every once in a while. It’s not something where the world ends if I’m missing it. What I do want to see is NixOS having better control over building and managing Mozilla profiles.

                                                                                        2. 3

                                                                                          Sometimes it’s on a blog post with ligatures,

                                                                                          I’ve never seen bad ligatures in a blog post code block. Usually the kind of people who want ligatures in their blog are sensitive enough to know when to turn them off. There are plenty of other ways to mess up code blocks, colored highlighting is a very common one. You don’t see anyone arguing against highlighting do you?

                                                                                          sometimes it’s in a screen shot of someone’s code they sent while asking for help.

                                                                                          So? My eyes burn when I help someone not using dark mode, it’s never stopped me from helping them.

                                                                                          1. 2

                                                                                            And I really don’t have a choice but to read code with ligatures from time to time. Sometimes it’s on a blog post with ligatures, sometimes it’s in a screen shot of someone’s code they sent while asking for help. I wish people would follow his advice.

                                                                                            Sincere question: Do you really struggle if there’s ligatures? Does seeing them a few times in those scenarios really cause a hassle?

                                                                                            What about the inverse? What about people who struggle with “=” vs “==” vs “===”, for example? There are people who find “≡” preferable. Perhaps they wish people would not follow the author’s advice?

                                                                                            Anyway, I have ligatures on, and I’ve never had a problem in the scenarios you’ve mentioned, and more importantly, I’ve never had a problem when pairing. I regularly explicitly check too that a. is the font-size okay and b. are the ligatures okay.

                                                                                          2. 7

                                                                                            I agree. I like the clean look; the ligatures make common symbols easier to recognize. I just started doing some HTML work yesterday, and was happily surprised to see that my current font has ligatures for HTML comments, which make them very recognizable even when the editor isn’t syntax-coloring them.

                                                                                            1. 6

                                                                                              the ligatures make common symbols easier to recognize

                                                                                              Totally. I want to piggy back on this to point out that ligatures don’t just “change symbols into other symbols” (like the OP is arguing WRT things like ≠). Ligatures can actually help make your code clearer even without substituting symbols. For example, Victor Mono, when it sees things like || or &&, replaces them with a bolder ligature version to make them stand out in code, but they’re still the same two symbols.

                                                                                              1. 2

                                                                                                why is that called a ligature? isn’t it just a font change?

                                                                                                  1. 1

                                                                                                    this explains my confusion

                                                                                                  2. 1

                                                                                                    At a technical level, fonts have several tables in them which define ligatures. The application code doesn’t even get involved, this all happens within the font rendering library (which on many platforms is part of the OS). That mechanism is used here, so that’s what it’s called.

                                                                                                    1. 1

                                                                                                      Thanks. I guess OP’s complaints are about ligatures in the layman’s sense.

                                                                                                1. 1

                                                                                                  If the symbols are easier to recognize, then why not just use the Unicode symbols for them? Ask your language to adopt Unicode support.

                                                                                                2. 3

                                                                                                  I’ve been using ligatures in my editor for years and have never once found myself in the author’s prophesied “swamp of despair”. My code just looks nicer with things like nice mathematical symbols (≤ instead of <=) and it hasn’t once caused me a problem.

                                                                                                  Likewise. Has anyone burnt themselves on “this hot mess”?

                                                                                                  I also usually have the opposite experience of what the author, and some commenters here, are saying. People usually ask me how to get the ligatures since they look nice and help with understanding by making it quick and obvious to parse things like logical operators.

                                                                                                  1. 3

                                                                                                    It’s like they’re afraid they’re going to be forced to use ligatures against their will

                                                                                                    1. 1

                                                                                                      The ≤ case is a selling point for me because I can never remember if it’s spelt <= or =< and the font disambiguates that for me.

                                                                                                      Mind you, it’s the year 2022 now, and digraphs like >= or != should be regarded as crutches just like trigraphs are. It’s sad that we have to misuse font features instead.

                                                                                                      1. 1

                                                                                                        because I can never remember if it’s spelt <= or =< and the font disambiguates that for me.

                                                                                                        Ha. That’s a really interesting point. I never mistake <= because it’s spelled like it’s read (“less than or equal to”), but I could never get straight ~= or =~ for using Perl regular expressions.

                                                                                                        digraphs like >= or != should be regarded as crutches just like trigraphs are. It’s sad that we have to misuse font features instead.

                                                                                                        Disagree on that. It’s great that I can see “if err ≠ nil” over “if err != nil” (the ≠ actually looks much nicer as a ligature in my programming font than the Unicode character does alone) without having to cut and paste unicode characters (as if using ≠ over != were even allowed). Fact is there are only so many keys on the keyboard. Programming-specific ligatures seems like a perfectly natural use-case, I don’t see it as a mis-use. Further, many Unicode “characters” are composed of multiple code-points, so Unicode does “ligatures” internally as well. Typing in Asian languages is often accomplished by combining multiple keystrokes to type one character. Text isn’t simple.

                                                                                                    1. 4

                                                                                                      Since this is git-compatible, anyone using this to work on a git repository shared with git users?

                                                                                                      1. 12

                                                                                                        I’m the author, so maybe you’re not asking me, but I’ve used it for the project itself for over a year, as well as for various pull requests for other GitHub projects. I doubt anyone else has used it for anything serious, but I’d be very happy to hear if anyone has.

                                                                                                      1. 15

                                                                                                        I personally think it’s sheer madness to create a new language with a C compiler already built into it… but I have to admit it seems to be working out decently for Zig in practice.

                                                                                                        1. 21

                                                                                                          Even if Zig-the-Language never makes it big, Zig-the-Toolchain is already providing an immense amount of value.

                                                                                                          1. 14

                                                                                                            Zig-the-Language is also better at using C libraries than C is.

                                                                                                            1. 3

                                                                                                              It’s safer, but it’s unclear if total developer productivity is higher than in C, depending on how you measure it.

                                                                                                            2. 13

                                                                                                              Seriously, it’s the cross-compile toolchain that should have existed for decades.

                                                                                                            3. 1

                                                                                                              It solves a problem of getting from here to there. We have a big C world. Trying to replace it results in xkcd standards. Embracing it, however, lowers the barrier to entry. When the tooling is as good and easy to use and build as Zig’s, that barrier to entry gets low enough to give the language a fighting chance.

                                                                                                            1. 4

                                                                                                              Been thinking about standardizing on asdf+direnv. Could anyone offer a quick comparison?

                                                                                                              It sounds like Nix can also build your containers for you based on your project definition?

                                                                                                              1. 6

                                                                                                                asdf works fine for pinning runtimes until you have system libraries, etc that extensions link against which aren’t versioned with asdf. Then you’re back in the same boat as you are with brew, etc where upgrading might break existing workdirs.

                                                                                                                1. 3

                                                                                                                  It sounds like Nix can also build your containers for you based on your project definition?

                                                                                                                  Yep basically just something like this, lots of assumptions given with this and that you “want” to containerize the hello world program gnu ships but eh its an example:

                                                                                                                  $ cat default.nix
                                                                                                                  { pkgs? import <nixpkgs> { system = "x86_64-linux"; } }:
                                                                                                                  pkgs.dockerTools.buildImage {
                                                                                                                    name = "containername";
                                                                                                                    config = {
                                                                                                                      cmd = [ "${pkgs.hello}/bin/hello" ];
                                                                                                                    };
                                                                                                                  }
                                                                                                                  $ nix-build default.nix
                                                                                                                  # lengthy output omitted
                                                                                                                  $ docker load < result  
                                                                                                                  259994eca12e: Loading layer [==================================================>]  34.04MB/34.04MB
                                                                                                                  Loaded image: containername:zvrzzl5vlbjdbjz8wmy8w4dv905zra1j
                                                                                                                  $ docker run containername:zvrzzl5vlbjdbjz8wmy8w4dv905zra1j     
                                                                                                                  Hello, world!
                                                                                                                  

                                                                                                                  There are caveats to using the docker builds (can’t build on macos) and you’ll need to learn the nix programming language at some point but its a rather droll affair IME once you get that its all just data and functions. And before you ask why is it so big, the short answer is everything that hello defined it depends on is included, which includes jq/pigz/jshon/perl/moreutils etc… for some reason. But its basically lifted straight out of the nix store verbatim.

                                                                                                                  1. 1

                                                                                                                    everything that hello defined it depends on is included, which includes jq/pigz/jshon/perl/moreutils etc… for some reason

                                                                                                                    I recognise this list. These are the dependencies used in the shell scripts which build the Docker image. They shouldn’t be included in the image itself.

                                                                                                                    1. 2

                                                                                                                      They won’t be included in the image if unused.

                                                                                                                      1. 2

                                                                                                                        Have I just been building docker images wrong then this whole time?

                                                                                                                        1. 2

                                                                                                                          Yup. Nix is a fantastic way to build docker images. For example https://gitlab.com/kevincox/dontsayit-api/-/blob/46cbc50038dfd3d76fee2e458a4503c646b8ff2c/default.nix#L23-35 (nd older project but good example because it has more than just a single binary) creates an image with:

                                                                                                                          563528481rvhc5kxwipjmg6rqrl95mdx-glibc-2.33-56
                                                                                                                          7hq7ls1nqdn0ksy059y49vnfn6m9p8hm-dontsayit-api
                                                                                                                          qabnj48kj88r1zkz17hcfzzw3z8k5rmv-words.csv
                                                                                                                          qbdsd82q5fyr0v31cvfxda0n0h7jh03g-libunistring-0.9.10
                                                                                                                          scz4zbxirykss3hh5iahgl39wk9wpaps-libidn2-2.3.2
                                                                                                                          

                                                                                                                          Of course if I used musl libc than glibc and its dependencies would go away automatically.

                                                                                                                          What’s better is that if you use buildLayeredImage each of these is a separate layer so that rebuilding for example the word list, or the binary doesn’t require rebuilding other layers. (This is actually better than docker itself because docker only supports linear layering, so you would have to decide if the word list or the binary is the top layer and rebuilding the lower would force a rebuild of the higher one.)

                                                                                                                  2. 2

                                                                                                                    It sounds like Nix can also build your containers for you based on your project definition?

                                                                                                                    There is also https://nixery.dev/ which allows you to build a container with the necessary tools as easy as just properly naming them. For example:

                                                                                                                    docker run -ti nixery.dev/shell/git/htop bash
                                                                                                                    

                                                                                                                    Will bring you in a container that has a shell, git, and htop.

                                                                                                                  1. 2

                                                                                                                    It almost feels like Python was written in a deliberately slow way. Like:

                                                                                                                    sum([w * e for (w,e) in zip(a,x)])
                                                                                                                    

                                                                                                                    Simply removing square brackets to avoid creation of intermediate lists in memory would speed this line up very significantly.

                                                                                                                    And that’s not even touching on the subject of numpy: most people actually using Python for statistics would use it for such a task, because why not? So it’s not really “LispE vs. Python”, it’s “LispE vs. deliberately unoptimized pure-Python code”.

                                                                                                                    1. 4

                                                                                                                      For your second point, the problem with punting to a library is that the same (or similar) libraries can be used from Lisp, making the comparison pointless. I’m not sure about LispE, but Common Lisp at least has bindings for a lot of math libraries (BLAS, GSL, LAPACK, etc.).

                                                                                                                      On the other hand, using one of those seems like overkill for the examples in the article. And if a language is so slow that any meaninful work needs to pull in third party libraries like NumPy, then that’s worth knowing, too.

                                                                                                                      1. 2

                                                                                                                        Simply removing square brackets to avoid creation of intermediate lists in memory would speed this line up very significantly.

                                                                                                                        Didn’t seem to have much effect when I tried it out. If anything the code became slightly slower, 63ish ms vs 59ish on my computer. Why? No idea. The variance on either was +/- 3 ms anyway though, just from eyeballing it, so it’s a little difficult to know whether that difference was real.

                                                                                                                        1. 1

                                                                                                                          That’s a good observation, thanks for going through the trouble :-). It is true that on small lists generator expressions can be slower than list comprehensions. It was just something that jumped on me from cursory looking.

                                                                                                                          1. 2

                                                                                                                            Yeah list comprehensions are faster at a certain size, because think of the machinery to do generators. Unfortunately, I think that size is pretty large :)

                                                                                                                        2. 2

                                                                                                                          And that’s not even touching on the subject of numpy: most people actually using Python for statistics would use it for such a task, because why not? So it’s not really “LispE vs. Python”, it’s “LispE vs. deliberately unoptimized pure-Python code”.

                                                                                                                          This seems like a pretty incurious reading of the article. Mine is that it’s quite explicit on exactly this question: the answer is, this algorithm was chosen as an exemplar of a sufficiently computationally complex yet still-understandable benchmark case for comparing the two languages as general compute languages. Not as a demonstration of why you should use this lisp to do your data engineering. The fact that the domain of this algorithm, in the real world, tends to revolve around some other highly optimized library is completely neither here nor there.

                                                                                                                          1. 2

                                                                                                                            Thank you, you are absolutely right

                                                                                                                          1. 4

                                                                                                                            This one seems to have more expressive selector syntax than htmlq.

                                                                                                                            Edit: Also pup’s “ json{}” output formatter means that if pup isn’t sufficiently expressive for what I need, I can probably pipe its output to jq and get what I want.

                                                                                                                            1. 12

                                                                                                                              htmlq uses Servo’s CSS selector engine underneath, so I think everything on pup’s readme should work with htmlq. For example:

                                                                                                                              curl -s https://news.ycombinator.com/ | htmlq 'table table tr:nth-last-of-type(n+2) td.title a'
                                                                                                                              

                                                                                                                              If you do this instead you can just get the links themselves rather than the entire element:

                                                                                                                              curl -s https://news.ycombinator.com/ | htmlq 'table table tr:nth-last-of-type(n+2) td.title a' --attribute href
                                                                                                                              

                                                                                                                              Full disclosure: I wrote htmlq, and am in envy of pup’s README

                                                                                                                              1. 1

                                                                                                                                Hah, nice! Thank you.

                                                                                                                          1. 4

                                                                                                                            Is anyone using natively-compiled Clojure for anything? Clojure is seeming more attractive lately, given:

                                                                                                                            1. there’s a natively-compiled option
                                                                                                                            2. ClojureScript seems like the best alternative to writing JavaScript/TypeScript
                                                                                                                            3. new tools like Babashka have made scripting with Clojure possible
                                                                                                                            4. Rich Hickey and the language are now being supported by Nubank
                                                                                                                            1. 2

                                                                                                                              Looks like natively compiled stuff is mostly getting use for developer tooling like clj-kondo. Calva makes heavy use of this to provide intelligence for VS Code. I’ve used Babashka for some internal tooling and scripting, and it’s definitely a great way to do automation.

                                                                                                                            1. 2

                                                                                                                              Firefox’s dark mode reverses it so now it’s bright :(

                                                                                                                              1. 1

                                                                                                                                Emailed the author to hopefully get them to include https://crystal-lang.org/