1. 14

    No more strict separation of evaluation and build phases: Generating Nix data structures from build artefacts (“IFD”) should be supported first-class and not incur significant performance cost.

    I don’t know much about Nix (although I’m about to learn with Oil), but this sounds like the same problem Bazel has, which I mentioned here:

    https://lobste.rs/s/virbxa/papers_i_love_gg

    and linked in this post:

    http://www.oilshell.org/blog/2021/04/build-ci-comments.html#two-problems-with-bazel-and-gg-again

    i.e. if you have to compute all dependencies beforehand, it both makes things slow and clashes with the way that certain packages want to be built. You have a lot of work to do before you can start to parallelize your build.

    Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

    It sounds like we still need:

    1. A more dynamic notion of dependencies like gg
    2. A middleground between Docker-like layers and /nix/store composability

    related: https://lobste.rs/s/65xymz/will_nix_overtake_docker#c_tfjfif

    1. 6

      Thanks for linking the gg paper! It’s quite interesting how most concepts in there directly map to concepts that already exist in Nix today, for example:

      gg models computation as a graph of “thunks,” which represent individual steps of computation. Thunks describe the execution of a specific binary on a set of inputs, producing one or more outputs. Thunks are entirely self-contained; any dependencies of the invoked binary must also be explicitly named as dependencies.

      That’s the concept behind Nix derivations.

      Either a primitive value, representing a concrete set of bytes stored into the object store, or

      Fixed-output derivation.

      As the output of a named thunk (referred to using a hash of the serialized thunk object), which must be first executed to produce its output(s).

      That’s a standard derivation.

      You can encode, for instance, “Link(Compile(Preprocess(foo.cc)))” as a sequence of three thunks

      This is where it gets interesting. Nothing in Nix’s model prevents us from doing things at this abstraction level, but it is far more common to write a single derivation that is more like Wrap(EntireBuildSystemOfTheThing). We would like to elevate Nix one level up.

      A gg thunk can return either one or more primitive values, or it can return a new thunk, with its own set of inputs.

      In Nix this is known as import from derivation (IFD). It refers to any time the build graph is extended with things that first need to be computed as the result of derivations. This process is currently a major pain in Nix as you can not “chunk” these parts of the evaluation if you’re doing something like computing an entire repository’s build graph for a CI system.

      Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

      I find this statement confusing. Nix rather does the opposite right now - most build systems are wrapped fully in derivations. The one thing that often causes a lot of overhead (e.g. through code generation) is that the majority of build systems either don’t provide Nix with usable hashes of dependencies, or they make it really hard to “inject” “pre-built” artefacts.

      A middleground between Docker-like layers and /nix/store composability

      Can you expand on this?

      1. 2

        Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

        I find this statement confusing.

        I believe GP meant “rewrite the upstream package management system”, which is mostly true for Nix; Unless a particular language’s package manager cooperates heavily with Nix and exposes its guts, you can’t reuse it to express your Nix derivations. That’s why we have projects like cabal2nix, node2nix etc. that try to bridge the gap.

        1. 4

          The majority of those tools just pin hashes, because the upstream package managers don’t provide appropriate lockfiles, but it’s still the upstream build system that is being invoked.

          A notable exception is Rust packaging with buildRustCrate, which reimplements a huge chunk of cargo as cargo is impossible to deal with from a perspective of wanting to package individual crates (it doesn’t support injecting these pre-built artefacts in any way).

          On the other end of the spectrum is something like naersk, which uses lockfiles natively.

        2.  

          Yeah gg and Bazel are both very functional, parallel, and fine-grained. Actually the fine-grained-ness of Bazel is nice at times but it also takes a large amount of time and memory to handle all that file-level metadata.

          It’s also worth looking at Llama, which is an open source experiment inspired by gg: https://blog.nelhage.com/post/building-llvm-in-90s/

          (by the author of the gg post)

          There were a few more blog posts and comments on lobste.rs about llama.


          I think gg’s main contribution is that it’s fast. However it’s also not a production system like Bazel or Nix; it’s research.

          Though it is open source: https://github.com/StanfordSNR/gg

          What stood out to me as being somewhat unreasonable and not-fit-for-production is the idea of “model substitution”, which seems like just a fancy way of saying we have to reimplement a stub of every tool that runs, like GCC.

          https://github.com/StanfordSNR/gg/blob/master/src/models/gcc.cc

          The stub is supposed to find all the dependencies. This seems like a lot of tech debt to me, and limits real world applicability. However gg is fast and that’s interesting! And I do think the notion of dynamic dependencies is important, and it seems like Bazel and Nix suffer there. (FWIW I feel Nix is a pioneering system with many of the right ideas, though it’s not surprising that after >15 years there is opportunity for a lot of improvement. Just because the computing world has grown so much, etc.)


          By rewrite the whole build system I basically mean stuff like PyPI and I have some particular experience with R packages (in Bazel, not Nix).

          As for the middleground, I’m experimenting with that now :) As far as I understand, writing Nix package definitions can be tricky because nothing is ever in a standard place – you always have to do ./configure --prefix. Although I just looked at a few definitions, and I guess that’s hidden? But the actual configure command underneath can never be stock.

          In my experience this is slightly annoying for C code, but when you have say R and Python code which rely on native libraries, it starts to get really tricky and hard to debug. My experience is more with Bazel but I’ve seen some people not liking “100 KLOC” of Nix. I’d prefer package definitions that are simpler with less room for things to go wrong.

          For me a key issue is that Nix was developed in a world before Linux containers and associated mechanisms like OverlayFS, and I think with that new flexibility you might choose something different than the /nix/store/$hash mechanism.

          From a philosophical level, Bazel and Nix both have a very strong model of the world, and they get huge benefits from that. But if you don’t fit in the model, then it tends to fall off a cliff and you struggle to write solid build config / package defs.

          Also I don’t remember the details, but I remember being excited about Nix Flakes, because I thought Nix had some of those guarantees all along but apparently doesn’t.


          Anyway it sounds like a very cool project, and I look forward to future blog posts. BTW it might also be worth checking out the Starlark language from Bazel. It started out as literally Python, but evolved into a language that can be evaluated in parallel quite quickly. This partially mitigates the “two stage” latency of evaluating dependencies and then evaluating the graph. I didn’t see any mention of parallel Nix evaluation but it seems to make sense, especially for a functional language? Starlark is not as functional but the semantics are such that there’s a lot of parallelism.

          1.  

            BTW it might also be worth checking out the Starlark language from Bazel. It started out as literally Python, but evolved into a language that can be evaluated in parallel quite quickly.

            Oh, I’m aware of Starlark - my previous full-time position was in SRE at Google :)

            I didn’t see any mention of parallel Nix evaluation but it seems to make sense, especially for a functional language

            Yeah, we want to get to a parallel evaluation model for sure. In fact the lack of that model is a huge problem for current Nix evaluation, as any use of IFD will force a build process and block the rest of the language evaluation completely until it is done. It makes a lot more sense to mark the thunk as being evaluated and continue on a different part of the graph in the meantime.

        3.  

          Oh god. If Nix is anything like Bazel I am staying the hell away from it. Is this actually true?

          1.  

            If you zoom out a lot there’s some conceptual similarities, but in practice they’re pretty much completely different systems with a completely different UX.

        1. 67

          Nobody knows how to correctly install and package Python apps.

          That’s a relief. I thought I was the only one.

          1. 8

            Maybe poetry and pyoxidize will have a baby and we’ll all be saved.

            One can hope. One can dream.

            1.  

              After switching to poetry, I’ve never really had any issues.

              pip3 install --user poetry
              git clone...
              cd project
              poetry install
              poetry run python -m project
              

              You can pull the whole install sequence in a Docker container, push it in your CI/CD to ECR/Gitlab or whatever repo you use, and just include both the manual and the docker command in your readme. Everyone on your team can use it. If you find an issue, you can add that gotcha do the docs.

              Python is fine for system programming so long as you write some useful unittests and force pycodestyle. You loose the type-safety of Go and Rust, yes, but I’ve found they’re way faster to write. Of course if you need something that’s super high performance, Go or Rust should be what you look towards (or JVM–Kotlin/Java/Scala if you don’t care about startup time or memory footprints). And of course, it depends on what talent pools you can hire from. Use the right tool for the right job.

              1.  

                I’ve switched to poetry over the last several months. It’s the sanest installing python dependencies has felt in quite a few years. So far I prefer to export it to requirements.txt for deployment. But it feels like about 95% of the right answer.

                It does seem that without some diligence, I could be signing up for some npm-style “let’s just lock in all of our vulnerabilities several versions ago” and that gives me a little bit of heartburn. From that vantage point, it would be better, IMO, to use distro packages that would at least organically get patched. I feel like the answer is to “just” write something to update my poetry packages the same way I have a process to keep my distro packages patched, but it’s a little rotten to have one more thing to do.

                Of course, “poetry and pyoxidize having a baby” would not save any of this. That form of packaging and static linking might even make it harder to audit for the failure mode I’m worrying about here.

            2. 5

              I’d make an exception to this point: “…unless you’re already a Python shop.” I did this at $job and it’s going okay because it’s just in the monorepo where everyone has a Python toolchain set up. No installation required (thank god).

              1. 4

                I think the same goes for running Python web apps. I had a conversation with somebody here… and we both agreed it took us YEARS to really figure out how to run a Python web app. Compared to PHP where there is a good division of labor between hosting and app authoring.

                The first app I wrote was CGI in Python on shared hosting, and that actually worked. So that’s why I like Unix – because it’s simple and works. But it is limited because I wasn’t using any libraries, etc. And SSL at that time was a problem.

                Then I moved from shared hosting to a VPS. I think I started using mod_python, which is the equivalent of mod_php – a shared library within Apache.

                Then I used a CherryPy server and WSGI. (mod_python was before WSGI existed) I think it was behind Apache.

                Then I moved to gunicorn behind nginx, and I still use that now.

                But at the beginning of this year, I made another small Python web app with Flask. I managed to configure it on shared hosting with FastCGI, so Python is just like PHP now!!! (Although I wouldn’t do this for big apps, just personal apps).

                So I went full circle … while all the time I think PHP stayed roughly the same :) I just wanted to run a simple app and not mess with this stuff.

                There were a lot of genuine improvements, like gunicorn is better than CherryPy, nginx is easier to config than Apache, and FastCGI is better than CGI and mod_python … but it was a lot of catching up with PHP IMO. Also FastCGI is still barely supported.

                1.  

                  nginx, uWSGI, supervisord. Pretty simple to setup for Flask or Django. A good shared hosting provider for Python is OpalStack, made by the people who created Webfaction (which, unfortunately, got gobbled up by GoDaddy).

                  I cover the deployment options and reasoning in my popular blog post, “Build a web app fast: Python, JavaScript & HTML resources”. Post was originally written in 2012 but updated over the years, including just this month. See especially the recommended stack section at the end, starting at “Conclusion: pick a stack”, if you want to ctrl+f for that section. You can also take a peek at how OpalStack describes their Python + uWSGI + nginx shared hosting setup here. See also my notes on the under the hood configuration for nginx, uWSGI, and supervisord in this presentation, covered in the 5-6 sections starting from this link.

                  You’re right that there are a lot of options for running a Python web app. But nginx, uWSGI, supervisord is a solid option that is easy to configure, high performance, open source, UNIXy, and rock solid. For dependency management in Python 3.x you can stick with pip and venv, remotely configured on your server via SSH.

                  My companies have been using this stack in production at the scale of hundreds of thousands of requests per second and billions of requests per month – spanning SaaS web apps and HTTP API services – for years now. It just works.

                  1.  

                    I’m curious, now that systemd is available in almost all Linux distributions by default, why are you still using supervisord? To me it feels like it is redundant. I’m very interested.

                    1.  

                      I think systemd can probably handle the supervisord use cases. The main benefit of supervisord is that it runs as whatever $USER you want without esoteric configuration, and it’s super clear it’s not for configuring system services (since that’s systemd’s job). So when you run supervisorctl and list on a given node, you know you are listing “my custom apps (like uwsgi or tornado services)”, not all the system-wide services as well as my custom app’s ones. Also this distinction used to matter more when systemd was less standard across distros.

                    2.  

                      In addition to @jstoja’s question about systemd vs supervisord, I’d be very curious to hear what’s behind your preference for nginx and uWSGI as opposed to caddy and, say, gunicorn. I kind of want caddy to be the right answer because, IME, it makes certificates much harder to screw up than nginx does.

                      Have you chosen nginx over caddy because of some gotcha I’m going to soon learn about very unhappily?

                      1.  

                        Simple answer: age/stability. nginx and uWSGI have been running fine for a decade+ and keep getting incrementally better. We handle HTTPS with acme.sh or certbot, which integrate fine with nginx.

                        1.  

                          That’s a super-good point. I’m going to need to finish the legwork to see whether I’m willing to bet on caddy/gunicorn being as reliable as nginx/uWSGI. I really love how terse the Caddy config is for the happy path. Here’s all it is for a service that manages its own certs using LetsEncrypt, serves up static files with compression, and reverse proxies two backend things. The “hard to get wrong” aspect of this is appealing. Unless, of course, that’s hiding something that’s going to wake me at 3AM :)

                  2. 3

                    Why is Python’s packaging story so much worse than Ruby’s? Is it just that dependencies aren’t specified declaratively in Python, but in code (i.e. setup.py), so you need to run code to determine them?

                    1. 7

                      I dunno; if it were me I’d treat Ruby exactly the same as Python. (Source: worked at Heroku for several years and having the heroku CLI written in Ruby was a big headache once the company expanded to hosting more than just Rails apps.)

                      1.  

                        I agree. I give perl the same handling, too. While python might be able to claim a couple of hellish inovations in this area, it’s far from alone here. It might simply be more attractive to people looking to bang out a nice command line interface quickly.

                      2. 6

                        I think a lot of it is mutable global variables like PYTHONPATH which is sys.path. The OS, the package managers, and the package authors often fight over that, which leads to unexpected consequences.

                        It’s basically a lack of coordination… it kinda has to be solved in the core, or everybody else is left patching up their local problems, without thinking about the big picture.

                        Some other reasons off the top of my head:

                        • Python’s import mechanism is very dynamic, and also inefficient. So the language design kind of works against the grain of easy distribution, although it’s a tradeoff.
                        • There’s a tendency to pile more code and “solutions” on top rather than redoing things from first principles. That is understandable because Python has a lot of users. But there is definitely a big mess with distutils + setuptools + pip + virtualenv, plus a few other things.
                        • Package managers are supposed to solve versioning issues, and then you have the tricky issue of the version of the package manager itself. So in some sense you have to get a few things right in the design from the beginning!
                        1. 5

                          Ruby’s packaging story is pretty bad, too.

                          1.  

                            In what way?

                            1.  

                              I don’t know, it’s been a long time since I’ve written any Ruby. All I know is that we’re migrating the Alloy website from Jekyll to Hugo because nobody could get Jekyll working locally, and a lot of those issues were dependency related.

                          2.  

                            Gemfile and gemspec are both just ruby DSLs and can contain arbitrary code, so that’s not much different.

                            One thing is that pypi routinely distributes binary blobs that can be built in arbitrarily complex ways called “wheels” whereas rubygems always builds from source.

                            1. 5

                              Not true. Ruby has always been able to package and distribute precompiled native extensions, it’s just that it wasn’t the norm in a lot of popular gems, including nokogiri. Which by the way, ships precompiled binaries now, taking couple of seconds where it used to take 15m, and now there’s an actual tool chain for targeting multi arch packaging, and the community is catching up.

                              1.  

                                Hmm, that’s very unfortunate. I haven’t run into any problems with gems yet, but if this grows in popularity the situation could easily get as bad as pypi.

                              2.  

                                Thanks for the explanation, so what is the fundamental unfixable issue behind Python’s packaging woes?

                                1.  

                                  I could be wrong but AFAICT it doesn’t seem to be the case that the Ruby crowd has solved deployment and packaging once and for all.

                              1. 2

                                I just run pkg install some-python-package-here using my OS’s package manager. ;-P

                                It’s usually pretty straightforward to add Python projects to our ports/package repos.

                                1.  

                                  Speaking from experience, that works great up until it doesn’t. I have “fond” memories of an ex-coworker who developed purely on Mac (while the rest of the company at the time was a Linux shop), aggressively using docker and virtualenv to handle dependencies. It always worked great on his computer! Sigh. Lovely guy, but his code still wastes my time to this day.

                                  1.  

                                    I guess I’m too spoiled by BSD where everything’s interconnected and unified. The ports tree (and the package repo that is built off of it) is a beauty to work with.

                                    1.  

                                      I mean, we used Ubuntu, which is pretty interconnected and unified. (At the time; they’re working on destroying that with snap.) It just often didn’t have quiiiiiite what we, or at least some of us, wanted and so people reached for pip.

                                      1.  

                                        Yeah. With the ports tree and the base OS, we have full control over every single aspect of the system. With most Linux distros, you’re at the whim of the distro. With BSD, I have full reign. :-)

                                        1.  

                                          But it could still be the case that application X requires Python 3.1 when application Y requires Python 3.9, right? Or X requires version 1.3 of library Z which is not backwards compatible with Z 1.0, required by Y?

                                          1.  

                                            The Debian/Ubuntu packaging system handles multiple versions without any hassle. That’s one thing I like about it.

                                            1.  

                                              Does it? Would love to read more about this if you have any pointers!

                                              1.  

                                                I guess the main usability thing to read about it the alternatives system.

                                            2.  

                                              The ports tree handles multiple versions of Python fine. In fact, on my laptop, here’s the output of: pkg info | grep python:

                                              py37-asn1crypto-1.4.0          ASN.1 library with a focus on performance and a pythonic API
                                              py37-py-1.9.0                  Library with cross-python path, ini-parsing, io, code, log facilities
                                              py37-python-docs-theme-2018.2  Sphinx theme for the CPython docs and related projects
                                              py37-python-mimeparse-1.6.0    Basic functions for handling mime-types in Python
                                              py37-requests-toolbelt-0.9.1   Utility belt for advanced users of python-requests
                                              py38-dnspython-1.16.0          DNS toolkit for Python
                                              python27-2.7.18_1              Interpreted object-oriented programming language
                                              python35-3.5.10                Interpreted object-oriented programming language
                                              python36-3.6.15_1              Interpreted object-oriented programming language
                                              python37-3.7.12_1              Interpreted object-oriented programming language
                                              python38-3.8.12_1              Interpreted object-oriented programming language
                                              
                                        2.  

                                          I’m as happy to be smug as the next BSD user but it isn’t justified in this case. Installing Python packages works for Python programs installed from packages but:

                                          • They don’t work well in combination with things not in packages, so if you need to use pip to install some things you may end up with conflicts.
                                          • The versions in the package repo may or may not be the ones that the thing you want to install that isn’t in packages need, and may conflict with the ones it needs.
                                          • The Python thing may depend on one of the packages that depends on Linux-specific behaviour. The most common of these is that signals sent to the process are delivered to the first thread in the process.

                                          In my experience, there’s a good chance that a Python program will run on the computer of the author. There’s a moderately large chance that it will run on the same OS and version as the author. Beyond that, who knows.

                                    2.  

                                      Fwiw, I’ve had good luck using Pyinstaller to create standalone binaries. Even been able to build them for Mac in Circleci.

                                      1.  

                                        It can feel a bit like overkill at times, but I’ve had good luck with https://www.pantsbuild.org/ to manage python projects.

                                      1.  

                                        I had read this last year, seems like great work. Somehow it’s not been merged?

                                        https://github.com/containers/image/pull/902

                                        https://github.com/containers/tar-diff

                                        1. 1

                                          This continues to be really fun to see develop.

                                          Regarding the parenthesis-as-expressions, does this make control blocks like if, while, etc, regular commands, semantically, of the form if EXPRESSION BLOCK, where the expression just follows the normal expression mode rules instead of explicitly requiring parenthesis as a grammar rule.

                                          I wonder if you could push that further? It might lead you into a remarkably elegant Lispy or Haskellesque territory, and give you fully programmable control flow for free. Very possible this is already the case and I’m not seeing it yet.

                                          I imagine you’ve caught this, but:

                                          const myexpr = ^[size > 10]  # unevaluated expression
                                          const myblock = ^(echo $name)  # unevaluated block
                                          when (myexpr, myblock)
                                          

                                          sort of asks to be rewritten as:

                                          const myexpr = ^(size > 10)  # unevaluated expression
                                          const myblock = ^{echo $name}  # unevaluated block
                                          when (myexpr, myblock)
                                          

                                          You’re in the verge of ^ as quotation, there, which would be incredible.

                                          I imagine keeping all of this compatible-ish with Bash is always a constraining factor, so if these are made impossible by that I’d be super curious as to why.

                                          1. 2

                                            Yes great points! I’m glad someone is paying attention :)

                                            For the first question, if and while are a special keywords basically because they are in shell, and I implemented them a long time ago. But I posted this same question on Zulip today under this post!

                                            https://oilshell.zulipchat.com/#narrow/stream/202008-oil-documentation/topic/Winter.20Blog.20Backlog.3A.20Recent.20Progress

                                            So I am wondering if it is confusing that the syntax is the same, but they mean different things. But in practice I’m not sure if there are any real consequences, other than the fact that you can’t redefine if as a function. And I don’t really want that. Testing and feedback is appreciated!


                                            And yes there is an inconsistency in the punctuation. I documented that here:

                                            https://www.oilshell.org/release/0.9.5/doc/warts.html#two-different-syntaxes-for-block-and-arglist-literals (some old syntax here; just corrected at HEAD)

                                            But the basic reason is that we already solve the problem of parsing $(echo hi) and the closing paren, which is actually quite difficult because parens are used in so many places, and can be unbalanced in case statements. (The AOSA chapter on bash which I refer to talks about this problem.)

                                            Parsing $[echo x] and ${echo hi} is actually even more difficult, because neither of them are operator characters. Unquoted [] can be globs or the test builtin, and unquoted {} can be brace expansion or brace groups.

                                            So I avoided those, and made ^(echo hi) consistent with $(echo hi). (And we also have @(echo hi) for split command sub.) The syntax is basically consistent with shell. If you think about it, shell also has $(echo hi) for command sub and { echo hi; } for blocks.


                                            Originally I did specify that command sub was $[echo hi], because [] was supposed to be for “words” and () was supposed to be for expressions. But that collided with reality!

                                            So it is a wart, and a documented one, but I’m actually OK with it because I think those unevaluated expressions will be very rare in real code. They are probably only going to be used for framework and library code (which I also hope will be rare!)

                                            I did spend a lot of effort trying to make all the sigils and punctuation consistent, and wrote a whole doc about it!

                                            https://www.oilshell.org/release/0.9.5/doc/syntax-feelings.html (feedback welcome)

                                            Let me know if you see any other inconsistencies :)

                                            1. 3
                                              • (): is a command

                                              • $(): $ means string / scalar, () is a command

                                              • ^(): ^ means quoted, () is a block

                                              • @(): @ means array / splice, () is a command

                                              • %(): % means unquoted, () is an array literal (sequence of words)

                                              • {}: brace expansion or brace groups

                                              • ${}: $ means string / scalar, {} is variable substitution (its own insanity)

                                              • ^{}: ^ means quoted, {} is … nothing? It was an expression.

                                              • @{}: is a syntax error

                                              • %{}: is … nothing?

                                              • []: is a list (sequence in the docs)

                                              • $[]: is … nothing?

                                              • ^[]: ^ means quoted, [] is an expression

                                              • @{}: is a syntax error

                                              • %{}: is … nothing?

                                              Even with oil and the docs both open, I don’t feel confident with my above listing.

                                              Is there a table somewhere? This feels like Perl-level inconsistency.

                                              1. 1

                                                The appendix is out of date, I just pushed a change [1], although I’m still confused where you got a lot of those. For example () isn’t a command; it’s a type arg list to a command as shown in the blog post.

                                                The best overview at the moment is the tour:

                                                https://www.oilshell.org/release/latest/doc/oil-language-tour.html

                                                Here’s a short way to think about it. Let’s start with the following axioms from shell:

                                                • $var and ${myvar}
                                                • $(command sub)
                                                • most other things about shell are discouraged/deprecated in Oil, particularly $(( )) and bash’s [[ and ((
                                                • { echo 'brace group'; } is { echo 'brace group' } in Oil

                                                Then in Oil, you have command vs. expression mode:

                                                https://www.oilshell.org/release/latest/doc/command-vs-expression-mode.html

                                                Expression mode is like Python / JavaScript! Generally this code has very few sigils and “sigil pairs”. It looks like

                                                const x = f('mystr', [], myvar)
                                                
                                                • {} [] () mean what they do in Python / JS – dict, list, parenthesized expression
                                                • we also have %(foo bar) which is identical to ['foo', 'bar']. It’s like Perl’s qw//. Words are unquoted.

                                                Then in command mode:

                                                • Word splitting is explicit rather than implicit. So we need:
                                                  • Explicit splicing of array variables with @myarray
                                                  • Split command sub with @(command sub)
                                                  • These are analogous to the two above
                                                • We also have expression sub, which is $[1 + 2].
                                                • You can pass { echo hi } as a trailing block arg, which is consistent with shell’s brace groups.

                                                Everything else should be quite rare. The inconsistent unevaluated expressions are a wart, but as mentioned very rare. Also, that part is less implemented (though everything in the Tour is verified to work).

                                                Does that make sense?

                                                %{} and @{} don’t have any obvious meaning I see. I guess you could argue that @{myarray} should mean splicing, but it’s superfluous considering the meaning of $myvar vs ${myvar} (which is purely syntactic).

                                                [1] https://github.com/oilshell/oil/commit/7adffce97ba8619f9fb866b1d185c1f0d6cd45b3

                                                1. 1

                                                  Does that make sense?

                                                  Not really, no. That’s why I asked if there was a table.

                                                  I’m still confused where you got a lot of those.

                                                  oil -n -c '…' mostly.

                                                  1. 1

                                                    This table is updated (and will be published with the next release).

                                                    https://github.com/oilshell/oil/blob/master/doc/syntax-feelings.md

                                                    You have to take into account the valid contexts (command, expression or both), and what’s inside (either commands, expressions, or words).

                                                    1.  

                                                      Thanks mate; I gave a look and it’s a lot more clear!

                                                      Am I blind, though, or is it missing normal ’ol $variable / ${variable}?

                                                2. 1

                                                  Another thing to note that % does not mean hash, like it does in Perl. $ and @ do mean string and array respectively, so this could be confusing.

                                                  Oil doesn’t need any sigil for hash, because expression mode is different than command mode. The sigils are used mainly in command mode.

                                            1. 2

                                              You just introduced me to uftrace, thanks!

                                              1. 2

                                                Cool, yeah it came in handy this release as a substitute for gdb / CLion! Somehow I’m unable to productively use GDB’s interface for many bugs …

                                                I was using CLion a lot earlier this year but switched PCs and didn’t install it again. Surprisingly, uftrace is a nice CLI substitute in some cases.

                                                Originally I had used it for performance reasons, but it’s also good for debugging. I meant to write a a post tagged #praise about it, which I mentioned here: http://www.oilshell.org/blog/2020/01/blog-roadmap.html

                                                Simply counting function calls can be a great for profiling real systems. Particularly recursive descent parsers. And uftrace is really good at that.

                                                I also haven’t written properly about re2c, although I keep mentioning it, and it’s been great for ~4 years now

                                              1. 8

                                                They rightly point out the limited sharing of layers in Docker. Nix has better composability via the separate /nix/store paths, though the problem is then that the package definitions become very unwieldy and non-standard.

                                                You pay a tax on every package. And since Nix has better reproducibility, you have more packages to build.

                                                You also have more problems with the model shear between N different language package managers who want to own their slice of world.


                                                I was thinking that what we need is simply an HTTP proxy for reproducibility without rewriting all your package builds and building them from source. Has anyone done anything like that?

                                                That is, tools like apt should all be able to use an HTTP proxy to fetch data. So if you do an apt install in something like a Dockerfile, it the proxy should be able to record what it retrieved, and you can archive it somewhere. (where to archive it is another issue; I don’t like depending on free services from Docker.)

                                                Then again I remember this other project which adds a “lockfile” concept to Debian (an expanded transitive dependency list). So maybe it would suffice if Debian and pip had that?

                                                https://news.ycombinator.com/item?id=28840128

                                                And I still wonder if a nicer system could take advantage of the fact that OverlayFS is in the kernel, and just bypass Docker …

                                                1. 2

                                                  Things you do in apt, including install, get logged to /var/log/apt/history.log, with entries that look like this:

                                                  Start-Date: 2021-11-10 15:49:09 Commandline: apt install html-xml-utils Requested-By: dsr (1000) Install: html-xml-utils:amd64 (7.7-1.1) End-Date: 2021-11-10 15:49:13

                                                  So you could just log that via syslog somewhere useful to you, or copy off the log .

                                                1. 12

                                                  This is indeed depressing … My last upgrade was from Ubuntu 16 to 18, not to the current 20 release, because it appears to have less Snap BS on it. Not sure what I’m going to do in a few years :-(

                                                  On another note, I recently ran WIndows XP on modern hardware and it absolutely flies. As far as desktop apps, it does basically everything that modern computers do. We used to make fun of Microsoft software for being “bloated” but the Linux world is 1000x worse now. The Microsoft calculator app was never 152 MB :-(

                                                  In fact the entire Windows XP installation is under 128 MB! Amazing!!! And it runs in 32 MB of RAM.


                                                  I wonder if Linux needs something like COM – stable shared library interfaces. I would prefer something more like IPC than shared libraries (Unix style), but performance is always a concern. Although you did have “DLL hell” back then too, which is what Snap and such are trying to avoid.

                                                  But I wonder if “DLL hell” was really people NOT using COM, which should let people know if the interfaces were changed? Or just using it poorly, i.e. changing the semantics of the interface, rather than creating new interfaces when breakage occurs.

                                                  Mozilla had XPCOM more than a decade ago but abandoned it for some reason. I think they abandoned true cross language interoperability and went for just JS / C++ interop like WebIDL in Chrome (as far as I understand).

                                                  1. 17

                                                    I recently ran WIndows XP on modern hardware and it absolutely flies

                                                    Yeah. Software that was written by people using HDDs goes really well when you run it on SSDs.

                                                    I consider it a systemic tragedy that programmers tend to use very fast computers when we’re actually the one group that has the most capability to change things to make slow ones useful.

                                                    1. 3

                                                      Yeah I asserted on Hacker News that the Apple M1 would probably have the effect of making the entire web slower.

                                                      If you assume that web developers are more likely to buy newer Apple laptops sooner than their audience, and spend more on them, that seems inevitable :-(

                                                      I think SSDs were a big jump in hardware performance but CPU and memory are issues as well. Today’s apps use so much memory that people running with 8 GB of RAM can experience slowdowns, let alone older computers with 2GB or 1GB (like at the library, or what many low income people use, etc.)

                                                      1. 1

                                                        Fwiw that’s also been true of every new AMD and Intel CPU too so there’s no particular reason to single Apple out. Other than their currently being in front. Obviously I do believe you are correct.

                                                        Today’s apps use so much memory that people running with 8 GB of RAM can experience slowdowns,

                                                        This is one area where I hoped at one point that web browsers might help a bit because of per tab memory limits.

                                                        Also CI environments and serverless (neé Platform as a Service) environments tend to charge by the 128MB-second of RAM which tempts people to try to fit things in smallish boxes.

                                                        1. 4

                                                          I think this unique because it’s the first time there’s a pretty big differential between Apple and the rest of the industry. And because web developers are more likely to use Apple machines, and the audience is more likely to use Windows.

                                                          When Apple was using Intel chips, top of the line Windows laptops had the same CPUs or faster. Now if you’re a Windows user, AFAIK you can’t get a laptop as fast as the Macs that everyone is buying right now.

                                                          Obviously this is not Apple’s “fault”; they’re just making faster computers. And I haven’t quantified this, but I still think it’s interesting :)

                                                          1. 1

                                                            Ah. You have a good point. Yes, I’ll concede this!

                                                            1. 1

                                                              I don’t find this argument compelling.

                                                              First off, most usage is via mobiles, and any company worth its salt (i.e. striving to make money) will take this into consideration. Mobile clients are generally not as fast as desktop ones.

                                                              Secondly, the M1 line is about a year old. Has there really been a critical mass of web technology developed and deployed during that time, that is acceptable to run on an M1 but not on others, to materially tilt the scale of performance in the wild?

                                                              And lastly, this paints an incredibly bleak “us vs. them” picture, wherein Intel, one of the world’s largest companies and one that has been in the forefront of processor development for decades will never[1] catch up with Apple when it comes to performance. Or that Apple, seeing a gaping hole in the market, won’t move in with more affordable machines using the M1 chips to capture that.

                                                              [1] well, in the medium term

                                                              1. 2

                                                                I’m not claiming this is a permanent state of affairs! Surely the CPU market will eventually change, but that’s how it is now.

                                                                The first claim about company incentives is empirically false… If mobile app speed mattered more than functionality or time to market, then you wouldn’t see apps that are 2x or 10x too slow in the wild, yet you see them all the time.

                                                                Funny story is that at Google, which traditionally had very fast web apps and then fell off a cliff ~2010 or so, people knew this was a problem. There were some proposals to slow down the company network once a week to respect our mobile users. (The internal network was insanely fast, both throughput and latency wise). This never happened while I was there. Seems like a good idea but there was no will to do it.

                                                                The truth is that 99% of changes to web apps never get tested on a mobile network or mobile device. It slows down development too much. Why do you think there is the mobile app simulator in desktop Chrome? Because that’s how people test their changes :)

                                                                Once there’s a slowdown, it’s fairly hard to back out after a few changes are piled on top. So in this respect Google is like every other company that does web dev. There is no magic. It’s just a bunch of people writing JavaScript on top of a huge stack, and they are incentivized to get their jobs done.

                                                                If they did test on mobile, employees generally had the latest phones because they were given out as gifts every year (for awhile). And testing on the phone doesn’t solve the problem of testing on an insanely fast network.

                                                                1. 1

                                                                  Thanks for clarifying and expanding. My faith in the free market and competition has taken a dent.

                                                      2. 13

                                                        Actually, ironically the web solves this problem with process-based concurrency and stable protocols and interchange formats.

                                                        So if I have a web calculator app, a web spreadsheet, and a web mail app, I don’t ship the GUI with the app.

                                                        Instead I emit protocols and languages that cause the GUI to be displayed by the browser.

                                                        In some sense you do move some of the bloat do the browser, but it’s linear bloat and not multiplicative bloat like with Linux desktop apps.

                                                        You are forced to do feature detection (with JS) rather than version detection, but that’s good! That is, the flaky model of solving version constraints in a package manager is part of what leads to DLL hell.

                                                        Also, it’s much easier to sandbox such a web app than one that makes a lot of direct calls to the OS.


                                                        So the web is more Unix-y and avoids a lot of the problems that these Linux desktop apps have. Although I guess we recreated a similar problem again on top of it with JS package managers :-/

                                                        1. 2

                                                          But so many things are just not possible with the web ? Good luck having a DaVinci Resolve or Solidworks (the real thing, not something with 1/100th of the features, format and hardware support) editing things that requires 32+GB of RAM to work semi-comfortably, in a web page.

                                                          1. 1

                                                            Definitely true, I’m just pointing out an underexamined benefit of the web architecture for certain (simple) apps. There are plenty of problems with the web for that use case too, in particular that it’s most natural to write everything in JavaScript!

                                                        2. 11

                                                          Well, after a fashion, we do already have COM in Linux, via Wine.

                                                          And by sheer headcount, most of the applications I run (via Proton ne Wine) are all using COM and the Win32 APIs.

                                                          If you want a solid API to bang against on Linux–at least for gaming–use Microsoft APIs. :3

                                                          1. 9

                                                            Yeah I think I saw a pithy tweet about that recently!!!

                                                            I can’t find it but this is similar: https://twitter.com/badsectoracula/status/1181574065817038850

                                                            “the most stable ABI on Linux is Wine”

                                                            I’m not really familiar with the gaming world but it seems like this is a common thing: https://news.ycombinator.com/item?id=22922774

                                                            So yeah Linux is bad at API design and stability so we have converged on something that is known to work, which is Win32 :-/

                                                          2. 4

                                                            [T]he entire Windows XP installation is under 128 MB…[a]nd it runs in 32 MB of RAM.

                                                            That doesn’t sound like a full install. Officially it required 64Mb RAM but disabled things like wallpaper when below 128Mb; unofficially if you want to run any software the real number is much higher. Nonetheless the point remains valid.

                                                            I wonder if “DLL hell” was really people NOT using COM…[o]r just using it poorly

                                                            AFAICT COM is a bit of a red herring. COM allows for C++ style objects to be expressed with a stable ABI. But a lot of Windows is built on a C ABI where objects are not required. The key part is following the rules of either approach to ensure the ABI remains stable. Unfortunately this creates a situation where one person anywhere who makes a serious mistake can cause misery everywhere - it requires developers to be perfect. Frankly though, the vast majority of the time, ABI stability was achieved by just following the rules, and the rules are not that hard to follow.

                                                            I’ve ranted a bit about the lack of ABI stability on Linux libraries before, and agree with the original author’s point about “militant position on free software.” Just like the above, this position doesn’t need to be held universally - if it’s held by any single maintainer of a widely used library, the result is an unstable ABI.

                                                            1. 1

                                                              It might have been 256 MB disk and 64 MB RAM, not sure… But yeah I was surprised when setting the VirtualBox resources how low it was. You can’t run Ubuntu like that anymore.

                                                              1. 1

                                                                Right, because COM doesn’t make you add an IHober2, it’s your own internal discipline.

                                                              2. 3

                                                                On another note, I recently ran WIndows XP on modern hardware and it absolutely flies.

                                                                Well, at least until you block a UI thread in Explorer…

                                                                1. 3

                                                                  True, but that happens to me on a daily basis in Ubuntu :-/ Especially with external drives

                                                                  Ubuntu used to be better but has gotten worse. Ditto OS X. I have had to reboot both OSes because of instability, just like Windows back in the day. 10 years ago they were both more stable IME.

                                                                  1. 2

                                                                    You’d also be missing out on a lot of security stuff too. XP was a superfund site of malware back in the day before Microsoft started cleaning stuff up in SP2 and did radical refactoring in Vista.

                                                                    1. 1

                                                                      Sure, I’m not saying we should literally use Windows XP :) I’m just saying it could be a good reference point for efficient desktop software. We’re at least 10x off from that now, and 1000x in some cases. We can be more secure than XP and Vista too :)

                                                              1. 14

                                                                It sounds like the correct title is ’GNU cut considered harmful`, since the author points out that other implementations do the right thing.

                                                                That said, this is quite tricky with UNIX in general. The locale that a program should use defines the character set that it should use for input and output. This is propagated by an environment variable and on *NIX systems (unlike Windows, for example) the environment of a program cannot be modified after it starts. This means that you can start your music player in a BIG5 locale, then run a command that monitors its output in a UTF-8 locale and there’s no standard mechanism for telling the music player that it should switch its output to UTF-8.

                                                                On macOS, Apple largely ignores the UNIX locale for precisely this reason (well, technically, NeXT did 30 years ago and Apple simply inherited that choice): The current locale is part of user defaults which has a mechanism to notify applications that a default has changed. AppKit hooks a handler for this notification that checks if the changed default is the locale and, if so, switches the locale of the application. Most command-line tools on Darwin don’t use this mechanism, unfortunately, so you can easily end up with a mismatched locale between command-line and GUI tools. This is also why Apple added the _l-suffixed variants of all of the locale-aware C-standard library functions (most of which were standardised in POSIX2008): it allows you to use C libraries with an explicit locale_t that is picked from the locale in user defaults, rather than whatever happens to be in the environment.

                                                                1. 7

                                                                  Unix and C are defined that way, but I’d argue the environment variable is an incoherent design. It may have worked in the 80’s, but it no longer works in a networked world.

                                                                  The locale is metadata, and metadata describes data. It generally should be shipped along with it, on the file system, or like HTTP and other protocols do. It doesn’t belong in an environment var that the program reads!

                                                                  You can have files on your disk with different encodings! You get them from the network and store them on disk. The same instance of a program can read two different files in different encodings. grep accepts multiple files :)

                                                                  Oil is UTF-8 only and I believe OpenBSD’s shell made the same choice. Good to know OS X has leaned in that direction too. If you need to process data in another encoding you can convert it with iconv or something first.

                                                                  Although I think we need to add options for LANG=C, so it’s basically C or UTF-8.

                                                                1. 3

                                                                  Aren’t tuples spaces implementable as just a relational database? Insert to put, delete to delete?

                                                                  1. 8

                                                                    The blocking behavior of take() is also important, and I’ve never seen that in a relational DB. As mentioned it’s more of a distribution and concurrency primitive than a data storage and querying primitive. I could imagine the query part approaching the power of SQL, but I don’t think any systems did so.

                                                                    1. 2

                                                                      Oh…now that’s an interesting feature.

                                                                      1. 2

                                                                        It’s good question though and makes me think that relational databases (and even sqlite) should support some blocking primitive.

                                                                        Although on second thought, I also don’t think you can’t SELECT + DELETE atomically, where the DELETE depends on the result of the SELECT? Like you would have a LIMIT 1 query and then want to both retrieve that one tuple, and delete it atomically.

                                                                        You might be able to simulate by retrying with exponential backoff on an empty query. Another thing you might want is a blocking PUT/INSERT to simulate the backpressure we were talking about on the other thread. Although the one system I worked with that resembled a tuple space didn’t have this.

                                                                        1. 3

                                                                          In a database you’d SELECT first, then DELETE the row by its primary key, but wrap those in a transaction for atomicity.

                                                                          But my preconception of tuple spaces is that a tuple often represents a small unit of work, like one computation, in which case a database transaction would add many orders of magnitude overhead.

                                                                          1. 2

                                                                            Well, Postgres extensions would allow you to return the value of a deleted row I believe. The blocking bit would be the magical bit, on both.

                                                                      2. 4

                                                                        Tuple Spaces are more about coordination and communication than data storage. It’s not just putting and deleting, it’s combinations of instructions and them being atomic.

                                                                        Matching and retrieving a tuple from the store, for example, deletes it so that no other process monitoring the tuple store can read it. It’s more akin to a different kind of message bus/queue than a relational database, and you’d face overhead if you implemented tuple spaces in terms of transactions on a table.

                                                                        1. 1

                                                                          This isn’t it but it solves the same problem, I think this is good for now:

                                                                          http://blog.ezyang.com/2020/10/idiomatic-algebraic-data-types-in-python-with-dataclasses-and-union/

                                                                          I will post it as a top level story since I haven’t seen it before :)

                                                                            1. 2

                                                                              I like this recipe for very high utilization, scalable systems:

                                                                              • heterogeneous threads connected by queues, so that the queue sizes let you know what is underprovisioned
                                                                              • backpressure between stages by limiting queue sizes
                                                                              • load shedding of non-critical requests (analytics, etc.) Dropping work makes the system more reliable under stress!

                                                                              Go was supposed to encourage designs like this, but I have heard that there is very limited use of channels (fixed size queues) in a lot of Go code

                                                                              1. 2

                                                                                Blocking bounded queues are indeed a simple way to implement backpressure (if you can actually block producers), but they introduce deadlock risks. “Fail-fast” load shedding tends to be more robust than blocking for this reason. Networking-inspired approaches like CoDel can work well for this.

                                                                              1. 4

                                                                                This is a great talk, and I saw a lot of the issues first hand while hacking on the CPython interpreter for Oil (e.g. http://www.oilshell.org/blog/2018/11/15.html) The surprisingly complex semantics of “a + b” is one example I remember from the talk (which I watched a few years ago).

                                                                                At the beginning of the Oil project I thought I would reuse some of the CPython interpreter for Oil, but that’s no longer the case. That would be essentially importing a lot of undocumented semantics from a specific implementation, which I don’t want. Ironically, writing interpreters in C leaves a lot of room for bugs to hide, but writing interpreters in Python is less prone to that.

                                                                                Python is deceptively clean on the outside :) It has a nice regular syntax (with few of the oddities of say Perl or Ruby), but the semantics are full of corner cases. I think this is somewhat inherent in all languages, e.g. Rust too. Reading the source code is often the only way to understand the corner cases.

                                                                                The “specs” often approach the size of the source code, i.e. they essentially need to cover every if statement in the interpreter. That’s why I think executable specs have a lot to recommend them, in contrast to docs like POSIX or the C++ standard.

                                                                                1. 1

                                                                                  Any new language should be designed (i.e. specified) “semantics-first”. Broken syntax is much easier to repair than broken semantics.

                                                                                  1. 5

                                                                                    Sure, let us know how that works out :)

                                                                                1. 1

                                                                                  Hm I don’t understand the motivation for doing it without recursion? Isn’t that code a lot shorter?

                                                                                  It’s an interesting topi,c but I found the blog post too dense with code and tests, without a clear explanation of the algorithm, at least not that I could see.

                                                                                  1. 1

                                                                                    I think at least for imperative language, the code will be shorter:

                                                                                    #[test]
                                                                                    fn print_all_sets() {
                                                                                        let n = 3;
                                                                                    
                                                                                        let mut g = Gen::new();
                                                                                        while !g.done() {
                                                                                            let s: Vec<bool> = (0..n).map(|_| g.gen(1) == 1).collect();
                                                                                            eprintln!("{:?}", s)
                                                                                        }
                                                                                    }
                                                                                    
                                                                                    #[test]
                                                                                    fn print_all_sets_recursive() {
                                                                                        let n = 3;
                                                                                        let mut acc = Vec::new();
                                                                                        go(&mut acc, n);
                                                                                    
                                                                                        fn go(acc: &mut Vec<bool>, n: usize) {
                                                                                            if n == 0 {
                                                                                                eprintln!("{:?}", acc);
                                                                                            } else {
                                                                                                acc.push(true);
                                                                                                go(acc, n - 1);
                                                                                                acc.pop();
                                                                                    
                                                                                                acc.push(false);
                                                                                                go(acc, n - 1);
                                                                                                acc.pop();
                                                                                            }
                                                                                        }
                                                                                    }
                                                                                    

                                                                                    But for me, the main benefit is the directness. I need to think how to write a recursive function which enumerates all the segments. The imperative version I can just type out.

                                                                                    Regarding the presentation, yeah, I know. At least for me this is still a new idea, so the genre is “presenting new material” rather than “teaching an established topic”, so the post assumes a fair amount of background and careful reading. Should’ve explicitly mentioned this at the start though.

                                                                                    1. 2

                                                                                      Hm so I think the problem you are solving is not necessarily imperative vs. recursive, but abstracting the (combinatorial) iteration behind an API? Basically flattening it? The two examples don’t seem equivalent because the first uses a Gen object and the second doesn’t.

                                                                                      The reason I say that is because I think you can do this with Python generators and yield from recursively. I was going to try to write a demo today but I didn’t have time.

                                                                                      I think this is also related to “push vs. pull parsers”. In most language parsers you use the stack (recursive descent), which is analogous to the recursive case here. In other parsers, like the event-driven nginx and node.js HTTP parsers, the parser is “inverted” into a state machine, and doesn’t use the stack. So I think that is analogous to what you’re doing, although I didn’t read all the code.

                                                                                      Notably Go uses the stackful, “client pulls” style parser because it always starts a goroutine, which is basically like the the Python generator.

                                                                                      I think Rust has opposite style of iteration which is maybe why it’s harder to express. If you were willing to start a thread in Rust to generate permutations (which would probably be fine in tests), then you could just use the recursive style and hide it behind an API, no?

                                                                                      1. 2

                                                                                        OK I transcribed the code in this comment and I think it illustrates what I’m getting at:

                                                                                        https://github.com/oilshell/blog-code/blob/master/push-pull/powerset.py

                                                                                        Both styles of code output the same thing, and both are recursive. I think the issue is more how to hide them behind an API, i.e. does the producer or the consumer “own the thread of control”?

                                                                                        Whether you can abstract it behind an API without threads depends on the language. Rust, Ruby, and JS favor the push style, while Python and Go favor the pull style.

                                                                                        Bob Nystrom calls it “internal vs external iterators”: https://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/

                                                                                        Also related: https://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

                                                                                        It’s not too clear from this example, but for more complicated algorithms I think the pull style is nicer. The push style is more prone to bugs IMO, and for that reason I would avoid it in test code. (I think a lot of the blog post is testing the correctness of test code, which IMO is a bit of a smell.)

                                                                                        That reminds me of this good talk on changing the node.js parser from hand-written nginx style push to a pull style with code generation:

                                                                                        https://lobste.rs/s/76akkn/llhttp_http_1_1_parser_for_node_js_by_fedor

                                                                                        my comment mentioning it: https://lobste.rs/s/rzhxyk/plain_text_protocols#c_gnp4fm

                                                                                        Also reminds me of the discussion we had about whether Pratt Parsing and Shunting Yard are the same algorithm :) https://matklad.github.io/2020/04/15/from-pratt-to-dijkstra.html

                                                                                        I guess I find it annoying that you have to “choose” and would like to just code all the algorithms one way, and have some “compiler” choose the right one. That is exactly what llparse does, and also re2c can generate both push and pull lexers, although one is more mature / favored. Some other parsing tools do it too.

                                                                                        PUSH STYLE
                                                                                        
                                                                                        [True, True, True]
                                                                                        [True, True, False]
                                                                                        [True, False, True]
                                                                                        [True, False, False]
                                                                                        [False, True, True]
                                                                                        [False, True, False]
                                                                                        [False, False, True]
                                                                                        [False, False, False]
                                                                                        
                                                                                        PULL STYLE
                                                                                        [True, True, True]
                                                                                        [True, True, False]
                                                                                        [True, False, True]
                                                                                        [True, False, False]
                                                                                        [False, True, True]
                                                                                        [False, True, False]
                                                                                        [False, False, True]
                                                                                        [False, False, False]
                                                                                        

                                                                                        edit: I neglected to mention the possibility of just generating the entire test matrix up front, i.e. putting it in a big vector. That is perfectly good for this testing use case, no need for any “concurrency” or coroutines. So you can write the generation in whatever style is easiest (no state machines) and then just hide it behind a vector.

                                                                                        1. 1

                                                                                          Not sure – I’d say both your versions are a variation of print_all_sets_recursive. The way to see it is that all recursive versions use O(N) stack space, while the gen version uses O(1) stack space. Here’s the Python translation of gen: https://gist.github.com/matklad/77dd480b7b6e7d5eef93074b63b07391

                                                                                          I think if you mechanically replace recursion with iteration you’ll get some specialization of the Gen trick

                                                                                          1. 1

                                                                                            OK but what I’m saying is: Why use the Gen style at all?

                                                                                            You can just write the “obvious” thing and materialize the entire output into a big vector. Then run your tests.

                                                                                            Or start a thread / coroutine if you really want it to be incremental.

                                                                                            I’m not convinced the Gen code is the “right” way to do it iteratively, it looks like it has a computational complexity problem with range(len(self.v)) in done(). In any case I find the Python is a lot clearer than Rust :) IMO Rust obscures the algorithm.


                                                                                            I actually thought about this type of problem a because shell has this construct:

                                                                                            $ echo _{a,b{c,d}}_{e,f}_
                                                                                            _a_e_ _a_f_ _bc_e_ _bc_f_ _bd_e_ _bd_f_
                                                                                            

                                                                                            It requires two instances of recursion to evaluate – one for the cross product and one for the “nesting”. So I was wondering how to do it without recursion. But I don’t think there’s any real problem of doing it with recursion even in production code. You can overflow the stack but that happens in every interpreted language.

                                                                                    1. 15

                                                                                      Seeing the examples use floats for currency made my eye twitch uncomfortably. None of the code I’ve written for financial institutions did that. Is that really done in this space?

                                                                                      1. 37

                                                                                        I used to work at a firm that did asset valuation for bonds, and we used floats :).

                                                                                        It’s generally fine to use floats when it comes to asset valuation, since the goal is to provide an estimate for the price of a security. Our job was to estimate the value of a bond, discounting its price based on 1) the chance that the issuer will go bust or 2) the chance that the issuer will pay off their debt early.

                                                                                        My understanding is that floats are never used in places where a bank is dealing with someone’s physical assets (it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors). Since our firm was not dealing with money directly, but instead selling the output of statistical models, floats were acceptable.

                                                                                        1. 9

                                                                                          That makes absolute sense to me. Thanks for sharing the difference. We were dealing with transactions (and things like pro-rated fees, etc.) so even for things where it made sense to track some fraction of a cent, it was “millicents” and integer arithmetic. I wasn’t thinking in terms of model output.

                                                                                          1. 4

                                                                                            it would be a disaster to miscalculate the money deposited into someone’s account due to rounding errors

                                                                                            IME the really really hard thing is that summing floats gives different answers depending on the order you do it in. And summation operations appear everywhere.

                                                                                          2. 7

                                                                                            @jtm gave you a bit more detail, the original post offers this in the Other notes section:

                                                                                            One of things that tends to boggle programmer brains is while most software dealing with money uses multiple-precision numbers to make sure the pennies are accurate, financial modelling uses floats instead. This is because clients generally do not ring up about pennies.

                                                                                            1. 6

                                                                                              Ah I missed this, but yes – exactly this.

                                                                                              This is because clients generally do not ring up about pennies.

                                                                                              An amusing bit about my old firm: often times, when a bond is about the mature (i.e. the issuer is about to pay off all of their debt on time), the value of a bond is obvious, since there is a near-zero chance of the issuer defaulting. These bonds would still get run through all the models, and accrue error. We would often get calls from clients asking “why is this bond priced at 100.001 when its clearly 100?” So sometimes we did get rung up about pennies :).

                                                                                              1. 2

                                                                                                If that was there when I read it, I overlooked it because my eye was twitching so hard.

                                                                                                1. 2

                                                                                                  It’s completely possible they added the Other notes section later! Just wanted to share since it addressed your question directly.

                                                                                              2. 3

                                                                                                I never wrote financial code, but I also never understood the desire to avoid floats / doubles. They should have all the precision you need.

                                                                                                Decimal is a display issue, not a calculation issue. I think the problem is when you take your display value (a string) and then feed it back into a calculation – then you have lost something.

                                                                                                It’s like the issue with storing time zones in you database vs. UTC, or storing escaped HTML in the database (BAD), etc.

                                                                                                Basically if you do all the math with “right”, with full precision, then you should be less than a penny off at the end. I don’t see any situation where that matters.

                                                                                                Although on the other side, the issue is that “programmers make mistakes and codebases are inconsistent”, and probably decimal can ameliorate that to some extent.

                                                                                                I also get that it’s exact vs. inexact if you advertise a 0.1% interest rate, but I’d say “meh” if it’s a penny. It’s sort of like the issue where computer scientists use bank account balances as an example of atomic transactions, whereas in real life banks are inconsistent all the time!

                                                                                                1. 11

                                                                                                  I also never understood the desire to avoid floats / doubles.

                                                                                                  Addition isn’t associative, so the answers you get from summations are less predictable than you would like

                                                                                                  1. 7

                                                                                                    I think in practice the issue may actually be that floats can be too precise. Financial calculations are done under specific rules for e.g. rounding, and the “correct” result after multiple operations may actually be less mathematically accurate than if you’d just used 64-bit floats, but the auditors aren’t going to care about that.

                                                                                                    1. 4

                                                                                                      It’s not just that, it’s that the regulations are usually written to require that they be accurate to a certain number of decimal digits. Both the decimal and binary representations have finite precision and so will be wrong, but they’ll be differently wrong. Whether the binary floating-point representation is ‘too precise’ is less important than the fact that it will not give the answer that the regulators require.

                                                                                                    2. 4

                                                                                                      like @lann and @david_chisnall mentioned, it’s not about being precise, it’s about getting the answer expected by the accountants and bookkeepers and finance people. Back when they were doing it all on paper, they built certain rules for handling pennies, and you have to do it the same way if you want to be taken seriously in the finance/banking/accounting industries. Back then they couldn’t cut a physical penny in half, so they built rules to be fair about it. Those rules stuck around and are still here today and are sometimes codified into law[0]

                                                                                                      As for “meh” it’s a penny, they generally don’t care much about anything smaller than a penny, but they absolutely care about pennies. I regularly see million dollar transactions held up from posting because the balancing was off by 1 penny. They then spend the time it takes to track down the penny difference and fix it.

                                                                                                      0: PDF paper about euro rounding

                                                                                                      1. 1

                                                                                                        how do you store 1/3 with full precision ?

                                                                                                        1. 1

                                                                                                          Not with a decimal type either :)

                                                                                                          1. 1

                                                                                                            sorry I misread your post

                                                                                                    1. 4

                                                                                                      There’s a lot of great info in here, but I feel Rust isn’t a great language for this kind of thing, there’s a lot of rusty boilerplate clogging up the parts that are actually about how terminals work :( I know enough rust to read it anyway, but it’s definitely something that’s in the way, and hampers my understanding of the important concepts.

                                                                                                      1. 2

                                                                                                        I think it just depends on what you’re used to. Given how much of this stuff is low-level system and library calls, it makes sense to pick something that has good support for FFI and reasonable wrappers around those facilities, both of which are true of Rust.

                                                                                                        1. 4

                                                                                                          I would have preferred C, so there are no wrappers :) Not necessarily for production code but for pedagogy.

                                                                                                          1. 5

                                                                                                            I think it’s hard to write something like this, where the goal seems to be to inform and teach, without it becoming copy-paste production code in a variety of systems. In that sense it seems preferable to write the examples in a way that they can be used directly for new software, which is unlikely to be in C in 2021.

                                                                                                            1. 1

                                                                                                              This is a good point.

                                                                                                      1. 7

                                                                                                        Personally I regard malloc() as a fundamentally broken api with the operating systems not providing a reliable alternative.

                                                                                                        One of my pet hates is all the introductory text telling everybody to check the return value for NULL… resulting in literally gigabytes of broken useless untested code doing ever more arcane and intricate steps to try and handle malloc returning null.

                                                                                                        Most of it is demonstrably broken… and easy to spot… if an out of memory handler uses printf…. guess what. it’s b0rked. Printf uses malloc. Doh!

                                                                                                        I always wrap malloc() to check for null and abort(). Invoke the Great Garbage Collector in the Sky.

                                                                                                        There after I rely on it being non-null.

                                                                                                        These days I also allocate small or zero swap partitions…. by the time you’re swapping heavily… your program is not dead… just unusable, actually worse than that. Your program has made the entire system unusable. So the sooner the OOMKiller wakes and does it’s thing the better.

                                                                                                        1. 13

                                                                                                          One of my pet hates is all the introductory text telling everybody to check the return value for NULL…

                                                                                                          It’s an extremely important thing to do in embedded systems, many of which are incredibly RAM-constrained (I own at least one board with only 16KB of RAM) and in older “retro” OSs (including MacOS pre-X, and IIRC Windows pre-XP) that don’t have advanced virtual memory.

                                                                                                          Most of it is demonstrably broken… and easy to spot… if an out of memory handler uses printf…. guess what. it’s b0rked. Printf uses malloc. Doh!

                                                                                                          You young fellers didn’t cut your teeth on out-of-memory errors the way I did :) Here’s how you do this: On startup you allocate a block big enough to handle your error-recovery requirements, say 16KB. Sometimes it was called the “rainy day fund.” When allocation fails, the first thing you do is free that block. Now you have some RAM available while you unwind your call chain and report the error.

                                                                                                          In your event loop (or some equivalent) if your emergency block is null you try to reallocate it. Until then you operate in emergency low-memory mode where you disable any command that might use a lot of RAM. (You can also check the heap free space for other clues you’re running low.)

                                                                                                          This behavior was baked into old classic-Mac app frameworks like MacApp and PowerPlant. If you didn’t use those frameworks (most apps didn’t), then you damn well rolled your own equivalent. Otherwise your testers or end users would be reporting lots and lots of crashes when memory ran low.

                                                                                                          I never coded for Windows, DOS or AmigaOS, but I bet they had very similar band-aids.

                                                                                                          I always wrap malloc() to check for null and abort(). Invoke the Great Garbage Collector in the Sky.

                                                                                                          That works fine in some use cases like a CLI tool, or a server that can get restarted if it crashes. It’s not acceptable in an interactive application; it makes the users quite upset.

                                                                                                          Back in the late 80s I once came back from vacation to find myself hung in effigy on our team whiteboard, because the artists using my app kept losing their work when they ran into a particular memory crasher I’d introduced just before leaving.

                                                                                                          (Oh, it’s not OK in a library either, because it defeats the calling code’s memory management. If I’m using a library and find that it aborts when something recoverable goes wrong, it’s a sign to stop using it.)

                                                                                                          1. 9

                                                                                                            You young fellers didn’t cut your teeth on out-of-memory errors the way I did :) Here’s how you do this: On startup you allocate a block big enough to handle your error-recovery requirements, say 16KB. Sometimes it was called the “rainy day fund.” When allocation fails, the first thing you do is free that block. Now you have some RAM available while you unwind your call chain and report the error.

                                                                                                            I have been bitten by doing exactly this on a modern OS. If the OS performs overcommit then your rainy-day fund may not actually be accessible when you exhaust memory. You need to make sure that you pre-fault it (writing random data over it should work, if the OS does memory deduplication then writing non-random data may still trigger CoW faults that can fail if memory is exhausted) or you’ll discover that there aren’t actually pages there. I actually hit this with a reservation in the BSS section of my binary: in out-of-memory conditions, my reservation was just full of CoW views of the canonical zero page, so accessing it triggered an abort.

                                                                                                            Similarly, on modern platforms, just because malloc failed doesn’t mean that exactly the same malloc call won’t succeed immediately afterwards, because another process may have exited or returned memory. This is really bad on Linux in a VM because the memory balloon driver often doesn’t return memory fast enough and so you’ll get processes crashing because they ran out of memory, but if you rerun them immediately afterwards their allocations will succeed.

                                                                                                            Back in the late 80s I once came back from vacation to find myself hung in effigy on our team whiteboard, because the artists using my app kept losing their work when they ran into a particular memory crasher I’d introduced just before leaving.

                                                                                                            I think you learned the wrong lesson from this. A formally verified app will never crash from any situation within its reasoning framework but anything short of that cannot guarantee that it will never crash, especially when running on a system with many other processes that are out of its control (or on hardware that may fail). The right thing to do is not to try to guarantee that you never crash but instead to try to guarantee that the user never loses data (or, at least, doesn’t lose very much data) in the case of a crash. Even if your app is 100% bug free, it’s running on a kernel that’s millions of lines of C code, using RAM provided by the lowest bidder, so the host system will crash some times no matter what you do.

                                                                                                            Apple embraced this philosophy first with iOS and then with macOS. Well-behaved apps opt into a mechanism called ‘sudden termination’. They tell the kernel that they’ve checkpointed all state that the user cares about between run-loop iterations and if the system is low on RAM is can just kill -9 some of them. The WindowServer process takes ownership of the crashed process’s windows and keeps presenting their old contents, when the process restarts it reclaims these windows and draws in them again. This has the advantage that when a Mac app crashes, you rarely lose more than a second or two of data. It doesn’t happen very often but it doesn’t really bother me now when an app crashes on my Mac: it’s a 2-3 second interruption and then I continue from where I was.

                                                                                                            There’s a broader lesson to be learned here, which OTP made explicit in the Erlang ecosystem and Google wrote a lot about 20 years ago when they started getting higher reliability than IBM mainframes on much cheaper hardware: the higher the level at which you can handle failure, the more resilient your system will be overall. If ever malloc caller has to check for failure, then one caller in one library getting it wrong crashes your program. If you compartmentalise libraries and expect them to crash, then your program can be resilient even if no one checks malloc. If your window system and kernel expect apps to crash and provide recovery paths that don’t lose user data, your platform is more resilient to data loss than if you required all apps to be written to a standard that they never crash. If you build your distributed system expecting individual components to fail then it will be much more reliable (and vastly cheaper) than if you try to ensure that they never fail.

                                                                                                            1. 3

                                                                                                              Ever since I first read about it, I have always thought “crash only software” is the only way to make things reliable!

                                                                                                              1. 2

                                                                                                                I generally sympathize with the “crash-only” philosophy, but an issue with that approach is that sometimes a graceful shutdown path can significantly speed up recovery. (Of course, a counterargument is that not having a graceful shutdown path forces you to optimize recovery for all cases, and that in an emergency where recovery time is critical your app likely already crashed anyway.)

                                                                                                                1. 1

                                                                                                                  One of the original papers benchmarked a graceful shutdown vs a crash and fsck for a journaled file system (ext3? ext4? can’t remember) and found crash and fsck was faster!

                                                                                                                  The actual use case for a graceful shut down is for things like de-registering from basestations.

                                                                                                                  But I would argue that such “shutdown activity” should be “business as usual” with the only difference being new requests for activity gets rejected with “piss off I’m shutting down” and once it is done. Crash!

                                                                                                                  1. 3

                                                                                                                    Since you brought up filesystems, there is a lesson to be learned from ZFS: “crash and no fsck” is fastest – try to use atomic/transactional/CoW magic to make sure that any crash basically is graceful, since there’s nothing to corrupt.

                                                                                                                  2. 1

                                                                                                                    The idea with most ‘crash-only’ systems (assuming I’m understanding the term correctly - I don’t think I’ve heard it before) is that your shutdown path isn’t special. You have a small amount of uncommitted data at any given time but anything that actually needs to persist is always persisted. For example, you use an append-only file format that you periodically garbage collect by writing the compacted version to a new file and then doing an atomic rename. You may chose to do the compact on a graceful shutdown, but you’re also doing it periodically. This has the added advantage that your shutdown code path isn’t anything special: everything that you’re doing on the shutdown path, you’re doing periodically. Your code is effectively doing a graceful shutdown every so often, so that it’s always in the recoverable state.

                                                                                                                    The core mindset is ‘assume that things can fail at any time’. This is vital for building a scalable distributed system because once you have a million computers the probability of one of them breaking is pretty high. Modern software increasingly looks like a distributed system and so ends up needing the same kind of mindset. Isolate whatever you can, assume it will fail at any given time.

                                                                                                                    1. 1

                                                                                                                      Some background:

                                                                                                                      https://www.usenix.org/conference/hotos-ix/crash-only-software

                                                                                                                      https://lwn.net/Articles/191059/

                                                                                                                      https://brooker.co.za/blog/2012/01/22/crash-only.html

                                                                                                                      I think the full “crash-only” philosophy really requires infrastructure support in a runtime or VM, because sometimes it’s just not acceptable to bring the whole system down. There was some work on “micro-reboot” prototypes of the JVM (and I guess .NET AppDomains were supposed to implement a similar model), but so far AFAIK BEAM/Erlang is the only widely used runtime that implements the “micro-reboot” model.

                                                                                                                      1. 1

                                                                                                                        You make a great point that the sort of recovery-optimizing cleanup one might do in a graceful shutdown path can instead be done periodically in the background. During Win7 there was an org-wide push to reduce system shutdown latency, and I remember doing some work to implement precisely this approach in the Windows Search Engine.

                                                                                                                2. 7

                                                                                                                  It’s an extremely important thing to do in embedded systems…

                                                                                                                  That’s my day job.

                                                                                                                  The way I set things up is this…. in order from best to worst…

                                                                                                                  • If I can do allocation sizing at compile time… I will.
                                                                                                                  • Statically allocate most stuff for worst case so blowing the ram budget will fail at link time.
                                                                                                                  • A “prelink” allocation step (very much like C++’s collect2) that precisely allocates arrays based on what is going into the link and hence will fail at link time if budget is blown. (Useful for multiple products built from same codebase)
                                                                                                                  • Where allocations are run time configuration dependent… Get the configuration validator to fail before you can even configure the device.
                                                                                                                  • Where that is not possible, fail and die miserably at startup time… So at least you know that configuration doesn’t work before the device goes off to do it’s job somewhere.
                                                                                                                  • Otherwise record error data (using preallocated resources) and reset.. aka. Big Garbage Collector in the Sky. (aka. Regain full service as rapidly as possible)
                                                                                                                  • Soak test the hell out of it and record high water marks.

                                                                                                                  I still find despite all this, colleagues that occasionally write desperate, untested and untestable attempts at handling OOM conditions.

                                                                                                                  And every bloody time I have reviewed it…. the unwinding code is provably buggy as heck.

                                                                                                                  The key thing is nobody wants a device that is limping along in a degraded low memory mode. They want full service back again asap.

                                                                                                                  1. 1

                                                                                                                    Sounds like “fun”! I’ve got a side project making alternate firmware for a MIDI controller (Novation’s LaunchPad Pro) where I’ve been making a lot of use of static allocation and C++ constexpr … it’s been interesting to see how much I can do at compile/link time. IIRC I’ve been able to avoid all calls to malloc so far.

                                                                                                                    and every bloody time I have reviewed it…. the unwinding code is provably buggy as heck

                                                                                                                    Yeah, this was always a miserable experience developing for the classic Mac OS. QE would keep finding new ways to run out of memory on different code paths, and filing new crashers. But crashing wasn’t an option in a GUI app.

                                                                                                                    1. 3

                                                                                                                      The trick is back pressure.

                                                                                                                      Unwinding is nearly always a bad choice of architecture.

                                                                                                                      To massively oversimplify a typical device… it’s a pipeline from input events, to interacting with and perhaps modifying internal state to output.

                                                                                                                      If something on the output side of that pipeline runs out of resource…. attempting to unwind (especially in a multithreaded real time system) is a nightmare beyond belief.

                                                                                                                      The trick is to either spec the pipeline so downstream always has more capacity / bandwidth / priority than upstream OR have a mechanism to sniff if my output queue is getting near full and so throttle the flow of input events by some means. (Possibly recursively).

                                                                                                                      By throttle I mean things like ye olde flow xon/xoff flow, blocking, dropping packets, etc…

                                                                                                                      The important principle is to do this as soon as you can before you have wasted cpu cycles or resources or … on an event that is going to be dropped or blocked anyway.

                                                                                                                      1. 1

                                                                                                                        Yeah, I’ve used backpressure in networking code, and I can see it would be important in a small constrained device processing data.

                                                                                                                      2. 1

                                                                                                                        Unrelated to malloc….

                                                                                                                        I was watching this bloke https://www.youtube.com/watch?v=ihe9zV07Lgk

                                                                                                                        His Akai MPK mini was giving me the “buy me’s”….

                                                                                                                        Doing some research other folk are recommending the LaunchPad….

                                                                                                                        What’s your opinion? I find it very interesting that you can reprogram the Novation… Does it come with docs and sdk and the like?

                                                                                                                        1. 1

                                                                                                                          So, I have the previous-generation LP Pro. They partially open-sourced it in 2015 or so. (I don’t believe the current model is supported.) Instead of releasing the stock firmware, they have a GitHub repo with a minimal framework you can build on. Much of it is still a binary blob. The README describes their reasoning.

                                                                                                                          I found it easy to get started with — there’s even a sort of emulator that lets you run it on your computer for easier debugging.

                                                                                                                          But you really are starting from scratch. There are empty functions you fill in to handle pad presses and incoming MIDI, and some functions to call to light up pads and send MIDI. So even recreating what the stock firmware does takes significant work. But I’m having fun with it.

                                                                                                                    2. 4

                                                                                                                      IIRC Windows pre-XP

                                                                                                                      It’s probably still important to this day. Windows has never supported memory overcommit - you cannot allocate more memory than there is available swap to back. This is why the pagefile tends to be at least as large as the amount of physical memory installed.

                                                                                                                      1. 2

                                                                                                                        One of Butler Lampson’s design principles: “Leave yourself a place to stand.”

                                                                                                                      2. 3

                                                                                                                        These days I also allocate small or zero swap partitions….

                                                                                                                        Have you read this? https://lobste.rs/s/rgv1sv/defence_swap_common_misconceptions_2018

                                                                                                                        It made me reconsider swap, anyway.

                                                                                                                        1. 1

                                                                                                                          No I hadn’t…. it’s a pretty good description.

                                                                                                                          Didn’t learn much I didn’t know beyond the definition of the swappiness tunable….

                                                                                                                          and the existence of something in cgroup exists…. and that it does something with memory pressure.

                                                                                                                          I have been meaning to dig into cgroup stuff for awhile.

                                                                                                                          But yes, the crux of the matter is apps need to be able to sniff memory pressure and have mechanisms to react.

                                                                                                                          For some tasks eg. a big build, the reaction may be… “Hey OS, just park me until this is all over, desperately swapping to give me a cycle isn’t helping anybody!”

                                                                                                                        2. 3

                                                                                                                          I literally woke up this morning thinking about this: 😩

                                                                                                                          guess what. it’s b0rked. Printf uses malloc. Doh!

                                                                                                                          I have not looked up the source code of [any implementation of] printf, but I can’t think of a reason printf would need to call malloc. It’s just scanning the format string, doing some numeric conversions that can use fixed size buffers, and writing to stdout. Given that printf-like functions can be a bottleneck (like when doing lots of logging) I’d think they’d try to avoid heap allocation.

                                                                                                                          1. 2

                                                                                                                            It’s an edge case, a bad idea, and a misfeature, but glibc allows registering callbacks for custom conversion specifiers for printf.

                                                                                                                            1. 2

                                                                                                                              For localisation, printf provides qualifiers that allow you to reference the arguments by their location in the format string. C’s stdarg does not allow you to get variadic parameter n, so to support this printf may need to do a two-pass scan of the format string. First it collects the references to the formats and then collects them all to an indexed data structure. That indexed data structure needs to be dynamically allocated.

                                                                                                                              It’s quite common in printf implementations to use alloca or a variable-length array in an inner block of the printf to dynamically allocate the data structure on the stack but it’s simpler to use malloc and free. It’s also common in more optimised printf implementations to implement this as a fall-back mode, where you just do va_next until you encounter a positional qualifier, then punt to a slow-path implementation with the string, a copy of the va_list, and the current position (once you’ve seen a positional specifier you need may discover that you need to collect some of the earlier entries in the va_list that you’ve discarded already. Fortunately, they’re still in the argframe).

                                                                                                                              To make this even more fun, printf is locale-aware. It will call a bunch of character-set conversion functions localeconv, and so on. If the locale is lazily initialised then this may also trigger allocation. Oh, and as @dbremmer points out, GNU-compatible printf implementations can register extensions. FreeBSD libc actually has two different printf implementations, a fast one and one that’s called if you have ever called register_printf_function.

                                                                                                                              To put printf in perspective: GNUstep has a function called GSFormat, which is a copy of a printf implementation, extended to support %@ (which is a fairly trivial change, since %@ is just %s with a tiny bit of extra logic in front). I was able to compile all of GNUstep except for this function with clang, about a year before clang could compile all of GNUstep. The printf implementation stressed the compiler more than the whole of the rest of the codebase.

                                                                                                                              Java’s printf is even more exciting. The first time it’s called, it gives pretty much complete coverage of the JVM spec. It invokes the class loader to load locales (Solaris libc does this as well - each locale is a separate .so that is dlopened to get the locale, most other systems just store locales as tables), generates enough temporary objects that it triggers garbage collection, does dispatch via interfaces, interface-to-interface casts, and dispatches at least one of every Java bytecode.

                                                                                                                              Whatever language you’re using, there’s a good chance that printf or the local equivalent is one of the most complex functions that you will ever look at.

                                                                                                                              1. 1

                                                                                                                                Sigh … It’s always the most obscure 10% of the feature set that causes 90% of the complexity, isn’t it?

                                                                                                                              2. 1

                                                                                                                                Yeah I thought the same thing. Printf is a little interpreter and it doesn’t need to Malloc. Sprintf doesn’t either. That’s why it has the mode where it returns the number of bytes you need to allocate.

                                                                                                                                1. 1

                                                                                                                                  I’ve only just skimmed the printf code in glibc. All it does is call vfprintf with stdout as the file. It does appear that vfprintf allocates some work buffers to do its thing, but I did not dive too deeply because the code is hard to read (lots of C preprocessor abuse) and I just don’t have the time right now.

                                                                                                                                  1. 1

                                                                                                                                    So I thought to until I set a break point….

                                                                                                                                    I don’t think we were using standard gnu libc at that point so your milage may vary.

                                                                                                                                    I have the vaguest of memory it was printing 64 bit int’s (on a 32 bitter) that trigger that.

                                                                                                                                  2. 2

                                                                                                                                    These days I also allocate small or zero swap partitions…. by the time you’re swapping heavily… your program is not dead… just unusable

                                                                                                                                    Swap still has its use. I’ve got a server which can evict most of the application because ~50% of its memory is lower levels of jit which will never be hit after a minute of runtime. Without swap, I’d have to run two servers. With swap, 1gb of it is used, and I can run two copies of the server without oomkiller kicking in. 60% of swap is marked used, but almost never touched.

                                                                                                                                    1. 3

                                                                                                                                      Yes. Also, the more unused anonymous pages you can swap out, the more RAM you have for pagecache to accelerate I/O.

                                                                                                                                  1. 8

                                                                                                                                    FWIW I gave up on this post because it displays nothing in my browser without JS …

                                                                                                                                    1. 1

                                                                                                                                      Yeah I couldn’t grab it into Wallabag.

                                                                                                                                      1. 1

                                                                                                                                        I have JS enabled and it’s still a black screen for me…

                                                                                                                                      1. 5

                                                                                                                                        I’ve been working with the understanding that NPM package dependency hierarchies are deep because the browser API environment did not provide much in the way of a standard library. Therefore tiny packages are useful enough to distribute and get versioned instead of ignored, and large packages will bring many others with them. The platform is low, so the stack goes high. Then packages get versioned at high frequency, often cascading through dependency trees.

                                                                                                                                        But that’s just how it went, because despite its limitations, the JavaScript environment is the web, and the web is where it’s at. Location, location, location.

                                                                                                                                        A way out of the mess is for the language to level up so we don’t feel the need for Babel, and for the most wanted third party things to be added as standards, like web components. That has been happening, but the mess of packages is still there and growing. We need to wean ourselves off of huge dependencies that bring the kitchen sink with them and get on standards for our dependency graphs to become shallow and livable.

                                                                                                                                        1. 5

                                                                                                                                          This is a long argument, but I think programmers should avoid “pyramid-shaped” and hard-coded dependency graphs.

                                                                                                                                          Package managers like npm and Cargo encourage this style of programming (“pile some more stuff on top”) with their notion of transitive dependencies. I also had a lot of experience with this at Google – people would always complain why building a low level web server linked in code maintained by the maps team, or things like that. Or why changing an application header unexpectedly caused you to need to rebuild the world.

                                                                                                                                          It was always a transitive dependency problem. (Incidentally, that’s the genesis of the “include what you use” tool that was on the front page yesterday.)


                                                                                                                                          What results in smaller and more stable software is if you program to interfaces rather than implementations, in the style of dependency inversion. Unfortunately it seems like there is no package manager that works like that.

                                                                                                                                          sqlite and Lua are 2 examples of pure libraries that “invert” all their dependencies. They make them all optional and have well defined and documented interfaces.

                                                                                                                                          They don’t depend a huge pyramid of libraries – you can build them with just a C compiler.