Threads for chriswarbo

    1. 1

      It’s so handy to spin-up a “proper” database as a single, temporary, in-memory process like this. I also like choosing appropriate tools for the job (from a well-stocked toolbox): I’ve not used DuckDB, but will keep it in mind for columnar data; I’ve also found clickhouse-local to be useful, e.g. for joining and querying a bunch of standalone files.

      Shell pipelines are also underappreciated; the common sentiment of avoiding Bash scripts in favour of e.g. Python makes sense for scripts which have a bunch of logic and parsing, but Bash is much nicer for piping large amounts of data.

      1. 2

        clickhouse-local

        I wasn’t aware of this variant and am now looking into it - thanks.

        I would guess that it fits into a similar niche as duckdb - ie a local, OLAP db.

        Bash here was only meant to be an enabling technology for allowing people to run the examples! But yes I love it for ad-hoc stuff :)

    2. 44

      You already have your solution, but you haven’t tried it. NixOS.

      1. 8

        Note that Nix isn’t quite completely deterministic due to the activation script which is a giant pile of shell commands.

        (It maybe be mostly deterministic, but as a user you can add basically whatever you want to it and break this.)

        1. 10

          Nix is only as deterministic as it’s builders and compilers that’s true. It gives you the scaffolding you need but you still have to ensure you follow the rules. If you use a builder that creates random data somewhere then you still won’t have a deterministic output. That’s true of pretty much anything that tries to solve this problem though. They are working with tools that make it a little too easy to violate the rules unexpectedly.

          1. 7

            Yeah I was actually surprised by this, from some brief experience with a shell.nix

            It seems like it is extremely easy to write Nix files “wrong” and break the properties of Nix

            Whereas with Bazel, the build breaks if you do something wrong. It helps you write a correct configuration

            Similar issue with Make – Make provides you zero help determining dependencies, so basically all Makefiles have subtle bugs. To be fair Ninja also doesn’t strictly enforce things, but it’s easy to read and debug, and has a few tools like a dep graph exporter.

            1. 7

              from some brief experience with a shell.nix

              nix-shell is a bit of a lukewarm compromise between Nix’s ability to supply arbitrary dependencies and the guarantees of the build sandbox. It’s not really trying to be hermetic, just ~humane.

            2. 2

              I would like to hear more about your shell.nix experience. It has some issues, for sure, but I have found the platform to be extremely reliable overall.

        2. 2

          Is that really relevant? A determined user could flash their BIOS with garbage and break anything they want. Does it matter that you can go out of your way to break your own NixOS system? The point is that as a user you’re not supposed to mess with your system manually, but you’re supposed to make changes by changing your configuration.nix. This is not the case for most other distros where you make changes to files under /etc side by side with files maintained by the distro packages.

          1. 3

            The point is that I don’t think anyone is checking. IDK how many options in nixpkgs break this. I don’t think we need to stop the user from purposely breaking it but we should make “official” flags not break it and make it easy for users to not break this.

      2. 6

        I did try Nix (briefly).

        However, I do think there is a place for classical distributions, as there is a place for distributions like Nix, Guix, and even something like GoboLinux.

        1. 24

          Pretty much every concern you share in that post is solved by NixOS. Some points

          • You don’t modify files in /etc, you use the NixOS configuration to do so
          • Changes are idempotent
          • Changes are reversible, there’s a history of config revisions called generations, or you can just remove the config and apply to roll forward
          • System configuration is reproducible and deterministic
          • Services are systemd units declared through the configuration
          • Packages reference their direct dependencies allowing for multiple versions of the same lib/package/etc

          NixOS isn’t perfect by any means, but it is a solution you seem to be looking for. Guix probably as well, but it’s a smaller and more limited distro.

          I took the multiple comments that “NixOS may do this but I’ll have to research” to mean you haven’t tried it. And this entire blog post screams to me that NixOS is a solution for you.

          1. 5

            You don’t modify files in /etc, you use the NixOS configuration to do so

            In a way, that’s second system syndrome. Suse Linux was bashed for doing something like that with YaST for a long time…

            In case of NixOS, I found that new language (not the syntax, the semantics) changed too fast for my liking…

            System configuration is reproducible and deterministic

            With a few caveats: Software might still get updated (I suppose you can lock that down to specific revisions, but who does?). From a bugfix/security perspective this may be desirable, but it’s not entirely reproducible, and updates can always introduce new bugs or incompatibilities, too.

            1. 6

              All packages are locked in Nix. So With a few caveats: Software might still get updated is not a problem there.

              1. 1

                https://www.software.ac.uk/blog/2017-10-05-reproducible-environments-nix offers “download a specific release tarball of buildInput specs” to improve matters. Other than that, buildInputs = [ bash ] may or may not point at the same version. https://github.com/NixOS/nixpkgs/issues/93327 discusses this for 3 years and there doesn’t seem to be a resolution.

                1. 10

                  You’ll get the same version with the same Nixpkgs, every time. buildInputs = [ bash ] will always point at the same bash package for some given Nixpkgs. Package versioning is a different issue.

                  1. 1

                    I count package versioning as part of the system configuration: Sure, your package names are always the same, but the content might differ wildly.

                    1. 12

                      The content will be the same every time with the same system configuration, assuming you’ve pinned your Nixpkgs version.

                    2. 5

                      I think you’re misunderstanding the way nixpkgs works with versions and what the versions actually mean there. Check it out in practice. Unless you update the channels/flakes or whatever you use between runs, nothing will change - there’s no explicit pinning required.

                    3. 4

                      The way I manage this is using several different versions of nixpkgs; a “known good” version for each package I want to lock, so for agda I would have nixpkgsAgda, the regular nixpkgs which is pinned to a stable release, and nixpkgsLatest which is pinned to master.

                      Most of my packages are on stable nixpkgs. Every now and then when I run nix flake update, it pulls in new versions of software. Things pinned to stable have never broken so far. Things pinned to latest are updated to their latest versions, and things pinned to specific packages never change.

                      While it does involve pulling in several versions of nixpkgs, I build a lot of software from source anyway so this doesn’t matter to me very much. I do hope that nixpkgs somehow fixes the growing tarball size in the future…

                2. 5

                  buildInputs = [ bash ] may or may not point at the same version

                  That’s a snippet of code written in the Nix expression language. bash is not a keyword in that language, it’s just some arbitrary variable name (indeed, so is buildInputs). Just like any other language, if that bash variable is defined/computed to be some pure, constant value, then it will always evaluate to that same thing. If it’s instead defined/computed to be some impure, under-determined value then it may vary between evaluations. Here’s an example of the former:

                  with rec {
                    inherit (nixpkgs) bash;
                    nixpkgs-src = fetchTarball {
                      sha256 = "10wn0l08j9lgqcw8177nh2ljrnxdrpri7bp0g7nvrsn9rkawvlbf";
                      url = "https://github.com/nixos/nixpkgs/archive/4ecab3273592f27479a583fb6d975d4aba3486fe.tar.gz";
                    };
                  
                    nixpkgs = import nixpkgs-src { config = {}; overlays = []; system = "aarch64-linux"; };
                  };
                  { buildInputs = [ bash ]; }
                  

                  This evaluates to { buildInputs = [ «derivation /nix/store/6z1cb92fmxq2svrq3i68wxjmd6vvf904-bash-5.2-p15.drv» ]; } and (as far as I’m aware) always will do; even if github.com disappears, it may still live on if that sha256 appears in a cache!

                  Here’s an example of an impure value, which varies depending on the OS and CPU architecture, on the presence/contents of a NIX_PATH environment variable, the presence/contents of a ~/.config/nixpkgs/overlays.nix file, etc.

                  with { inherit (import <nixpkgs> {}) bash; };
                  { buildInputs = [ bash ]; }
                  

                  I recommend sticking to the former ;)

                  PS: I’ve avoided the word “version”, since that concept doesn’t really exist in Nix. It’s more precise to focus on the .drv file’s path, since that includes a hash (e.g. 6z1cb92fmxq2svrq3i68wxjmd6vvf904) which is the root of a Merkle tree that precisely defines that derivation and all of its transitive dependencies; whereas “version” usually refers to an optional, semi-numerical suffix on the file name (e.g. 5.2-p15) which (a) is easy to spoof, and (b) gives no indication of the chosen dependencies, compiler flags, etc. which may have a large effect on the resulting behaviour (which is ultimately what we care about).

            2. 4

              In case of NixOS, I found that new language (not the syntax, the semantics) changed too fast for my liking…

              Do you happen to have an example of such an semantic change at hand? Curious as nix is often regarded as rather conservative and trying to keep backwards-compatibility as good as possible.

        2. 7

          Part of the reason Nix/Guix take their particular approach is specifically because the global mutable state of a traditional FHS style distribution is nigh-impossible to manipulate in a deterministic way; not just because of the manipulation by the package manager, but because there is no delineation between mutations by the package manager and those by the user. A great example is /etc/passwd, which contains both locally created users as well as “service accounts” the distro maintainers make to separate out some unprivileged services.

          1. 1

            Relevant writing from Lennart Poettering on state in Linux distributions and how to accomplish a factory reset mechanism.

        3. 1

          There is a place for “classical” (read: legacy) distributions, yes, in the same way there is a place for collecting stamps.

      3. 3

        I want to switch to NixOS so badly, but I also want SELinux support.

      4. 1

        it needs a lot of work to enforce 1) determinism/reproducibility, 2) non-hysterisis (when you stretch an object and it doesn’t return /exactly/ back… like hidden side effects in ansible), 3) ease of use, & 4) acceptance testing.

    3. 6

      I agree, especially regarding the “monolithic” aspect; although thankfully that seems to be improving. My main problem with flakes the way it moves information out-of-band (e.g. into “registries”, “lock files”, etc.), which seems opposite to my own experience (avoiding out-of-band stuff like NIX_PATH, channels, etc. in favour of plain old .nix files).

      If people like that approach that’s fine; but it’s annoying that some otherwise-useful functionality requires that flakes are enabled. That either requires debugging the nix.conf on colleague’s machines, VMs, CI setups, EC2 images, etc. or else adding the --extra-experimental-features flag to every invocation and script (then having to explain in code review that it’s simultaneously (a) needed for the thing we’re doing, whilst (b) safe for production because it’s unrelated to the thing we’re doing…)

      The main ones I’ve encountered are nix bundle and nix run; but I also like the idea of the evaluator cache. I think the latter is overly restricted at the moment though; e.g. it blocks filesystem access, when it would be nicer to incorporate the hash of any accessed files in the validation.

    4. 6

      I’m happy to see a unified way to do Nix patterns, even if it’s flawed. Other projects you have so many different cats being skinned to accomplish tasks & it is hard to follow the code. I wish there was less infighting tho as this could be a community-fracturing feature. Instead of getting fixes & proper development, often merge requests & discussions seem to be met with philosophical opposition & grandstanding even if it could improve the very situations (and sometimes legitimate) being stood against.

      Flawed or not, flakes are better than not-flakes & it has been helping many projects & teams.

      1. 5

        it has been helping many projects & teams

        That sounds true, from what I’ve heard online.

        Flawed or not, flakes are better than not-flakes

        This absolutely does not follow from the above. You (and others) might find them useful; I (and others) might consider them a backwards step. Those are opinions; neither is definitively correct.

        I wish there was less infighting tho as this could be a community-fracturing feature.

        If you wish there was less infighting, then please don’t inflame discussions. For example, I disagree with the article (e.g. I think flakes should be disentangled from other Nix features before cutting a 1.x version; ideally it would live in a separate tool, so those who prefer “pure” Nix can avoid it); however, I think the article is perfectly reasonable, and wouldn’t mind too much if its approach were actually taken (it wouldn’t affect me much directly, other than the continued bloat and churn; indirectly, I’m happy as long as orthogonal functionality doesn’t depend on flakes). In contrast, you seem to be entirely disregarding those who don’t want flakes; which I don’t find reasonable.

        1. 2

          “Pure” Nix has it’s own set of issues of issues managing versions + inter-opting with foreign Nix code in a standardized way + etc. At least flakes are trying to solve some of them. Split it up & call the parts different names if that helps UX features cross the line, but by “flakes are better than not-flakes” I mean I’m in favor of adjusting base Nix to have better usability & flakes seem to be the only option going in that direction.

          1. 1

            Content-addressing is where I see the most potential, e.g. for decoupling dependencies from their particular definitions; for a global binary cache (e.g. via IPFS, magnet/torrents, etc.); for more seamless integration with git blobs, RDF/JSONLD URIs, etc.

            As for standardising entrypoints/datastructures, that appears to be more suited to linters; rather than making fundamental changes to the tooling, complicating the conceptual model, inventing new syntax (foo:bar/baz#quux.foobar), etc.

        2. 1

          I (and others) might consider them a backwards step

          I haven’t really seen what Nix without flakes would look like and what the path forward of that would be. All the commands and files that don’t use flakes feel relatively awkward…

          1. 1

            I haven’t really seen what Nix without flakes would look like… All the commands and files that don’t use flakes feel relatively awkward…

            In my last job, every project was built like this:

            $ nix-build
            

            Optionally you can give it a --show-trace flag to help debugging, but I prefer to set that globally in my nix.conf.

            That’s literally all you need; everything else can be automated, reproducibly, using Nix expressions. In particular, these are useful:

            • Default values for function arguments: use these to define all your build’s dependencies; including paths to any source code, compilers, libraries, external repos like Nixpkgs, etc. Such default values will be used by nix-build, but are easily overridable, e.g. if we import this project from another.

            • fetchGit/fetchTarball: use these to (a) specify (and cryptographically verify) precisely which files you’re depending on, and (b) move as much boilerplate, policy choices, dependency versions, etc. out of individual projects and into a central place. That way, individual projects can get most of what they from a single dependency, like { helpers ? import (fetchGit { ... }) }: ... (using the default value of a function argument, as above)

            • import-from-derivation: this lets a Nix expression import/read the output of a derivation; since derivations can run arbitrary commands (with network access, if they’re fixed-output), we can write Nix functions that call out to external tools, like dependency solvers. Note that nixpkgs doesn’t allow code that uses import-from-derivation, since it causes builds to happen during evaluation: that makes sense for a large collection like Nixpkgs (don’t want Haskell dependency solvers slowing us down if we only want some Python packages), but it’s fine for standalone projects.

            There’s currently work on recursive Nix, which would be a nice complement to import-from-derivation (I used to write recursive Nix derivations, but the sandboxing in Nix 2.x made that less reliable).

            Likewise, the following should be avoided:

            • Environment variables: these are impure, and require external management. Prefer to keep everything in Nix expressions, e.g. using function arguments (and perhaps putting a .override function in the result).
            • <nixpkgs>: this sort of path is taken from the NIX_PATH env var. Avoid env vars: use a function argument, defaulting to fetchTarball with a specific Nixpkgs revision.
            • Channels: again, these are impure and require external management. Prefer pure Nix expressions using fetchGit/fetchTarball.
            • builtins.currentTime: this is a hacky way to invalidate cached derivations. Instead, use builtins.hashFile or builtins.hashString on whatever’s actually invalidating the output. In particular, putting hashes in the name of a fixed-output derivation will cause it to be rebuilt automatically (even if we forget to update the derivation’s output hash).

            My problem with flakes is that it goes against this latter advice, by moving more things out of pure Nix expressions, and into commandline arguments, or externally-managed “registries”, or blessed locations like “lock files”, etc. I also don’t like the increasing reliance on centralised, single-points-of-failure; e.g. instead of making it easier to depend on github:foo/bar it’s better to move further to content-addressing (e.g. the ongoing work on the Nix store layer to interoperate better with IPFS, git, dat, hyperdrive, etc.)

            1. 1

              Thanks for the lengthy reply but understanding most of this is going to take me several hours of painstakingly tracing and trying out nix commands.

              For some reason when I started doing Nix I gravitated directly to flakes and climbing out of that well seems like it would be a lot of effort for an ‘outdated’ way of doing things.

              Also my interface with nix is not so much nix build at the moment (which though is nice if it works) as it is nix develop which turns out to be an absurd pain to make work for any non-trivial project that has its own dependencies (python, cargo, node) and then you have to use dream2nix or something which is more or less undocumented and well… I gave up and I’m now back to using system global tools for these three and I’m much happier and much more productive for it.

    5. 9

      I can only speak about NixOS, since I’ve not used any of the others. For this part:

      Meanwhile, NixOS, an immutable distribution, empowers the user and allows them to change many aspects of their system.

      The justification given is that we can mix-and-match the various different configurations and packages available, track the latest unstable git commits, etc. which is certainly true. However, we sometimes want to avoid the pre-provided stuff and just dump random files somewhere in the root folders. NixOS certainly allows that, the only difference from a system like Ubuntu is how we do it. Mutable distros can do it imperatively, e.g.

      sudo cp ~/my-script /usr/bin
      echo "foo" | sudo tee /etc/foo.conf
      

      Whilst on NixOS we do it declaratively, by describing these files in our config, then (atomically) switching to the new system, e.g.

      environment.etc."foo.conf".text = "foo";
      environment.systemPackages = [
        # This relies on ~/my-script remaining in place, so future rebuilds can find it. That's
        # not required when doing `cp`, and is probably a bit silly. Instead, this sort of thing
        # would usually use fetchurl, fetchGit, etc. to get the contents from a stable URL.
        (pkgs.writeScriptBin "my-script" (builtins.readFile ~/my-script))
      ];
      

      The above comparison is actually a bit misleading, since it’s bad practice to make such imperative changes on mutable distros. For example, on Ubuntu it would be better to do something like the following, which is certainly more work than sudo cp!

      mkdir -p my-package/etc
      mkdir -p my-package/usr/bin
      mkdir my-package/DEBIAN
      echo "foo" > my-package/etc/foo.conf
      cp my-script my-package/usr/bin
      {
        echo 'Package: my-package'
        echo 'Version: 1.0'
        echo 'Architecture: all'
        echo 'Maintainer: Me <me@example.com>'
        echo 'Description: My system changes.'
        echo ' Includes foo.conf and my-script'
      } > my-package/DEBIAN/control
      dpkg-deb --build --root-owner-group my-package
      dpkg -i *.deb
      
      1. 2

        Thank you for one of the most succinct, useful, and technical comments I’ve encountered on here in a while. Please, folks, more comments like this and less other stuff.

    6. 7

      I wish Nix/NixOS configuration examples were more discoverable. The search is great (despite some glaring issues), but many times I’ve had the feeling of reinventing the wheel to get work done. Maybe this stuff is obvious to Haskell programmers, but for us mortals it’s not easy to see how to combine mkShell with poetry2nix, why nixpkgs uses with everywhere to save a few characters when it’s clearly evil, which of the many ways to declare an environment variable is idiomatic, why tools are spread around builtins, pkgs, and pkgs.lib, why nixpkgs has about fourteen different Bash shebangs, and a bunch of other things. Fortunately most of the community are really helpful.

      1. 2

        why nixpkgs uses with everywhere to save a few characters when it’s clearly evil

        with is certainly not evil; IMHO it’s the most sensible way to define variables (e.g. in about a decade of using Nix, I’ve never found a use for let/in): https://github.com/NixOS/nix/issues/1361#issuecomment-390420439

        1. 1

          in about a decade of using Nix, I’ve never found a use for let/in

          Could you share some expressions you’ve written? I’m very curious what it ends up looking like when one forgoes let entirely.

          1. 1

            Most of the Nix I’ve written (outside of private, work repos) are in the following three repos; although they’re pretty old so probably quite non-idiomatic (they go back to Nix 1.x, pre-date the overlays mechanism of Nixpkgs, pre-date many builtins like fetchGit, etc. and have been refactored a few times since around 2014):

            I use Nix to build the static site chriswarbo.net (e.g. see the static/nix folder; although some pages also contain embedded code)

            There are a few pages on that site about Nix/NixOS, at http://www.chriswarbo.net/projects/nixos

            A couple of public projects that rely on Nix are the nix-eval Haskell package and the asv-nix plugin for Airspeed Velocity

            Note: Many of the links above go to self-hosted, static HTML generated via git2html; to see their code click on View repository, then a branch (e.g. master) then a commit (e.g. HEAD). Only their HEADs are rendered, to save on disk space. Here’s a full list of my repos and a mirror on GitHub.

            1. 1

              … huh. I’ve never seen that approach, and I’m not sure it’s even equivalent to what I was on about. I meant how with pkgs and the like introduce a huge number of variables into the scope implicitly, making it a brute-force manual job to figure out where any one variable which just pops up without being explicitly declared came from.

              1. 1

                That’s what the inherit keyword is for, e.g.

                with rec {
                  inherit (builtins) toJSON;
                  inherit (pkgs) jq runCommand;
                 
                  nixpkgs = fetchTarball { ... };
                  pkgs = import nixpkgs { config = {}; overlays = []; };
                
                  data = toJSON [ 1 2 3 ];
                };
                runCommand "foo.json"
                  {
                    inherit data;
                    buildInputs = [ jq ];
                  }
                  ''echo "$data" | jq 'map(select(. > 2))' > "$out"''
                
                1. 2

                  Thanks for this! It’s given me something to think about, though I think your usage isn’t as extreme as I thought. with rec { inherit (builtins) toJSON; }; is not meaningfully different to me from let inherit (builtins) toJSON; in, and in general I dislike the context pollution and inability to know the provenance of any given binding when using with builtins; with lib; … at the start of a large expression.

      2. 1

        Yes, for quick throwaway development, it’s very annoying to get out of the gate with Nix. In those cases I really prefer having a global python/rust installed on my system and be able to hack immediately.

        1. 1

          A big problem with that approach is that as soon as you work on more than a single project you run into all sorts of conflicts because projects depend on different versions of Python/Rust/packages. Not to mention that you can seriously break Ubuntu and other non-NixOS distros if you replace the system Python with a different version.

          1. 1

            Pyenv and Rustup I think both offer project specific versions of everything and that has worked perfectl for me so far. Much better than anything Nix has been able to provide so far (despite sinking lots of weekends into it).

            1. 1

              I don’t understand - I thought both of those install into isolated environments, and not into the global system?

    7. 1

      Could something like this be accomplished in Zig via comptime?

      1. 1

        In theory, you could do it with any static language if you write a verification, condition generator to integrate with Why3 or Boogie. They do the proving. A language with metaprogramming might need an initial pass that does compile-time evaluation. Metaprogramming and dynamic languages are difficult in general. Worst case, you can use subsets and/or annotations to aid the analyses.

        1. 2

          That reminds me of the different approaches to handling declarative dependencies in Nix (in my case that’s mostly Haskell libraries with version bounds):

          • One approach is to have our Nix function (e.g. buildHaskellPackage) implement a constraint solver, which reads in version bounds from each version of each dependency, and picks a mutually-compatible set of dependencies to build.
          • A more practical approach is to just shell-out to an existing solver (cabal in Haskell’s case) and parse its output.

          Whether such build-time analysis is performed “within” the language via macros, or externally as a separate step of the build process, the same solvers can be called and the end result is the same (for checkers like this, there’s also nothing to parse: if a solution is found then we throw it away and carry on to the next step, if not we exit with an error message).

          I used to dislike the thought of performing I/O at compile-time, but I’m seeing more and more compelling use-cases: shelling out to competent solvers is one; “type providers” like F# are another (where types can be generated from some external source of truth, like a database schema; ensuring out-of-date code fails to build). One I’ve used recently was baking data into a binary, where a macro read it from a file (aborting on any error), parsed it, built a datastructure with efficient lookups, and wrote that into the generated AST to be compiled. This reduced the overhead at runtime (this command was called many times), and removed the need for handling parse errors, permission denied, etc.

          1. 2

            Yeah, integration with external tool can help in all kinds of ways. The simplest is static analysis to find code-level issues the compiler can’t find. I really like your idea of baking the data into a binary. It’s like the old idea of precomputing what you can mixed with synthesis of efficient, data structures. That’s pretty awesome.

            Actually, I’ve been collecting, occasionally posting, stuff like that for formal methods. Two examples were letting someone specify a data structure in functional way or modify/improve loops. Then, an external tool does a pass to make an equivalent, high-performance, imperative implementation of either. Dumps it out as code. Loop and data structure examples. Imperative/HOL’s technique, if generalizable, could probably be applied to languages such as Haskell and Rust.

    8. 2

      I see it uses SMT solvers to check the annotations. I know that Z3 is quite impressive, but am interested in how scalable this would be. Does the language design ensure these checks are kept ‘local’ (i.e. adding N definitions/calls adds O(N) time to the build), or can the constraints interact over a longer distance (potentially causing quadratic or exponential time increase)? I’d also like to know if there are common failure-modes, e.g. where some common code pattern can’t be solved without extra annotations.

      For example, the classic “hello world” for dependent types is to define Vector n t (lists containing n elements of type t), and the function append : Vector n t -> Vector m t -> Vector (plus n m) t whose output length is the sum of the input lengths. Woe betide anyone who writes plus m n instead, as they’d have to prove that plus commutes!

      1. 3

        With SMT based verification, each property is proved separately and assumed true when proving others. So there’s no problem with locality. The problems are very different vs. dependent types. SMT is undecidable, so unless the tool is very careful about using decidable logic (like Liquid Haskell is), you’ll very often see that the solver just times out with no solution, which is pretty much undebuggable. It’s difficult to prove anything through loops (you have to write loop invariants), etc.

        1. 2

          In this case, SMT isn’t undecidable: a friend pointed out the write!(solver,"(set-logic QF_UFBV)\n").unwrap(); line to me, which means “quantifier free with uninterpreted functions and bitvectors”. It’s decidable, just super hard.

          1. 2

            Yeah, if you can get away with quantifier-free, everything is much better. But you can’t do much with C code without quantifiers. zz probably doesn’t support verifying loops.

            1. 2

              I agree, unless ZZ has its own induction system for loops, using SMT only to prove the inductive hypothesis. It’s a lot of work though.

      2. 2

        Does the language design ensure these checks are kept ‘local’ (i.e. adding N definitions/calls adds O(N) time to the build), or can the constraints interact over a longer distance (potentially causing quadratic or exponential time increase)?

        I’m also interested in this in terms of API stability. If interactions happen at a longer distance it can be difficult to know when you are making a breaking change.

      3. 2

        Z3 “knows” addition commutes, so that’s no problem. Usual trouble with dependent type is that you define addition yourself, so it is hard for Z3 to see your addition is in fact addition.

        1. 1

          Yes, I gave that example for dependent types (the type requiring a different sort of branching than the computation) since I’m more familiar with them.

          The general question was about the existence of pathological or frustrating cases in this SMT approach (either requiring much more time, or extra annotations) arising from very common assertion patterns, or where the/a “obvious” approach turns out to be an anti-pattern.

    9. 3

      Tiling WMs never appealed to me for one simple, shallow reason: I like seeing nice decorations on my windows. Having windows overlap doesn’t bother me. Indeed, I occasionally arrange them that way on purpose. I have 11 virtual desktops (KDE), and I map my numpad keys to them, so every desktop is a single keystroke away (modifier keys are not even needed). I can have 30+ windows, many of those having several tabs (web browser, terminal), and everything’s within reach via numpad, Alt-Tab and Ctrl-Tab. This setup has worked very well for me for many years. (Not to be construed as me saying that nobody should use tiling WMs.)

      1. 3

        My (tiling) window borders are a single pixel wide; just enough to highlight the focused window in blue, with unfocused in grey. I also don’t see my desktop background much, since windows are arranged to cover the whole screen (usually just one fullscreen window at a time, in my case).

    10. 2

      How does a tiling wm work for me? It’s my understanding that I have to set it up “my way”, according to each task in my daily flow. Whatif I don’t have a flow?

      My usual gnome flow on my desktop is just meta key to find an app and launch it, alt tabbing my way to an open one, and meta+arrow to move my current app to this or that corner. When I hook my windows laptop from work, I might also need to move my thing to another window, ctrl-meta-arrow does it. I usually open terminator where I split it up with term panes with ctrl-e or ctrl-o and then with arrows, as by default I just split it in two panes. I open IntelliJ, and there I sometimes want to fiddle with its panes with a mouse, same with Firefox or Chrome dev tools. Occasionally mail app is open, and a comms like slack or teams.

      Sometimes I need to adjust the width of the window manually, with the mouse, because I’m really interested in those logs, or that image. Or open an unusual app like calc or excel or whatnot.

      Luckily, gnome and apps remember where they’ve been last, usually.

      What would a tiling wm bring me? How would my flow look like? Do I have to setup config for each window?

      What benefits would I see? Would it bring me discipline? How much work to set up initially and how much work later on? What happens when I open a new app?

      I can’t find this information online. Anybody has a video or article or presentation at hand to share?

      1. 3

        How does a tiling wm work for me? It’s my understanding that I have to set it up “my way”, according to each task in my daily flow. Whatif I don’t have a flow?

        If you don’t have a particular flow, you can use whatever is the tiling window manager does by default. Usually new programs will open full screen. If there is already a window open, it will split the screen either vertically or horizontally. And so on. Some tiling WMs let you configure how this splitting happens.

        What would a tiling wm bring me? How would my flow look like? Do I have to setup config for each window?

        A tiling WM would probably help you spend less time moving windows around. In your case, if you need to look at lots of terminals, or have an IDE + terminal + browser open all at once, you can do that in a way that uses all the screen estate without having to manually drag and resize windows.

        I don’t know what your flow would look like. Personally I usually have an Emacs window and a terminal or browser open side by side on one workspace. This lets me look at my code and documentation at once easily. On another I will have my chat apps, usually Slack and FB Messenger open side by side, maybe Hangouts too. That way I can see all new messages at a glance. I find this is more useful when plugged into a larger monitor where I have more screen real estate than on my laptop.

        You don’t have to set up a config at the beginning. You can use it purely interactively and if you find yourself using certain layouts all the time, your WM probably gives you a way of setting up layouts and restoring them or binding them to keybindings.

        What benefits would I see? Would it bring me discipline? How much work to set up initially and how much work later on? What happens when I open a new app?

        You will spend less time manually moving windows around and once you find layouts you like, you can save and restore them. I don’t know if it would bring you discipline (I don’t know what you mean by that). You should be able to get set up running without doing any configuration.

        A few days ago there was a post here about Regolith which is an Ubuntu spin with a beginner friendly tiling setup that you can use to see if you like tiling. If you use macOS I recommend Moom.

        1. 1

          Thanks. I might try it sometime, but I’m not convinced. I don’t hear a “killer app” for me.

          It may be “the dishwasher effect”. Before I had one (grew up without, then went on living without), I never thought it’s a big deal. Once I’ve bought one, I never think I’d go back.

          But from my safe “defaults only on everything” perspective, I don’t see the incentive.

          1. 2

            That’s totally fine. If you are fine using the defaults on whatever system you have and that works for you, then you don’t need to be bothered. Tiling window managers (more specifically, the automated window management they allow) is attractive to me, but it doesn’t have to be for you.

      2. 3

        I’ve seen a couple of comments talking about configuration paralysis, but I haven’t really experienced that. I just have everything full-screen all the time; I only switch to a “proper” tiling arrangement occasionally, e.g. if I’ve opened the GIMP; then switch back. Note that this isn’t per-app or configuration-dependent, it just rearranges whatever windows happen to be open on the current desktop.

        In general, I quickly stopped caring about the size and shape of windows. The only per-app configuration I have is which virtual desktop my auto-start programs appear on, which is the same as I had on floating WMs.

      3. 2

        》What would a tiling wm bring me? How would my flow look like

        Nothing. You would need arrange every window at your desired size and move them cross the workareas spending lot of time learning new keystrokes. This every time you need two new different applications working at the same sight.

        1. 2

          Not certain if this is a disgruntled or a troll reply :)

          1. 2

            sorry, not intended trolling. Its my opinion based in three years using i3. Sure I’m wrong.

    11. 4

      There are some accurate observations in this article, and it seems that the author has gained a real appreciation for strong, static type systems. That’s great!

      There are also a number of inaccuracies and basic misunderstandings here. Most of the “small nits” section is… frustrating to read.

      Haskell has no null definition of type Char

      I don’t understand what the author wants. A String is a sequence of Chars; that sequence can be empty. What would an empty Char even be? And why isn’t Maybe Char satisfactory for their use-case?

      Int and Integer are two different types

      Yes, they are. Integer is a dynamically-resized integer equivalent to Python’s integer type. Int is a fixed-size integer of at least the range [- 2^29, 2^29 - 1], similar to C’s long. If that range is exceeded, the behaviour is undefined.

      ghci> maxBound :: Int
      9223372036854775807
      ghci> maxBound :: Integer
      
      <interactive>:3:1: error:
          • No instance for (Bounded Integer)
              arising from a use of ‘maxBound’
      

      Haskell can raise errors partway through evaluating an error message.

      I’m not sure what they mean, but it seems the author is just discovering lazy evaluation. In Haskell, expressions are not evaluated until their results are needed. In map (+1) [2,3,undefined], the thing that “needs” the result is the print function which is implicitly called by every line.

      print will print the first thing of the list, (+1) 2. This must be evaluated first, and then it’s printed. Then print waits for the next thing in the list, (+1) 3, which also needs to be evaluated before being printed. Finally print needs (+1) undefined to be evaluated, which causes the program to stop.

      1. 2

        Haskell can raise errors partway through evaluating an error message.

        I’m not sure what they mean, but it seems the author is just discovering lazy evaluation.

        I have to admit, I hit this occasionally and I’ve been programming in Haskell for years. I don’t like unhelpful error messages, so if I find myself writing a partial function I’ll at least add an explicit branch for the unwanted case (rather than leaving it to a generic “unmatched pattern” error), and I’ll put in a call to error "..." with something useful and pertinent about what happened. Sometimes it might be useful to append (some representation/summary of) the inputs to the error message, to make it clear what went wrong (e.g. "foo/bar" isn't an absolute path). Sometimes those inputs can raise their own errors (e.g. if we branched on whether the first character is /, there would still be an error lurking in something like List.intercalate "/" ["foo", undefined]).

        As a good rule of thumb: when writing error handling (even if it’s just reporting, rather than recovery), be extra cautious and don’t necessarily assume that your program’s invariants still hold (after all, something had to cause the error!).

    12. 3

      It is always super weird to see someone quote me. Really cool! But also super weird. I’m not fully adjusted to that yet ;)

      If you find Haskell’s type system interesting, I’d also recommend checking out an ML! The module system in ML is really cool, powerful, and unique. I’m shocked more languages haven’t adopted similar module systems.

      Also, a couple of quick pedantic nitpicks:

      I was reading through some of Hillel Wayne’s blog posts, and one of them discussed the Curry-Howard correspondence, which somehow proves a 1:1 association between aspects of a mathematical proof and aspects of a type system. You can convert a proof into a type, if your type system supports it. I think Haskell’s type system respects this. I don’t think Python’s type system does.

      I don’t think I explained the CHC very well in that. All sound static type systems are “proofs”, where what you are proving is “this is well-typed”. This is true for both Haskell and Python (mypy). But different languages can encode different consequences of being well-typed, like if you can express the type of sorted lists, you can say “I’ve proven this is well-typed ⇒ this sort function correctly sorts the list.” Haskell’s type system is more powerful than most other commercial languages, but it’s not powerful enough to encode arbitrary theorems. For that you need dependent typing a la Idris or Coq.

      Also proving type-safety for arbitrary theorems is really, really hard. Tradeoffs!

      Python has hypothesis, but having used it a tiny bit for an open-source contribution, I don’t think it’s the same without the Haskell type system.

      Gotta defend hypothesis a bit here! It sounds like you were using quickcheck in the context of learning from a book, while you were using hypothesis for a real-world project. In that context quickcheck is gonna seem more elegant. But Hypothesis is world-class and a lot of more modern PBT libraries follow its example. It’s a lot better at generating inputs like “two dictionaries where the values of the first dict are a superset of the keys in the second.”

      1. 2

        OCaml seems really cool, and I’ve heard it used in industry; is it the most popular ML for practitioners? I also know of ReasonML used by Facebook to create Facebook Messenger. But I think there’s still so much to learn about Haskell, and between going broad and learning about type systems, I’d want to go deep into Haskell and become intermediate, or even advanced if I’m fortunate enough to have the time and resources to do so!

        Hmm, I didn’t know that Idris or Coq would be powerful enough to encode arbitrary theorems. I think if I were to study more into type theory (an interesting subject), I’d try to learn those!

        Yeah, I think Corbin mentioned that Hedgehog was closer to hypothesis than QuickCheck was; I think I was too quick in dismissing hypothesis. I do like the notion of using generators over types, and the integrated shrinking (when I wrote this I assumed that QuickCheck had integrated shrinking; I think that’s actually a Hedgehog or maybe hypothesis innovation). I definitely like your blog post point about creating integration tests from a combination of hypothesis and pycontracts, I’ll give that a shot for my next project :D

        1. 2

          QuickCheck does shrinking. The Arbitrary typeclass has a shrink :: a -> [a] method which should return a list of smaller versions of its argument.

          If you don’t want to create an instance of Arbitrary, you can use the function forAllShrink. It’s type is a more polymorphic version of (RandomSeed -> a) -> (a -> [a]) -> (a -> Bool) -> Bool: the first argument is a generating function, the second is a shrinking function and the third is the property to check.

    13. 19

      As somebody who first learned Python, and then Haskell, and then left both, this review reminds me of many of my own experiences.

      I was reading through some of Hillel Wayne’s blog posts, and one of them discussed the Curry-Howard correspondence, which somehow proves a 1:1 association between aspects of a mathematical proof and aspects of a type system. You can convert a proof into a type, if your type system supports it. I think Haskell’s type system respects this. I don’t think Python’s type system does.

      Yes, but undefined happens to be a value of every type, in Haskell; this corresponds to being able to prove any statement, including the false ones. So while it is indeed possible to turn proofs into programs, it’s not possible to turn Python or Haskell programs into useful valid proofs.

      (Somebody may pedantically note that every Python or Haskell program gives a proof, just not of the fact that we might hope that they prove; they usually have Python-shaped or Haskell-shaped holes in them.)

      I don’t think you can do anything like [fmap] with Python or any other practitioner’s language that I’m familiar with.

      I think that this is part of the roots of the Haskeller memetic complex. Because Haskell has the lightweight syntax of an ML, and its prelude is rich in algebraic tools, suddenly there is a blindness as to the fact that the tools have been there all along. For example, in Python 3:

      >>> plusOne = lambda x: x + 1
      >>> lmap = lambda f: lambda xs: [f(x) for x in xs]
      >>> nada = object()
      >>> mmap = lambda f: lambda x: nada if x is nada else f(x)
      >>> dot = lambda f: lambda g: lambda x: f(g(x))
      >>> zf = dot(lmap)(mmap)(plusOne)
      >>> zf([1, 2, 3])
      [2, 3, 4]
      >>> zf([1, nada, 3])
      [2, <object object at 0x7ff65c5db0d0>, 4]
      

      Even the base error type, bottom or |, is defined as an evaluation that never completes successfully, either because a computation failed, or because there’s an infinite loop. … I don’t think Python’s BaseException has a built-in notion of infinite loops evaluating to an error condition. This appears super powerful to me, because the big error type introduced by going over network connections are network timeouts, which appear to most programs as infinitely long computations.

      There are many opinions on this, but I am of the opinion that the sort of incompleteness generated by an unresponsive network peer, known as FLP impossibility or perhaps FLP “incompleteness” of network requests, is not the same as the incompleteness that Turing, Gödel, Tarski, Lawvere, and others showed via diagonalization. They are connected, in that the FLP result implies that a Turing machine which simulates many computers on a network may encounter a Turing-complete subproblem which prevents the simulated network peers from responding.

      In Alice-and-Bob framing, suppose Alice is waiting for Bob to reply over the network. The FLP impossibility of Alice having to wait forever for Bob’s reply is connected to the Turing undecidability of whether a simulation of Alice and Bob will hang forever while simulating Bob’s preparation of their reply.

      Haskell has Test.QuickCheck, a property-based testing framework that generates values to check a declared property. Python has hypothesis, but having used it a tiny bit for an open-source contribution, I don’t think it’s the same without the Haskell type system.

      I think that this is another memetic root; specifically, the root is type-driven dispatch. To compare apples to apples, a better analogue of QuickCheck in Python might be PayCheck, while Hedgehog is closer to Hypothesis in Haskell. As soon as the one-dimensional comparison becomes two-dimensional, the flexibility of Hypothesis, and crucially the ability to select arbitrary strategies for generating test cases, far outweighs the convenience of simple type-checking. This makes sense, in a certain way; if Haskell’s type system were sound in the first place, then QuickCheck would be little more than a totality checker. Hypothesis’ own author recommends Hedgehog over QuickCheck, and in my language, Monte, I implemented both a classic unit-test runner and also a Hypothesis-style property test runner.

      Ultimately, I don’t think [indentation] is a big issue. Haskell has code auto-formatters like haskell-formatter, much like Python’s black. Use it and move on. Bikeshed about something else, you know?

      Since you didn’t mention it, and none of your code samples seem to use it, and to give an example of something to bikeshed over, it’s worth pointing out that Haskell supports curly-brace-oriented syntax as well as indentation. This is a recurring topic in programming language design: We want code to be readable, and that leads to caring about whitespace.

      1. 9

        This makes sense, in a certain way; if Haskell’s type system were sound in the first place, then QuickCheck would be little more than a totality checker.

        I don’t think that follows unless you’re using incredibly specific types. There are many total functions of type Ord a => [a] -> [a], but not all of them would pass the tests of a sorting function.

        1. 4

          This is a fair observation. QuickCheck test selection indeed is driven not just by types, but by typeclasses and associated behaviors, and I was remiss to dismiss it so glibly.

          The main desire is that one might express algebraic laws, and ultimately all of these testing tools are only approximations to actual equational reasoning. That doesn’t preclude a type system from giving useful annotations.

      2. 4

        Wow, amazing comment! I understand about half of this. I wish I had more than one upvote to give you.

        I am really curious as to whether Python’s NoneType would map to Haskell’s bottom in that regards. Would that be the Python-shaped hole you’re talking about?

        I did see a tutorial by David Beazley on implementing a lambda calculus in Python. I didn’t think it was all that practical tbh, and I think he said the same in the video: https://www.youtube.com/watch?v=pkCLMl0e_0k

        Could you give a brief summary of how FLP impossibility and Turing incompleteness are different? I know of “Turing completeness” and that it’s a thing, but my knowledge ends there. I haven’t heard of FLP impossibliity before. Bookmarked that website.

        Huh, the author of hypothesis says QuickCheck doesn’t have integrated shrinking. From his statement on generators rather than type classes, I understand your point on how Hedgehog and Hypothesis are more similar.

        Hmm, there weren’t a whole lot of examples of curly-brace oriented syntax. Most examples used indentation. The Haskell wiki does mention it though: https://en.wikibooks.org/wiki/Haskell/Indentation#Explicit_characters_in_place_of_indentation

        (I do care about whitespace and clean code; I read through “Clean Code” my second year out of college, but in my experience as a practitioner so far I’d rather use a tool and automated process to have a consistent way of doing things with multiple people, in order to spend political capital elsewhere.)

        1. 10

          (Phew this got long, time to turn it into a blog post ;) )

          I am really curious as to whether Python’s NoneType would map to Haskell’s bottom in that regards.

          It’s important to distinguish between the empty type and the unit type. In (type) theory there are no values of the empty type, and there is one value of the unit type. In a language without _|_, like Agda, we might define them something like this:

          data Empty : Type
          
          data Unit : Type where
              unit : Unit
          

          This says that Empty is a Type, that Unit is a Type, that unit is a Unit. Since we’ve given no contructors (or “introduction forms”) for Empty, there’s no way to make one. Likewise we can write function types like Int -> Empty which have no values (since there’s no way for any function to return a value of type Empty), we can write tuple types like Pair Int Empty that have no values, etc. We can also write function types like Empty -> Int which do have values (e.g. const 42), but which can never be called (since there are no Empty values to give as an argument). Incidentally, the fact that we can have types with no values is one of the reasons we can’t have a value with type forall a. a (in Haskell) or (a : Type) -> a (in Agda): they claim to return a value of any type, but we might ask for an Empty.

          The unit type is trivial, literally. Since there’s only one value of this type (unit), whenever we know that a value will have type Unit we can infer that the value must be unit, and hence (a) if it’s a return value, we can always optimise the function to simply unit and (b) if it’s an argument, it has no effect on the return value (there’s nothing to branch on). In this sense, the unit type contains 0 bits of information (compared to Bool which contains 1 bit).

          As an aside, Either is called a “sum type” because it contains all the values from its first argument plus those of its second: Either Unit Unit is equivalent to Bool (with values Left unit and Right unit). Pairs are called “product types” since they contain every combination of their first argument and second arguments, which is the number of values in each multiplied together. In this sense Empty acts like zero: the type Either Zero a doesn’t contain any Left values, so it’s equivalent to a, just like 0 + a = a for numbers. Likewise Pair Empty a contains no values, since there’s nothing to put in the first position, just like 0 * a = 0 for numbers. Either Unit a acts like 1 + a, since we have all of the values of a (wrapped in Right) plus the extra value Left unit; note that this is also the same as Maybe a, where Right acts like Just and Left unit acts like Nothing. Pair Unit a is the same as a since each a value can only be paired with one thing, and we know that thing is always unit; this is just like 1 * a = a with numbers.

          Back to your question: things get confusing when we try to apply this to “real” languages, especially since a lot of the terminology is overloaded. In Haskell the type Unit is written () and the value unit is also written ().

          In Haskell the type Empty is called Void, but that is not the same as void in languages like Java! In particular, remember that we cannot have a function which returns Empty, so void being Empty would mean we could not implement methods like public static void putStrLn(String s). Such methods certainly do exist, since they’re all over the place in Java, but what happens if we call them? They will run, and return a value back to us (hence it can’t be Empty). What is the value we get back? We know, without even running the method, that we’ll get back null. That’s just like the Unit type (where we know the value will be unit without having to run the code). Hence Java’s void acts like Unit, and null acts like unit.

          If we follow a similar line of reasoning in Python, we find that None acts like unit (or Java’s null) and hence NoneType is like Unit (or Java’s void). AFAIK Python doesn’t have anything which acts like Empty (Haskell’s Void) since, lacking any values, Empty is only useful for type-level calculations (of the sort which get erased during compilation), which Python doesn’t tend to do.

          That just leaves “bottom”. There are a couple of ways to think about it: we can be “fast and loose” where we ignore bottom as a nuisance, which mostly works since bottom isn’t a “proper” value: in particular, we can’t branch on it; hence any time we do a calculation involving bottoms which results in a “proper” value, we know that the bottoms were irrelevant to the result, and hence can be ignored. Any time a bottom is relevant, we have to abandon our nice, logical purity in any case, since catching them requires IO, so why bother complicating our pure logic by trying to include them?

          Alternatively we can treat bottom as an extra value of every type, in a similar way to null inhabiting every type in Java. From this perspective Haskell’s Void type does contain a value (bottom), and the () type contains two values (() and bottom). In this sense we might think of Java’s void corresponding to Haskell’s Void, but I find this line of thinking difficult to justify. In particular, we can branch on null in Java (in fact we should be doing such “null checks” all the time!), whereas (pure) Haskell doesn’t even let us tell whether we have () or bottom.

          As a final thought, whenever comparing dynamic and static languages, it’s important to not conflate different concepts which just-so-happen to have similar names. In particular I find it useful to think of dynamically typed languages as being “uni-typed”: that is, every value has the same type, which we might write in Haskell as a sum like data DynamicValue = S String | I Int | L List | O Object | E Exception | .... Python is actually even simpler than this, since “everything is an object” (not quite as much as in Smalltalk, but certainly more than e.g. Java), so Python values are more like a pair of a class (which itself is a Python value) and a collection of instance properties.

          This is important because “dynamic types”, like in Python, aren’t directly comparable to “static types” like those of Haskell; they’re more comparable to “tags” (e.g. constructors). In particular, think about a Python function branching based on its argument’s “dynamic type”; this is the same as a Haskell function branching on its argument’s constructor. What does this tell us about “bottom” in Python? From this perspective, there isn’t one (at least not reified; an infinite loop might be comparable, but it’s not something we can e.g. assign to a variable). Python’s None is just a normal value like any other, in the same way that Haskell’s Nothing is a normal value (we might sometimes use it to stand for an error, but that’s not inherent to the system); likewise Python’s exceptions are also normal values (we can assign them to variables, return them from functions, etc.); the idea of “throwing and catching” (or raise/except for Python) is actually a perfectly normal method of control flow (it’s actually a limited form of delimited continuation), and this is orthogonal to error representation and handling.

          This makes raising an exception in Python very different to triggering a bottom in Haskell, since Haskell provides no way to branch on a bottom (or whether we even have one). In Python we can raise a value (with the exception “tag”) as an alternative to using return, and we can catch those values, inspect their tag and value, etc. to determine what our overall return value will be, with none of this being visible by the caller. To do anything like that in Haskell we need some “proper” data to inspect and branch on, hence why I say None, exceptions, etc. are just normal values (even though they’re used to represent errors, and we treat them specially because of that; but the same could be said about Either String values in Haskell, for example). Consider that in Haskell the only way to even check if a bottom exists is to use IO, but the idea behind IO is that it’s like State RealWorld, i.e. every IO a value is a pair of (RealWorld, a), so conceptually we never “actually” see a bottom; it’s more like triggering a bottom changes the RealWorld, and we’re branching on that.

          1. 5

            Thank you for the clarification on None vs bottom! I think it definitely helps me understand how much more different Python is from Haskell.

            I like the idea of saying data DynamicValue = S String | I Int | L List | O Object | E Exception for dynamic languages, and I think from a type theory perspective I finally understand what people say when they say “In {Ruby, Python}, everything is an object”.

            I hate try/catch logic in Python. It works, but I much rather prefer the functorial structure espoused by Haskell.

            I agree on the description on IO! The book does mention IO is technically without effects, since it is merely a description of what IO actions should be taking place instead of imperatively executing IO.

            1. 4

              I like the idea of saying data DynamicValue = S String | I Int | L List | O Object | E Exception for dynamic languages

              That’s a good intuitive starting point, but it’s not the whole story. The “unitype framework” is a way of understanding dynamic types from a static typing perspective, but we can also start from a “dynamic typing perspective” and see where it gets us. A dynamically-typed language is one where you can manipulate type definitions at runtime, and test for type membership at runtime. So saying DynamicValue = S String | I Int ... will always be incomplete, because we might not know something’s type even after creating it!

              An intuition I like here is that dynamic types are boolean predicates: x is of type T iff T(x). This means we can define the type of, say, all functions with format as an optional keyword, or the type of all iterables that aren’t strings, or the type of all types that appear as return values in functions.

              (The downside of this is it gets hellishly confusing very quickly, and allows for a lot of spooky action at a distance. Also, there’s no guarantee that checking type membership is total. That’s one reason even highly dynamic languages shy away from going deep in dynamic types.)

              I think from a type theory perspective I finally understand what people say when they say “In {Ruby, Python}, everything is an object”.

              Everything can also be an object in a static system, too. My rule of thumb for what “everything is an object” means: are object methods also objects? Is the call stack an object? If you are using message passing, are the messages objects? Can you send messages to other messages?

              1. 1

                RE: “Everything is an object”, I usually put it in quotes since the meaning is usually context-dependent. A colleague once said that to me regarding Java, and I replied that Java classes aren’t objects yet Python classes are; and Python’s if branches aren’t objects yet Smalltalk’s are (at least, the closures are); and so on.

                RE: Unitypes, etc. yes the static/dynamic distinction can be built up from the dynamic side too (I tend not to think that way since fixed points and things trip me up). The important thing is that people aren’t talking past each other, e.g. it’s fine to argue about whether “foo can/can’t do bar”, but not so useful when each person is using “foo” and “bar” to mean different things ;)

          2. 4

            The overloaded word “void” across languages is confusing. In Rust and Typescript (and others?) they called it “never”, which I think makes it really clear: values of this type never actually exist. And “this function returns never” is like a pun: “this function never returns” and “this function’s return type is ‘never’”.

          3. 3

            This was wonderfully clear. Thank you.

    14. 27

      Interesting post. The main issue of Nix is its onboarding curve and lack of simple-to-grok documentation.

      There’s a few things in this post worth responding to:

      Now, you may ask, how do you get that hash? Try and build the package with an obviously false hash and use the correct one from the output of the build command! That seems safe!

      Nix has the prefetch-* commands that can do this for you and output either the hash, or a full Nix expression that refers to that thing.

      I could avoid this by making each dependency its own Nix package, but that’s not a productive use of my time.

      It depends. My personal view recently has been that Nix should adopt a more Bazel-like model, in which the Nix language is also used for describing the actual software builds rather than just wrapping external package managers.

      I have implemented this for Go (buildGo.nix / docs) and Common Lisp (buildLisp.nix), and with the Go one specifically external dependencies can be traversed and automatically transformed into a Nix data structure.

      For example, here’s the buildGo.nix packaging of golang.org/x/net (from here):

      { pkgs, ... }:
      
      pkgs.buildGo.external {
        path = "golang.org/x/net";
        src = builtins.fetchGit {
          url = "https://go.googlesource.com/net";
          rev = "c0dbc17a35534bf2e581d7a942408dc936316da4";
        };
      
        deps = with pkgs.third_party; [
          gopkgs."golang.org".x.text.secure.bidirule.gopkg
          gopkgs."golang.org".x.text.unicode.bidi.gopkg
          gopkgs."golang.org".x.text.unicode.norm.gopkg
        ];
      

      This makes every subpackage available as an individual Nix derivation, which also means that those builds are cached across different software using those dependencies.

      this is at least 200 if not more packages needed for my relatively simple CRUD app that has creative choices in technology

      For most mainstream languages generators have been written to wrap 3rd-party package managers automatically. For some languages (e.g. Python), the nixpkgs tree actually contains derivations for all packages already so it’s just a matter of dragging them in.

      Oh, even better, the build directory isn’t writable.

      This isn’t true for Nix in general. The build directory is explicitly writable and output installation (into /nix/store) usually (in Nix’s standard environment) happens as one of the last steps of a build.

      It might be that this was a case of some language-specific tooling implementing such a restriction, in which case there’s probably also a flag for toggling it.

      You know, the things that handle STATE, like FILES on the DISK. That’s STATE. GLOBALLY MUTABLE STATE.

      The conceptual boundary is drawn differently here. In some sense, we look at the artefacts of realised derivations (i.e. completed “builds”) as a cache. The hashes you see in the /nix/store reference the inputs, not the outputs.

      Also, nothing written to the store is mutated afterwards so for any given store there is mutability, but it is append-only.

      As a side effect of making its package management system usable by normal users, it exposes the package manager database to corruption by any user mistake, curl2bash or malicious program on the system.

      I’m not sure what is meant by this.

      Edit 1: Ah, on the above this tweet adds some background (I think):

      It doesn’t matter how PURE your definitions are; because the second some goddamn shell script does anything involving open(), you lose. The functional purity of your build is gone.

      By default, Nix sandboxes builds which means that this is not a problem. Only explicitly declared dependencies are visible to a builder, and only the build directory and output path are writeable. Users can enable various footguns, such as opting out of sandboxing or whitelisting certain paths for passthrough.

      1. 6

        By default, Nix sandboxes builds which means that this is not a problem.

        Only on linux, unfortunately. The author seems to be on mac, which is probably why they didn’t know about the sandboxing.

        1. 3

          It seems like sandboxing is available on Mac (thanks puck for pointing this out), but for users running Nix in single-user mode (which OP might be doing) there is currently some extra hoop-jumping required to make it work correctly.

          1. 1

            I was thinking of this line from https://nixos.org/nix/manual/:

            In addition, on Linux, builds run in private PID, mount, network, IPC and UTS namespaces to isolate them from other processes in the system (except that fixed-output derivations do not run in private network namespace to ensure they can access the network).

            It looks like on mac it’s just a chroot to hide store paths but you can still curl install.sh | bash in your build. I didn’t know it even had that much sandboxing on mac though, so thanks for pointing it.

      2. 4

        You know, the things that handle STATE, like FILES on the DISK. That’s STATE. GLOBALLY MUTABLE STATE.

        The conceptual boundary is drawn differently here. In some sense, we look at the artefacts of realised derivations (i.e. completed “builds”) as a cache. The hashes you see in the /nix/store reference the inputs, not the outputs.

        Also, nothing written to the store is mutated afterwards so for any given store there is mutability, but it is append-only.

        I really like this aspect of Nix: it’s like all packages exist in some platonic library of babel, and we copy a few of them into our /nix/store cache as they’re needed. This style of reasoning also fits with the language’s laziness, the design of nixpkgs (one huge set, whose contents are computed on-demand) and common patterns like taking fixed points to allow overrides (e.g. all of the function arguments called self).

        A similar idea applies to content-addressable storage like IPFS, which I’m still waiting to be usable with Nix :(

      3. 2

        Nix should adopt a more Bazel-like model, in which the Nix language is also used for describing the actual software builds rather than just wrapping external package managers.

        Would that involve “recursive Nix” to allow builders to use Nix themselves, in order to build sub-components?

        1. 3

          Recursive Nix is not necessary. For some languages this can already been done. E.g. the buildRustCrate function reimplements (most of) Cargo in Nix and does not use Cargo at all. This is in contrast to buildRustPackage, which relies on Cargo to do the builds.

          You can convert a Cargo.lock file to a Nix expression with e.g. crate2nix and build crates using buildRustCrate. This has the same benefits as Nix has for other derivations: each compiled crate gets its own store path, so builds are incremental, and crate dependencies with the same version/features can be shared between derivations.

        2. 2

          No, I’m not using recursive Nix for these. In my opinion (this might be controversial with some people) recursive Nix is a workaround for performance flaws of the current evaluator and I’d rather address those than add the massive amount of complexity required by recursive Nix.

          What’s potentially more important (especially for slow compilers like GHC or rustc) is content-addressed store paths, which allow for early build cutoff if two differing inputs (e.g. changes in comments or minor refactorings) yield the same artefact. Work is already underway towards that.

      4. 2

        Can you please edit documentation somewhere to note the existence of the prefetch commands and how to use them?

        Does that buildGo.nix thing support Go modules?

        1. 7

          Can you please edit documentation somewhere to note the existence of the prefetch commands and how to use them?

          nix-prefetch-url is part of Nix itself and is documented here, nix-prefetch-git etc. come from another package in nixpkgs and I don’t think there’s any docs for them right now.

          Nix has several large documentation issues and this being undocumented is a symptom of them. The two most important ones that I see are that the docs are written in an obscure format (DocBook) that is not conducive to a smooth writing flow and that the docs are an entirely separate tree in the nixpkgs repo, which means that it’s completely unclear where documentation for a given thing should go.

          The community disagrees on this to various degrees and there is an in-progress RFC (see here) to determine a different format, but that is only the first step in what is presumably going to be a long and slow improvement process.

          Does that buildGo.nix thing support Go modules?

          I’ve never used (and probably won’t use) Go modules, but I believe Go programs/libraries written with them have the same directory layout (i.e. are inspectable via go/build) which means they’re supported by buildGo.external.

          If your question is whether there’s a generator for translating the Go module definition files to Nix expressions, the answer is currently no (though there’s nothing preventing one from being written).

          1. 1

            Is there a way to get a hash of a file without making it available over HTTP?

            1. 6

              Yep!

              /tmp $ nix-store --add some-file 
              /nix/store/kwg265k8xn9lind6ix9ic22mc5hag78h-some-file
              

              For local files, you can also just refer to them by their local path (either absolute or relative) and Nix will copy them into the store as appropriate when the expression is evaluated, for example:

              { pkgs ? import <nixpkgs> {} }:
              
              pkgs.runCommand "example" {} ''
                # Compute the SHA256 hash of the file "some-file" relative to where
                # this expression is located.
                ${pkgs.openssl}/bin/openssl dgst -sha256 ${./some-file} > $out
              ''
              

              Edit: Oh also, in case the question is “Can I get this hash without adding the file to the store?” - yes, the nix-hash utility (documented here) does that (and supports various different output formats for the hashes).

      5. 1

        For example, here’s the buildGo.nix packaging of golang.org/x/net (from here):

        Proxy error (the link obvisouly).

        Edit: Back up!

        1. 2

          Hah, sorry about that - I’m running that web UI on a preemptible GCP instance and usually nobody manages to catch an instance cycling moment :-)

      6. 1

        Oh, even better, the build directory isn’t writable.

        This isn’t true for Nix in general. The build directory is explicitly writable and output installation (into /nix/store) usually (in Nix’s standard environment) happens as one of the last steps of a build.

        It might be that this was a case of some language-specific tooling implementing such a restriction, in which case there’s probably also a flag for toggling it.

        It’s most likely caused by the derivation either trying to build inside a store path, e.g. cd "${src}" && build, or inside a copy of a store path (which preserves the read-only flags), e.g. cp -a "${src}" src && cd src && build. We can see if that’s the case by looking at the build script in the failing .drv file: they’re plain text files, although they’re horrible to read without a pretty-printer like pretty-derivation. This is probably quicker than trying to get hold of and inspecting the failing derivation in nix repl, since it may be buried a few layers deep in dependencies.

        I actually make this mistake a lot when writing build scripts; I usually solve it by putting chmod +w -R after the copy. If someone else has written/generated the build script it may be harder to override; although in that case it would presumably break for every input, so I’d guess the author might be calling it wrong (AKA poor documentation, which is unfortunately common with Nix :( )

        It might be a symptom of the Koolaid, but I find this a feature rather than a bug: Nix keeps each download pristine, and forces my fiddling to be done on a copy; although the need to chmod afterwards is admittedly annoying.

    15. 4

      I’ve found that the Nixpkgs approach to Haskell packages (choosing a subset of Hackage packages, picking one version of each which mostly work together) can often require a lot of overriding. Even worse, the error messages only appear after (re)compiling a bunch of dependencies, making each iteration painfully slow.

      IOHK’s Haskell infrastructure takes the opposite approach: running cabal or stack to solve the dependencies, then fetching and building whichever versions were chosen (unless they’re already in /nix/store, of course). I’ve found this works quite well: it requires a larger up-front build time, but is more reliable so doesn’t need hand-holding.

      1. 1

        I haven’t seen this before. Cool. Starred it and scheduled for a read on the commute.

    16. 1

      I’ve been using XMPP for years; I’ve not heard of any of these clients other than Gajim, so it’s nice to see development continuing.

      Back in the day I used XMPP for microblogging, via the XMPP gateway on identi.ca. I also used gateways to Yahoo messenger and MSN messenger to talk to friends, before everyone switched to Facebook/whatever. It would be interesting to know if there are any stable gateways to such protocols running (bonus points if they don’t require signing up to Facebook/whatever).

      After identi.ca turned off their XMPP gateway, I wrote a simple Python script to do the same thing on my own Web site (minus the social aspect, of course). It was nice to throw random thoughts into a Pidgin window and have them archived on my site. I added a stripped-down version of this to a repo of small but useful Python programs for learners (in the ‘Microblogger’ directory).

      I also made a pub/sub library, and couple of XMPP service discovery clients as a way to learn GTK+ and Qt (I can’t find their code at the moment). Probably wildly out-of-date with current best-practices now ;)

    17. 4

      This is a case of improper data modeling, but the static type system is not at fault—it has simply been misused.

      The static type system is never at fault, it behaves just like the programmer tells it to. But this is kind of handwaving over the very point this article attempts to address. This particular case of “improper data modeling” would never be a problem on dynamically-typed systems.

      Bad analogy time: it is pretty much like advocating the use of anabolic steroids, because they make you so much stronger, but when the undesired side effects kick in, you blame the hormonal system for not keeping things in balance.

      1. 9

        Bad analogy time: it is pretty much like advocating the use of anabolic steroids, because they make you so much stronger, but when the undesired side effects kick in, you blame the hormonal system for not keeping things in balance.

        To me it feels like that’s exactly what proponents of dynamic typing often do “I can write all code super fast” and then when people say there’s issues when it is accidentally misused (by another programmer or the same programmer, in the future) it is always “you should’ve used more hammock time to think about your problem real hard” or “you should’ve written more tests to make sure it properly handles invalid inputs”.

        1. 5

          You are not wrong and this is just a proof that the debate around type systems is still too immature. There is certainly a component of dynamism in every computer system that programmers crave, and it usually lives out of bounds of the language environment, on the operating system level. Dynamically typed languages claim to offer that dynamism inside their own environment, but most of the programs don’t take advantage of that.

          There is no known definitive argument on either side that would definitively bury its respective contender. Programmers sometimes seem too afraid of some kind of Tower of Babel effect that would ruin the progress of Computer Science and I believe that the whole debate around static and dynamic type systems is just a reflex of that.

      2. 2

        This particular case of “improper data modeling” would never be a problem on dynamically-typed systems.

        I think this is addressed in the appendix about structural vs nominal typing. In particular, very dynamic languages like Python and Smalltalk still allow us to do such “improper data modelling”, e.g. by defining/using a bunch of classes which are inappropriate for the data. Even if we stick to dumb maps/arrays, we can still hit essentially the same issues once we get a few functions deep (e.g. if we’ve extracted something from our data into a list, and it turns out we need a map, which brings up questions about whether there’ll be duplicates and how to handle them).

        Alternatively, given the examples referenced by the author (in the linked “parse, don’t validate” post) it’s reasonable to consider all data modelling in dynamically-typed systems to be improper. This sounds a bit inflammatory, but it’s based on a core principle of the way dynamically-typed languages frame things: they avoid type errors in principle by forcing all code to “work” for all values, and shoehorning most of those branches into a sub-set of “error” or “exceptional” values. In practice this doesn’t prevent developers having to handle type errors; they just get handled with branching like any other value (with no compiler to guide us!). Likewise all dynamic code can model all data “properly”, but lots of code will model lots of data by producing error/exceptional values; that’s arguably “proper” since, after all, everything in a dynamic system might be an error/exception at any time.

        Side note: when comparing static and dynamic languages, it’s important to remember that using “exceptions” for errors is purely a convention; from a language/technology standpoint, they’re just normal values like anything else. We can assign exceptions to variables, make lists of exceptions, return exceptions from functions, etc. it’s just quite uncommon to see. Likewise “throwing” and “catching” is just a control-flow mechanism for passing around values; it doesn’t have anything to do with exception values or error handling, except by convention. I notice that running raise 42 in Python gives me TypeError: exceptions must derive from BaseException, which doesn’t seem very dynamic/Pythonic/duck-typical; yet even this “error” is just another value I can assign to a variable and carry on computing with!

        1. 1

          The point I was trying to make is that, in the example mentioned in the article, the reason why the type description was inaccurate at first has only to do with the fact that the programmer must provide the checker information about UserId‘s subtype. On a dynamically-typed system, as long as the JSON type supports Eq, FromJSON and ToJSON, you are fine, and having to accurately determine UserId‘s subtype would never be a problem.

          So I do understand the appeal of static typing in building units, but not systems, especially distributed ones, and this is why I believe the article is myopic, but dynamic language advocates do a terrible job in defending themselves. Having to process JSON payloads is the least of the problems if you are dealing with distributed systems; how would you accurately type check across independent snippets of code in a geographically-distributed network over which you have no control is a much more interesting problem.

          1. 2

            On a dynamically-typed system, as long as the JSON type supports Eq, FromJSON and ToJSON, you are fine, and having to accurately determine UserId‘s subtype would never be a problem.

            That’s not true. At some point your dynamically typed system will make an assumption about the type of value (the UserId in this case) that you’re applying some function to.

            1. 1

              For practical purposes, it is true. The system internals indeed need to resolve the dependency on that interface, indeed, either with fancy resolution mechanisms or attempting to call the function in a shoot-from-the-hip fashion. But it is not common between dynamic language implementations that the programmer needs to specify the type, so it is not a problem.

    18. 1

      Mine’s at http://chriswarbo.net

      It’s been through a few iterations:

      • I started with blogspot, as a quick way to get things up
      • A few years later I wanted to self-host, so I got a domain, an EC2 instance and copied the content into a CMS
      • After a few more years I was fed up with maintaining the CMS’s permissions, updates, PHP updates, etc. so I turned it all into Markdown and used a static site generator (Hakyll)
      • I eventually found myself working around Hakyll too much, so I replaced it with Pandoc and a Makefile
      • When my Makefile became unwieldy I switched to using Nix

      Since I mostly wirte programming things, I made a couple of Pandoc “filters” for executing arbitrary programs during rendering. This is useful for keeping things “honest”, e.g. checking that the code in a blog post actually runs, and splicing the real output into the document. Running things via Nix has meant that old posts are still reproducible, and I don’t need to care whether something’s installed globally on my system (e.g. I wrote a bunch of posts about PHP, which still execute properly in their Nix sandbox, despite me not having/using PHP for 5 years).

      I’m currently updating those scripts (and hence the site) to use the latest Pandoc (from 1.x to 2.x).

      I’ve been told that LightSail would be a cheaper option than EC2, whilst still giving me a machine with SSH access for the few other things I use it for (unlike, e.g. S3). I still haven’t gotten around to switching it over.

      I am interested in hosting things using IPFS or similar, but their mutability options aren’t great (e.g. IPNS) and I found that the default Go implementation would eat all of my RAM :(

    19. 2

      (I don’t use proprietary GitHub things like “bots” and “pull requests”, so I’ll just talk about the “conventional commits” stuff in general)

      I like the idea of standardising “unimportant” things like commit message format, since it’s one less thing to I have to think about right now, but may help me out in the future. Tooling is the best way to standardise these things, since following the standard becomes the path of least resistance (compared to, e.g. forcibly skipping git hooks).

      One thing I don’t see from clicking around the links is how I’d run this “commitizen” stuff from Magit; the use of an interactive CLI (rather than, say, stdio, env vars or arguments) makes me think it would be tricky.

      Incidentally, I was thinking earlier today how I’m terrible at keeping changelogs and if there’s a simple way to enforce it. This would do the job nicely!

    20. 1

      Obligatory plug for ASV as a benchmark runner, tracker and regression-finder; and my Nix plugin which generalises it to non-Python projects :)