1. 3

    I agree with the article that e2e tests can be a huge drag on productivity to the point that the time and effort wasted due to them can easily cause more bugs to be pushed to production. I still leave a warm spot for them, because a flexible e2e test setup can be a very valuable thing to have not only for testing but also for experimentation, but I think the main lesson here is to avoid e2e tests unless you’re willing to put the continuous effort to properly maintain them to be fast, reliable and easy to diagnose.

    BTW I’m willing to believe that at the 1k engineer scale it may very well be impossible to maintain those properties, never seen e2e at that scale myself. At some point one probably needs to see the overall system as a collection of interacting but independent systems, so maybe e2e might still be maintainable for some of those subsystems.

    1. 3

      There are also some other nasty side-effects of this solution, namely that the HashMap.toList would produce a potentially different ordering on every run of the program.

      Isn’t that part of the deal with hash-based data structures? That ordering is not something you should rely on, as it’s non-deterministic between runs?

      In other news, next up on Lobsters: “Haskell considered harmful” /s

      1. 2

        Doesn’t have to be. Python dictionaries iterate on insertion order while using SipHash to prevent attacks. They do that by maintaining two arrays: one for the contents that are added in insertion order and another that maps the hashed key to the location in the first array. It takes two hops, but keeps a stable order across runs.

        This is a really nice property when you are working with, for example, JSON config. If you load and dump it, the config values don’t get shuffled around and the git diff will only show the relevant changes.

        1. 1

          Being unspecified doesn’t necessarily mean it can vary between runs. That would make it impure, at least unless you have a type system where you can express that order independence as a quotient type. This subtle impurity would wreak havoc on an abstraction like Workflow.

        1. 1

          Before the author told us that she ended up quitting her job anyway, I already knew that had to happen. You can’t possibly recover from a burnout by adjusting your circumstances back to just normal. It’s not surprising that she had to take months of completely free time to recover in the end.

          I wish it were more common and acceptable to take sabbaticals in the industry. I don’t understand why one has to resign in order to get that necessary break. Months or maybe years of productivity and the joy of life gets wasted while the employee comes to a point that taking action is a matter of life and death and the company ends up losing an experienced member that will take years to replace.

          1. 17

            Very insightful. I do find type level programming in Haskell (and, to a lesser extent, Rust) to be a confusing nightmare. Nonetheless, Rust could not exist without traits (i.e. without bounded polymorphism). The Sync and Send marker traits (combined with borrow checking) are the basis of thread safety.

            I think Zig takes an interesting approach here, with it’s compile-time programming (i.e. type level programming with the same syntax and semantics as normal programming), but it suffers from the same typing issues as C++ templates, i.e. types are only checked after use and monomorphization. Rust’s bounded polymorphism can and is type checked before monomorphization, so you know if there are type errors in general. In Zig (and C++), you only know if there are type errors with a particular type, and only after using a function (template) on that type.

            I think there’s room for an approach that’s more like Zig’s, but with sound polymorphic typing, using dependent type theory. Coq, Agda, and Idris include type classes (aka implicits, bounded polymorphism), but it doesn’t seem like type classes should be necessary in a dependently typed language. In particular, it doesn’t seem like they should provide any increase in expressiveness, though perhaps they reduce verbosity.

            1. 5

              Fwiw, even in Haskell you only really need one extension to obviate type classes in terms of “expressiveness,” namely RankNTypes. See https://www.haskellforall.com/2012/05/scrap-your-type-classes.html

              …though it doesn’t solve the verbosity issues. But I suspect that a language with better support for records might make this a pretty good solution (I have a side project where I am working on such a language).

              1. 2

                RankNTypes is my top pick for something to add to Haskell. however, for common cases type classes have the advantage of having decidable inference.

                1. 3

                  Note that in the context of replacing type classes, the usual decidability problem with inference doesn’t really come up, because either way the higher rank types only show up in type definitions. E.g.

                  class Functor f where
                      fmap :: (a -> b) -> f a -> f b


                  data Functor' f = Functor'
                      { fmap' :: forall a b. (a -> b) -> f a -> f b

                  In the latter case, the problems with inference don’t come up, because the higher-rank quantifier is “hidden” behind the data constructor, so normal HM type inference can look at a call to fmap' and correctly infer that its argument needs to be a Functor' f, which it can treat opaquely, not worrying about the quantifier.

                  You can often make typchecking advanced features like this easier by “cheating” and either hiding them behind a nominal type, or bundling them with a other features as a special case.

                  (N.B. I should say that for just Functor' you only need Rank2Types, which actually is decidable anyway – but I don’t think GHC actually decides it in practice, so it’s kindof a nitpick).

                  Of course this is talking about type inference, whereas type classes are really more about inferring the values, which, as I said, this doesn’t solve the verbosity issues.

              2. 5

                Type classes aren’t just about verbosity, global coherence is a very important aspect. Any decision on whether to use a type class vs. explicit dictionary passing needs to consider the implications of global coherence. I think Type Classes vs. the World is a must-watch in order to engage in productive discussion about type classes.

              1. 4

                So, git rebase advocates try to keep the canonical history cleaner, while the fossil author suggests that it’s the job of the tooling to expose a clean version of the history while the VCS doesn’t lose any history. I think this argument has a lot of merit, but IMO what the author is missing here is the number of tools interacting with your VCS. By lying and simplifying your history, you’re forcing a canonical view onto any consumers of your repository, but an honest history suffers from the inconsistent interpretations from different tools.

                You can actually approach the problem from the other end. Let’s keep the commit hierarchy simple, but retain any information we want in the commit messages of rebased or cherry-picked commits. So, whenever we cherry-pick a commit, let’s take note of the original time stamp and/or the name of the branch where the commit came from, in a way that keeps or collapses multiple layers of some metadata. If we do this systematically, then we can teach our tooling to process that metadata consistently and recover the original history when needed.

                IMO, retaining that metadata in the commits and hoping that the tooling agrees isn’t very different from retaining that information as a core component of our VCS and hoping that all the tooling agrees on what is canonical. The difference is, git, by itself, enforces only the minimum amount of metadata it needs for itself to work without really stopping you from retaining anything extra, but fossil forces you to deal with a world with that extra information whether or not you like it, but the up side is that you have a much better chance of getting an ecosystem of tools to agree since the shape of the metadata is standardized.

                My personal preference is to work with systems that impose the minimum amount of complexity on top of the bare minimum they need themselves to operate. I find it easier to combine such systems to build exactly what I need.

                1. 67

                  I love rebase.

                  There are a couple things rebase enables that are really powerful which are, unfortunately, not possible in fossil.

                  The first is a clean history.

                  My commit history as I create it has no value to anybody else. I “finally got this bit working”, I go “This is close but I’m going to try a totally different approach now,” and I leave my computer for the day. All of these are valuable to me, but have no place in the long lived history of my source code. Why?

                  A simple misconception. Commit history is not supposed to be how I think, but how the software committed evolved.

                  I commit then run tests. Should I be committing the failed results and then committing the successful ones? Should I be cluttering my history with commits like “fix tests” since I commit all over? Or should I be producing nice, small, specific commits for specific features or specific points in my software’s progression towards its current form?

                  Bisect means nothing if I have many small commits where I repeatedly broke and unbroke a feature. Bisect means a lot when I have a specific commit that makes a set of changes, or when I have a specific commit that fixes a different bug. It means nothing when I have to try “Hey, does this pass our CI as it stands?” (welcome to big-corp coding).

                  So point by point:

                  1. Yes! Rebase is dangerous! Don’t blindly use this command, know what it is you want at the end.
                  2. Cleaning history so it becomes about the software and not about your brain is a new and useful feature.

                  2.1) Nope, history all still there, just because you don’t know where the work started doesn’t mean you don’t know where the software gained the work.

                  2.2) You can merge this way in git too. You can diff two commits in git too. And then you can rebase because again, it’s not about my brain but about the software.

                  1. Siloed development? “Hey can you check this branch and it’s my branch I might clobber it later” is very different from “It’s my code and you can’t see it until it’s all done.” Master/trunk can’t be rebased. Everything else is fair game.
                  2. So what? Do you really want my commit to show when the work was done at 2 PM instead of 10 AM?
                  3. Who cares how a line of code came together, so long as the reason for it to exist (commit message) and a clean story for how it fit into the previously-existing project both exist?
                  4. How your brain works is not so valuable it must be imprinted on your commit history.

                  6.1) They were thinking “blargh.” Obviously. That’s why it’s an intermediate commit.

                  6.2) Nothing wrong with small check-ins in a linear progression, rebased into complete commits that add things in small and appropriate ways.

                  6.3) “blargh” “aargh” “fix the thing” “wtf is with java” “dude. stop” “I AM A ZYGON.” I’d rather a nice commit shaped by a rebase into being a useful object because….

                  6.4) Cherry picks also work better with single commits that are useful, instead of five commits all that need to go together to bring a single feature across branches. Also, notably, commits with terrible useless messages. See 6.3.

                  6.5) You want to back out just the testing fix? Or the whole feature while reconsidering how it fits in the existing code. Again, rebased commits for that clean history make this easier.

                  1. Sure. Rebasing for a merge, maybe a cherry-pick in fossil’s model is actually better. Won’t argue with the local SCM semantics for performing a nice linear merge.
                  2. Dishonest only if you think SCM is about the developer’s brain, and not the software.

                  Really I worry the author had too much Enterprise coding experience, where all your work will now become a single commit fitting the Jira formatting rule and you have a multi-dozen line commit because that way the pre-CI checks can pass. I understand being in such a system and thinking rebase is to blame. Maybe your org should trust developers a little more, and spend more time saying “don’t say blargh” instead of “all one commit.”

                  1. 27

                    The best analogy of this I’ve come up with is your private work is like a lab book (meticulous, forensic) and the public/merged branches are the thesis (edited, diversions gone, to the point).

                    1. 4

                      As I started reading you comment and was convinced you had not read the article, but I see you did as you addressed points individually. The author does negate your first claim and shows the evidence. I think you mean that some things are not achievable the same way they are in git. Cleaner history is achievable in fossil in a way that is a superset of git, as explained in the article thoroughly.

                      I am a git user and never used fossil. Git works fine for me and has proven to be a reliable and snappy VCS for me from day one. I don’t have any interest in move to fossil, nor am I a member of the group of people that advocate for changes in git, specially not on git’s principles. It works for me, the parts of it I dislike or would have built in a different way are acceptable choices by people who offered an immensely useful tool to the world. That isn’t to say that valid criticism doesn’t exist or that there aren’t things that could be solved better. I think this article very strongly proves that rebase is just a hacky workflow whose results could be achievable by resourcing to better designed functionality. The author did this masterfully, but on the other hands there is nothing wrong in having a workflow in muscle memory and use it. Even if said workflow relies on glitches or rough shortcuts.

                      1. Do you not mean the opposite? They way I see it, it doesn’t make sense to call a tool ‘dishonest’, one could call it potentially confusing. But it does what it does, how is that possibly dishonest?

                      Regardless personal opinions, the article was so clear, and explaining things so well with clear information and to such detail, that it was a joy to read. This is the mind of a great engineer at work in a way we don’t see so often these days.

                      1. 3

                        rebase is just a hacky workflow whose results could be achievable by resourcing to better designed functionality

                        Argued with examples and suggestions which do not share the same assumptions. There may be a case for rebase being a hacky way to go about making changes to past/private commits, but it was not made in this post. Rather the case was made for any manipulating of past commits as technically and socially wrong.

                        I understand fossil allows overlaying new information on past commits, however there comes a time for messing with actual commits, and not much lost when you change a parent commit.

                      2. 3

                        Maybe I’m missing something, but it seems like the author is specifically talking about git rebase and not git rebase --interactve (at least for the majority of the article). Many of their points are valid for the former, but this response seems to be speaking almost exclusively to the latter.

                        That being said, I don’t think Fossil supports rewriting history in any form, so quite a few of your responses are critiques of Fossil, but not really the article. Similar to you, I’ll try to go through all the points and show what I think the original author was getting at. On a side note, I personally don’t think that rebasing all commits so the master branch is flat is very helpful, but some people seem to like it. In any sense, that specific use is what I’ll be speaking to because it seems to be what the article seems to be talking about.

                        1. Everyone seems to agree on this, no sense speaking more about it.
                        2. Raw git rebase is more an alternative to merging in prod than it is cleaning up the commits. Commit cleanup is often useful, while blindly rebasing on prod rather than merging it in isn’t always the best option.
                          1. Your argument is saying “some data was lost, but everything is still there”. I have to agree with the original author on this one - a rebase drops the parent commit where the branch first came from, so all the history is not still there (or is purposefully misrepresented). Also see my response to #4.
                          2. I think the point they were making is that the claimed benefit from rebasing (“rebasing provides better feature branch diffs”) can be easily achieved by other means - in this case, merging the parent branch back in to the feature branch. On a related note, there are very subtle, but potentially fairly dangerous, differences when you look at the diff from the HEAD to the feature branch without merging in prod, so either rebasing or merging in prod are 2 ways to solve this. That is what the graphics and table show.
                        3. While I tend to view personal branches as potentially rewritten at any time, have you ever tried to base your branch on someone else’s when they’re using a rebase-based workflow? It’s a nightmare. Trying to get your changes to re-apply on top of their rebased changes often causes conflicts which are very hard to recover from.
                        4. The issue is not “when was work done”, but “what was the order the work was done in”. Using a rebase workflow, you could easily end up with commits later in the history which were much earlier chronologically. This is extremely confusing if you’re trying to track down what actually happened.
                        5. I’m not sure what you’re getting at here - fossil seems to allow amending commit messages to fix information or a mistake at a later date, you can’t do that in git without rebasing… and once something is in prod, that really shouldn’t happen. There have been many times when I’ve wanted to go back and add more information to a commit message (or fix a typo) after it was merged in.
                        6. For these, I tend to agree with your response - this seems to be one of the only places where the Fossil article is speaking about an interactive rebase and I think they really miss the point.
                        7. Not much to respond to here.
                        8. From the original article, “Rebasing is an anti-pattern. It is dishonest. It deliberately omits historical information. It causes problems for collaboration. And it has no offsetting benefits.” I agree that rebasing is often an anti-pattern, but I’m purposefully excluding the modification of local commits to get a more useful history. Rewriting local history can definitely have benefits though, so I don’t think they’re completely right.

                        I often wish Git’s UI was clearer - the multiple uses of “rebase” seems similar to the many things “checkout” can do. To make the distinction clearer in my head, I personally view git rebase --interactive as a git rewrite-history command. While it may share some of the internals of rebase, it has quite a different goal from the plain “rebase” action.

                        I hope this helps shed some light on their opinions, even if you may not agree with all of it.

                        TL;DR: there should be a distinction made between rebasing to keep a flat merge history and rewriting feature-branch commits to make them more useful. The first can cause quite a bit of confusion and lead to a more misleading history, while the second can be a very valuable tool.

                        1. 1

                          Interactive rebase should give you only those abilities available from the commandline, just with a nicer interface.

                          I’m OK with fossil commits being append-only. That doesn’t bother me, I love the OpenCrux database which offers an immutable base. A similar thing for commits is an excellent idea.

                          But so is modifying the stream of commits to match when commits hit mainline. And so is merging or splitting commits. And so is ordering the work not in how a spread-out team might complete it, with multiple parallel useless checkins a day, but with what matters long term: In what order did this work introduce regressions to the codebase.

                          1. 1

                            In general I agree with you - I like modifying the stream of commits, but I really only like doing it before they hit main… and I don’t like forcing main to be a straight line without merges. I was primarily trying to point out that most of their arguments focus on rebasing commits to maintain a straight line on main, and not on rewriting history for the sake of a clearer commit log. I think there is very little value to the former (most times), and plenty of value for the latter.

                            Again, I really dislike how “rebase” has often been taken to mean “rewriting history” in git, because in a DAG, a rebase is a specific operation. It’s unclear which people are talking about during this conversation and I think some of the wires may have been crossed.

                        2. 2

                          I generally agree that git rebase is fine as long as one knows exactly one is doing and using the tool to make deliberate changes and improve the state of the project, but

                          So what? Do you really want my commit to show when the work was done at 2 PM instead of 10 AM?

                          2PM vs. 10AM is unlikely to matter, but it often matters whether it was yesterday or Thursday 2 weeks ago, which is before we had the meeting about X,Y,Z. I don’t go about memorizing the commit timestamps in my repositories, but I still find them useful occasionally. I wish we’d all be more careful about avoiding argument from lack of imagination in our debates.

                          1. 5

                            but it often matters whether it was yesterday or Thursday 2 weeks ago, which is before we had the meeting about X,Y,Z.

                            Not long term, which is where SCM exists.

                            Long term those distinctions turn into a very thin slice of time, and people forget about the discussions that happened outside the commit history. Thus all that remains is a commit message in a line.

                        1. 14

                          What’s going on here? How did this get to the top of lobste.rs with 26 upvotes? I’m happy for the OP that they could get their system to work, but as far as I can tell, the story here is “package manager used to manage packages.” We have been doing that for decades. Is there any way the community can get a lever to push back on thin stories like this one?

                          1. 25

                            Would it change your opinion if the article mentioned that the nix shell being used here is entirely disposable and this process leaves no mark in your OS setup? Also that even if this required some obscure versions of common system dependencies you could drop into such a shell without worrying about version conflicts or messing up your conventional package manager?

                            I agree that the article is thin in content, but I don’t think you can write this story off as “package manager used to manage packages.” , I think nix shell is very magical in the package management world.

                            1. 6

                              I could do that with docker too and it would not leave a trace either

                              1. 17

                                Yes, but then you’d be inside a container, so you’d have to deal with the complexities of that, like mounting drives, routing network traffic etc. With nix shell, you’re not really isolated, you’re just inside a shell session that has the necessary environment variables that provide just the packages you’ve asked for.

                                Aside from the isolation, the nix shell is also much more composable. It can drop you into a shell that simultaneously has a strange Java, python and Erlang environment all compiled with your personal fork of GCC, and you’d just have to specify your GCC as an override for that to happen.

                                1. 4

                                  I get that, but I have to go through the learning curve of nix-shell, while I already know docker, since I need it for my job anyway. I am saying that there are more ways to achieve what the article is talking about. It is fine that the author is happy with their choice of tools, but it is very unremarkable for the title and given how many upvotes that article got.

                                  1. 5

                                    Why not learn nix and then use it at work as well :) Nix knows how to package up a nix-defined environment into a docker container and produce very small images, and you don’t even need docker itself to do that. That’s what we do at work. I’m happy because as far as I’m concerned Nix is all there is and the DevOps folks are also happy because they get their docker images.

                                    1. 3

                                      I work in a humongous company where the tools and things are less free to choose from atm, so even if I learned nix, it would be a very tough sell..

                                2. 3

                                  As someone who hasn’t used Docker, it would be nice to see what that looks like. I’m curious how the two approaches compare.

                                  1. 6

                                    I think that the key takeaway is that with Docker, you’re actually running a container will a full-blown OS inside. I have a bias against it, which is basically just my opinion, so take it with a grain of salt.

                                    I think that once the way to solve the problem of I need to run some specific version of X becomes let’s just virtualize a whole computer and OS because dependency handling is broken anyway, we, as a category simply gave up. It is side-stepping the problem.

                                    Now, the approach with Nix is much more elegant. You have fully reproducible dependency graphs, and with nix-shell you can drop yourself in an environment that is suitable for whatever you need to run regardless of dependency conflicts. It is quite neat, and those shells are disposable. You’re not running in a container, you’re not virtualizing the OS, you’re just loading a different dependency graph in your context.

                                    See, I don’t use Nix at all because I don’t have these needs, but I played with it and was impressed. I dislike our current approach of just run a container, it feels clunky to me. I think Docker has it’s place, specially in DevOps and stuff, but using it to solve the I need to run Python 2.x and stuff conflicts with my Python 3.x install is not the way I’d like to see our ecosystem going.

                                    In the end, from a very high-level, almost stratospheric, point-of-view: both docker and nix-shell workflow will be the developer typing some commands on the terminal, and having what they need running. So from a mechanical standpoint of needing to run something, they’ll both solve the problem. I just don’t like how solving things by doing the evergreen is now the preferred solution.

                                    Just be aware that this is an opinion from someone heavily biased against containers. You should play with both of them and decide for yourself.

                                    1. 3

                                      This comment is a very good description of why I’ve never tried Docker (and – full disclosure – use Nix for things like this).

                                      But what I’m really asking – although I didn’t make this explicit – is a comparison of the ergonomics. The original post shows the shell.nix file that does this (although as I point out in another comment, there’s a shell one-liner that gets you the same thing). Is there an equivalent Dockerfile?

                                      I was surprised to see Docker brought up at all because my (uninformed) assumption is that making a Docker image would be prohibitively slow or difficult for a one-off like this. I assumed it would be clunky to start a VM just to run a single script with a couple dependencies. But the fact that that was offered as an alternative to nix-shell makes me think that I’m wrong, and that Docker might be appropriate for more ad-hoc things than I expected, which makes me curious what that looks like. It points out a gap in my understanding that I’d like to fill… with as little exertion of effort as possible. :)

                                      1. 4

                                        But the fact that that was offered as an alternative to nix-shell makes me think that I’m wrong, and that Docker might be appropriate for more ad-hoc things than I expected, which makes me curious what that looks like. It points out a gap in my understanding that I’d like to fill… with as little exertion of effort as possible. :)

                                        I think containers is a perfectly capable solution to this. The closest thing you can use would probably be toolbox.


                                        It would allow you to even provide a standardized environment which would be decoupled from the deployment itself (if that makes sense). It also mount $HOME as well.

                                        1. 3

                                          I use Nix, but also have experience with Toolbox.

                                          I would recommend most people to use Toolbox over nix-shell. With toolbox you can create one-off containers in literally seconds (it’s two commands). After entering the container you can just dnf install whatever you need. Your home directory gets mounted, so you do not have to juggle with volumes, etc. If you need to create the same environment more often, you can create a Dockerfile and build your toolbox containers with podman. The upstream containers that Fedora provides are also just built using Dockerfiles.

                                          The post shows a simple use case, but if you want to do something less trivial, it often entails learning Nix the language and nixpkgs (and all its functions, idioms, etc.). And the Nix learning curve is steep (though it is much simpler if you are familiar with functional programming). This makes the toolbox approach orders of magnitude easier for most people - you basically need to know toolbox create and toolbox enter and you can use all the knowledge that you already have.

                                          However, a very large shortcoming of toolbox/Dockerfiles/etc. is reproducibility. Sure, you can pass around an image and someone else will have the same environment. But Nix allows you to pin all dependencies plus the derivations (e.g. as a git SHA). You can give someone your Nix flake and they will have exactly the same dependency graph and build environment guaranteed.

                                          Another difference is that once you know Nix, it is immensely powerful for defining packages. Nix is a turing-complete functional language, so nixpkgs can provide a lot of powerful abstractions. I dread every time I have to create/modify and RPM spec file, because it is so primitive compared to making a Nix derivation.

                                          tl;dr: most people will want to use something like Toolbox, it is familiar and provides many of the same benefits as e.g. nix-shell (isolated, throw-away environments, with your home directory available). However, if you want strong reproduciblity across systems and a more powerful packaging/configuration language, learning Nix is worth it.

                                        2. 3

                                          A cool aspect of Docker is that it has a gazillion images already built and available for it. So depending on what you need, you’ll find a ready-made image you can put to good use with a single command. If there are no images that fill your exact need, then you’ll probably find an image that is close enough and can be customised. You don’t need to create images from scratch. You can remix what is already available. In terms of ergonomics, it is friendly and easy to use (for these simple cases).

                                          So, NixPkgs have a steeper learning curve in comparison to dockerfiles. It might be simpler to just run Docker. What I don’t like is what is happening inside Docker, and how the solution for what looks like simple problems involves running a whole OS.

                                          I’m aware that you can have containers without an OS like described in this thread, but that is not something I often see people using in the wild.

                                        3. 1

                                          Nit-pick: AFAIK one doesn’t really need Alpine or any other distro inside the container. It’s “merely” for convenience. AFAICT it’s entirely possible to e.g. run a Go application in a container without any distro. See e.g. https://www.cloudbees.com/blog/building-minimal-docker-containers-for-go-applications

                                    2. 3

                                      Let’s assume nix shell is actual magic — like sourcerer level, wave my hand and airplanes become dragons (or vice versa) magic — well this article just demonstrated that immense power by pulling a coin out of a deeply uncomfortable kid’s ear while pulling on her nose.

                                      I can’t speak for the previous comment’s author, but those extra details, or indeed any meat on the bones, would definitely help justify this article’s otherwise nonsensical ranking.

                                      1. 2

                                        Yeah, I agree with your assessment. This article could just as well have the title “MacOS is so fragile, I consider this simple thing to be an issue”. The trouble with demonstrating nix shell’s power is that for all the common cases, you have a variety of ad-hoc solutions. And the truly complex cases appear contrived out of context (see my other comment, which you may or may not consider to be turning airplanes into dragons).

                                    3. 19

                                      nix is not the first thing most devs would think of when faced with that particular problem, so it’s interesting to see reasons to add it to your toolbox.

                                      1. 9

                                        Good, as it is not supposed to be the first thing. Learning a fringe system with a new syntax just to do something trivial is not supposed to be the first thing at all.

                                      2. 4

                                        I find it also baffling that this story has more upvotes than the excellent and original code visualization article currently also very high. Probably some nix up vote ring pushing this

                                        1. 12

                                          Or folks just like Nix I guess? 🤷

                                          1. 11

                                            Nix is cool and people like it.

                                            1. 5

                                              I didn’t think this article was amazing, but I found it more interesting than the code visualization one, which lost me at the first, “From this picture, you can immediately see that X,” and I had to search around the picture for longer than it would have taken me to construct a find command to find the X it was talking about.

                                              This article, at least, caused me to say, “Oh, that’s kind of neat, wouldn’t have thought of using that.”

                                            2. 6

                                              This article is useless. It is way simpler (and the python way) to just create a 2.7 virtualenv and run “pip install psycopg2 graphwiz”. No need to write a nix file, and then write a blog post to convince yourself you didn’t waste your time!

                                              Considering all nix posts get upvoted regardless of content, it’s about time we have a “nix” tag added to the site.

                                              1. 14

                                                This article is not useless just because you don’t see its value.

                                                I work mainly with Ruby and have to deal with old projects. There are multiple instances where the Ruby way (using a Ruby version manager) did not work because it was unable to install an old Ruby version or gem on my new development machine. Using a nix-shell did the job every time.

                                                just create a 2.7 virtualenv and run “pip install psycopg2 graphwiz”

                                                What do you do if this fails due to some obscure dependency problem?

                                                1. 4

                                                  What do you do if this fails due to some obscure dependency problem?

                                                  Arguably you solve it by pinning dependency versions in the pip install invocation or requirements.txt, as any Python developer not already using Nix would do.

                                                  This article is not useless just because you don’t see its value.

                                                  No, but it is fairly useless because it doesn’t do anything to establish that value, except to the choir.

                                                  1. 2

                                                    In my experience there will be a point where your dependencies will fail due to mismatching OpenSSL, glibc versions and so on. No amount of pinning dependencies will protect you against that. The only way out is to update dependencies and the version of your language. But that would just detract from your goal of getting an old project to run or is straight up impossible.

                                                    Enter Nix: You pin the entire environment in which your program will run. In addition you don’t pollute your development machine with different versions of libraries.

                                                    1. 3

                                                      Arguably that’s just shifting the burden of effort based on a value judgement. If your goal is to get an old project to run while emphasizing the value of incurring zero effort in updating it, then obviously Nix is a solution for you and you’ll instead put the effort into pinning its entire runtime environment. If, however, your value to emphasize is getting the project to run then it may well be a more fruitful choice to put the effort into updating the project.

                                                      The article doesn’t talk about any of the hairier details you’re speaking to, it just shows someone taking a slightly out of date Python project and not wanting to put any personal effort into updating it… but updating it by writing a (in this case relatively trivial) Python 3 version and making that publicly available to others would arguably be the “better” solution, at least in terms of the value of contributing back to the community whose work you’re using.

                                                      But ultimately my argument isn’t with the idea that Nix is a good solution to a specific problem, it’s that this particular article doesn’t really make that point and certainly doesn’t convincingly demonstrate the value of adding another complex bit of tooling to the toolkit. All the points you’ve raised would certainly help make that argument, but they’re not sadly not present in this particular article.

                                                  2. 1

                                                    Just out of curiosity, I’m also dealing with ancient ruby versions and use nix at work but I couldn’t figure out how to get old enough versions, is there something that helps with that?

                                                      1. 1

                                                        Thank you, very helpful!

                                                        1. 1

                                                          Do note this method will get you a ruby linked to dependencies from the same checkout. In many cases this is what you want.

                                                          If instead you want an older ruby but linked to newer libraries (eg, OpenSSL) there’s a few extra steps, but this is a great jumping off point to finding derivations to fork.

                                                          1. 1

                                                            Do note this method will get you a ruby linked to dependencies from the same checkout. In many cases this is what you want.

                                                            Plus glibc, OpenSSL and other dependencies with many known vulnerabilities. This is fine for local stuff, but definitely not something you’d want to do for anything that is publicly visible.

                                                            Also, note that mixing different nixpkgs versions does not work when an application uses OpenGL, Vulkan, or any GPU-related drivers/libraries. The graphics stack is global state in Nix/NixOS and mixing software with different glibc versions quickly goes awry.

                                                      2. 2

                                                        This comment mentions having done something similar with older versions by checking out an older version of the nixpkgs repo that had the version of the language that they needed.

                                                        1. 2

                                                          Like others already said you can just pin nixpkgs. Sometimes there is more work involved. For example this is the current shell.nix for a Ruby on Rails project that wasn’t touched for 5 years. I’m in the process of setting up a reproducible development environment to get development going again. As you can see I have to jump through hoops to get Nokogiri play nicely.

                                                          There is also a German blog post with shell.nix examples in case you need inspiration.

                                                      3. 4

                                                        this example, perhaps. I recently contributed to a python 2 code base and running it locally was very difficult due to c library dependencies. The best I could do at the time was a Dockerfile (which I contributed with my changes) to encapsulate the environment. However, even with the container standpoint, fetching dependencies is still just as nebulous as “just apt install xyz.” Changes to the base image, an ambiently available dependency or simply turning off the distro package manager services for unsupported versions will break the container build. In the nix case, it is sort of forced on the user to spell it out completely what the code needs, combine with flakes and I have a lockfile not only for my python dependencies, but effectively the entire shell environment.

                                                        More concretely, at work, the powers to be wanted to deploy python to an old armv7 SoC running on a device. Some of the python code requires c dependencies like openssl, protobuf runtime and other things and it was hard to cross compile this for the target. Yes, for development it works as you describe, you just use a venv, pip install (pipenv, poetry, or whatever as well) and everything is peachy. then comes to deployment:

                                                        1. First you need to make a cross-compiled python interpreter, which involves first building the interpreter for your host triple then rebuilding the same source for the target host triple making sure to tell the build process where the host triple build is. This also ignores that some important python interpreter things may not build, like ctypes.
                                                        2. Learn every environment variable you need to expose to the setup.py or the n-teenth build / packaging solution for the python project you want to deploy, hope it generates a wheel. We will conveniently ignore how every C depending package may use cmake, or make, or meson, etc, etc…
                                                        3. make the wheels available to the image you actually ship.

                                                        I was able to crap out a proof-of-concept in a small nix expression that made a shell that ran the python interpreter I wanted with the python dependencies needed on both the host and the target and didn’t even have to think. Nixpkgs even gives you cross compiling capabilities.

                                                        1. 1

                                                          Your suggested plan is two years out of date, because CPython 2.7 is officially past its end of life and Python 2 packages are generally no longer supported by upstream developers. This is the power of Nix: Old software continues to be available, as if bitrot were extremely delayed.

                                                          1. 3

                                                            CPython 2.7 is available in debian stable (even testing and sid!), centos and rhel. Even on MacOS it is still the default python, that ships witht he system. I don’t know why you think it is no longer available in any distro other than nix.

                                                      1. 16

                                                        The author hit the nail on the head in his own piece. Most of these job offers are not for webSITE building, they’re for wepAPP building and as a consequence the framework they’re written in matters when hiring for new people.

                                                        1. 3

                                                          Has anyone tried to define the difference between the two? It always seems like there’s an implicit assumption that it’s easy and obvious to tell what defines a site vs. app.

                                                          1. 4

                                                            For me the differentiating factor is the degree of interactivity. A web site is largely concerned with displaying largely static information, and potentially sending some data back to a server with standard HTML forms. A web app on the other hand often has a high degree of interactivity, possibly communicates with a REST API, database, or other collaborating service, and works more like a traditional application that just so happens to run inside a web browser.

                                                            1. 3

                                                              This definition is a little circular, though.

                                                              • You implement websites with HTML forms and webspps with React
                                                              • How do I know if it’s a webapp or a website?
                                                              • Well websites use HTML forms and webspps use REST APIs etc.

                                                              I know I’m exaggerating a little :) But I think it’s important to keep implementation out of the definition if these terms are to be useful in this discussion. Because then what do I call applications that are built with technologies like Hotwire? They aspire to handle fairly complex interactivity with very minimal JavaScript.

                                                        1. 3

                                                          What’s funny is that this is how a lot of old C projects used to work, they would use a Common Lisp program that would generate C code in the manner mentioned here. IIRC Chip Morningstar used to do this when he worked at LucasArts developing video games.

                                                          1. 1

                                                            The metalang99 readme actually rejects code generators:

                                                            The main reason is that native macros allow you to interleave macro invocations with ordinary code. Code generators, on the other hand, force you to either “program in comments” or in DSL-specific files. Can you imagine using Datatype99 in this way? You would have to put nearly all code manipulating sum types into separate files or feeding all your sources to a code generator, thus losing IDE support and development convenience.

                                                            1. 1

                                                              And that’s exactly why I think C++‘s template metaprogramming is something truly special. The template sublanguage is pretty much a convoluted untyped lambda calculus and it lets you syntactically generate code that gets type checked after full instantiation. Still, the host language (the template sublanguage) and the object language (the remaining C-like language, forget about OOP) are very tightly integrated. You can call templated functions in the middle of non-templated functions, and the template arguments get deduced based on the surrounding typing context resulting in code generation followed by type-checking and other template instantiations if necessary, and then there’s also constexpr that allows you to run object-language code while evaluating the host language! This interleaving of type-checking/deduction/interpretation and code generation is not possible with external programs generating C code. And you get all of this with full IDE support.

                                                              Don’t get me wrong, with all of the above said, C++ has so many rough edges and historical accidents that I’ve long chosen to move on to Haskell, but my heart is still itching for a modern language with all the modern tooling that has an untyped host language - simply typed, non-garbage collected/no runtime object language interplay like C++. I’ve had my eyes on terralang for a few years now, but it seems a little experimental (and I don’t have the time to become a contributor and fix issues as I encounter them 😔)

                                                              1. 1

                                                                There are two big problems with C++ templates:

                                                                First, the type language and the expression language are completely distinct. I can write a == b but I can’t write decltype(a) == decltype(b), I have to instead write std::is_same_v<decltype(a), decltype(b)>. This makes anything that needs to move between the expression and type languages very clunky. This is slowly being improved, with things like constexpr / constinit and the reflection TS that let you stay in the expression language for much longer.

                                                                The second problem is that (until concepts) it lacked any concept of a metatype. All types that you manipulate in the type language are instances of the type typename. This meant that your only option for type checking was at instantiation. You can’t write a template and type check it. This is largely addressed by concepts, except that concept syntax makes template syntax look pleasant in comparison.

                                                                The C++ code that I write makes very heavy use of templates but it always makes me long for a cleaner language.

                                                          1. 2

                                                            Company: SlyceData

                                                            Company site: https://www.slycedata.com/

                                                            Position(s): Haskell engineer

                                                            Location: Fully remote, anywhere in the world, but must be able to collaborate with team members during US Eastern Time business hours


                                                            Haskell engineer, full-time, competitive compensation + equity options

                                                            SlyceData’s technology accelerates and greatly simplifies the investment research process allowing investment managers to be more efficient and competitive in today’s global marketplace. We solve the two bottlenecks of investment research: the initial ingestion of new datasets, and the ongoing work required to query these datasets correctly. This means researchers can simply request the data they want and receive it rapidly and correctly, without worrying about the data adjustments, enhancements, time-alignment, corporate actions, or other corrections required within their daily workflow.

                                                            The greatest challenge in the financial data industry today is the transformation from raw disparate vendor tables into an easily accessible database supporting multiple vendors, instrument types and asset classes. To solve this problem, we created a flexible query engine that can handle the highly idiosyncratic aspects of vendor data and user preferences. Designing technology to automate this process requires uniting two areas of expertise: functional programming expertise to create an intelligent query engine, and financial data expertise to embed the business logic into the queries. We are building an engine that dynamically interprets a user’s data requests, generates optimized queries in real time, and adapts the correct math/logic to deliver ready to analyze data without any pre-processing requirements.


                                                            • Collaborate in the design, implementation, deployment, and maintenance of business-critical software
                                                            • Optimize the performance of our data analytics DSL and implement new language features
                                                            • Design and implement data models, runtime DB queries, migrations and backend application logic
                                                            • Capture and analyze system logs and performance metrics from production environments to diagnose and solve issues
                                                            • Work with with customer support team in responding to issues and answering client questions

                                                            Tech stack: Haskell, PostgreSQL

                                                            Contact: Please send a cover letter and resume to jobs@slycedata.com

                                                            1. 10

                                                              I can’t imagine how this won’t go on forever, as long as the worst that can happen to a patent troll is them wasting their time. The law needs to define the term “patent troll” and you should be able to sue someone on that claim and if they’re found guilty, the consequences should make sure that they won’t go anywhere near the patent office for the rest of their lives.

                                                              1. 16

                                                                … or, for as long as intellectual property - beyond the right to be identified as the creator of a work - continues.

                                                                To quote Jefferson,

                                                                If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it.

                                                                Its peculiar character, too, is that no one possesses the less, because every other possesses the whole of it.

                                                                He who receives an idea from me, receives instruction himself without lessening mine; as he who lights his taper at mine, receives light without darkening me.

                                                                That ideas should freely spread from one to another over the globe, for the moral and mutual instruction of man, and improvement of his condition, seems to have been peculiarly and benevolently designed by nature, when she made them, like fire, expansible over all space, without lessening their density in any point, and like the air in which we breathe, move, and have our physical being, incapable of confinement or exclusive appropriation.

                                                                Inventions then cannot, in nature, be a subject of property.

                                                              1. 2

                                                                I don’t get what flexibility offers dynamic types?

                                                                1. 7

                                                                  One issue: Type Checking vs. Metaprogramming; ML vs. Lisp. These features are at odds: dynamically typed languages let you write shorter programs because you can “compress” the code via metaprogramming. All other things being equal, shorter programs are more correct.

                                                                  That post is from 2016 but it’s come up a lot in Oil since then. Oil is now statically typed (with MyPy) for speed, not correctness. (Some color on that.). Static types are great for predictable speed (the kind of speed I want).

                                                                  The same issue comes up in Rust vs. Zig – Rust favors static typing while Zig favors metaprogramming (even though both languages have some of both). Comment on that, and the whole thread is good.

                                                                  It’s a tradeoff; some problem domains favor one or the other. In distributed systems, you deal with data from the network a lot. The more you are interacting with “the world” vs operating on your own models, the more of a tax static typing becomes.

                                                                  The Rust vs. Zig example is also about interacting with the world (pins on an embedded chip), not models you constructed in your head. My motto is that when models and reality collide, reality wins :) The map isn’t the territory. Static typing is a map that often helps (or doesn’t help) reason about dynamic (runtime) behavior – that’s it. The runtime behavior is what you care about, but sometimes people forget this and let the tail wag the dog.

                                                                  Thread on that: https://old.reddit.com/r/ProgrammingLanguages/comments/nqm6rf/on_the_merits_of_low_hanging_fruit/h0cqvuy/

                                                                  where I reference: https://twitter.com/jeanqasaur/status/1394804946818650115

                                                                  edit: I was thinking about writing a post about other examples of the “world”:

                                                                  • R: “the world” you don’t control is messy data. You have to clean it and transform it BEFORE it fits some model. The entire ingestion and cleaning process often exceeds the analysis in terms of code volume by 10x. So that’s why the dynamic language “won”. Though speed is a huge problem here – Julia is a dynamically typed language with a novel compilation model that addresses this problem.
                                                                  • PHP: “the world” is basically people and policies. It’s not a surprise that Wikipedia was written in PHP. The whole thing is a miracle of coordination, incentives, etc. and requires a ton of evolved software to solve. You can’t model this up front. Ditto with all the communities that evolved around forked/customized PHP forum software.
                                                                  • Shell: “the world” is the Unix kernel, and everything that lives on the other side of it (disks, networks, devices). Trying to make this fit a particular type system is folly; the kernel already has its own model. You can probably find really long-winded static wrappers for kernel APIs that don’t actually make your program any better. (Often what I do is write typed application-specific wrappers, i.e. “role interfaces”, and that approach works better IME.)

                                                                  But I will probably leave it at this comment since it’s better to actually build Oil than write blog posts about static vs. dynamic typing :)

                                                                  1. 3

                                                                    Can you elaborate more on how PHP as a technology was instrumental in Wikipedia’s development? This is the first I’ve heard of this.

                                                                    1. 2

                                                                      Hm I don’t have anything all that concrete, but it’s just an observation from years (decades) of using social software written in PHP.

                                                                      There are multiple related arguments about language design and apps, and I probably can’t tease them apart in this space, but it is basically that R, PHP, and shell are more like “end user programming” or “non-programmer programming” – and moreover programmers actually end up using a lot of this software!

                                                                      It is a huge amount of irreducible domain logic. It’s better if the people who actually understand the domain write the code, and that is borne out by what we actually use in the real world. (social software like wikipedia with lots arguments, policies, and moderation; R code written by statisticians and not translated by programmers; distros full of shell, etc.)

                                                                      UIs are like that (as opposed to systems software), and it seems clear to me that social software is like that. One example is all the recent kerfuffle around lobste.rs moderation. There are lots of rules you have to encode in software, and if you do it wrong, the community can fail.

                                                                      I think a long time ago I was probably influenecd ideas from Joel Spolsky and Clay Shirky. I googled and found these links, although I don’t know if they are making precisely the same argument.




                                                                      My claim is that this kind of software is necessarily evolved, because at the beginning you don’t even have the people yet. And code in dynamic languages evolve more gracefully, and moreover it’s accessible to domain experts who iteratively respond to “the world”.

                                                                      I think the creator of Elm did a talk on this, but I don’t think he ever “shipped” his ideas, while PHP programmers are constantly shipping because they edit on the server :) (this is some what tongue-in-cheek)

                                                                      Here is a good “contrarian” view about dynamic languages from a PL researcher, just rewatched this:


                                                                      Very nice presentation by a HHVM creator that makes a good argument for PHP:



                                                                      1. 1

                                                                        To answer the wikipedia question more directly, I guess my claim is that the “default” is for communities to fail, and Wikipedia has not failed for nearly 20 years straight. That’s a huge feat and it requires a ton of software – nearly all of which is cobbled together in PHP on demand, as far as I can tell (e.g. all the bots that implement content policies, which are kind of semi-automated, etc.).

                                                                        If you’re trying to model this up front, I think it’s missing the point. Maybe someone will prove me wrong, but so far I don’t see many counterexamples. Maybe StackOverflow is one because that’s a huge site built around logic in C#. I’m not really sure of the details though; I think it’s more of a “site” developed by a company than a self-organizing “community”.

                                                                    2. 3

                                                                      For any type system, there are correct programs rejected by the type system.

                                                                      1. 3

                                                                        On the other hand, dynamically typed programming is actually just programming with a single type and a lot of partial functions… In practice, this means you can get back most of that metaprogramming back in a typed language, by selectively dropping types in favor of runtime checks, but the reverse isn’t necessarily possible in a “less typed” language. Put differently, in practical statically typed programming, there’s always a way to tell the type system “yeah, that’s not gonna happen, just error out if it somehow does”, which allows you to still express that correct but rejected program.

                                                                    1. 6

                                                                      I happen to be working on a project that uses postgres in all the ways mentioned in this article, and I think that was a great decision that kept the system simpler and more coherent. I just want to mention a few caveats I’ve encountered though:

                                                                      • Keeping the job locked while you’re working on it is great, but it means you’re dedicating an entire connection for this purpose and you don’t have so many of them in Postgres
                                                                      • Sometimes the worker dies a weird death and Postgres takes a long time to notice that the connection is dead, so the job can stay locked longer than you might expect
                                                                      • Be careful about what else you run over the connection holding that lock, the unintentional locks you might acquire can cause deadlocks
                                                                      • Be careful with foreign key references to and from the row that stays locked, modifying rows that refer to other rows or get referred by other rows have non-trivial locking behaviours across the involved tables, make sure you understand what’s going on
                                                                      • Be careful with these locked jobs in combination with migrations. Lock management in migrations is always tricky, but these long running locks tend to reveal unlikely corner cases
                                                                      • If you’re going to use Postgres for Pub-Sub design for periods of deafness. You might think you’re listening to a channel, but maybe the connection’s been dead for a while, so you need to take measures to detect that and when you do, you need to look around to figure out what you might have missed in that period.

                                                                      Despite the above caveats, I still love using Postgres in this way.

                                                                      1. 2

                                                                        Detailed technical comments based on experience like yours are why I love lobste.rs! Thank you.

                                                                      1. 3

                                                                        I don’t think this goal is attainable for a programmer, especially one that’s doing a good job working on a non-cookie-cutter project. Over time, a good programmer learns the business domain almost as good as the users they’re serving and they simultaneously learn the ins and outs of the existing code base and how the components map to the business concepts. The longer you work on a project, the more your brain will morph into a machine that’s streamlined to solve problems in your domain by changing the existing system. That kind of domain-brain-system adaptation is impossible to transfer between people, so a company loses a huge chunk of its soul (and competitive advantage) when a dedicated and experienced developer leaves.

                                                                        1. 2

                                                                          This is true! But also not. There’s lots of stuff that isn’t domain knowledge that is nonetheless necessary to work on a project. For example, I’ve seen an undocumented release process used to protect a fiefdom. A mature engineer would at least document the process, or preferably automate it.

                                                                        1. 13

                                                                          In my experience, one way this goes so horribly wrong is through your innocent debug lines unconsciously becoming information somebody else relies on via automation.

                                                                          In startup settings, something that commonly happens is in the beginning nobody cares about your product, much less its logs, so you hack away, trying to push the important features out to get some big customers. Then one day, in the middle of your start-up chaos, the product manager tells you that the IT department of one of your important customers has been scavenging your logs for some alerts and that they’d like you to add this and that bit of information to some lines and it comes to you as this tiny low priority task that you couldn’t possibly justify including the arduous yet hard to justify task of redesigning your whole logging story. That’s one way your unstructured log dumps slowly become a loosely defined interface where you have some nebulous compatibility concerns. It was never designed properly and it will never be redesigned…

                                                                          1. 34

                                                                            Total let down

                                                                            $ curl https://get.tool.sh | sudo bash
                                                                            curl: (6) Could not resolve host: get.tool.sh
                                                                            1. 18

                                                                              Good, good, so the echo "curl: (6) Could not resolve host: get.tool.sh" line is serving its purpose 😈

                                                                              1. 8

                                                                                You brave, brave soul.

                                                                                1. 4

                                                                                  They were probably running it on someone else’s machine they’d already breached ;-P

                                                                              1. 35

                                                                                Litestream author here. Michael Lynch did a great job describing Litestream in the post but I’m happy to answer any questions as well.

                                                                                1. 3

                                                                                  First of all, thank you for authoring Litestream! I was wondering how this immediate replication works with AWS S3. If we want every single transaction to end up in the replica, we’d need to save every incremental change, but isn’t that against the object model of S3? I.e. AFAIK you can’t append to your S3 files, so does Litestream create a new S3 object for each row update? Wouldn’t that be slow/expensive? Or does it only upload the WAL from checkpoint to checkpoint, in which case, how can we be sure that the workflow in the post will work? (i.e. make a change in heroku, stop the application, run it somewhere else and see that your change is there now)

                                                                                  1. 9

                                                                                    Litestream works as asynchronous replication similar to Postgres replication. It reads new pages off the WAL and copies them to a replica (either another disk or S3). The S3 replica has a configurable interval for uploading these batches of WAL pages. By default it’s every 10 seconds although you can reasonably configure it down lower thresholds such as every 1 second.

                                                                                    It all works as kind of a hack around S3’s cost structure. S3 doesn’t charge bandwidth for uploads—it only charges $0.000005 per POST request. If you were to have a constant stream of writes against your database and Litestream uploaded every 10s then it would cost you about $1.30/month in upload request charges. That’s assuming a constant stream of writes 24/7 though. Most applications will have periods of inactivity so your costs will likely be lower as in the OP’s case.

                                                                                    1. 2

                                                                                      Thanks for the nice answer! That seems like a really nice trade-off. Would it at all be possible for the application to know when a transaction has been successfully replicated? I’m imagining a situation where the effects of a particular transaction needs to be guaranteed to have persisted before a follow up action is taken.

                                                                                      1. 3

                                                                                        Yes, I put together an example of using Litestream as a library the other day and it allows you to force a flush to confirm that it’s been replicated before returning to a client:


                                                                                        It currently only works with Go applications although I’d like to make the functionality available with the regular litestream command so any language can do it—maybe using unix sockets or something. I haven’t quite decided yet.

                                                                                1. 4

                                                                                  This is a good subject for an article, and looks like good info, but the opening sentence makes me think it’s going to be bad. There’s some truth there but it lacks a lot of subtlety – slow and fast are relative to the application, etc. IMO a better opening sentence would be something like: “I hit this performance wall in my Python code, tried to speed it up with C, and was surprised”. What happened here? Why is this slow? etc.

                                                                                  FWIW the way I think of it is that Python is 10-50x slower than native code, usually closer to 10x. So it’s about 1 order of magnitude. You have around 9 orders of magnitude to play on a machine; a distributed system might make that 11 to 14. Lots of software is 1 to 3 orders of magnitude too slow, regardless of language.

                                                                                  Also, the most performance intensive Python app I ended up optimizing in C++ was mainly for reasons of memory, not speed. You get about an order of magnitude increase there too.

                                                                                  1. 10

                                                                                    Your estimate of of 10x seems WAY too low in my experience. It obviously depends on the use case, I/O bound programs will obviously be much closer because you’re not actually waiting for the language, but number crunching code tends to be closer to 100x than 10x.

                                                                                    I just did a quick test, a loop with 100000000 function calls in C and Python. The C loop ran in 0.2 seconds; the Python program in 17.2 seconds. That’s an 86x difference. (Yes, the C code was calling a function from another TU, using a call instruction. The compiler didn’t optimize away the loop.)

                                                                                    I also implemented the naive recursive factorial function in both C and Python. The C version calculated fib(40) in 0.3 seconds. The Python version calculated fib(40) in 42.5 seconds. That’s an 142x difference.

                                                                                    I implemented the function to do 1 + 2 + 3 + ... + n (basically factorial but for addition) in C and Python, using the obvious iterative method. C did the numbers up to 1000000000 in 0.41 seconds. Python did it in one minute 38 seconds. That’s a 245x difference.

                                                                                    Don’t get me wrong, Python is a fine language. I use it a lot. It’s fast enough for most things, and tools like numpy lets you do number crunching fairly quick in Python (by doing the number crunching in C instead of in Python). But Python is ABSOLUTELY a slow language, and depending on what you’re doing, rewriting your code from Python to C, C++ or Rust is likely to make your code hundreds of times faster. I have personally experienced, many times, that my Python code is anlayzing some large dataset in hours while C++ would’ve done it in seconds or minutes.

                                                                                    You have around 9 orders of magnitude to play on a machine

                                                                                    This is very often false. Games often spend many milliseconds on physics simulation; one order of magnitude is the difference between 60 FPS and 6 FPS. Non-game physics simulations can often take minutes; you don’t want to slow that down by a couple of orders of magnitude. Analyzing giant datasets can take hours; you really don’t want to slow that down by a few order of magnitude.

                                                                                    1. 5

                                                                                      Sure, I said 10 - 50x, but you can say 10 - 100x or 10 - 200x if you want. I measured exactly the fib use case at over 100x here:


                                                                                      Those are microbenchmarks though. IME 10-50x puts you more in the “centroid” of the curve. You can come up with examples on the other side too.

                                                                                      I’d say it’s closer to 10x for the use cases that people actually use Python for. People don’t use it to write 60 fps games, because it is too slow for that in general.

                                                                                      But this is all besides the point… If the post had included the subtleties that you replied with, then I wouldn’t quibble. My point is that making blanket statements without subtlety distracts from the main point of the article, which is good.

                                                                                      1. 5

                                                                                        But my point is that the subtleties aren’t required, because (C)Python just is a slow language. It doesn’t have to be qualified. Its math operations are slow, its function calls are slow, its control flow constructs are slow, its variable lookups are slow, it’s just slow by almost any metric compared to JITs and native code. If the article had started with a value judgement, like “Python is too slow”, I would agree with you, but “Python is slow” seems extremely defensible as a blanket statement.

                                                                                        1. 5

                                                                                          Well I’d say it’s not a useful statement. OK let’s concede it’s slow for a minute – now what? Do I stop using it?

                                                                                          A more helpful statement is to say what it’s slow relative to, and what it’s slow for.

                                                                                          To flip it around, R is generally 10x slower than Python (it’s pretty easy to find some shockingly slow code in the wild; it can be optimized if you have the right knowledge). It’s still the best tool for many jobs, and I use it over Python, even though I know Python better. The language just has a better way of expressing certain problems.

                                                                                      2. 4

                                                                                        There’s actually another dimension of slowness that I see people often forget about when making comparisons like this. Due to the GIL, you’re essentially limited to a single core when your Python code is running, but a C/Rust/Go/Haskell program can use all the available cores within a single process. This means that in addition to the x10-x100 speed up you get from using those languages, you have another x10-x100 room for vertical scaling within a single process, for a combined x100-x10000. Of course, you can run multiple Python processes on the same hardware, or run them across a cluster of single core instances, but you’re not in a single process anymore. Which means it’s much harder to share memory and you have new inter-process architectural challenges which might limit what you can do.

                                                                                        For example, if I write my web application backend in Haskell, I can expect to vertically scale it quite a lot and depending on the use case, I might even decide to stick with a single process model permanently, where I know that all the requests will arrive at the same process, so I can take advantage of the available memory for caching and I can keep ephemeral state local to the process, greatly simplifying my programming model. Single-process concurrency is much simpler than distributed concurrency after all. If I wrote that backend in Python, I would have to design for multi-process from the start, because SQLAlchemy would very soon bottleneck the GIL while generating my SQL queries…

                                                                                    1. 21

                                                                                      I forgot: there’s no reason to make a language in the 2000s without named arguments as default.

                                                                                      1. 12

                                                                                        I don’t want to disagree, but I want to share my experience. Before Rust, my primary language was Python, which is build around named arguments. I also used a lot of Kotlin, which has named arguments as well (although not as enshrined as in Python). I would expect that I’d miss named arguments in Rust, but this just doesn’t happen. In my day-to-day programming, I feel like I personally just never need neither overloading nor named parameters.

                                                                                        1. 5

                                                                                          I come from Swift, and I also (again, surprisingly to me) feel this way.

                                                                                          1. 1

                                                                                            Well, Swift has its own weird parameter naming that it inherited from ObjC.

                                                                                            1. 2

                                                                                              It’s definitely weird. But once you get used to writing function signatures and calls like

                                                                                              func set<K, V>(key: K, to value: V) {}
                                                                                              table.set(key: "foo", to: "bar")

                                                                                              …it starts to feel pretty natural. My expectation was that the transition to

                                                                                              fn set<K, V>(&mut self, key: K, val: V) {}
                                                                                              table.set("foo", "bar");

                                                                                              …would feel really unnatural. But somehow (in my experience) other aspects of Rust’s design come together to mean that I don’t miss the added clarity of those parameter labels.

                                                                                              1. 2

                                                                                                I find the Swift approach appealing in theory, but it’s hard to do well in practice. Maybe it’s just me, but I can never wrap my head around the conventions. I can read this 100 times and still struggle with my own functions: https://swift.org/documentation/api-design-guidelines/#parameter-names

                                                                                                For example, why isn’t your example func setKey<K, V>(_ key: K, to value: V) ?

                                                                                                When we get to the final bullet point in that guideline “label all other arguments”, how should we label them? So that the call site reads like a sentence? That doesn’t seem possible. So just name them what they represent? Is this right: func resizeBox(_ box: Box, x: Int, y: Int, z: Int)? Then the call site is way less cool than your example: resizeBox(box, x: 1, y: 2, z: 3).

                                                                                                1. 2

                                                                                                  For example, why isn’t your example func setKey<K, V>(_ key: K, to value: V)?

                                                                                                  After reading the API guidelines again, I think you’re right, it should have been that :)

                                                                                                  So just name them what they represent? Is this right: func resizeBox(_ box: Box, x: Int, y: Int, z: Int)?

                                                                                                  Yes and yes, as far as I know.

                                                                                          2. 1

                                                                                            I agree that, in practice, I don’t often miss either feature. But it does happen sometimes.

                                                                                            Every once in a while I do miss overloading/default-args. Sometimes you have a function that has (more than one!) very common, sane, default values and it sucks to make NxN differently named functions that all call the same “true” function just because you have N parameters that would like a nice default.

                                                                                            Then there’s the obvious case for named parameters where you have multiple parameters of the same type, but different meaning, such as x, y, z coordinates, or- way worse- spherical coordinates: theta and phi, since mathematicians and physicists already confuse each other on which dimension theta and phi actually represent.

                                                                                            It’s not terrible to not have them, but it seems like it would be nice to do it the Kotlin way. Default params are always a code smell to me, but just because it smells doesn’t mean it’s always wrong and I do use them in rare circumstances.

                                                                                          3. 6

                                                                                            I’d like to disagree, I really think functions should be as simple as possible, so that you can easily write higher order functions that wrap/transform other functions without worrying about how the side-channels like argument names will be affected. I really like the Haskellism where your function arguments are unnamed, but you recover the ergonomics of named arguments by defining a new record just for a single function, along with potentially multiple ways of constructing that record with defaults.

                                                                                            I’m not saying this pattern would work for every language, I just want to point out that named arguments aren’t necessarily a good thing.

                                                                                            1. 4

                                                                                              I forgot: there’s no reason to make a language in the 2000s without named arguments as default.

                                                                                              Named arguments make renaming parameters a breaking change; this is why C# didn’t support them until version 4. If I ever design a language, I’ll add named arguments after the standard library is finalized.

                                                                                              1. 12

                                                                                                Swift has abi stable named parameters, it just defaults to the local parameter name being the same as the public api name

                                                                                                1. 10

                                                                                                  Yeah, I think Swift nailed it. Its overloading/default args aren’t even a special calling convention or kwargs object. They are merely part of function’s name cleverly split into pieces. init(withFoo:f andBar:b) is something like init_withFoo_andBar_(f, b) under the hood.

                                                                                            1. 1

                                                                                              I think the point about mathematical programming languages is missing something: is it haskell or mathematica?

                                                                                              Take the example given for &&:

                                                                                              True  && x     = x
                                                                                              x     && True  = x
                                                                                              False && False = False

                                                                                              In the context of a haskell-like language, this is syntactic sugar for something like this:

                                                                                              && x y = match (x,y) among
                                                                                                       | (True,x) -> x
                                                                                                       | (x,True) -> x
                                                                                                       | (False,False) -> False

                                                                                              On the other hand in a mathematica-like language, every one of those initial rules is a pattern which the evaluation engine may match, at its own discretion. It doesn’t desugar to anything.

                                                                                              In a certain sense, this is an even stronger way to abstract over evaluation (in the manner discussed by TFA), but it is also accompanied by a loss of determinism.

                                                                                              1. 1

                                                                                                To go in another direction, a C program is a precise mathematical formula whose meaning is expressed in terms of C abstract machine.

                                                                                                I don’t see a sharp distinction between math/non math in programming languages. Rather, I see different notational conveniences some of which are easier for humans to reason about.

                                                                                                1. 1

                                                                                                  When i picture the engineering crew updating the LCARS widgets on star trek, i imagine them entering instructions in some mathematical/declarative language. It needs to be the ultimate language to transform idea to program by simply writing the idea down as closely as possible to how it goes in your head. This includes concise notation.

                                                                                                  1. 9

                                                                                                    It needs to be the ultimate language to transform idea to program by simply writing the idea down as closely as possible to how it goes in your head.

                                                                                                    The trouble is, the idea in my head usually starts out being short-sighted and inconsistent until I spend days trying to squeeze it through the filter that’s a precise specification, which is basically any programming language.

                                                                                                    1. 1

                                                                                                      It does, but when editing code the existing formulation and language makes a huge difference to how quickly you can get there; some edits take minutes to make while others take weeks.

                                                                                                      While this is primarily a function of the existing formulation, the availability of language features (eg exhaustiveness checking) makes a big difference too.

                                                                                                    2. 6

                                                                                                      That’s one vision of the future of software.

                                                                                                      Another is the need for a “software archeologist” as in Vinge’s A Deepness in the Sky, who spelunks through the untold depths of accumulated cruft to find the routine needed.

                                                                                                      I know which future I believe it most plausible.

                                                                                                      1. 2

                                                                                                        That part of Deepness is among my favorite predictions in any sci fi I’ve read.

                                                                                                        1. 1

                                                                                                          As someone who has just yesterday spent around an hour discussing with coauthors what exactly can be extracted from some paper pointed out by reviewers (a lot, but not what we need)… I see no reason to find these predictions in any kind of contradiction.

                                                                                                          I expect some checks to demand you clarify your thoughts before them becoming executable, but I also expect some automatic checks to the mathematical languages used for preparing maths articles…