Threads for karlicoss

    1. 1

      My computers accumulated old docker images, which are taking up quite a bit of disk space, but I don’t want to delete all of them in bulk (i.e. docker system prune), since some of them have jobs running against them (they use a temporary, not a permanently running container). So I implemented a python script that parses the output of docker images --format=json and compares against an allowlist of image names defined in the file. As an added benefit I can answer stuff like “what is this mariadb:10.5 image, and what is it used by?”.

      As a result I found out that my docker version is a bit old on some machines, so I added a pyinfra deploy to remove old docker and install the new version.

      Then after updating seems that docker-compose command is unavailable, now it’s docker compose, so I had to update my drontabs. After a simple change and dron apply it was all good :)

    2. 4

      I really start to like Zulip. Raph is using it for Xi, Piet and all his other projects that I like to lurk in. I really came to like the threading approach, since I can’t follow the chat all day long. Follow up on topics I am interested in is so much easier than compared to Discord or Slack. Arguable it makes free-form discussion a bit more tricky, but for Open Source projects it might be my go-to in the future. Really curious what other people’s experiences are with Zulip.

      1. 6

        I am a huuuuuge fan of Zulip. I used it extensively in the Rust community, and in one company of ten-ish people, and it was markedly better for these use cases than slack, discord, irc or gitter.

      2. 1

        I like Zulip’s threaded design, but, honestly, I find Slack not so bad as long as your team makes heavy use of threads.

      3. 1

        I’ve started using Zulip for my own community (Building a Memex), and enjoying it so far. Topics/threads feel like it’s a knowledge database, not just discussions

    3. 6

      I’ve been dreaming of making a “semantic history” browser add-on for ages, but never started.

      (As in something that would collect all the RDFa/microdata/microformats/JSON-LD/… objects on web pages you’ve seen, and give you an interface to browse not just pages, but these various objects like Organization, Person, Article, etc.)

      1. 1

        Wonder even if you haven’t started it, do you have any prior art to share? Would be very interesting! Also have you seen https://www.geoffreylitt.com/wildcard ?

    4. 1

      I keep almost everything in org-mode. I publish most of it here https://beepb00p.xyz/exobrain, and here are the source files https://github.com/karlicoss/exobrain

      • some pages are a bit messy (literal piles of links) It’s still useful, because I can instantly search over everything I saved in the past (and yes, I find useful info there). In adittion, sometimes I skim through links, re-prioritize and explore the top priority ones – this helps to manage it to an extent.

      • some are more organized, usually I do it for ‘project pages’, or when I am sharing some best practices, or preparing to publish in my blog

    5. 1

      I might be curmudgeon here, but I don’t think this project will live for long. The power of emacs comes from the ability to modify everything using a simple language - lisp. Adding javascript just increases complexity (ok, performance might be better) and adds an inconsistent language to the mix.

      1. 4

        I’ll be very interested to see how far it goes. If there is one language that could be used in a project like this and survive, Javascript would be it.

      2. 3

        Legitimately curious what would prevent a lisp to underlying typescript / javascript interpreter to interop with existing elisp.

        1. 3

          I tried writing a simple elisp-to-js transpiler some time ago, it’s doable and mostly straightforward (macros are tricky, but doable). I lost interest after I’d gotten the basics working (functions, variables, simple data types, macros). Maybe I should go back to that little project, it was fun.

          1. 1

            Sounds fun and like it could have practical/research benefit to this project!

        2. 1

          Same, it’s something I’ve been wondering for a long time. I genuinely don’t understand why, as an example, why can’t I use python’s regex functions (which I already know) to manipulate strings. Apart from performance concerns, of course, but often there aren’t any.

          For me, the power of emacs usually comes from

          • advice system
          • scoping rules: sucks that lexical scope isn’t the default, but it’s nice when you can easily override any global variable over the course of a single function call
          • integration with the runtime, e.g. REPL, evaluating arbitrary form in the file, being able to jump to any function source, etc.

          I don’t see why this all can’t be achieved in other languages (but I’m happy to be convinced!).

    6. 18

      People who stress over code style, linting rules, or other minutia are insane weirdos

      Having an automatic clang-format filter on git check-in solves so many of these pointless arguments.

      1. 20

        Oh, the insane weirdos will find things to bicker about even if you pass every linter rule there is.

        I wish I were kidding but you’ll find, for example, that if you do something like this:

        buffer.append("\r\n");
        buffer_len += 2;
        

        in a single place, as part of a completely non-essential function in a 2,000-line program, they’ll complain about the magic number on the second line. Rewrite it like this:

        buffer.append("\r\n");
        buffer_len += len("\r\n");
        

        yep, you’ve got it, they’ll complain about the magic string on both lines, plus what happens if someone changes the first one and forgets to update the second one (“yes I know the spec says lines are always terminated with \r\n but it’s not good programming practice!”)

        No problem, you say – ironically, they’re kindda right this time, so okay:

        buffer.append(CR_LF);
        buffer_len += len(CR_LF);
        

        …but the meaning of CR_LF is very opaque, it doesn’t emphasize that this is meant as a line terminator and it may not be obvious to people who come from a Unix background, rather than Windows (oh yeah, totally, I always wondered what those weird ^M signs were in my files, yeah, it’s not obvious at all). Perhaps we should call it CR_LF_TERMINATE_LINE instead?

        Sure,

        buffer.append(CR_LF_TERMINATE_LINE);
        buffer._len += len(CR_LF_TERMINATE_LINE);
        

        Okay, you say to yourself, clearly this is the end of the line? Nope, they’ll get back to you with something like:

        “I think it’s okay in terms of variable names but the code as it is right now is inelegant to the point where it’s unmaintainable. It’s bad programming practice to repeat yourself (links to DRY entry on Wikipedia) – you should refactor this into a single, general function that appends a user-defined sequence to the original buffer and modifies its length accordingly.”

        When you do that, the problem, of course, will be that the function is too general now, and you’ll really want is to add a separate, buffer_terminate_line function that calls the more general “append” function with "\r\n". Of course, you’ll now have two functions, both called from just one place: one is called by the other function, the other one is called in your code above. You got a longer, probably buggier program, but it’s so much more readable now. Assuming you named the functions correctly, that is ;-).

        The example above is fictional for obvious reasons but I ran into it regularly throughout my career.

        It’s a little better now that I’m a contractor – I no longer fuss about these things and will happily bill companies that don’t keep their “well ackshually” developers in check, unless they want something that’s literally wrong.

        Edit: some of the comments around here are… like, so blissful, like y’all think these are just things that you read about on The Daily WTF but surely nobody encounters this all the time, right? You have no idea how lucky you are and I have no nice words to say to you, your happiness fills me with dread and the fact that there are programmers who don’t experience these things on a weekly basis is so foreign to me I just hate you all right now and I’m going to go to bed and hate you all while I’m waiting to fall sleep! Did I say how much I hate you, lobste.rs? Ugh!

        Okay, no, seriously, these things are just so common that, with every year that passes, I find myself spending more and more time wondering why I still put up with it instead of, I dunno, anything?

        1. 4
          buffer.append("\r\n");
          buffer_len += 2;
          

          Dumb question but, ah, is there some reason why buffer_len isn’t a field on buffer which is updated for you by the .append() call? ;)

          weirdos will find things to bicker about even if you pass every linter rule there is

          The inverse of this is finding weird people who go to elaborate lengths to defeat a linter a rule that is pointing out a genuine bug in software, rather than just fixing it.

          (This is not an accusation towards you, just that sentence reminded me of it.)

          1. 7

            Dumb question but, ah, is there some reason why buffer_len isn’t a field on buffer which is updated for you by the .append() call? ;)

            The fictional code example above is written in a fictional language that doesn’t allow integer fields for objects called “buffer” :-P

            The inverse of this is finding weird people who go to elaborate lengths to defeat a linter a rule that is pointing out a genuine bug in software, rather than just fixing it.

            Oooh, I have a collection of stories about this one, too :(. Way back when I was working for $megacorp, I was more or less the chief intern herder in the worst internship program you’ve ever seen. They’d hire a bunch of interns over the summer and the only plan about what to have them do was “something easy because we don’t have anyone to supervise them”. I was in one of the few teams that wasn’t ran by morons and I had some idle cycles to spare once in a while, so I’d try to help the poor souls who ended up there over the summer.

            One of the things they regularly ended up doing was run Coverity scans, fix the trivial bugs and report the more complex ones (super easy to quantify, right?). This would roughly go as follows:

            1. Ian Intern would report a bug to Grumpy Senior

            2. Grumpy Senior would close it without even looking at it because he had no idea who Ian Intern was, probably an intern

            3. I’d reopen the bug, ask them to at least confirm it before closing it.

            4. Grumpy Senior would close the bug immediately saying there’s nothing to confirm, it’s just a linter warning

            5. Ian Intern would reopen the bug with a crash dump

            6. Grumpy Senior would close the bug as unreproducible, Ian Intern probably did something wrong

            7. I’d reopen the bug because it’s a fuckin’ crash and we can’t tell our users they’re holding it wrong (this is usually where I started bringing out the tiny guns, i.e. Cc-ing managers in bug reports).

            8. Grumpy Senior would reluctantly look at it but it’s definitely not that, that’s just a linter warning.

            Now comes the beautiful part:

            1. Three days later, Grumpy Senior would push a bullshit fix and close the bug.

            2. That same day, they’d push the actual fix, with a commit message saying “Misc. style and Coverity warnings fixes”

            And so on – multiplied by about 40 Grumpy Seniors all over the organisation, and about six months’ worth of bad code because Coverity fixes took a lot of work so they only ran the scan once every six months :-D.

            1. 1

              The fictional code example above is…

              Heh. ♥️

              the worst internship program

              Oh the poor little darlings. The way the org behaved to them sounds toxic. Good on you for trying to improve it. (But alas maybe Sisyphean.)

      2. 7

        found the weirdo! (I’m only poking fun at you)

        I have 100% grown to appreciate “deciding once” with automatic formatters, and resisting urges to “re-decide” as much as possible. I have ALSO realized that it can be a HUGE pain to “decide once” after you’ve already scaled to many teams + lots of code. You want to do that up front, early-on, and stick with it.

        1. 4

          :)

          The big advantage of having settled on a particular clang-format is that it’s not someone else’s choices that annoy, but just the arbitrary decisions of the formatter & that seems to defang the drive to nit-pick over formatting choices.

          I agree that it’s a huge pain to enforce a format late into a project though.

          1. 3

            My biggest gripe with many automatic formatting tools is that they pointlessly frob with line endings: clang-format will enthusiastically add or remove them by default, leading to code that I can describe as nothing short of butt-ugly by any standard. Luckily you can disable this with ColumnLimit: 0, but it’s a horrible default that too many projects use IMO. Splitting a 81-column sting over two lines is silly, and forcibly put all the arguments for XCreateWindow() on 1 or 2 lines “just because there is space” is even worse.

            1. 2

              Personally, I find working with a column limit set to around 120 nice and comfy, so I set clang-format to that & if it occasionally does something a bit odd I decide not to care about it. I save more time by never having to hand format code than I do obsessing over the odd corner case where it doesn’t make the “perfect” aesthetic choice.

            2. 1

              yep, or messing with vertically aligned code

          2. 2

            It’s not even that, I think. Not that the automation makes it so you’re not mad about the choices, but rather that you’re not forced for manually reformat a fuckload of code to please some arbitrary whinging program as the final step of every change.

            1. 6

              I often just write stuff like if x==y{return 42} now and just let gofmt deal with all of that. A small but measurable productivity boost.

              1. 1

                Exactly. Code formatters let you get on with with the actual job of writing code.

                I setup emacs to invoke the formatter on every enter, or with a key-binding when I want to invoke it directly. Why waste time inserting spaces / carriage returns when the formatter does a perfectly good job?

      3. 1

        Tried that in the last place I worked and people just started complaining about the formatter configuration ahaha

        1. 1

          That probably depends on the team. We have a clang-format style that isn’t my preferred style but it’s consistent and I’d much rather use a consistent style that is not quite my ideal than argue about style. Because everything is consistently formatted, if I cared enough, I could apply a different style, modify the code, and then format it back. I actually supervised a student building a tool designed for exactly that workflow (which he presented at EuroLLVM some years back; CELS: Code Editing in Local Style). I wish GitHub would integrate something like that into their code view. Don’t like the style of the project? Render it in a different style in code reviews and when you browse their code.

        2. 1

          I’m kind of okay with that in theory as long as the result is (possibly rejected) patches to the repo containing the formatter config.

          It might be helpful if the formatter config looks scary so people are hesitant to frob it unnecessarily.

    7. 9

      The article misses mention of git reflog – rebase with no fear :)

      1. 2

        So very important. I’d been using and hating git for at least a couple of years before I learned about git reflog. Discovering that everything in git has an undo button dramatically changed my perspective on it. About the only ways you can really lose data with git are by damaging the .git directory via non-git commands (or system failure) or by failing to commit things (i.e. in both cases by not using git). The only way git can lose data is if you generate so much junk that it runs a git gc in between your dropping something from your branches and remembering to hit undo.

    8. 9

      Usually python, just because I know it well, and can get results within a predictable timeframe. My personal rule of thumb is when I need something like a hashmap or array – I give up on bash that instant.

      However with time I’ve also gotten more comfortable with bash, so often it’s okay to mix and match. E.g. say, you want to find out the amount of free memory

      $ cat /proc/meminfo  | grep MemFree: 
      MemFree:        23727768 kB
      

      Right, how do we pick out the number? Normally you’d use cut, or awk:

      $ cat /proc/meminfo  | grep MemFree: | awk '{print $1}'
       23727768
      

      , but what if you forgot, or need something more elaborate? Well, why not use python?

      $ cat /proc/meminfo  | grep MemFree: | python3 -c 'print(input().split()[1])' 
      23727768
      

      Not as concise as awk, but you can type it quicker than the time you’d spend googling how to use awk.

      Note that you also can use multiline input if you press enter after the opening quote, so even if you need imports etc, it doesn’t have to look horrible. Also if you have some sort of vi mode or a hotkey to edit the command in editor (Ctrl-X Ctrl-E in bash), it helps a lot for messing with long shell commands.

      I also tried using xonsh a few times, a shell combining python & bash syntax. Cool idea, but I tend to forget how to use it, so never got into the habit.

      1. 2

        Not as concise as awk, but you can type it quicker than the time you’d spend googling how to use awk.

        Ah, someone who shares my shame.

        1. 0

          Err, am I weird in that awk isn’t that hard to use?

          $ awk ‘/memFree/ {print $2}’ < /proc/meminfo

          Two less fork()/exec()’s and does the same thing as all that python. Why break out the combine when the hedge trimmer will do to cut the grass.

          I’m not gonna lie, this falls under learning to use your tools. If you always reach for python/scripting languages for these simple tasks, I’m going to argue your general unix knowledge is too low.

          Also that second cat | grep | awk has a bug with print $1 versus $2 so not sure the gp actually ran that shell.

          1. 3

            Err, am I weird in that awk isn’t that hard to use?

            Probably not.

            I’m not gonna lie, this falls under learning to use your tools.

            I would disagree on a technicality: if you don’t know it, it isn’t your tool.

            If you always reach for python/scripting languages for these simple tasks, I’m going to argue your general unix knowledge is too low.

            This I do agree with. I can’t say that it is difficult to use because I never took the time to really learn awk. Instead, I just try to pick up what I need to to do a particular task. To a large extent, my relationship with awk is governed by apathy. It is an exceedingly practical tool and I just don’t really care. I love those little transcendental moments with software where you feel like you know something more about the world. awk doesn’t do that for me so I haven’t really given it the time it deserves.

            That said, my comment about shame comes from responses like:

            … awk isn’t that hard to use…

            … [learn] to use your tools.

            … your general unix knowledge is too low.

            Missing a little bit of context from your comment, but things can be read this way and it doesn’t feel so good. My comment isn’t about how difficult awk actually is, but how people assume that you should just know these things and if you don’t you are deficient.

            To be clear, I don’t think that that there is any malice on your part.

    9. 19

      I don’t do fully offline programming often, but a big fan of offline tools, because they are usually much faster to use even when you do have interne. For offline docs there is devdocs.io and zeal. Also often you can install the docs from the package manager, or along with the dev toolchain depending on the programming language, then you can set a keyword in the browser to use it as a search engine. E.g in my firefox I have

      py:  file:///usr/share/doc/python3/html/search.html?q=%s
      

      , so when I type, say, py re.sub, I get instant results.

      It’s certainly hard to replace googling + stackoverflow, but I often find myself searching for the same things I’ve already figured out before in other projects. For that it might be useful to setup a local code search (e.g. via Ripgrep). I’m describing my own code search setup here.

      I personally think it’s a shame that in many languages (e.g. Python, the one I’m working most with) it’s not a common practice to package tests and documentation alongside the code, this would really help with offline workflows.

      1. 3

        Wow, I had no idea there was a search page for local html python docs.

        1. 2

          You can also use pydoc, which ships with python by default

          1. 2

            Or help(re.sub) from the repl, which shows you the docstring of that function

        2. 1

          I don’t think there is. In Firefox you can set shortcuts for frequently used searches, this one just points to a local file.

    10. 4

      This is cool and all but it’s a bit of a downer that strictly worse mainstream languages are adopting features from better languages and implementing them in a strictly worse way. E.g. in OCaml you would get an exhaustiveness error like:

      Warning 8: this pattern-matching is not exhaustive.
      Here is an example of a case that is not matched:
      Scheduled
      

      Instead of the inscrutable:

      error: Argument 1 to "assert_never" has incompatible type "Literal[OrderStatus.Scheduled]";
      expected "NoReturn"
      

      (Also, you wouldn’t need to remember to put an else: assert_never(...) at the end of the code.)

      1. 6

        In OCaml pattern matching is a core feature, here it’s being emulated by clever users. There’s a proposal to add pattern matching to Python 3.10, in which case presumably Mypy would be able to do exhaustiveness checking.

        (Also, calling Python “strictly worse” is a bit of a strong statement. Strictly worse how?)

        1. 2

          Yes, eventually–if and when the proposal is accepted, then ships in Python, then gets implemented and ships in Mypy.

          Strictly worse how?

          • Performance
          • Type safety
          • Language design (including modules), e.g. the recently-added controversial walrus operator in Python is something that automatically and naturally occurs in OCaml due to its original design

          But since this is a programming-language-oriented thread, it’s reasonable to assume that we’re comparing mostly language design.

          1. 7

            On the other hand, Python runs natively on Windows, so it’s strictly superior to OCaml as a language.

            (F# is another story…)

            1. 5

              Difficult to tell whether this is a serious assertion, but on the off chance:

              • OCaml has no fewer than three ports to Windows. If this seems like a disadvantage, remember that many people actually recommend Anaconda as an alternative to the official Python installer on Windows.
              • The most well-known port, OCaml for Windows, provides recent compiler versions and almost all opam (equivalent of PyPI) packages.
              • Opam is planning full Windows support out of the box in an upcoming release.
              1. 6

                It was tongue-in-cheek up until I read your response, which made it suddenly an excellent argument about why “strictly superior” is nonsense. All three of your ports need “a Unix-like build environment”. Package manager support is only “planned” in the next version, much as pattern matching is “planned” for 3.10. Comparing it to Anaconda, which is recommended to scientists and data scientists who aren’t programmers, is flat wrong. The only people who could think that OCaml Windows support is as good as Python Windows support, which doesn’t require you to install cygwin, are OCaml partisans.

                Admitting that OCaml has worse Windows support than Python is the easiest thing in the world. If you’re not willing to even do that, why should I trust you about any of your other claims?

                1. 2

                  Comparing it to Anaconda, which is recommended to scientists and data scientists who aren’t programmers, is flat wrong.

                  Let’s actually look past the marketing blurb and check how Anaconda describes itself:

                  ’Anaconda Individual Edition is a free, easy-to-install package manager, environment manager, and Python distribution with a collection of 1,500+ open source packages with free community support.

                  ’Anaconda Commercial Edition is the world’s most popular open-source package distribution and management experience, optimized for commercial use and compliance with our Terms of Service.

                  ’Anaconda Team Edition is our latest generation repository for all things Anaconda. With support for all major operating systems, the repository serves as your central conda, PyPI, and CRAN packaging resource for desktop users, development clusters, CI/CD systems, and production containers.

                  ‘Anaconda Enterprise is an enterprise-ready, secure, and scalable data science platform…’

                  (From https://docs.anaconda.com/ )

                  So interestingly, they position only their enterprise edition as a data science platform, and the others as general-purpose Python distributions.

                  Now we can make the argument that it’s still used mainly by data scientists. But that just proves my point! Data scientists, like most other computer users, are mostly using Windows. So why would they prefer Anaconda, which is (if we ignore the Enterprise edition) just a Python distro? Because it has an easy Windows installer which loads you up with commonly-used libraries! I.e. specific distribution for a specific need, just like the OCaml Windows ports.

                  Admitting that OCaml has worse Windows support than Python is the easiest thing in the world. If you’re not willing to even do that, why should I trust you about any of your other claims?

                  That’s a non sequitur. I was responding to your claim that:

                  Python runs natively on Windows

                  …with the implication that OCaml does not. You’re a smart guy, you probably already know that it does and has for a long time. I was just expanding on that and saying that the surrounding tooling is also catching up.

                  Also, since when is Windows (a non-free operating system) support a measure of the quality of a programming language? Is this a portability argument in disguise? Are you suggesting in turn that OCaml, which has been ported and released for Apple Silicon, is better than languages which have not, and which due to resource constraints, won’t be for a while? Seems like a nonsensical argument.

                  1. 2

                    And let’s take at the OCaml for Windows page:

                    opam-repository-mingw provides an opam repository for Windows - and an experimental build of opam for Windows. It is work in progress, but it already works well enough to install packages with complex dependencies (like core_kernel) and packages with external dependencies (e.g lablgtk).

                    The repository is forked from the standard version. It contains Windows specific patches and build instructions. Especially build related tools (ocamlfind, ocamlbuild, omake, oasis, opam and OCaml itself) were modified, so that most unix-centric build instructions will also work with the native Windows/OCaml toolchain and I can sync the repository with the main repo from time to time (and without too much hassle).

                    I don’t know about you, but this doesn’t stir confidence in me that I’ll be up and confidently running OCaml in Windows. Which brings up the question again, what does it mean to be “strictly superior”?

                    Also, since when is Windows (a non-free operating system) support a measure of the quality of a programming language?

                    What does Windows being non-free have to do with being strictly superior?

                    Is this a portability argument in disguise

                    Since nobody in this thread has defined what “strictly superior” means, why can’t it be?

                    1. 1

                      What does Windows being non-free have to do with being strictly superior?

                      Non-free operating systems don’t have anything to do with language ‘superiority’ at all.

                      nobody in this thread has defined what “strictly superior” means

                      ‘Strictly superior’ is a subjective, and therefore, opinionated argument. So to each their own, but to me:

                      • Statically typed (meaning, no typechecking at runtime)
                      • Pattern matching
                      • Exhaustivity checking built in
                      • Every object/method/value is not a key/value in a bunch of dicts which can be runtime-patched
                      • No nulls
                      • Immutable values
                      • Expression-oriented syntax
        2. 2

          Fwiw, Java supports that kind of exhaustiveness checking now too, added in JDK 13 as part of switch expressions. Seems to be becoming a more common feature even outside of functional programming.

          1. 1

            That’s the thing–all features from functional programming (starting from garbage collection in the ’50s) eventually find their way into ‘mainstream’ programming languages, all the while FP is called ‘academic’ and ‘impractical’ every step of the way 😂

            1. 1

              You can say the same about OO languages. After all, Simula and Smalltalk were certainly not widely popular. C++ first appeared in 1985, which was a good 25 years after Algol-60 was released. And FP languages have gotten plenty wrong. Haskell’s laziness has been criticized for years, and I’d be happier if I didn’t have to deal with Scala implicits in my code again.

      2. 3

        “Not exhaustive” error messages when dealing with non-trivial types are amazing. I sit there and look at the example trying to figure it out then when I do half the time it is a bug that would take a lot of time to find.

      3. 2

        It’s a much better value proposition to add something like this that provides, let’s be conservative, half the utility, to an existing codebase than it is to switch languages (and ecosystems, tools, and so forth) in order to get the full utility.

        At some point the definitions of “better” and “worse” need to be weighted based on how much work is actually accomplished with a given tool / language.

        1. 2

          Hopefully also weighted by a fair accounting of the amount of bugs prevented by the language.

          1. 2

            Don’t listen to him! It’s a trap! It’s totally not fair to allow OCaml and friends to put the bugs prevented by the language on their side of the ledger because those are several infinite classes of bugs.

      4. 2

        My counterpoint is that OCaml only provides this feature through having a purpose-built ADT for it.

        Mypy (and typescript) generalize these concepts by slowing arbitrary union of types. You can have “string or bytes” instead of “Either String Bytes” (with the wrapping required at call sites)

        From a usability perspective OCamls stuff is way more restrictive and verbose

        1. 1

          I guess it’s possible with a suitable type system, e.g. in Scala you can have an implicit that coercing the value into Either, or in C++ it could work via an implicit constructor. But yeah, to me was somewhat surprising to find myself (as a big fan of types!) enjoying mypy and other gradual typing systems more than ‘stricter’/‘static’ ones.

        2. 1

          Arbitrary unions of types turn out to be a bad idea as soon as you need to deal with generics. Difficult to work with A | B if A and B are generics.

    11. 6

      This article is great!

      I just want to say to anyone thinking of adapting something similar: do not use assert in production code. It’s both slow and will be ignored/removed when running python with the -O (optimize) flag. Instead, just raise the exception. If you’re doing exactly what’s being done here, it’s fine as it should only ever be called as part of the ‘compile-time’ type checking, but otherwise assert should be avoided outside of tests.

      1. 7

        In my understanding, pretty much the only thing -O flag does is removing asserts, so IMO it makes it basically useless.

        And I can’t really come up with the situation when having asserts would have performance impact, unless you’ve got some contrived example riddled with asserts and doing nothing else apart from some CPU arithmetic. Of course also possible that assert conditions are slow by themselves, but anyway, replacing them with exceptions in such case isn’t gonna improve the performance.

      2. 2

        That’s the scenario right now, indeed.

        1. 2

          I reworded it to be a bit more clear, but yes this scenario is perfectly fine. I wanted to point out to people who may not fully understand some gotchas with using assert.

      3. 2

        Good point! I updated the article to address this issue:

        https://hakibenita.com/python-mypy-exhaustive-checking#updates

        1. 1

          Cool! Great article and I look forward to reading more from you!

    12. 2

      Very nice writeup, especially the trick with NoReturn!

      I also find type narrowing extremely useful for error handling, you can use Union[Exception, T] as the result type, and ‘pattern match’ with isinstance, which works in runtime and checkable by mypy. In addition, covariant Union type lets use it with generators and consume without extra boilerplate. I write more about it here

      1. 2

        So go(lang) in Python? 😉

        1. 4

          go returns a Tuple[Exception, T]

          1. 2

            Yep. I’m describing the Go approach here

            1. 1

              Thanks for the article karlicoss. We were just discussing this pattern the other day and we ended up reaching similar conclusions. The only difference in our case was that we wanted to return T along with the exception. This makes the “pattern” of the return type a bit more messy. We need to keep exploring.

      2. 1

        You mean returning an Exception instead of raising it?

        1. 1

          ah yes, sorry! Too late to edit now :(

    13. 3

      Nice, I’ve got something similar. It’s unfortunate git branch doesn’t support this via some flag. I also find it useful to set alias branch = branch -vv, then it also displays the relation to origin and the commit message header next to the branch name.

    14. 3

      Some of my personal ’intermediate vim` favorites:

      • J to collapse consecutive lines
      • block select (Ctrl-v) + edit – even if you’re not a vim user, I highly recommend getting used to it in your text editor
      • pipe command’s output to vim with command | vim -. For me it’s often faster to process the output in vim than remembering some awk/sed/cut syntax
      • :term to split the screen and run a terminal It’s actually pretty decent, both in terms of performance and it behaves ‘as you expect’ if you . You can also switch to ‘normal mode’ in the terminal with an awkward ’Ctrl-\ Ctrl-N` (e.g. then you can copy some stuff)
      • set up vi mode in your shell/interpreter (e.g. ipython). At least, for me it’s easier to just use the same bindings to edit commands + there is usually a key to open the command in vim and run after saving, useful for multi line commands
    15. 26

      If you use squash, you don’t have to literally squash your whole changeset into a single commit. You can use squash in conjunction with the interactive rebase to maintain meaningful history without spammy commits like ‘fix typo’, ‘fix compile error’, etc.

      E.g. you wrote a feature, committed, then wrote a test, commited. you end up with

      -1 feature (changes to src/code.py)
       0 test    (changes to tests/test.py) (HEAD)
      

      Then, say, you run a linter/code formatting tool/whatever – you ended up with changes to both files. What’s the right thing to do?

      Personally, I’d use interactive staging (add -p) to stage the changes to src/code.py and tests/test.py separately and commit separately too:

      -3 feature (changes to src/code.py)
      -2 test    (changes to tests/test.py)
      -1 fix src/code.py
       0 fix tests/test.py (HEAD)
      

      , then I’d use interactive rebase to reorder as -3, -1, -2, 0 and squash together -3 + -1; -2 + 0, which results in a nice atomic history:

      -1 feature (changes to src/code.py)
       0 test    (changes to tests/test.py) (HEAD)
      

      Another neat rebase trick you could use is to reorder the test commit and feature commit:

      -1 test    (changes to tests/test.py)
       0 feature (changes to src/code.py) (HEAD)
      

      , and checkout HEAD~1 and run the test to ensure it actually fails without the feature commit.

      1. 3

        I used to do it just as you describe, and then I discovered git commit --fixup <SHA>. With your example this becomes:

        git add -p src/code.py
        git commit --fixup <SHA of -3>
        git add -p tests/test.py
        git commit --fixup <SHA of -2>
        git rebase --autosquash -i <SHA of -3>^
        

        This saves me having to put the commits to squash in the right order in the interactive rebase.

        1. 1

          You may enjoy using git-absorb which automatically finds the SHA hash for --fixup commits.

          1. 1

            Seems interesting, but the description doesn’t make it clear how it works so it seems a bit too “magic”. What does it do if you have multiple commits touching the same file(s)?

            1. 1

              It doesn’t just look at the file, but actual changes (since you first git add [-p] all the changes you want to absorb). I believe it just picks the most recent commit touching the same change set. I haven’t needed to use it when the commit I’m wanting to fixup is multiple commits back on the same change set (which I believe would conflict and need a manual rebase anyways).

    16. 9

      Great news! I am eager to try this!

      Turn on -XLinearTypes, and the first thing you will notice, probably, is that the error messages are typically unhelpful: you will get typing errors saying that you promised to use a variable linearly, but didn’t. How hasn’t it been used linearly? Well, it’s for you to puzzle out. And while what went wrong is sometimes egregiously obvious, it can often be tricky to figure the mistake out.

      So, basically, GHC just got its own “Syntax error” a la OCaml… just a bit more specialized :p.

      1. 11

        Maybe it’s just me, but to me OCaml’s errors are terse and unhelpful and GHC’s errors are… verbose and unhelpful. ;)

        There are interesting papers that show working ways to improve both, but I wonder why none of those improvements are in the mainline compilers.

        1. 2

          Good error reporting is easiest if it’s built into the compiler front end from the start. If a new algorithm comes along to improve the error information it’s almost never going to be a simple job to drop it into an existing compiler.

          You need type information & parse information from code that’s potentially incorrect in both spaces, so any error algorithm usually has to be tightly integrated into both parts of the compiler front end. That tight integration usually means that improving compiler errors is a significant amount of work.

          1. 3

            It varies. What puzzles me is that a lot of time ready to use, mergeable patches take much longer to merge than they should.

            Like this talk: https://ocaml.org/meetings/ocaml/2014/chargueraud-slides.pdf

            1. 1

              Do you also have a link for a patch for the improved error messages?

              A lot of work has been going on to move OCaml to a new parser and improve error messages. Even though there is a lot still needed to be done, latest releases have started improving a lot. Maybe we can still extract some useful bits from that effort and try again

              1. 2

                Turns out it was even made into a pull request that isn’t merged yet: https://github.com/ocaml/ocaml/pull/102

                1. 1

                  Thanks. It is quite an informative PR actually, and explains why the change is not there yet and once can infer why it is easier to add informative messages in new languages and complier but it may be quite hard to retrofit them to seasoned ones

      2. 7

        Would you be kind enough to give me an ELI5 about what linear types are and what you can do with them?

        1. 29

          In logic, normal implication like A implies B means whenever you have A, you can derive B. You have tautologies like “A implies (A and A)” meaning you can always infinitely duplicate your premises.

          Linear implication is a different notion where deriving B from A “consumes” A. So “A linearly implies B” is a rule that exchanges A for B. It’s not a tautology that “A linearly implies (A and A).”

          The classic example is you can’t say that “$3 implies a cup of coffee” but “$3 linearly implies a cup of coffee” makes sense. So it’s a logical form that reasons about resources that can be consumed and exchanged.

          Same in functional programming. A linear function from type A to type B is one that consumes an A value and produces a B value. If you use it once with an A value then you can’t use it again with the same A value.

          This is nice for some performance guarantees, but also for encoding safety properties like “a stream can only be read once” etc.

        2. 5

          It can be used to model protocols with type signatures. The following is in theory what you should be able to do.

          data ConsoleInput
              = Input String ConsoleOutput
              | ExitInput
          
          data ConsoleOutput
              = PrintLines ([String] ⊸ Console)
              & PrintLastLines ([String] ⊸ ())
          
          greet :: ConsoleOutput ⊸ ()
          greet console
              = let PrintLines f = console
                in step2 (f ["name?"])
          
          step2 :: ConsoleInput ⊸ ()
          step2 ExitInput = ()
          step2 (Input input console)
              = let PrintLastLines f = console
                in f ["hello " ++ input]
          

          If you combine it with continuation passing style, you get classical linear logic and it’s a bit more convenient to use.

          If you model user interfaces with types, they should be quite useful.

          I’m also examining and studying them: http://boxbase.org/entries/2020/jun/15/linear-continuations/

        3. 1

          Wikipedia gives a reasonable overview. The closest analogy would be something like move semantics – for example ownership in Rust can be considered as manifestation of linear types.

          1. 6

            Rust ownership is linear affine types. Linear types are similar but differ in the details. A shitty way of understanding it is affine types mimic ref counting and prevent you from having a ref count < 0. Linear types are more a way of acting like RAII in that you might create a resource but just “know” that someone later on in the chain does the cleanup.

            Which I’m sure sounds similar but affine types allow for things like resource leaks but linear types should guarantee overall behavior to prevent it.

            This all assumes my understanding and explanation is apt. I’m avoiding a ton of math and i’m sure the shitty analogy doesn’t hold up but behaviorally this is how I have it in my brain.

            1. 2
              1. 2

                I’m personally of the stance that the 2020 linear ghc stuff is more <= 1 usage, and kinda misses out on a lot of really fun expressivity that can fall out of making full classical linear logic first class. But that’s a long discussion in its own right , and I’ve yet to make the time to figure out the right educational exposition on that front

                1. 1

                  it definitely seems more limited in scope/ambition compared to the effort ongoing for dependent types, for better or worse. Can’t say I know much about what first class linear logic would look like, but perhaps now there will be more discussion about such things.

                  1. 2

                    The really amazing thing about full linear logic is it’s really sortah a rich way to just do mathematical modelling where everything has a really nice duality. The whole thing about linearity isn’t the crown jewel (though wonderfully useful for many applications ), it’s that you get a fully symmetric bag of dualities for every type / thing you can model.

                    The paper that really made it click for me was mike shulmans linear logic for constructive mathematics paper. It’s just a fun meaty read even at a conceptual level. There’s a lot of other work by him and other folks that taken together just point to it being a nice setting for formal modelling and perhaps foundations of category theory style tools too!

              2. 1

                Not sure I can agree that Uniqueness types are the same as Linear types. Care to explain they’re similar sure but not the same thing and your… screenshot of a powerpoint? isn’t very illustrative of whatever point you’re trying to make here.

                And from my experience with Idris, I’m not sure I’d call what Rust has Uniqueness types.

                1. 1

                  They are different rows in the matrix because they are different, of course.

                  it’s from this presentation about progress on linear ghc a little over a year ago https://lobste.rs/s/lc20e3/linear_types_are_merged_ghc#c_2xp2dx skip to 56:00

                  What is meant by Uniqueness types here is “i can guarantee that this function gets the unique ptr to a piece of memory” https://i.imgur.com/oJpN4eN.png

      3. 2

        Am I the only one thinking this is not how you ship language features?

        If the compiler can’t even report errors correctly, the feature shouldn’t ship.

        1. 15

          If the compiler can’t even report errors correctly, the feature shouldn’t ship.

          Its more this is an opt-in feature with crappy error reporting for now using computer programming design features not in use in most programming languages. Its going to have rough edges. If we required everything to be perfect we’d never have anything improved. Linear types like this also might not have a great way to demonstrate errors, or the domain is new so why not ship the feature for use and figure out what kind of error reporting you want based on feedback.

          1. 13

            Many people do not realize that haskell is a research language and GHC is one of the main compilers for it. This is an experimental feature in a research language. If it works out well, then it will be standardized.

            1. 15

              You have an experimental feature and advertise as such, disable by default. It is not like it will become widely use anytime soon. I don’t get the outraged comments to be honest.

            2. 14

              This is utterly unprofessional

              I disagree. While good diagnostic messages are important in the long term, I think you can permit yourself to have less useful messages in the short term. Of course once shipped, you should dedicate time and resources to improving it.

              I can’t even remotely imagine how people thought this would be a good idea to ship if it isn’t even clear whether good error reporting is even technically possible.

              I think the reason is simple: to get feedback on the feature, how useful it is (regardless of the error messages), etc. Based on that you can then decide how to move on, what to focus on, etc.

            3. 4

              This is utterly unprofessional – my private hobby project has higher standards than this.

              Your private hobby project is almost 30 years old, and is used in production in any number of areas as well I take it?

              As goalieca notes, GHC/Haskell is a research langugage, its got a ton of extensions that you can opt in for, but accept that things can have rough edges largely because its near the bleeding edge of language design. This is one of them.

              There is literally no point in having this feature if it turns out that nothing can be done about error reporting.

              Of course there is a point, it is to explore Linear Types in spite of the error reporting. Advice from early “hair shirt” adopters will likely drive the error reporting or work towards a better system. I’m just pointing out that error reporting may not yet be even thought of as a concept of how Linear type systems are used.

              I’m not saying it isn’t possible, I’m saying it might not even be worth considering yet. Sometimes you have to implement things before you can fully know how to do it right. Same reason why Haskell will eventually move from a lazy language to a strict language. Only after you get decades into a design might you find out it was a dead end. But until you try it, nobody will ever know. In those conditions explain how you expect to learn the tacit knowledge to be “professional”? Someone somewhere has to do the work, this is how it happens.

              If you ever wondered why software is in such an utter state of disrepair, it’s because of shit like this.

              To be blunt, your axiomatic thinking is not helping matters, this is what research IS. It is ugly, incomplete, and the only way to move forward. This is the exploration of computer science concepts never implemented anywhere in an existing langauge. Of course its won’t be pretty, it took 4 years to get to “it compiles and we can use it, lets make it a POC/MVP and get feedback from people that see value from it”.

        2. 5

          Other people have sort-of said it, but not clearly enough I think. This is not a language feature being added. It is a feature-flagged experimental feature of a particular compiler. Most such compiler extensions never make it into real Haskell, and the ones that do take years after they are added to a compiler to make it to a language spec.

          1. 4

            for all practical purposes isn’t “real Haskell” defined by what ghc implements these days?

            1. 2

              Yes, all the other implementations are dead. They still work, but they won’t run most modern Haskell code, which usually uses a bunch of GHC extensions.

            2. 1

              You might say “isn’t it not popular to write standards-compliant Haskell these days?” and you’d be right. Of course it’s often trendy to write nonstandard C (using, say, GNU extensions) or nonstandard HTML/JavaScript. However, ignoring the standard being trendy doesn’t mean the standard doesn’t exist, or even that it isn’t useful. I always make sure my Haskell is Haskell2010, and I try to avoid dependencies that use egregious extensions.

          2. 2

            Honestly curious: are there any other Haskell compilers out there? Are they used in production?

            Also, what is a definition of a true Haskell? I always thought it’s what’s in GHC.

            1. 5

              There’s a Haskell which runs on the JVM - Frege. But it makes no attempt to be compatible with the version of Haskell that GHC impements, for good reasons. Hugs is a Haskell interpreter (very out of date now, but still works fine for learning about Haskell.) There a bunch of other Haskell compilers, mostly research works that are now no longer in development - jhc, nhc98 etc etc.

              But GHC is the dominant Haskell compiler by far. I don’t think there are any others in active development, apart from Frege, which isn’t interested in being compatible with GHC.

              (“True Haskell” is the Haskell defined in the Haskell Report, but real world Haskell is the Haskell defined by what GHC + your choice of extensions accepts.)

            2. 2

              There are other compilers and interpreters. None of them is anywhere near as popular as GHC, and usually when one does something interesting GHC consumes the interesting parts.

              There is definitely a standard, though: https://www.haskell.org/onlinereport/haskell2010/

              The whole reason language extensions are called “extensions” and require a magic pragma to turn on is that they are not features of the core language (Haskell) but experimental features of the compiler in question.

          3. 1

            In short, GHC Haskell is a language designed by survival-of-the-fittest.

        3. 3

          Overly terse error messages are bad, but they are better than wrong error messages. Some things are much harder to give helpful error messages for than others.

          I wish people spend more time improving error reporting, at least in cases when the way to do it is well understood. There is no reason for say TOML or JSON parsers to just say “Syntax error”. But, YAML parsers are pretty much doomed to give unhelpful errors just because the language syntax is ambiguous by design.

          And then some errors are only helpful because we know what their mean. Consider a simple example:

          Prelude> 42 + "hello world"
          
          <interactive>:1:1: error:
              • No instance for (Num [Char]) arising from a use of ‘+’
              • In the expression: 42 + "hello world"
                In an equation for ‘it’: it = 42 + "hello world"
          

          How helpful is it to a person not yet familiar with type classes? Well, it just isn’t. It’s not helping the reader to learn anything about type classes either.

          1. 1

            I’ve seen some good suggestions on r/haskell for improving the wording of these errors.

        4. 2

          The error they’re talking about is a kind of type error they’ve not worked with. It’s produced if you forget to construct or use a structure. I I’m guessing it’s technically “proper” but the produced error message may be difficult to interpret.

          They’ve ensured it’s a feature you can entirely ignore if you want to. Everybody’s not convinced they need this.

          I otherwise dunno what they’re doing and I’m scratching my head at the message. Something like “Oh cool you’ve done this… … … So where are the types?”

        5. 2

          So you never got a C++ template error in the good olden days? Seriously though, it just got merged. It’s not released or “shipped” in any means.

          1. 0

            So you never got a C++ template error in the good olden days?

            No, because I looked at the language, figured out that the people involved completely lost their fucking mind, and moved on.

            Seriously though, it just got merged. It’s not released or “shipped” in any means.

            They took 4 years to arrive at the current state, which I’ll approximate at roughly 10% done (impl unfinished, spec has unresolved questions, documentation doesn’t really seem to exist, IDE support not even on the radar).

            So if you assume that there will be a Haskell version in the next 36 years, then this thing is going to end up in some Haskell release sooner or later.

            1. 2

              So if you assume that there will be a Haskell version in the next 36 years, then this thing is going to end up in some Haskell release sooner or later.

              Could you elaborate on this? If practical users of linear types will only use them if they have good error messages, and early testers want to work out the kinks now, what’s wrong with having a half-baked linear types feature with no error messages permanently enshrined in GHC 8.12?

    17. 1

      What happens if

      1. one file is modified on multiple offline devices
      2. and these devices get online
      1. 2

        Then you have a conflict. The conflict version is local to each machine and has to be fixed locally.

        So don’t do that. It’s not magic.

        1. 1

          Thanks. Does that mean you have to periodically check the website to ensure there are no (unintended) conflicts?

          1. 1

            By the website you mean Syncthing web interface (it runs locally)?

            But it’s actually not displaying the conflicts anywhere (unfortunately), so I’ve got an external script that checks for conflicted files every hour and shows a notification if there are any. Perhaps I should contribute and make it the default behaviour.

            1. 1

              Yes, by “website” I meant the Syncthing web interface.

              Not having conflict notifications is pretty much a bummer for me.

    18. 2

      I love Syncthing, and using it across multiple devices, including Android phones, to sync most of my files.

      For Android, I recommend (weirdly named) Syncthing-Fork, it’s some nice power saving features and sync conditions (e.g. so you can only sync select folders when you’re roaming). The only thing that annoys me is that Android OS doesn’t support symlinks, so I have to maintain separate Syncthing directories for Photos/Pictures/Downloads, etc.

      For sync problems: I have a script that’s doing ‘heartbeats’ every hour, merely by writing timestamp into a special file (e.g. /syncthing/.heartbeat/$DEVICE). This is running on every device (so each device has its own hearbeat file). In addition each device is checking for other devices’ heartbeats – if they fall behind (say) for a day, it shows error. So far that’s been mostly overly paranoid, and on all occasions the problems weren’t with Syncthing itself, but with misconfiguration (e.g. firewall issues, wrong sync conditions on the phone, etc). But I feel like this kind of functionality could be helpful in Syncthing by default, maybe I should contribute it.

      1. 2

        I feel like this kind of functionality could be helpful in Syncthing by default, maybe I should contribute it.

        Even if it doesn’t get merged, publishing your script somewhere publicly visible could prove useful to someone!

    19. 3

      That looks lovely! I’m a big hater of yaml-based configuration and reinventing the wheel, when things like that can be done in a proper programming language. I don’t have to manage many servers, but I like the idea of having reproducible state of the system. Like NixOs, but NixOs/Nix seems like a lot of work, whereas I’m 99% happy with simply using apt, pip and few ad-hoc commands.

      I was about to cleanup my current setup scripts (done with a bunch of scripts + Ansible), so I think I’ll give this tool a try. Do you think it makes sense for my usecase (personal desktop/laptop)? Are you using it for that purpose?

      1. 4

        I do think it makes sense - I do exactly the same :) So far I’ve used pyinfra for both ad-hoc/local box setup and also in production managing medium size (100’s of nodes) Elasticsearch clusters, amongst other things. An example is my (very WIP) MacBootstrap deploy: https://github.com/Fizzadar/MacBootstrap.

    20. 2

      heh, my idea for a better browser history a few years ago was extracting semantic markup (schema.org etc. — attempt to unify common vocabularies) and being able to have a history of not just pages, but e.g. organizations, people, events etc. mentioned on all the pages. Not sure if that would actually be useful though :D

      1. 1

        That would be really cool, like a proper Memex! But also I’d imagine very hard.

        1. 2

          this sort of entity extraction and classification is actually pretty straightforward with more or less modern nltk stacks - I did almost exactly this for work a few years ago, pulling people and companies from news articles.

          you’re going to want to search for:

          • Named Entity Recognition (NER)
          • Stanford NLTK

          check out this random tutorial I found (unaffiliated, just hate being given search targets without contextual examples)