Threads for ehamberg

  1. 8

    This is because fork+exec is still a slow path on POWER. It is similar on ARM, MIPS, and many other RISC architectures.

    I’m curious, why?

    1. 2

      It’s slow everywhere. fork+exec is a dumb UNIXism.

      1. 7

        Can you elaborate on why it’s dumb? It’s worked (for various values of worked) for many decades.

        1. 20

          This paper from 2019 explains it well. (But there is some good critique in the comments.)

          A fork() in the road

          Abstract:

          The received wisdom suggests that Unix’s unusual combination of fork() and exec() for process creation was an inspired design. In this paper, we argue that fork was a clever hack for machines and programs of the 1970s that has long outlived its usefulness and is now a liability. We catalog the ways in which fork is a terrible abstraction for the modern programmer to use, describe how it compromises OS implementations, and propose alternatives.

          1. 17

            Long story short, it works quite well in simple cases: no threads, no DLL’s, no containers, no signal handlers, etc.

            Then when you try to involve any of the above in the process it suddenly becomes very inadequate and you don’t have many other tools for sensibly dealing with the complexity involved. Processes in the 2020’s involve a lot more state than they did in the 1970’s, and managing that state with fork+exec takes a lot of fragile and flaky work.

            1. 11

              Yes, worked.

              It is dumb because it turns process creation into duplicating a process and replacing it, which is way more complex (and thus slow).

              The only reason we still do it is the classic “we’ve always done it this way”. I expect sanity will impose itself at the end, but it will obviously not be a smooth change.

              One way or another, UNIX (including its clones) is going to be replaced by a multiserver, microkernel architecture anyway. Whatever the replacement is, it won’t rely on fork+exec for spawning executables.

              1. 4

                We don’t do it this way just because we’ve always done it this way. We do it this way because we’ve tried other ways and they are error prone. See vfork and posix_spawn for past unix approaches to move away from the status quo. Windows provides CreateProxessA, which is simpler but basically suffers from the same usability issues as posix_spawn.

                I’d argue that we already have the replacement for the conventional approach. It is … fork + exec, just not in sequence. This can be seen in some programs that have large working sets and/or strong priv sep needs. The program essentially forks a process off very early that can serve as the main program and then make requests to the original process (which has a minuscule footprint) to do launch other processes. The fork is cheaper and there’s no system integration challenges, such as running on a new OS version that is subject to new IPC restrictions/trade-offs (as often happens with microkernel systems).

                1. 6

                  fork and vfork have similar footguns. fork is pretty simple in a single-threaded program (you can do basically anything in the fork context) but that is really safe only if you control all of the code in your program, otherwise you have to assume that any library that you link may create threads. In a multithreaded program you may call only async-signal-safe functions between fork and execve and, in particular, may not call anything that might acquire a lock. If one thread enters malloc and acquires a slow-path lock then another thread calling fork can call malloc and have it work on the fast paths that don’t require a lock, right up until you hit a particular combination of heap state and timing that means that it hits the slow path and tries to acquire the lock that is owned by the other thread (which is not duplicated into the child) and deadlocks. This is insanely hard to debug.

                  In contrast, with vfork, the only footgun is that any memory that’s allocated in the child but not freed is leaked. You can malloc memory in a vfork context, you just must free it before execve. The simplest thing to do is use RAII containers to allocate memory before the vfork call and then not do any allocation in the vfork context. This has the benefit that it also works with fork, it’s just slower. It’s much easier to debug than the fork failure mode because the memory is leaked on all execution paths and will show up with valgrind or whatever you use for checking leaks.

                  The problem with CreateProcess as an explicit call is that you need to do some stuff to the created process. On Windows, all of the system calls that modify the process environment take a HANDLE to the process and modifying your own process is a special case. On *NIX, all of them assume the current process. For example, mmap modifies your own process, you’d need a mmap_other (or whatever) that took a process descriptor as an argument to allow you to modify the virtual address mapping of other processes. The same applies to setuid, setgid, and so on. This would be quite an invasive set of changes to POSIX. vfork is a simple hack that lets you do any system calls change the process state (except the memory mapping), without needing to duplicate them.

                  1. 2

                    fork is pretty simple in a single-threaded program (you can do basically anything in the fork context) but that is really safe only if you control all of the code in your program, otherwise you have to assume that any library that you link may create threads

                    Oh look, someone else who experienced that pain.

                    1. 3

                      Honestly, I think you could do the world a favour by making fork return an error if it is called in a multithreaded program. The set of constraints on fork in a multithreaded program are more restrictive than vfork and it’s slower. With vfork, you may allocate but you must deallocate everything before execve if you don’t want memory leaks. With fork you may not allocate or call any other function unless it is explicitly marked as async signal safe. Even having fork silently call vfork if used in a multithreaded program would probably be an improvement. You’ll leak memory in the parent rather than deadlocking in the child.

                    2. 1

                      I certainly didn’t mean to say fork is good. I just meant that based on real world bugs it seems to be the easiest for people to understand and avoid said bugs. As much as alternatives can help avoid known pitfalls they also introduce their own pitfalls. The result is that even people who know of newer approaches tend to choose fork because they’ve also been burned by it’s replacements. The early pre-fork approach avoids the threading and memory mapping issues as much as possible. You do need to be careful that dependencies aren’t initialized in a way that can launch threads, open files, etc but in practice that has proven easier (especially for larger teams with heterogeneous knowledge sets) than tracking the rough edges of each process spawning approach and always choosing the most appropriate one.

                      1. 1

                        I certainly didn’t mean to say fork is good. I just meant that based on real world bugs it seems to be the easiest for people to understand and avoid said bugs

                        Even with that caveat, I still disagree. With vfork, if I call malloc then I will deterministically leak memory if I forget to call free, so the easiest solution is to avoid calling malloc at all in the vfork block. If I get it wrong, then valgrind will tell me. With fork, if I call malloc then I will nondeterministically get deadlock, in some situations. I know not to call malloc in a fork context but if I get it wrong then it’s almost impossible to debug via any mechanism other than reading the code and realising that this is what happened.

                        Worse, if I wrote code that called fork in a single-threaded program, it could call malloc and then if someone else comes along later and adds a background thread then they have introduced a nondeterministic bug into some code that they’ve never looked at, let alone modified.

                        The only way that I can use fork safely is to apply a stricter set of restrictions to myself than I would for vfork. In practice, I usually want my code to work with vfork or fork, and so I end up with the union of the restrictions (which basically boils down to not doing anything other than system calls in the child context).

                    3. 2

                      I’m by no means an expert in this, but aren’t usability issues meant to be addressed at the language level more so than at the architecture level? As in, just because a bare architecture has usability issues, doesn’t mean it isn’t worth considering building on top of, right? The opposite seems like letting the perfect be the enemy of the good.

                      1. 2

                        I’d argue that we already have the replacement for the conventional approach. It is … fork + exec, just not in sequence. This can be seen in some programs that have large working sets and/or strong priv sep needs. The program essentially forks a process off very early that can serve as the main program and then make requests to the original process (which has a minuscule footprint) to do launch other processes. The fork is cheaper and there’s no system integration challenges, such as running on a new OS version that is subject to new IPC restrictions/trade-offs (as often happens with microkernel systems).

                        Android has the zygote setup for preforking processes with an initialized runtime, last I checked.

                        1. 2

                          Yeah that is the sort of approach I’m talking about. Chrome and openssh do something similar at the application level.

                          1. 1

                            Is that still true? I was under the impression that the zygote was removed a few years ago because it completely defeats ASLR. Every process was a forked copy of the same initial instance and so had a stable ASLR seed even across process restarts. This led to only about 8 bits of entropy in most pointers (from the allocator) and none for non-JIT’d code pointers, so an attacker who can try 256 times (made easier by the fact that Android helpfully restarts processes that crash) can deterministically compromise it.

                    4. 5

                      Sure, but the post implies it’s slower compared to x86, and I don’t know why that would be true.

                      Also shell scripts are notoriously slow on Windows compared to Linux because Windows doesn’t use a fork model. Perhaps that’s because UNIX shells are designed around forking. Fork certainly is more complicated and fragile as you outline in another comment, but slower? I’m not convinced. The specific case of fork+exec has been aggressively optimized. Is there a non-forking OS that trashes Linux in subprocess creation speed?

                      1. 3

                        Also shell scripts are notoriously slow on Windows compared to Linux because Windows doesn’t use a fork model

                        The lack of fork isn’t the reason that process creation is slower on Windows. Windows processes are very large. A new process has a load of DLLs mapped by the kernel, has multiple threads, and so on. Creating a new picoprocess on Windows is very fast.

                        The specific case of fork+exec has been aggressively optimized. Is there a non-forking OS that trashes Linux in subprocess creation speed?

                        Linux with vfork + execve is significantly faster than Linux with fork + execve.

                        1. 2

                          For POWER specifically, on Linux 5.4 at least, there is a slow path involving copy_from_user that is part of the syscall which is optimised away on x86. I believe part of this has to do with the fact that the MMU algorithm is swappable (HPT vs Radix). IIRC, the same copy_from_user happens on ARM, but I’ve long forgotten the arcana there.

                          For a good time, benchmark FreeBSD/sparc64’s fork to FreeBSD/amd64. That’s far worse because of how they handled process switching on the SPARC. (Linux comparison can be found in include/switch_to_64.h.)

                        2. 3

                          This is a thing that Windows does better. Create process is its own call and you specify what the new process should be and get a kernel object handle that refers to it.

                          It does make porting unix programs that expect to fork workers instead of spawning threads something of a nightmare, mind you.

                          1. 1

                            What advantages do you see CreateProcess providing over posix_spawn?

                            1. 3

                              I never used posix_spawn so I don’t know its semantics. It wasn’t an option for most of the time when I was writing stuff that needed fork, and I haven’t updated my knowledge since.

                              1. 3

                                By itself, none, but posix_spawn was intentionally designed to be limited. It is a simple (hah!) API that fits the vast majority of cases, not a general-purpose process creation tool.

                                The NT APIs, in contrast, let you do absolutely anything to a process that you have a HANDLE for that you can do to your own process. You can create mappings, start threads, inject new handles, and so on, all without the target process needing to actively participate (in *NIX you could do this if you had some code running in the target that received FDs over a UNIX domain socket). Cygwin is able to implement fork in userspace on Windows by creating a new process and then creating mappings and copying memory across. It’s very slow and not something anyone should ever do, but it demonstrates that it’s interesting.

                            2. 3

                              So, why call it out for RISC?

                          1. 2

                            Last 3 computers in reverse order:

                            • Surface Pro 3 (Yes going back a ways): Loved it because I could code just fine on it, but also use it for D&D character sheets and DMing. It was so valuable for me to use that 2 other players in my group bought them as well just because of that. I ended up using that computer until it got smashed when I got rear ended.
                            • Dell XPS 15: I ended up buying mine for $1200 – it was the lowest specced one that I could get that still had an i7. I then spent $500 on a 2TB NVMe + 32GB RAM to upgrade it. I absolutely adored that computer. It was so good that my wife, who has been a long-time Apple-only user, actually bought one as her new laptop. She recently bought a NEW one after her old one died. It was amazing for dev work for me – I dual-booted linux/windows on it for coding/gaming. I had that computer for several years, until a really unfortunate drop broke a corner of it and then wear and tear on that corner over the next few months severed something important and it just completely stopped functioning.
                            • M1 Macbook Air (current): I bought this very recently after my Dell XPS 15 died. I had my eye on it after the M1 came out, and was waiting for a few months until all the dev software I cared about supported ARM and/or Rosetta. I bought it because, to be frank, I don’t need a fancy-ass computer. This thing was $1200 (for a Mac) and it has done everything I could possibly need. My entire dev environment is running including 1+ docker images and PyCharm, and it doesn’t even get warm (I don’t think I’ve ever heard sound from it either). Battery lasts for seemingly forever. My only gripe is that I wasn’t willing to get an MBP that just came out – magsafe + all the ports again was attractive, but it was almost double the price. No way.
                            • Future?: Unsure. If Apple keeps on the track they’re on right now with better prices and reasonable specs, I might legitimately stick with it. I am in no way married to a brand though, so I would readily move back to Dell/Lenovo/etc if it matched my needs.
                            1. 1

                              it doesn’t even get warm (I don’t think I’ve ever heard sound from it either)

                              The M1 MacBook Air is completely fanless.

                            1. 4

                              This would be really cool to see, but… C? Lambdas? Did I miss something?

                              1. 3

                                This paper builds on the assumption that at least simple lambdas are integrated into C23

                                1. 2

                                  ah. I guess I did miss something :)

                              1. 5

                                Layman’s question: why is systemd not using semantic versioning? Hard to understand if any breaking changes will be coming to distros upgrading to systemd 250. I am assuming it should correspond to something like 1.250.0 if full compatibility is preserved?

                                1. 8

                                  It may be as simple as Systemd predating the Semantic Versioning spec by almost five years.

                                  1. 2

                                    are you sure?

                                    none of the specs on semver.org are dated, but there is a wayback machine snapshot from 2009. systemd 1 was 2010.

                                    I also feel like the practice existed long before the semver spec, but that github co-founder certainly seems to be taking credit for it…

                                    1. 2

                                      SemVer certainly matches how version numbers were first explained to me in 2007.

                                      1. 0

                                        yet the guy who added tracking scripts to avatars is like “I propose a simple set of rules and requirements…”

                                      2. 1

                                        The first Semver commit (ca64580) was 14 Dec 2009 with it’s first release (v0.1.0) the next day. The first Systemd commit (f0083e3) was 27 Apr 2005 with it’s first release (0.1) the same day.

                                        I think you’re right that the first stable release of Systemd came after the Semver spec and that various forms of that practice were already around before it. In my (somewhat unreliable) memory it took years for Semver to reach the popularity it has now where it’s often expected of many projects.

                                        1. 1

                                          on wikipedia the initial release for systemd is listed as 30 March 2010, without any citation. perhaps it should read 27 April 2005.

                                          also I shouldn’t have assume the initial release was called systemd 1. if they do point releases maybe they are indeed using semantic versioning.

                                    2. 3

                                      I think that semantic versioning is a lie, albeit a well-intentioned one. All it tells you is what the author thinks is true about their consumers’ expectations of their code, not what is actually true, so it can mislead. Having an incrementing release version says the same thing: “Something has changed” and gives the same practical guarantees: “We hope nothing you did depended on what we changed”.

                                      1. 17

                                        I don’t see it as a lie, more like a best effort guess at compatibility, which is really the best we could hope for.

                                        Semantic versioning is more of a social contract than a formal proof of compatibility.

                                        1. 4

                                          With that reasoning, why bother with any communication at all?

                                          1. 2

                                            True in this case, but there are ecosystems that will help authors enforce semantic versioning, e.g. Elm where the compiler makes you increase the major version if it can know there are API changes, i.e. when the type of an exported function changed.

                                          2. 1

                                            I don’t think it’s supposed to have breaking changes, is it?

                                            1. 1

                                              The breaking changes are documented in the release notes, but there’s very few of them considering the scope available.

                                            2. 1

                                              are you sure they aren’t using semantic versioning? they do have minor releases like 249.7.

                                            1. 13

                                              Genuine comment (never used Nix before): is it as good as it seems? Or is it too good to be true?

                                              1. 51

                                                I feel like Nix/Guix vs Docker is like … do you want the right idea with not-enough-polish-applied, or do you want the wrong idea with way-too-much-polish-applied?

                                                1. 23

                                                  Having gone somewhat deep on both this is the perfect description.

                                                  Nix as a package manager is unquestionably the right idea. However nix the language itself made some in practice regrettable choices.

                                                  Docker works and has a lot of polish but you eat a lot of overhead that is in theory unnecessary when you use it.

                                                2. 32

                                                  It is really good, but it is also full of paper cuts. I wish I had this guide when learning to use nix for project dependencies, because what’s done here is exactly what I do, and it took me many frustrating attempts to get there.

                                                  Once it’s in place, it’s great. I love being able to open a project and have my shell and Emacs have all the dependencies – including language servers, postgresql with extensions, etc. – in place, and have it isolated per project.

                                                  1. 15

                                                    The answer depends on what are you going to use nix for. I use NixOS as my daily driver. I am running a boring Plasma desktop. I’ve been using it for about 6 years now. Before that, I’ve used windows 7, a bit of Ununtu, a bit of MacOS, and Arch before. For me, NixOS is a better desktop than any of the other, by a large margin. Some specific perks I haven’t seen anywhere else:

                                                    NixOS is unbreakable. When using windows or arch, I was re-installing the system from scratch a couple of times a year, because it inevitably got into a weird state. With NixOS, I never have to do that. On the contrary, the software system outlives the hardware. I’ve been using what feels the same instance of NixOS on six different physical machines now.

                                                    NixOS allows messing with things safely. That’s a subset of previous point. In Arch, if I installed something temporarily, that inevitably was leaving some residuals on the system. With NixOS, I install random on-off software all the time, I often switch between stable, unstable, and head versions of packages together, and that just works and easy rollbackabe via entry in a boot menu.

                                                    NixOS is declarative. I store my config on GitHub, which allows me to hop physical systems while keeping the OS essentially the same.

                                                    NixOS allows per-project configuration of environment. If some project needs a random C++ package, I don’t have to install it globally.

                                                    Caveats:

                                                    Learning curve. I am a huge fan of various weird languages, but “getting” NixOS took me several months.

                                                    Not everything is managed by NixOS. I can use configuration.nix to say declaratively that I want Plasma and a bunch of applications. I can’t use NixOS to configure plasma global shortcuts.

                                                    Running random binaries from the internet is hard. On the flip side, packaging software for NixOS is easy — unlike Arch, I was able to contribute updates to the packages I care about, and even added one new package.

                                                    1. 1

                                                      NixOS is unbreakable. When using windows or arch, I was re-installing the system from scratch a couple of times a year, because it inevitably got into a weird state. With NixOS, I never have to do that. On the contrary, the software system outlives the hardware. I’ve been using what feels the same instance of NixOS on six different physical machines now.

                                                      How do you deal with patches for security issues?

                                                      1. 8

                                                        I don’t do anything special, just run “update all packages” command from time to time (I use the rolling release version of NixOS misnamed as unstable). NixOS is unbreakable not because it is frozen, but because changes are safe.

                                                        NixOS is like git: you create a mess of your workspace without fear, because you can always reset to known-good commit sha. User-friendliness is also on the git level though.

                                                        1. 1

                                                          Ah I see. That sounds cool. Have you ever had found an issue on updating a package, rolled back, and then taken the trouble to sift through the changes to take the patch-level changes but not the minor or major versions, etc.? Or do you just try updating again after some time to see if somebody fixed it?

                                                          1. 4

                                                            In case you are getting interested enough to start exploring Nix, I’d personally heartily recommend trying to also explore the Nix Flakes “new approach”. I believe it fixes most pain points of “original” Nix; two exceptions not addressed by Flakes being: secrets management (will have to wait for different time), and documentation quality (which for Flakes is now at even poorer level than that of “Nix proper”).

                                                            1. 2

                                                              I didn’t do exactly that, but, when I was using non-rolling release, I combined the base system with older packages with a couple of packages I kept up-to-date manually.

                                                      2. 9

                                                        It does what it says on the box, but I don’t like it.

                                                        1. 2

                                                          I use Nixos, and I really like it, relative to how I feel about Unix in general, but it is warty. I would definitely try it, though.

                                                        1. 1

                                                          I’ve been using massren[0] for a long time. It lets you rename all the files in CWD with $EDITOR.

                                                          1. https://github.com/laurent22/massren
                                                          1. 3

                                                            vidir, which comes with moreutils, also lets you do exactly this.

                                                            1. 1

                                                              IIRC, moreutils also has qmv and qcp. Great tools.

                                                            1. 10

                                                              Considerably better than I was expecting. High-level summary: computer vision company founded before the deep learning revolution outperforms neural nets via huge amounts of baked-in domain knowledge.

                                                              1. 4

                                                                baked-in domain knowledge

                                                                Bravo, sir!

                                                                1. 1

                                                                  Fortunately, not everything is “deep” learning

                                                                1. 18

                                                                  Other posts in this series:

                                                                  • Why All My Servers Have a Process that Allocates 1 GB of RAM and Then Sleeps Forever
                                                                  • Why All My Servers Have a Process that Runs an Infinite Loop on One of the Cores
                                                                  • Why All My Databases has a Table With 1 GB of NUL Strings
                                                                    1. 11

                                                                      You can rely on Go picking up stupid ideas and doing them for real ..

                                                                      1. 3

                                                                        Thanks for sharing – that was a fun read! “Ballast” is such a perfect name for this.

                                                                    1. 2

                                                                      until recently GHC’s runtime relied on a mix of volatile variables and explicit memory barriers to enforce memory consistency

                                                                      This is kind of shocking. One of the first things I learned about memory ordering is that I shouldn’t use volatile for it. Maybe this just shows GHC’s age.

                                                                      1. 4

                                                                        Yeah the fact that GHC’s runtime was written well-before memory consistency models were widely understood, and nearly two decades before the standardization of C11 atomics (emphasis mine) really stood out to me.

                                                                      1. 18

                                                                        the user may specify extra, language-specific dependencies to model the edits more accurately, for example the dependency between introducing a function and using it in another file or another part of the same file.

                                                                        Wow. That’s the first time I’ve heard about this feature. That’s super interesting! I guess that relationship will have to be stored in the patch, then?

                                                                        1. 11

                                                                          Yes: in Pijul, dependencies are already stored in patches, and there is already a way to add them manually.

                                                                          1. 8

                                                                            FWIW, you might want a hat for making it clear when you talk from a position of authority about the subject.

                                                                            1. 3

                                                                              I’ve requseted one, thanks.

                                                                              1. 1

                                                                                That’s actually a really interesting feature. I never paid those hats much attention. I just assumed they were like user-set Reddit flairs.

                                                                              2. 3

                                                                                I really like the idea going somewhere behind current (D)VCS. The associativity is great. I would like to know how far your goals are. Are you going towards semantic diff/merge? Because some operations are not possible without having support for particular programming language or data format.

                                                                                With Pijul, developers can be 100% confident that the code they reviewed is the code that gets merged, which is not necessarily the case in Git and Mercurial.

                                                                                I am in doubt whether this is possible at all. If I want to be really sure, I have to do (a second) review after the merge. There are also cases where the merge runs flawlessly from the text/line point of view, but the result is wrong on the semantic level (two methods with same name/signature, two enum items with same value etc.)

                                                                                The question is whether these semantic operations belong to the VCS or should be done separately (and possibly on top of any VCS)…

                                                                                1. 4

                                                                                  There are also cases where the merge runs flawlessly from the text/line point of view, but the result is wrong on the semantic level (two methods with same name/signature, two enum items with same value etc.)

                                                                                  Sure, but at least with Pijul you can predict what the merge will be with absolute confidence. You can also be sure that merging changes one by one and all at once will yield the same.

                                                                                  The question is whether these semantic operations belong to the VCS or should be done separately (and possibly on top of any VCS)…

                                                                                  You can model dependencies between changes in Pijul, and one could totally imagine adding hooks to add dependencies.

                                                                                2. 2

                                                                                  Are there any papers / specifications that you relied on for writing pijul that you could share? The ‘towards 1.0’ post gave a nice idea what is going on, and the posts on initialcommit are also nice but I’d also be interested in seeing some of the research underlying the design.

                                                                                  1. 5

                                                                                    It started with “A categorical theory of patches” (Mimram, Di Giusto), but there is a lot more in Pijul than just that paper. I still have to write the papers. I do have some of the proofs written up, but it’s not a paper yet.

                                                                                    1. 2

                                                                                      Cool, eagerly awaiting the writeups, in the meanwhile there is the code :)

                                                                                      Thanks for all your hard work.

                                                                                      1. 1

                                                                                        Out of curiosity, do you know what the relationship is between academic papers and FOSS licenses? Theoretically, could someone read the papers you publish and then turn around and write an MIT-licensed DVCS similar to pijul?

                                                                                        To be clear, I have absolutely no interest (is negative interest a thing?) in doing so. I’m just wondering how academia views licensing and public information and what the standards are.

                                                                                  2. 7

                                                                                    This is really interesting. When using git I try to write small, atomic commits where each change is as small as possible and passes CI (and could be deployable to production without breaking anything). I feel like, with git, it takes a lot more effort to track dependencies between changes than it should and this kind of functional dependency tracking system would make it easier.

                                                                                  1. 11

                                                                                    There’s a great blog post by Conal Elliott about the false belief that “everything is a function” in Haskell:

                                                                                    There a belief about Haskell that keeps popping up in chat rooms and mailing lists — one that I’ve been puzzling over for a while. One expression of the belief is “everything is a function” in Haskell.

                                                                                    Of course, there are all of these non-functions that need to be accounted for, including integers, booleans, tuples, and lists. What about them? A recurring answer is that such things are “functions of no arguments” or functions of a one-element type or “constant functions”.

                                                                                    I wonder about how beliefs form, spread, and solidify, and so I asked around about how people came to this notion and how they managed to hold onto it. I had a few conjectures in mind, which I kept to myself to avoid biasing people’s responses. Of the responses I got, some were as I’d imagined, and some were quite surprising to me, revealing some of my blind spots about others’ thinking and about conversation dynamics.

                                                                                    http://conal.net/blog/posts/everything-is-a-function-in-haskell

                                                                                    1. 19

                                                                                      This interview with the (former) Helm maintainer is really good. Turns out he’s a alpine guide who never programmed for a living.

                                                                                      https://sachachua.com/blog/2018/09/interview-with-thierry-volpiatto/

                                                                                      1. 19

                                                                                        Thanks for sharing this! I’m the author of Gleam (and this post). Very happy to answer any questions :)

                                                                                        1. 6

                                                                                          Thank you for your work on Gleam! It looks really promising, and it’s been great seeing it progress from the sideline.

                                                                                          Is it easy to integrate it (e.g. writing one module in Gleam) in an existing Erlang + rebar3 project? (Is this documented somewhere?)

                                                                                          1. 7

                                                                                            Yes for sure. Currently we don’t have a dedicated build tool so all Gleam projects are rebar3 projects with a project plugin (https://github.com/gleam-lang/rebar_gleam), so compilation of Erlang works as per usual.

                                                                                            There’s also a mix plugin for Elixir projects (https://github.com/gleam-lang/mix_gleam).

                                                                                            The tooling is a bit rough-and-ready at the moment, I’m hoping to improve it in the near future.

                                                                                        1. 9

                                                                                          What is your favorite pitfall in Date?

                                                                                          Has to be toISOString(). Claims to return ISO8601, which contains the timezone offset, but instead it just gives you the GMT string, even though it’s perfectly aware of the timezone information:

                                                                                          // It's 15.44 in Europe/Warsaw
                                                                                          > dt.getTimezoneOffset()
                                                                                          -120
                                                                                          > dt.toISOString()
                                                                                          '2020-08-02T13:44:03.936Z'
                                                                                          
                                                                                          1. 5

                                                                                            That is a valid ISO 8601 timestamp. The ‘Z’ (“zulu”) means zero UTC offset, so it’s equivalent to 2020-08-02T15:44:03.936+02:00.

                                                                                            1. 3

                                                                                              Oh, it is valid, yes. It’s just less useful than one containing the TZ information that is stored in that Date object. It’s correct, but less useful than it could be (and with little extra effort).

                                                                                              1. 3

                                                                                                Ah, I misunderstood you, then. When you wrote “claims to return ISO 8601” I thought you meant that it wasn’t actually an ISO 8601 string.

                                                                                                So what you mean is that the “encoding” of the of the ISO 8601 string should reflect the local timezone of the system where you call .toISOString()? I.e. 2020-08-02T15:44:03.936+02:00 if you called .toISOString() on a CEST system and 2020-08-02T09:44:03.936-04:00 if you called it on an EDT system?

                                                                                                1. 2

                                                                                                  I’d expect it to not lose the timezone information, given that it already uses a format that supports that information. It’s not incorrect, it’s just less useful that it could be. Perhaps that’s just the implementation, not the spec – but I’m yet to see it implemented differently. It’s not a huge deal, it’s just frustrating that it could’ve been better at a little cost and yet no one bothered, apparently.

                                                                                                  It’s not about the system it’s called on – that determines the timezone that’s already in the object, as my code snipped showed. I’d expect the data that’s already there to be included in the formatting, instead of being converted to UTC, lost and disregarded. If implemented correctly better, toISOString could’ve been a nice, portable, lossless serialization format for Dates – but as it is, a roundtrip gives you a different date than you started with, because it will now always come back as UTC.

                                                                                                  1. 2

                                                                                                    I would actually assume that getTimezoneOffset is a class method that just looks at your system’s configured time zone and does not read anything from the Date object. I’m pretty sure the object does not store information about the timezone of the system in which it was generated, because it’s never needed. You can always convert to the timezone you want at read time.

                                                                                                    This is also what PostgreSQL does. If you create a column for “timestamps with timezone” it will discard the timezone information at write time and just use UTC (because why not?). The only thing that is different when you choose a timestamp column with timezone is that at read time it will convert values from columns to the configured timezone. All it stores is the number of seconds since the epoch.

                                                                                                    If you look at Firefox’s JS source, it looks like they also just store the seconds since the Unix epoch in a Date object, no timezone information: https://github.com/mozilla/gecko-dev/blob/d9f92154813fbd4a528453c33886dc3a74f27abb/js/src/vm/DateObject.h

                                                                                                2. 3

                                                                                                  I don’t believe Date contains a time offset. As far as I’m aware, like many languages, the problem is not that the APIs ignore the time offset - they would have to silently reach into the client locale to get it, which would be misleading and make it easy to create bugs. the problem is that they named it “Date” when it’s really just a point in absolute time. Combine a Date with the client locale’s time offset and you’ve got yourself a date, but a Date is not a date.

                                                                                              2. 5

                                                                                                This is a namespacing error that’s common when methods are on objects like this. getTimezoneOffset is a property here of the client locale, not of the date time object.

                                                                                              1. 5

                                                                                                With all the enthusiasm for zettelkasten/second-brain like systems (roam, org-roam, now this), I’m surprised that nobody has been working on I haven’t heard of an external format/tool that various UI’s can interface. VSCode, at least that’s my impression, is the kind of editor that gets displaced from it’s throne every few years by the next new thing, as has happened to Sublime and Atom before, so I certainly wouldn’t be too confident in making my “second brain” depend on it, except maybe if it’s used as a brainstorming tool for projects, but then it would have to be distributable too – but from skimming the article that doesn’t seem to be the case.

                                                                                                Edit: Fixed the first sentence, sorry for my ignorance. Also I missed that this is markdown based, so I guess the rest of the comment isn’t quite right either, but I guess/hope my general point is still legitimate.

                                                                                                1. 6

                                                                                                  I’m surprised that nobody has been working on an external format/tool that various UI’s can interface

                                                                                                  Checkout neuron which is editor-independent, has native editor extensions, but can also interface (in future) with editors through LSP.

                                                                                                  Some examples of neuron published sites:

                                                                                                  Easiest way to get started (if you don’t want to install yet): https://github.com/srid/neuron-template

                                                                                                  1. 3

                                                                                                    That sounds cool, but I don’t really get why LSP would help? I (personally) would much prefer a native client, in my case for Emacs, than something that forces itself into a protocol for program analysis.

                                                                                                    1. 2

                                                                                                      Well, neuron does have native extensions for emacs and vim (see neuron-mode and neuron.vim) - but LSP support just makes multiple editor support easier by shifting common responsibility to a server on neuron.

                                                                                                      EDIT: I’ve modified the parent comment to clarify this.

                                                                                                    2. 1

                                                                                                      Is there any easier way to install (i.e. without nix?) I’m on a laptop and installing new toolchains is prohibitive for the low storage I have.

                                                                                                      1. 1

                                                                                                        Nix is the only way to install neuron (takes ~2GB space including nix and deps), until someone contributes support for building static binaries.

                                                                                                        But I’d encourage you give Nix a try anyway, as it is beneficial even outside of neuron (you can use Nix to install other software, as well as manage your development environments).

                                                                                                        1. 2

                                                                                                          I got a working binary with nix-bundle, that might be a simpler option. It’s a bit slow though, especially on first run when it extracts the archive. nix-bundle also seems to break relative paths on the command line.

                                                                                                          1. 1

                                                                                                            Interesting. Last time I tried nix-bundle, it had all sorts of problem. I’ll play with it again (opened an issue). Thanks!

                                                                                                    3. 3

                                                                                                      Isn’t the markdown that this thing runs on exactly that external format, and one that has been getting adoption across a wide range of platforms and usecases at that?

                                                                                                      1. 3

                                                                                                        There is tiddlywiki and the tiddler format.

                                                                                                        1. 2

                                                                                                          I wish the extension used the org format instead of markdown (so if something happens to vscode, I can use it with emacs), but otherwise I totally agree with your comment!

                                                                                                          1. 2

                                                                                                            You can use markdown files with org-roam in emacs by using md-roam. I prefer writing in Markdown most of the time, so most of my org-roam files are markdown files.

                                                                                                        1. 6

                                                                                                          Is there a video link for this talk? I want to watch it!

                                                                                                          1. 10

                                                                                                            This one has me conflicted. On the one hand, I understand this reasoning and agree with it, in the specific case discussed — a local University computing system. I think the author is probably making the right choice for their users.

                                                                                                            On the other hand, I will still almost always recommend people just use UTC because it’s a safe long term default. I’ve worked with multiple companies now where all the servers are still on Pacific Time despite opening many global offices over the years. Many of their users and developers are now doing time zone math anyway, because they don’t all live in California anymore. But now they have the added adventure of daylight savings time! 😉

                                                                                                            Granted, if your whole mission is focused on serving a given locality, like a University, you’re probably safe with local time. But even then… as soon as you look at computers intended for research collaboration, that might go out the window too. I’ve seen plenty of academic HPC systems that eventually have more external users than internal, as they get linked into wider research programs.

                                                                                                            1. 5

                                                                                                              On the other hand, I will still almost always recommend people just use UTC because it’s a safe long term default.

                                                                                                              Anything should be a safe long-term default, as long as you’re consistent or store the corresponding time zone. The problems usually happen when you have something like 2020-05-15 17:56:34 and no idea which TZ that refers to, or need to get the user’s configured TZ (which may change). But if it’s stored as 2020-05-15 17:56:34 +0800 then it’s always safe and easily convertible to whatever timezone.

                                                                                                              IMO “always use UTC” is much better phrased as “always know which TZ a time is”. Using UTC internally everywhere is often a convenient way of doing that, but not always.

                                                                                                              1. 1

                                                                                                                But if it’s stored as 2020-05-15 17:56:34 +0800 then it’s always safe and easily convertible to whatever timezone.

                                                                                                                I think that’s a lucky example, since there’s little daylight savings out that way, but much of the world moves their clocks around, so you might share a DST-offset part of the year with someone when it matters, and you’re using these timestamps specifically for correlation.

                                                                                                                The reason UTC is better is because we know where 0° longitude is and we know they don’t practice daylight savings. Was that the result of a car crash into a telephone pole at almost six o’clock? Time zone doesn’t tell you that in parts of the world, but UTC-reported dates will.

                                                                                                                The reason UTC is worse, of course, is because sometimes people don’t provide UTC dates, and sometimes people confuse the time in London or Greenwich with UTC so their reports are bad, and people rarely make this mistake with local time (as the author points out).

                                                                                                                1. 4

                                                                                                                  Won’t a format like that take care of DST? For example in Western Europe it would be +0100 in the winter, and +0200 in the summer.

                                                                                                                  1. 2

                                                                                                                    Not always. Consider an event in the future planned before a decision to change daylight savings time. It’s still going to open at 9am whatever-is-local- on some given future date.

                                                                                                                    1. 2

                                                                                                                      Assuming the DST change is communicated clearly and in good enough time to the TZ database maintainers… you’d be surprised how often this is not the case.

                                                                                                                      Off the top of my head, I can mention Jordan, Turkey, Armenia, Egypt and Russia announcing changes to their DST change schedules with very short notice.

                                                                                                                      My fear is that this will happen in the EU too, considering that most politicians don’t really seem to understand the implications of changing the DST schedule…

                                                                                                                      1. 1

                                                                                                                        Even with past dates, calculations will only be correct iff a library is using a correct timezone database that includes all change history and is using it correctly. It also may not be the case.

                                                                                                                        Using UTC and converting to local time only when needed saves one from a lot of “if’s”.

                                                                                                                        1. 2

                                                                                                                          Only if you’re using a format that doesn’t save the UTC offset, right? I don’t see how the interpretation of an ISO-8601 datetime like 2020-05-18T08:42:57+02:00 can change in the future.

                                                                                                                          1. 1

                                                                                                                            It can change if you’re planning to schedule something at 09:00 (AM) on 15 Jun 2022 in Berlin, and the current schedule for DST tells you that Germany will be observing DST at that time.

                                                                                                                            Right now we don’t know what EU countries are going to do with DST - abolish it, and stay on normal time? Abolish it and stay on summer time?

                                                                                                                            If Germany decides to stay on normal time your appointment at 2022-06-15T09:00:00+02:00 will be 1 hour later than expected (which is 9AM CET in this case).

                                                                                                                            1. 2

                                                                                                                              Yeah, that makes sense, but the comment above mine was talking about calculations on past dates.

                                                                                                                              1. 1

                                                                                                                                Sorry, I misinterpreted your comment! Yeah, past dates are generally “safe”.

                                                                                                              1. 23

                                                                                                                It only works in Google Chrome and Microsoft Chrome, unfortunately:

                                                                                                                For the best experience with Codespaces, we recommend using a Chromium-based browser, like Google Chrome or Microsoft Edge. Firefox is currently unsupported, and there are known issues using Safari.

                                                                                                                1. 12

                                                                                                                  Codespaces allows you to develop in the cloud instead of locally. Developers can contribute from anywhere, on any machine, including tablets or Chromebooks

                                                                                                                  …and on iOS all browsers including Chrome use the Safari rendering engine so this doesn’t really open up development on the most popular tablet platform at all.

                                                                                                                  1. 1

                                                                                                                    I imagine they will add that.

                                                                                                                  2. 4

                                                                                                                    Before that note is this paragraph, though:

                                                                                                                    During the beta, functionality is limited.

                                                                                                                    So hopefully once it’s actually released it will be usable in every browser.

                                                                                                                    1. 1

                                                                                                                      It only works in Google Chrome and Microsoft Chrome, unfortunately:

                                                                                                                      To be honest, it’s quite scary to run all of that inside a browser. Can you imagine the performance on that?

                                                                                                                      1. 1

                                                                                                                        It probably performs fine on most development machines, to be fair.