Threads for rovaughn

    1. 7

      Hang on… by “native” they mean “in Go”? Interesting choice!

      1. 20

        I imagine it’s easier to port their codebase when they don’t have to deal with adding memory management. Concerns like ownership and reference cycles can affect the way you design the data structures, and make it difficult to preserve an architecture that wasn’t designed that way.

        I’m kind of hoping that as a result of this project they come up with a TS-to-Go transpiler. That would make TS close to being my dream language.

        1. 9

          Hejlsberg addresses the question of a native TypeScript runtime towards the end of this interview https://youtu.be/10qowKUW82U?t=3280s

          He talks a bit about the difficulties of JavaScript’s object model. If you implement that model according to the JavaScript spec in a simple manner in Golang, the performance will be terrible. There are a lot of JS performance tricks that depend on being able to JIT.

          What might be amusing is an extra-fancy-types frontend for Golang, that adds TypeScript features to Golang that the TypeScript developers want to use when writing tsgo.

          1. 2

            he also mentions in there about the syntax-level ts-to-go transpiler they wrote, I don’t know the timestamp though

        2. 8

          I’m surprised that a group inside Microsoft that’s presumably led by the creator of C# (author of the post) chose Go. Not because I think C# would have been better for a TypeScript compiler, but because I would have guessed C# AOT would have been the default (even if just for intellectual property reasons) and they had good reasons to use something else.

          Did they prefer Go’s tooling? Was it Go’s (presumably) smaller runtime? Maybe just Go is more mature for AOT (since it was always AOT)?

          1. 6

            Good guesses! Kerrick’s comment links to a video interview which addresses this.

          2. 5

            I was slightly surprised as well. They may have been influenced by esbuild (see Evan Wallace’s summary on why he went with Go over Rust). They may even be reusing some code from or integrating with esbuild in some way, though it doesn’t seem likely to me. My personal preference would lean toward Rust for something like this but I can see why they’d use a native GC’d language.

            1. 2

              I think if they had the experience and time, they may have chose Rust, but they wanted to deliver something in under a year with no knowledge for performance gains now.

            1. 13

              pmeunier has an account here and would be most qualified to add some context, but I can add a bit here.

              Pijul is a patch-based VCS, as opposed to snapshot-based, like Git, Mercurial, etc. The leading example of a patch-based VCS otherwise up to this point has been Darcs, written in Haskell. I haven’t used it much. Pijul’s main motivation over Darcs was algorithmic improvements that resolve the worst-case exponential time Darcs can run into; see the Why Pijul? and Theory pages for a bit more context. There’s supposed to be some theoretical soundness improvements as well over Darcs, but I don’t know as much about that.

              Nest has been the main service for natively hosting Pijul projects with a web UI, made by the same team/developer. From what I remember, like git, a Pijul repository can be used remotely over SSH; Nest is more like having gitea/gitlab in addition to that.

              I remember it being closed source with the intention of making it open source later; I think the rationale was around Nest still being alpha and not having the resources at the time to field it as a full open source project in addition to Pijul itself, though nest.pijul.com was available to host other open source projects with Pijul.

              The news here would be that Nest’s been recently open sourced. As someone who’s been interested in Pijul but hasn’t had much opportunity to use it yet, this sounds like significant news that should make adoption more practical in general. Congrats to pmeunier and co., and thank you for your interesting and generous work in the VCS space!

              1. 8

                There’s supposed to be some theoretical soundness improvements as well over Darcs, but I don’t know as much about that.

                As I recall, the problem is that patches in Darcs don’t necessarily commute. For all the nice math to work out, you want independent patches to commute, that is to say, applying patch A followed by patch B, should give you the same result as applying patch B first, followed by patch A. But patches aren’t guaranteed to do that in Darcs, and the only way to ensure this is to simply test pairs of patches by applying them and seeing if both orders give the same result.

                In Pijul, if you have two patches that don’t depend on each other, they always commute. Either they don’t conflict, in which case the non-conflicting outcome is the same regardless of the order, or they do conflict, in which case you get the exact same conflict state regardless of the order.

                1. 5

                  I found these public posts about the history of this Nest implementation and the plans to open source it:

                  1. pmeunier’s 2023-05-23 blog post “A new direction for Pijul’s hosting service” announced a rewrite of Nest that would be more maintainable and would also be open source.
                  2. pmeunier’s 2023-11-10 forum post said “There are currently two very different versions of the Nest, of which the most recent one was supposed to become open source and self-hostable, but it never worked out. I’m working on merging their features.”
                  1. 4

                    Great answer, nothing to add from me!

                    The reason it was closed-source wasn’t really by design, it was just that the service had accumulated a lot of tech debt after transitioning through way too many versions of the Rust async ecosystem (the Nest started in 2015). So, this is a marker of Pijul being mature enough that I was able to spend some time rewriting things using the (now) stabilised Rust libs such as Diesel and Axum. Also, Svelte is fun on the front-end but didn’t exist back then, I love how you can have the best of both worlds (static and dynamic).

                  2. 1

                    Pijul is a version control system (alternative to git) descended from darcs, which is built around patches as opposed to snapshots (e.g. commits).

                    Nest is like gitea for pijul repositories, and this is the source code for nest hosted on the public instance of nest run by the pijul org.

                    1. 5

                      I wouldn’t say “descended from Darcs” because that may give the wrong connotations. Pijul isn’t a fork of Darcs. Pijul has a rigorous mathematical foundation, unlike Darcs. They are conceptually related though, so I think it is clearer to say Pijul is inspired by Darcs.

                      Pijul is the first distributed version control system to be based on a sound mathematical theory of changes. It is inspired by Darcs, but aims at solving the soundness and performance issues of Darcs.

                      From Why Pijul?

                      I started pay attention to Pijul many years ago. When it comes to systems that manage essential information, I tend to bias in favor of systems with formal guarantees.

                  3. 20

                    I’m working on a follow-up post about the technical details of my search engine but the tl;dr is I’m not actually doing any spidering yet; just taking one direct list of URLs, checking robots.txt and content-type, sending the HTML thru pandoc, and stuffing it in SQLite’s full-text indexer. The search side is a very simple web app written in a framework created and abandoned in 2016 called moonmint that uses “luv” which is the Lua bindings to libuv, the I/O subsystem used by node.js.

                    All the code is written in Fennel: https://git.sr.ht/~technomancy/search/

                    1. 5

                      A bit off-topic, but thanks for your work on Fennel, it made working on Factorio mods a lot more pleasant. I think it might finally be my gateway Lisp as well, after so many glancing blows with that family of languages.

                      1. 4

                        I’ve gotta ask the question I always ask when SQLite FTS comes up. How do you sanitise the input of the search query before putting it in the SQL query? FTS queries have their own DSL of sorts.

                          1. 1

                            Yeah the search function in that file is key. Replacing all non-ascii alphanumeric characters with a space. Sqlite really should provide a better way to do this natively.

                      2. 6

                        I’m suprised that helix is the 4th most used editor with rust (at least in the survey). But makes sense, it feels quite home.

                        1. 3

                          Helix’s relative lack of obscurity has been a pleasant surprise. VSCode was the first IDE that got me off of pure vim (for the most part), and then trying Helix I just stayed with it since it turned out all I really missed from VSCode was the language servers, and I was happy with everything out of the box and generally prefer terminal-based workflows.

                          Then I got vendor-locked into Helix’s keybindings way faster than I expected. Knowing vim was nice since so many things speak it as a portable editing language, and I feared I’d be stuck in my cottage with my hipster terminal editor for the foreseeable future, but seeing the effort in Zed for instance increases the range of tooling I’ll feel fully fluent in at this point.

                          1. 2

                            I was surprised by how widespread it is in my company. May have to do with rust users also looking for rust written tools and/or adopting newer tech ?

                            1. 8

                              I think I heard about Helix due to it being on some list of cool new things made in Rust, but I started using it because I ran out of the needed stamina to fix my nvim config, and I don’t like VS Code. A modal terminal editor with good LSP support just seemed like the natural choice.

                              1. 1

                                I got stuck in VSCode and thus there is a high bar of entry to get productive in Helix with things that would just work in VSCode - and if not usability then the ecosystem of plugins

                          2. 10

                            I don’t mind the hyphenation [1]; it’s the weird gaps that makes justified text look odd. I think Safari looks a lot better than Chrome here.

                            I mean, the Gutenberg Bible was justified. On my Kindle, I read justified. Every paper book. Controlled outputs. It’s just on the web that, as they say, the results are unpredictable.

                            [1] FWIW, when I learned to write I would hyphenate “all” the lines, like I was trying to not waste any paper

                            1. 18

                              Modern paper books are using an algorithm for text layout that’s less greedy and more backtracking. This means that justified text typeset with InDesign or LaTeX looks substantially better than most web browsers (which use greedy algorithms in service of faster page loads).

                              The HTML/CSS spec doesn’t preclude alternative text layouting algorithms, and there are tools like PrinceXML for Books which implement some of those tricks (incurring something like a 3x slowdown in rendering time for my use cases).

                              1. 3

                                Interesting, I hadn’t seen that web rendering system.

                                For anyone curious about seeing TeX’s justification algorithm on webpages, I’ve used this bookmarklet occasionally: https://github.com/robertknight/tex-linebreak

                                I’ve been curious about how well one could precompute typesetting information for webpages and apply it in a progressively enhanced way, like providing hints to the browser for at least some predetermined widths using soft hyphens and non-breaking spaces.

                            2. 23

                              Deciseconds isn’t a crazy unit of time - it’s also used for timeouts in some Linux bootloaders, and it is a convenient unit for describing quick but still human-perceptable amounts of time.

                              The real problem isn’t that the git configuration file interpreted integer literals as deciseconds, but that it interpreted them as booleans at all (repeating an ancient poor design decision of C). I think the ideal configuration language for this option would have something like a lightweight tag system: help.autocorrect = true, help.autocorrect = false, help.autocorrect = (timeout 10), or something like that.

                              1. 23

                                Milliseconds also isn’t crazy - and people are generally used to it. timeout_ms should basically be clear to 90% of people who have ever edited a config file.

                                1. 16

                                  Including the unit in the name really should be standard by now!
                                  Though “min” for minute is confusing without the context and a good reason to use seconds instead IMO.

                                  1. 1

                                    Yeah, I think it’s wise to stick to prefixes for powers of a thousand in human-readable presentation by default unless there’s a particular reason not to, like engineering notation does. And for a database field, I’d prefer the unadjusted SI unit if possible.

                                    Out of the non-thousand prefixes (centi-, deci-, deca-, hecto-), the most standard uses are probably centimeters and decibels, though each field has their own conventions that may weigh in heavily as well.

                                      1. 14

                                        @hwayne, you have to link to Frink’s Sample Calculations. You just gotta.

                                        1. 4

                                          Also Frink’s default units page is a good read for education and entertainment value, linked to from another one of @hwayne’s posts. I can’t get enough of metrology drama when I see it

                                        2. 36

                                          The article buries the lede: they built a Rust async runtime that is basically a wrapper around Grand Central Dispatch on macOS! This is the operating system task scheduler, which means they basically get the dispatch behavior of a native Mac application. Very cool stuff!

                                          1. 2

                                            Ah this is quite interesting actually. The major dichotomy I see in languages tends to be 1:1 threading (or just “threads”) and M:N threading (often called “async”; M userspace threads multiplexed over N kernel threads, like green threads, goroutines, etc.). I’ve also seen async refer to M:1 threading (coroutines/state machines/event loops on a single thread like Nginx or Node.js).

                                            I hadn’t had much opportunity to empirically compare the different approaches myself, but the main argument I heard in favor of 1:1 threading is that an M:N model creates a “left hand doesn’t know what the right hand is doing” situation, frustrating the kernel’s ability to take advantage of its knowledge when scheduling, and that avoiding context switches or general threading overhead isn’t the bugbear it used to be anyway. So seeing what looks like an additional option here—using Rust’s async to schedule threads M:N, but with the kernel still steering the scheduling—sounds appealing.

                                            This does remind me of Google’s switchto proposal alongside futexes to facilitate user-space threading as well.

                                            I’m not particularly well informed or up-to-date on these things so happy to hear if I accidentally misrepresented something.

                                            1. 4

                                              M:N threading (often called “async” […] I’ve also seen async refer to M:1 threading

                                              Asynchronicity is orthogonal to threading. For extra confusion, “async” these days tends to mean async/await (state machines built from “normal looking” code), as opposed to stackful coroutines.

                                              with the kernel still steering the scheduling

                                              Ehhh, AFAIK this isn’t any more kernel-based than any other common scheduler. libdispatch is just Apple’s userspace library that provides a scheduler and other concurrency stuff, and this article describes using it because it’s “platform native” and has better integration with Apple’s frameworks and the power management and whatnot.

                                              This sounds roughly equivalent to running on glib’s event loop which would be “platform native” for a GNOME application.

                                              Either way, it is indeed great to see the flexibility of Rust’s async/await in action!

                                              One perhaps-not-so-well-named innovation arriving in the server space is a little “more kernel-based” in a way: the “thread-per-core” runtimes. Basically: lots of server applications are pretty much shared-nothing, so let’s just spin up completely independent event loops on $num_cpus threads, and (here’s the kernel part) let SO_REUSEPORT distribute requests between them.

                                              (The linked runtimes also all switch to completion-based I/O—because io_uring go fast wooo—which results in ecosystem compatibility pain due to different read/write interfaces; but you can also replicate the architecture itself in a few lines with Tokio as the TechEmpower benchmark entries do)

                                              1. 5

                                                For extra confusion, “async” these days tends to mean async/await (state machines built from “normal looking” code), as opposed to stackful coroutines.

                                                You’re right, but usually people contrast async/await with virtual threads, not stackful coroutines, which are different constructs. The grandparent is more confused though, in that there’s no connection between the interface exposed to users (virtual threads, stackful/stackless coroutines, actors, whatever) and the scheduling mechanism (1:1, N:1, M:N, etc).

                                                Basically: lots of server applications are pretty much shared-nothing, so let’s just spin up completely independent event loops on $num_cpus threads, and (here’s the kernel part) let SO_REUSEPORT distribute requests between them.

                                                But as I wrote in the article you linked: a lot of the times these cite benchmarks which do a well-balanced amount of work on each connection, so load balancing by distributing connections evenly with SO_REUSEPORT is an effective strategy. Unless your real workload also has that property, its questionable to cite these benchmarks in contrast to a more responsive (but also higher constant overhead) strategy like work-stealing.

                                                That is to say, its not really about being shared-nothing (though thats a prerequisite for avoiding coordination between threads), its about whether merely balancing connections is an effective strategy for balancing work in your application.

                                                1. 2

                                                  Right, it’s definitely not a property every workload has, but I feel like it’s not uncommon at all. For web app backends especially it’s typical for all request handling to be essentially “fire off queries to DB/cache/third-party-service/etc, await, template/serialize, return” which results in little variety of work per request (and so—with some handwaving about clients doing roughly similar numbers of requests—per connection too). So to me it makes sense that that approach is getting some traction.

                                                  …oh, and of course the thing is that it’s kind of a tried-and-true architecture for web apps: all those GIL-burdened scripting language interpreters were often being scaled in an OS process per core way (or “just some larger number of worker processes than cores” when each process was running fully synchronously instead of having an async event loop). Those got away with merely balancing connections!

                                                  1. 1

                                                    I don’t believe that it’s common for web clients perform roughly similar numbers of request per connection, and if the number of requests you perform per request is not the same that is also an imbalance of work even for stateless services.

                                                    Your comment about “GIL-burdened scripting language interpreters … running fully synchronously instead of having an async event loop” seems confused. You’re conflating multiple categories of architecture: some spawn a process per core and use async IO, some spawn a thread per connection and use blocking IO. In the latter case they do use work-stealing because the OS will dynamically balance those threads across cores so that cores don’t sit idle. Some even spawn a process per connection, so they don’t share state but they use OS-level work stealing. Regardless, these are hardly relevant to a comparison of architectures that both outperform all of these.

                                                    So yes if you were to assume these things you might reach a certain conclusion, but they seem counterfactual to me so I don’t know why we’d engage in this thought exercise.

                                                    1. 2

                                                      IIUC, they’re saying that even if the requests per connection varies greatly, the requests themselves require little in active CPU time and just orchestrate more IO (“fire off more queries”); In order words, they’re so small or dependent on asynchronously waiting it’s not worth introducing work-stealing to dynamically spread processing load.

                                                      Anecdotally, Ive found this to be true as well; Services which function more as high perf “routing” + simple business logic scale just as well, and sometimes better, by sticking to a single-thread + non-blocking IO. And this tends to be most backends which handle database and sub-service calls.

                                                      When heavy processing is needed they can offload that work to a thread pool before returning back to the IO thread. I believe the pool being mixed with the IO thread is unfortunate programming & perf burden that exists to avoid straying from opaque OS threads as a concurrency model.

                                                  2. 1

                                                    That article was informative, thank you. I mainly wanted to contrast user-space vs kernel scheduling implementations for threads (virtual or otherwise, including stackless coroutines with explicit async/await points for linear-looking code), but conflated all of “async” and user-threading together which wasn’t very accurate. E.g. goroutines aren’t usually referred to as async, just a concurrency construct that largely present themselves as threads with transparent preemption where M:N threading is an implementation detail.

                                                    The most outspoken opponent of M:N user-space scheduling I recall was Bryan Cantrill, though perhaps that was more specifically about work stealing. I don’t recall the argument exactly, and don’t necessarily agree, mostly it left me thinking that perhaps my assumptions about the overhead of kernel threads were outdated. Nonetheless I’ve been happily using async Rust and haven’t been working in the kind of space where getting into the weeds of optimizing these things has mattered much.

                                                    1. 3

                                                      I guess I’m honored to be the model’s most outspoken opponent? I reflected on this a while back[0] – and now 16 years further down life’s path (!), I stand by not just my thoughts on the M:N threading model, but also (obviously?) transactional memory, which is rightfully in history’s dustbin.

                                                      [0] https://bcantrill.dtrace.org/2008/11/03/concurrencys-shysters/

                                                      1. 2

                                                        The most outspoken opponent of M:N user-space scheduling I recall was Bryan Cantrill

                                                        I wonder if that was about the 1990s Ssolaris threads, or something more recent.

                                                        That old Solaris M:N threading implementation was crippled because they didn’t provide enough new kernel facilities for the userland part to be able to work well, and the userland part didn’t do enough to compensate. So for instance, all filesystem ops were blocking and were not offloaded to worker threads.

                                                2. 2

                                                  +1 for using the platform!

                                                  this also explains why I’m patiently awaiting the windows version… but I imagine it’ll be a similar approach, to use the system’s event scheduler!

                                                3. 16

                                                  I used exclusively dwm as my desktop window manager for many years, and as my config.h file diverged further from the provided config.def.h, consistently more maintenance was needed on every upgrade.

                                                  There were other minor problems too – like if I didn’t bind a particular function to any binding, the compiler would spit out “unused function” warnings.

                                                  Basically, from experience I agree with the thesis of this post. But I still like the idea of source code configuration; maybe there’s some cleverer, better way to execute it?

                                                  1. 10

                                                    You might be surprised to hear that from me, but it also bothers me to some extent. In the ideal case, your software is mature and its configuration interface ‘complete’.

                                                    The issue is that there rarely are data structure refactorings during the development even if they yield tangible simplifications.

                                                    To give an example for dwm, the layout functions are well-modularised with a clean and extensible interface. Many other aspects have not reached this zen though.

                                                    One idea could be to guarantee backward/forward compatibility for minor versions within semantic versioning.

                                                    Overall though, configuration via source code is a viable choice for some projects. However, the Pareto principle dictates that not all will be satisfied by it.

                                                    1. 3

                                                      You might be surprised to hear that from me

                                                      I am, but only a little. When I read this, my first reaction was “I bet suckless.org feels called out.” My more refined reaction was that you are the home to projects which probably fit in the box of “exception that proves the rule” for this case.

                                                      1. 1

                                                        A lot is just historical baggage, to be honest, and suckless.org is more heterogenous than you might think.

                                                      2. 2

                                                        You might be surprised to hear that from me, but it also bothers me to some extent. In the ideal case, your software is mature and its configuration interface ‘complete’.

                                                        FWIW, I’m not surprised to hear that from you! I use suckless programs when they do what I want out of the box (and when they do I totally love them, so thank you!) and if not then I ignore them, and I recommend that modus vivendi to everyone.

                                                      3. 9

                                                        Maybe using Scheme or Lua is the best path here.

                                                        1. 4

                                                          Given that dwm is < 2500 LOC, it doesn’t seem to make a lot of sense. Maybe a budget of 200 LOC for a config language would make sense, though.

                                                          1. 2

                                                            Yeah, while I have generally liked the trend of using Lua for config for other projects, it looks like it’s about 10× that at 24k SLoC. I could see a Lisp getting much smaller. This might exist, but ideally it’d be a standard enough dialect to have basic typing and a language server: autocomplete hints and realtime validation go a long way to self-documentation.

                                                            1. 3

                                                              Lua

                                                              Xlib is about 100k SLoC. For config you can strip down Lua substantially as well since you don’t need or might not even want to expose the stdlib.

                                                        2. 1

                                                          would there have been a downside to not upgrading?

                                                          1. 1

                                                            The question of why anyone ever upgrades anything is an interesting one and worthy of discussion, but that’s not really the conversation I was aiming for with my comment.

                                                            1. 1

                                                              I’m asking about dwm specifically though. if upgrading is not necessary, then not upgrading is an easy way to avoid the problems you described.

                                                              1. 1

                                                                I think @FRIGN’s suggestion of config API following semantic versioning might be the best way for this to work

                                                                1. 2

                                                                  it might be, or it might be unnecessary complexity

                                                                  1. 1

                                                                    I upvoted you because I think this is a valid argument, but in my experience semantic versioning on an API is absolutely worth the extra development overhead, even from a developer’s point of view

                                                        3. 3

                                                          I like the idea of config.h. config.def.h may be in the repository.

                                                          I think that C is a pretty nice configuration language. It takes quite a bit of time to learn it though.

                                                          1. 16

                                                            The normal problem with config.h.def and config.h is that you’re still asking people to merge files, it’s just that you’re not threatening to overwrite their copy. If you add new things to your config.h.def, on a version update people have to merge those changes into their current (old) config.h, along side their changes and settings. If they don’t, at the best they get compile errors.

                                                            (I’m the author of the linked-to article.)

                                                            1. 1

                                                              I see. I think that at some point configuration will inevitably require patching the source code and there’s little one can do. To be absurd, one can consider the entire process of developing software to be an act of configuration. It doesn’t go that far for everyone, of course.

                                                              Furthermore, if the user disagrees with upstream changes, there’s also only so much the project can do to mitigate that before the user has to effectively fork it. The lines between the user and the developer get blurry.

                                                              However, the idea of mergeability is interesting and is worth attention, in my opinion.

                                                              Also I see similarities to the idea of early and late binding. Editing a source file directly seems to be early, having a separate build directory seems to be late.

                                                              In the end, I think that it depends, whether one should require people to change the source or not.

                                                              1. 4

                                                                I think that at some point configuration will inevitably require patching the source code and there’s little one can do.

                                                                A lot of my career has been in this liminal space between configuration and patching. It happens to me when there is some software that is core to the service I am running, which doesn’t quite work the way I would like it to. The options are, roughly,

                                                                • live with the limitation;

                                                                • work around it with scripting or custom hooks or plugins;

                                                                • if the workaround is particularly awkward, or if there isn’t a suitable customization facility, hack in a local patch;

                                                                • the advanced mode is to develop a feature that is generally useful, independent of your local peculiarities, with tests and documentation, that can be contributed upstream.

                                                                The middle two are tech debt, that has an ongoing maintenance cost. The last is a positive-sum outcome, but it’s much more effort in the short term.

                                                                1. 3

                                                                  I’ve felt similarly over time. After using some systems like Pijul or Darcs, I’m more optimistic about a leaning on a merging-based approach in the future, as they do so much better than Git during conflicts, especially since many things that Git would mark as conflicts don’t require manual intervention. Despite those tools already existing, it’s more of a pipe dream currently, as they’re still quite pretty esoteric or eccentric at best. (Also, the best experience would probably come from a variant that was more language-aware, but I think that’s still may be more a research-level problem.)

                                                                  Overall (if I’m understanding your point correctly) I agree that it makes sense to view configuring and patching as different flavors of the same thing, and configuration can require cleaving your code in premature or less-than-ideal ways unfortunately.

                                                                  1. 2

                                                                    What I really want is something like mercurial’s changeset evolution. Perhaps in the past I should have learned stgit instead of hacking my own patch management script - I have a friend who has been successfully using stgit for years. More recently I have just been rebasing and relying on the reflog if I need to look at older versions of a branch, though that isn’t a great solution. I’ve never noticed a conflict that I thought Darcs or Pijul might have handled better, I think because I have used workflows that do not lean heavily on three-way merge. A much more promising option is jj, but I am hesitant because I might have to give up magit.

                                                                    1. 3

                                                                      I’ve never noticed a conflict that I thought Darcs or Pijul might have handled better

                                                                      “Handled” is an interesting word here. Being able to cherry-pick conflict resolution is a much better “handling” than what Git gives you.

                                                                      A much more promising option is jj, but I am hesitant because I might have to give up magit.

                                                                      Probably a misunderstanding, jj is essentially a Git frontend, like Gitless, StackedGit, Git branchless, GitButler… or even Magit! So (1) jj doesn’t handle conflicts any better than git rerere, it just makes the UI around rerere easier. How promising is that? And (2) you can still use Magit, and jj on the command-line.

                                                                      If you like your Git UI (Magit), why get another one?

                                                                      1. 3

                                                                        Jujutsu makes significant improvements to the vcs interaction model: it hides git’s index, it auto-commits, it represents conflicts in persistent commits instead of ephemerally in the working tree. Magit retains git’s interaction model, e.g the index is front-and-centre in magit’s ui, so I would not get the benefit of jj if I continue using magit.

                                                                        I’ve never used git rerere because (as I said) I’ve used rebase-oriented workflows which resolve each conflict once when it occurs during a rebase. So there’s never a need to repeatedly resolve the same conflict. And there’s never an existing resolution that can be cherry-picked.

                                                                        1. 1

                                                                          it hides git’s index, it auto-commits

                                                                          So does GitButler, another Git UI which doesn’t claim to be a “revolutionary new VCS” while being a Git UI. Honesty is the promising approach here I think.

                                                                          it represents conflicts in persistent commits instead of ephemerally in the working tree

                                                                          So does git rerere, if you’ve never used it because you don’t see any conflicts, I’m not sure why Jujutsu would make your conflicts so much better, it just represents conflicts slightly differently from git rerere.

                                                                          1. 2

                                                                            A couple of corrections:

                                                                            • git rerere stores its state in .git/rr-cache, not in commits
                                                                            • I do see conflicts, but I don’t see the same conflict more than once

                                                                            Jujustu’s ultimate aim is to have its own repository format. That will allow it to do better than re-using git, for instance by storing conflicts in a less hacky manner, and by discarding baggage like the index.

                                                                            1. 2

                                                                              git rerere stores its state in .git/rr-cache, not in commits

                                                                              That difference only matters if you’re doing heavy cherry-picking, which you typically don’t do in a system where commits change their identity.

                                                                              I do see conflicts, but I don’t see the same conflict more than once

                                                                              Same for rerere users.

                                                                              Jujustu’s ultimate aim is to have its own repository format.

                                                                              I thought the ultimate goal was to be compatible with Git, what you’re saying sounds like reinventing an incompatible snapshot-based VCS with a nicer interface than Git, which Mercurial already does extremely well.

                                                                              1. 5

                                                                                Probably a misunderstanding, jj is essentially a Git frontend, like Gitless, StackedGit, Git branchless, GitButler… or even Magit!

                                                                                As the original author of Jujutsu, I think I’m qualified to say that it’s not just a Git frontend. For example, we have integrations for our in-house centralized VCS at Google which allows you to use jj as a client for it. There’s no Git involved there, so I don’t think “a Git frontend” makes sense. I have no experience with Bazaar, but I believe it has pluggable storage backends in a similar was a Jujutsu, so that’s probably a better comparison. (EDIT: I think I meant Breezy, not Bazaar, as the VCS that has some Git-compatible storage format.)

                                                                                I’m not sure why Jujutsu would make your conflicts so much better

                                                                                Jujutsu has an understanding conflicts that’s completely different from rerere. For example, if you rebase a commit and it results in conflicts, you can rebase it back and the conflicts will be gone. rerere doesn’t do anything like that. More details here: https://martinvonz.github.io/jj/prerelease/conflicts/

                                                                                So does GitButler, another Git UI which doesn’t claim to be a “revolutionary new VCS” while being a Git UI. Honesty is the promising approach here I think.

                                                                                It sounds like you’re implying that I or other Jujutsu devs are being dishonest. As I said above, it is a new VCS, and I do think it has some new ideas. It is not just a Git UI. But I don’t think we’ve claimed it’s “revolutionary” anywhere. Can you point to where we did that so we can fix it?

                                                                                Mercurial already does extremely well.

                                                                                Agreed! That’s why I copied a bunch from it. I presented at Git Merge 2022 about some of the problems we’re having with Mercurial at Google. There are links to slides and a recording from Jujutsu’s GitHub repo if you’re interested.

                                                                                1. 2

                                                                                  It sounds like you’re implying that I or other Jujutsu devs are being dishonest.

                                                                                  The page linked just above that claim of yours starts with “like Darcs and Pijul”, which is very intentionally false or at least misleading, as I’ve told you many times in the past. Your total lack of consideration for our feedback is indeed very clearly dishonest. Either provide an objective, honest comparison also stating what Darcs and Pijul do that Jujutsu doesn’t do, or don’t mention these other systems at all.

                                                                                  Can you point to where we did that so we can fix it?

                                                                                  I’ve answered that question of yours many times already, so I can certainly answer it once more: combining the “best of Git and the best of Pijul” (which you state in virtually every single page of your documentation you’ve ever referred me to, including your README) would indeed be something really new and interesting, possibly revolutionary. This is what you have been claiming all along, but it is incredibly far from what Jujutsu actually does.

                                                                                  1. 3

                                                                                    The page linked just above that claim of yours starts with “like Darcs and Pijul”, which is very intentionally false or at least misleading, as I’ve told you many times in the past.

                                                                                    Ah, I think I see what you mean now. I don’t remember you pointing out that before. I’ll drop the mention of Pijul and Darcs there because it doesn’t really add much anyway. For reference, here’s what it says:

                                                                                    Like Pijul and Darcs but unlike most other VCSs, Jujutsu can record conflicted states in commits.

                                                                                    I think your point is that readers might think that we’re saying that conflicts are modeled in a similar way in the three different systems. That’s fair. As I said, I’ll remove that part (and all other mentions of Pijul) from our docs (https://github.com/martinvonz/jj/pull/3503).

                                                                                    Can you point to where we did that so we can fix it?

                                                                                    I’ve answered that question of yours many times already, so I can certainly answer it once more: combining the “best of Git and the best of Pijul”

                                                                                    I said that once here: https://news.ycombinator.com/item?id=29792092

                                                                                    (which you state in virtually every single page of your documentation you’ve ever referred me to, including your README)

                                                                                    For reference, so other readers here can decide for themselves, here’s what we currently say (the mention of Pijul is being removed in the PR above):

                                                                                    We combine many distinct design choices and concepts from other version control systems into a single tool. Some of those sources of inspiration include:

                                                                                    • Pijul & Darcs: Jujutsu keeps track of conflicts as first-class objects in its model; they are first-class in the same way commits are, while alternatives like Git simply think of conflicts as textual diffs. While not as rigorous as systems like Darcs and Pijul (which are based on a formalized theory of patches, as opposed to snapshots), the effect is that many forms of conflict resolution can be performed and propagated automatically.
                                                                          2. 1

                                                                            If you start a rebase and abandon in the middle for some reason (after solving some conflicts), you’ll have the conflicts again when you restart the rebase. rerere helps then.

                                                                        2. 1

                                                                          stgit

                                                                          Thanks for mentioning this! I’m willing to drop magit to try this for a while.

                                                                      2. 2

                                                                        if the workaround is particularly awkward, or if there isn’t a suitable customization facility, hack in a local patch;

                                                                        Depending on your situation, you can use Nix for the patching, building, and installation, and make this process a bit more robust. You’ll still need to rebase your patches against the upstream when it changes, though, of course.

                                                                        1. 2

                                                                          Choice of tools doesn’t matter as much as being engaged with upstream so you have a better idea of how much effort the ongoing maintenance is likely to be. I prefer to keep my patches in a local fork of upstream git, tho that generally means the .deb build or whatever also needs to be adapted.

                                                                    2. 1

                                                                      While I generally agree with your article, the pedantic part of my brain keeps telling me the issue here is not code, it is change management. Concretely, one could implement something like this in a non-editable:

                                                                      // config.h
                                                                      // DO NOT EDIT (for local changes create a local_config.h in the root of the project and set variables there
                                                                      #if __has_include("local_config.h")
                                                                      #include "local_config.h"
                                                                      #endif
                                                                      
                                                                      #ifndef SOME_CONFIGURATION_STRING
                                                                      // Not defined in local_config.h, use the default
                                                                      #define SOME_CONFIGURATION_STRING "Default Value"
                                                                      #endif
                                                                      
                                                                      #ifdef OLD_BROKEN_CONFIG_OPTION
                                                                      // We can even handle breaking changes in way where the user is given notice and directed to new options, etc
                                                                      #error OLD_BROKEN_CONFIG_OPTION is no longer supported, please update local_config.h"
                                                                      #endif
                                                                      

                                                                      (I know that __has_include is new and you cannot really depend on it, but there are many other ways to achieve the same goal, but most of the ways I could think of required something in the build script checking for the existence of a local variable and using that to set a define, which added a lot of boilerplate around the basic concept)

                                                                  2. 8

                                                                    Something I’ve recently realized about WASI:

                                                                    It has blocking IO

                                                                    I haven’t thought about this too much, but it feels very surprising — it seems like this could add up to a lot of systems complexity down the line, relative to an alternative world where there’s only async IO.

                                                                    This re-invites irreconcilable duplication between sync and async worlds, forces readiness based IO model, and drains resources from language design effort for ergonomic evented programming.

                                                                    And that’s going to be an ecosystem-wide cost, not something you abstract away or avoid paying for if not using.

                                                                    Worth re-reading: https://www.tedinski.com/2018/10/16/concurrency-vs-parallelism.html

                                                                    Synchronous API “pretend” that there’s no concurrency, but there always is! (At minimum, you can get ^C). But once you’ve dug yourself deep enough into “look ma, no concurrency”, it’s impossible to dig yourself out without becoming a zombi in the process!

                                                                    1. 3

                                                                      This talk leads me to believe there is a path forward with a solution. In it the speaker describes how async will be able to call sync and vice-versa.

                                                                      1. 2

                                                                        I agree, though if I’m understanding the state of things correctly, the current interface seems to provide a primarily async API via pollable and streams and the other interfaces are based on them (filesystem, http, tcp, and udp).

                                                                        I do see that in streams, the stream types provide some blocking alternatives like blocking-read in addition to read, though it seems a nonblocking version is always available. I’m not sure what the intention is precisely as far as their implementation or why they’re provided. Perhaps the blocking version would be polyfilled with a loop in most cases to be more convenient for some languages.

                                                                        From the article it sounds like waiting on further work on the Component Model will be the main thing to fully realize the potential of the async interface (from the article, “The major banner of WASI 0.3 is async, and adding the future and stream types to Wit”).

                                                                        Apologies if I’m not looking at the right sources, I’m not super familiar with how to make sure I’m looking at the most up-to-date canon for Wasm/WASI. Or perhaps your concern is mainly the presence of the blocking alternatives and that there may be over-reliance on them?

                                                                        1. 1

                                                                          The filesystem APIs that manipulate metadata (rename, delete, open, …) are all blocking, as far as I can tell from the interface declarations.

                                                                          1. 1

                                                                            I see. I do see a would-block error code (“Resource unavailable, or operation would block, similar to EAGAIN and EWOULDBLOCK in POSIX.”), but I’m not sure if it’s intended to be used with those operations.

                                                                            Looking at a reference implementation in wasmtime it looks like open_at does block. If I understand the comment correctly it sounds like there’s an intention to support nonblocking later on. There seems to be a File::allow_blocking_current_thread flag as well, but I’m not sure where/how it’s used.

                                                                            Hopefully down the line they are able to pursue a more fully asynchronous-first API. So far I think I like the outcomes of the design process, given what they have available, but I don’t know too much about the internal conversations and direction.

                                                                      2. 42

                                                                        Maybe I’m wrong, but…

                                                                        This seems to try to mimic Rust safety rails as impractical annotations. Even if these annotations can express non-trivial cases (which I doubt), they will constrain you for reasons Rust lifetimes can be constraining, you’ll have to do tons of extra work, with crappy bare-bone programming language, inside an ecosystem where no one gives a crap and that can’t even agree on a common sane tooling.

                                                                        And then I have no idea from where does it follow that the at the end it will be “Safer than Rust”.

                                                                        How is that better idea than just using a most loved programming language along with a maturing large ecosystem that actually has a culture of building safe and reliable software, using approaches that are not bolted-on but built-in?

                                                                        1. 10

                                                                          Speaking generally (this is not the first project like this, and I don’t know or care about the details here): these sorts of annotations can be added incrementally to existing c code without all-or-nothing compatibility breaks, and the resultant code can still be compiled by an unmodified c compiler (and hence interoperate somewhat freely with the existing c ecosystem, although there are obviously tradeoffs w.r.t. safety then).

                                                                          1. 11

                                                                            This one doesn’t seem to support unmodified C compilers but it’s probably simple enough to build a preprocessing step that removes these annotations to make code palatable for regular C compilers.

                                                                            Other systems like Frama-C or SPARK before 2014 (for Ada) put annotations in comments to avoid such compatibility problems.

                                                                            1. 8

                                                                              C is relatively easy to gradually rewrite in Rust. You can use Rust’s C ABI support to link the two and move function by function, or you can use c2rust to convert the C source code and gradually rewrite it into safer idioms.

                                                                            2. 1

                                                                              And yet, that “crappy bare-bone programming language inside an ecosystem where no one gives a crap” has been shown to lead to fewer non-memory vulnerabilities than Rust.

                                                                              I didn’t know Rust had the mantle of most-loved programming language. Has anyone recorded a song for Rust that can hold a candle to Write in C? Did you mean “love” in the Big Brother sense?

                                                                              1. 3

                                                                                Where does that page talk about Xr0?

                                                                                  1. 5

                                                                                    OK, let me rephrase. That page doesn’t mention Xr0. Why is it relevant?

                                                                                    1. 2

                                                                                      because it suggests C and C++ lead to fewer non-memory vulnerabilities than Rust.

                                                                                      1. 7

                                                                                        There isn’t any breakdown of which languages have how many non-memory vulnerabilities so I don’t think it suggests that at all.

                                                                                        1. 1

                                                                                          it says the decrease in memory unsafe languages and the increase in memory safe languages correlates with a decrease in memory vulnerabilities and an increase in non-memory vulnerabilities, and it strongly implies that this is due in part to efforts to shift new code from C/C++ to Rust.

                                                                                          1. 8

                                                                                            Unless I missed something, the signal for that from the article is weak at best. Rust as a fraction of new native code has increased, and it shows Rust’s percentage of new code in Android 13, but not for the other years, or how many non-memory-safety vulnerabilities were found by language.

                                                                                            All we know for sure from the article is memory-safety vulnerabilities only originated from C/C++, and that total vulnerability severity has decreased significantly as memory-safety vulnerabilities have decreased.

                                                                                            There is the section discussing why the rate of vulnerabilities overall has stayed about the same, implying more non-memory-safety vulnerabilities reported compared to before. It’s not very clear which languages are contributing the most now. It suggests that although memory-safety vulnerabilities pay the most, with less opportunity to find them the focus has shifted to logic bugs, such as the larger Java attack surface. They mention interest in further applying Rust’s type system to reduce logic bugs in general.

                                                                                            Without the raw data it’s hard to infer much but from what it does say, it sounds like with vulnerability severity decreasing significantly, there’s more focus on eliminating logic bugs in general, perhaps with some of Rust’s features. Overall it sounds like the project was a success and it doesn’t sound like they’re going to backtrack.

                                                                                            (I personally like using C and owe it a lot, but I don’t use it for new code where correctness/safety matters, which is most code.)

                                                                                            1. 1

                                                                                              It takes some assumptions and reasoning, but I think it’s revealing to notice what they don’t say, that the definitely would say if they could because it would support the projects they are trying to promote.

                                                                                              For instance:

                                                                                              All we know for sure from the article is memory-safety vulnerabilities only originated from C/C++

                                                                                              If anything I would draw the opposite conclusion, since I feel like they would say this if it were true, yet they do not say it, and it cannot be inferred from anything that they do say. Perhaps I’m missing something?

                                                                                              The caginess with which they discuss the issue of non-memory safety bugs is interesting too. General “thoughts” but no actual data, when they clearly have the data and are basing the core claims of the post on certain statistics.

                                                                                              I will admit that it was too strong of a claim to say that it “shows” C leads to fewer non-memory vulnerabilities. Honestly the signal from that article for any claim is weak at best.

                                                                                              1. 3

                                                                                                Regarding “All we know for sure from the article is memory-safety vulnerabilities only originated from C/C++”, I inferred it from “To date, there have been zero memory safety vulnerabilities discovered in Android’s Rust code”, and it looked like the only other languages were Java and Kotlin, though I suppose there may be some gray zones where maybe Java/Kotlin is driving some C/C++ calls and breaking a contract due to a logic bug.

                                                                                                The caginess with which they discuss the issue of non-memory safety bugs is interesting too. General “thoughts” but no actual data, when they clearly have the data and are basing the core claims of the post on certain statistics.

                                                                                                I will admit that it was too strong of a claim to say that it “shows” C leads to fewer non-memory vulnerabilities. Honestly the signal from that article for any claim is weak at best.

                                                                                                Yeah I agree on both these paragraphs. Seeing specifically the vulnerabilities by language would’ve been much more informative; from what’s shown it wouldn’t be possible to infer Rust produced fewer logic bugs either. It did still seem fair to see it as a success overall, but being able to learn more about the nature and prevalence of the logic bugs with more actual data would’ve been nice.

                                                                                                1. 1

                                                                                                  I suppose there may be some gray zones where maybe Java/Kotlin is driving some C/C++ calls and breaking a contract due to a logic bug.

                                                                                                  They say “memory safety vulnerabilities are exceptionally rare in our Java code.” Are you sure these rare Java memory vulnerabilities are all the result of calling out to fragments of C/C++ code? I feel like there could be some assembly or something else super weird.

                                                                                                  1. 1

                                                                                                    Oh I overlooked that, I’m not really familiar with JNI so I’m not sure.

                                                                                            2. 1

                                                                                              No it doesn’t. It says that the decrease in the rate of increase of code in memory unsafe languages alongside an increase in the rate of increase of code in memory safe languages correlates to a decrease in the rate of increase of memory vulnerabilities, and also a decrease in the observed severity of bugs. This is totally different from what you are claiming. It does not at any point claim that C and C++ have fewer non-memory vulnerabilities than Rust; instead it claims that Rust’s feature set opens up a fruitful approach for reducing logic bugs.

                                                                                              1. 1

                                                                                                No it doesn’t. It says that the decrease in the rate of increase of code in memory unsafe languages alongside an increase in the rate of increase of code in memory safe languages correlates to a decrease in the rate of increase of memory vulnerabilities, and also a decrease in the observed severity of bugs. This is totally different from what you are claiming.

                                                                                                that’s perfectly in line with what I’m saying; you just left out the part where it says the overall number of vulnerabilities has remained steady. do I need to explain?

                                                                                  2. 3

                                                                                    Did you mean “love” in the Big Brother sense?

                                                                                    dpc_pw likely meant the Stack Overflow Developer Survey sense: the survey results said “loved”, but the survey question had more specific wording, asking which languages respondents “have … done extensive development work in over the past year, and … want to work in over the next year” (in the 2022 version; they changed “loved” to “admired” in 2023).

                                                                                1. 6

                                                                                  I have implemented Rosettacode tasks in Odin in the past, I find it to be a good way to learn a new language in small steps:

                                                                                  https://github.com/eterps/odin_rosettacode

                                                                                  1. 3

                                                                                    I still wonder why GraphQL exists. It just seems to reinvent parts of SQL in curly braces syntax?

                                                                                    1. 10

                                                                                      I think the “QL” part in GraphQL is misleading, it’s more of a way for a frontend component to declare what data they need in a way that can be composed, so that a parent component can fetch data for a tree of components upfront (avoiding waterfalls) while also maintaining the abstraction boundary between child and parent components and keep the child component’s data needs colocated with it.

                                                                                      To do that you basically just need a way to say “I need these fields from this type to render”; otherwise it’s pretty orthogonal to SQL and so forth. GraphQL doesn’t inherently have an equivalent to WHERE, GROUP BY, LIMIT, etc. Though there are conventions (like Relay’s connections spec for automatic pagination).

                                                                                      1. 6

                                                                                        I’d say it’s better than a random half-specified subset of SQL. GraphQL also seem to provide better access control than the approach taken in OP.

                                                                                        1. 4

                                                                                          My take is it solves a Conway’s Law problem. Frontenders don’t know what data they need until they know, but they don’t feel empowered to ask for new endpoints. Using GraphQL means they can just write a new query and then the BE team can react after dev ops tells them the query has terrible performance or whatever.

                                                                                          1. 1

                                                                                            Or they feel overly empowered and you end up with five dozen subtly different god endpoints.

                                                                                          2. 3

                                                                                            I forget where I saved this quote from, but:

                                                                                            GraphQL is just a declarative syntax to force a server to build a viewmodel for you.

                                                                                            I roughly agree and I think it can be good at that. GraphQL has flaws, and I wouldn’t turn to it for most projects, but I understand the situations where it has advantages.

                                                                                            1. 3

                                                                                              I see this sentiment a lot. I personally appreciated the why of GraphQL after I spent more time understanding Ent, Metas “executable schema”. It’s worth poking around the blog posts for an open source version. Ent is an extremely expressive ORM that encodes a full graph of all of Metas data and properties (like PII, auth). GraphQL is just a way of querying/mutating that graph from networked clients. AFAICT Meta engineers don’t actually write GraphQL schemas -it is just generated from the Ent schema. The benefit of Ent over ANSI SQL is its expressiveness - Meta can encode business specific logic and have it enforced (and introspected) by all clients.

                                                                                              1. 2

                                                                                                I still wonder why GraphQL exists. It just seems to reinvent parts of SQL in curly braces syntax?

                                                                                                Brace syntax is really nice. People struggling with JOIN’ing data together, but in Graphql it’s a trivial {.

                                                                                                1. 2

                                                                                                  Last I checked (which has been a couple years; I’m not claiming the following is up to date), GraphQL didn’t actually join anything; you had to plumb everything into GraphQL yourself or use some middleware that can reflect over your postgres schema and configure GraphQL appropriately.

                                                                                                  I care much less about the syntax and more about the amount of work I have to do to get something working.

                                                                                                  1. 1

                                                                                                    GraphQL doesn’t “do” anything. Resolvers do. So either you use a custom resolver or you use something that provides resolvers automatically, but from the GraphQL user’s perspective joins are just {

                                                                                                2. 1

                                                                                                  rummages around in old notes … ah, here:

                                                                                                  you can build a robust GraphQL API by rewriting incoming queries into non-redundant ground-typed normal form, computing the expected result size in polynomial time, and then proceeding to execute the query only if the expected size is below an acceptable threshold.

                                                                                                  (https://blog.acolyer.org/2018/05/21/semantics-and-complexity-of-graphql/, discussing https://dl.acm.org/citation.cfm?id=3186014)

                                                                                                  I.e., you can inspect an incoming query and decide on whether to execute it instead of having to timeout queries that are too expensive in (execution) time or space.

                                                                                                  1. 1

                                                                                                    The answer is: Marketing.

                                                                                                    I do agree that it is unfortunate that such things sneak in areas tha should be left for engineering.

                                                                                                  2. 11

                                                                                                    Jepsen is great. As I commented elsewhere https://jepsen.io/consistency is a go-to for me. I don’t use MySQL so I don’t have too much to say about the substance of this particular test, but I wanted to take a moment to appreciate how this is an excellent example of exposition for the web.

                                                                                                    Many of these points below aren’t necessarily unique or novel, and are subjective, but it shows an attention to detail and service to the reader.

                                                                                                    • Document date upfront (unambiguous ISO too).
                                                                                                    • Abstract upfront that’s a microcosm for the rest of the essay. Sometimes I feel like I and others have a habit of not wanting to “spoil the good parts up front”. Maybe this is from grade school essays where I felt like I had to save the good stuff for the necessary but vaguely defined “conclusion” section. The abstract informs me with the opportunity to know more, rather than just being an advertisement for the rest of the essay.
                                                                                                    • Reads well to the already experienced, but with affordances for those who need to gather more context (links and footnotes and so forth).
                                                                                                    • Body typeface seems to be plain system serif, which is probably pretty good for most people as they’re used to reading it. But for the italics in the abstract they made sure to use a typeface with an actual italic variant rather than relying on the browser’s italic polyfill.
                                                                                                    • Text color is unnoticeably lighter than true black.
                                                                                                    • Correct typographical characters (curly quotes, en-dashes, em-dashes, ellipses), including in the block quotes. Even the copyright declaration at the bottom uses an en-dash for the range. Ellipses are like my version of that “rule” where you check if a dentist’s waiting room’s magazines are up to date as a signal for their attention to detail. “…” usually looks nearly identical to “…” and so harmless to ignore, but nonetheless a sign of conscientiousness.
                                                                                                    • I personally like the justified text here, though results can be mixed using the browser’s typesetting. But I was impressed the PDF was typeset with microtypographical adjustments (e.g. the trailing hyphens extend a bit into the margin so that the margin are optically aligned rather than just geometrically. I’ve been seeing this a bit more often in PDFs in general so maybe it’s just updates to pdftex propagating.)
                                                                                                    • Diagrams are SVG with selectable/copyable text.
                                                                                                    • Inline code font is adjusted (by 85%) to read more smoothly with the surrounding prose. I also like that inline code is used sparingly, letting prose be prose except where the code font helps. Code blocks fit in pretty well too; minimal extra distraction, and color scheme fits the rest of the page. Tables are structured but without too much extra decoration as well.
                                                                                                    • I like ensuring there’s a recommendation section at the end. Exposition is good, and making sure you don’t forget to have a “ok, so what do I do with this information, if anything?” from the expositor is very helpful.
                                                                                                    • I learned the word “lagniappe” today.

                                                                                                    More on the substantial side, glad to see they also tested RDS as well, since that’s certainly going to be relevant to many.

                                                                                                    1. 13

                                                                                                      Thanks for writing this. I work my butt off on Jepsen’s writing, of course, but folks don’t necessarily see the weeks of work I put in to typesetting, color choice, visualization design. Stuff like color-blind accessibility, italic families, footnote backlinks, hand-editing code snippets to fit well within column widths… As someone with a bit of background in book arts, it makes me really happy to see people appreciate those aspects of the work.

                                                                                                      As you note, every report gets compiled (thanks to Pandoc and a pile of TeX templates, Lua Pandoc filters, and custom HTML rewriting in the web site’s Clojure backend) to both HTML and PDF, including embedding (sometimes interactive!) SVGs. Some of the awkward choices in layout and wrapping come about from trying to write and compile for both formats–everything has to work in two-column text, on large browsers, and on phones.

                                                                                                      A lot of the typographical stuff you’re noting is from Pandoc & XeTeX. I write in Markdown with a lot of embedded LaTeX hints (-/–/—, math typesetting, etc) and it does a darn good job of translating that to both PDF and HTML. The web site code also uses a bunch of the Flexmark typographical extensions to Markdown. Body fonts are TeX Gyre Schola. Titles and some metadata are Museo Slab and Museo Sans.

                                                                                                      Some of Jepsen’s design is just BAD. The SVG renderings of anomalies from GraphViz? That stuff is such a hack, and it sticks out like a sore thumb. It’s machine-generated, it gets written as an SVG, munged to EPS, there’s terrible bounding-box hacks, it’s just… it’s a nightmare. Kind of amazed it works at all.

                                                                                                      On the writing side–one of the things I work really hard on, both in my writing and teaching, is having a scale of affordances. Jepsen reports should be straightforward and specific for database people and researchers. But they should also have top-line recommendations that a motivated VP or CTO can digest. Most importantly, I want newcomers to have handholds throughout the work. They may skip some of the jargon or formalism, but in every section there should be some kind of beginner-friendly exposition that helps them learn something new, and motivate further learning.

                                                                                                      1. 4

                                                                                                        Oh I didn’t realize you were the poster so I’m glad you saw my comment!

                                                                                                        The effort does show, and thanks for sharing your tooling and process. I think it was the moment I opened the PDF and saw how consistent it was with the HTML that I went back and started writing this comment, because like you said that can be a surprisingly large amount of work. Typesetting a fixed-size PDF is one thing; typesetting for effectively 3+ formats at once is another. I was recently doing something similar and seeing how far I could take web-typesetting so it was all pretty fresh for me. (Side note, I wish there was a better way to scale down the code fonts but keeping the weight consistent. Variable fonts maybe?) Also amusing to note that with P3 and Rec2020 we literally have more colors to worry about too.

                                                                                                        I also didn’t realize until you mentioned it how awkward the SVG across both formats would be. Hopefully we’ll see improvements in diagram-setting tooling, it’s still so manual and finicky.

                                                                                                        I wanted to write more on the prose but worried my comment was already too long. It is interesting to optimize for both skimming and depth. A while back I read that one book, How to Read a Book (usually gets a chuckle when I mention to someone I felt the need to read such a thing) but I remember its main thesis being: If you only had 30 seconds to extract as much information from a book, what would you do? How about 5 minutes, 30 minutes, an hour, a day? And developing methods for all those levels. What you said reminded me of applying that in reverse, providing something at each level that can further guide their attention where it’s most useful.

                                                                                                        (Funny enough, that book had a little rant about how most books are way too long, but to be honest How to Read a Book was way longer than it needed to be. Nonetheless that book has stuck with me. I write better abstracts now, I don’t immediately skip tables of contents, and I felt vindicated in my habit of skimming and economizing my attention.)

                                                                                                    2. 26

                                                                                                      We gotta put a moratorium on developers saying things are or aren’t real. I mean, I get that it’s just a clickbait way of making your piece more appealing and that’s not great…but it’s actually worse than that, you could trigger discussions in the comments about whether tech debt is real.

                                                                                                      1. 9

                                                                                                        Makes me want to write a post titled “i isn’t real”

                                                                                                        1. 13

                                                                                                          You’re not wrong, but it’s a bit more complex than that

                                                                                                          1. 9

                                                                                                            There’s an uncountable number of jokes in this.

                                                                                                            1. 3

                                                                                                              The rational subset of those jokes is countable.

                                                                                                            2. 1

                                                                                                              I can’t imagine how

                                                                                                          2. 8

                                                                                                            Existence (or lack thereof) considered harmful? :^)

                                                                                                            1. 3

                                                                                                              The title is wholly addressed in the very first paragraph.

                                                                                                              In software development, “Technical Debt” often emerges as a foreboding specter, casting a long shadow over codebases and development teams alike. Yet, herein lies a provocative truth: technical debt is not a tangible entity lurking within lines of code. It’s a metaphor, a way of thinking about the accumulated consequences of past decisions and shortcuts.

                                                                                                              1. 8

                                                                                                                Owning up to your clickbait after it’s done its job is (a) pretty common and (b) no defence at all, IMO. I think Lobsters should demand better.

                                                                                                                1. 1

                                                                                                                  I’m confused by your complaining about clickbait. Surely you didn’t click on an article titled “Technical Debt is not real” and expect that idea/concept was wholesale rejected? The article goes on to make several nuanced points, with the title being the punchline.

                                                                                                                  1. 4

                                                                                                                    Surely you didn’t click on an article titled “Technical Debt is not real” and expect that idea/concept was wholesale rejected?

                                                                                                                    As a data point, I absolutely did expect that. I didn’t expect to agree with the article, but I did expect the article to advocate for rejection of the concept.

                                                                                                              2. 1

                                                                                                                Marked as spam. Presumably it’s yet another case of equating (for fun and profit) “it’s not ‘real’ debt” and “it’s not real”.

                                                                                                                1. 1

                                                                                                                  Post is better than the title makes it seem. Calls out to Kent Beck for coming up with tech debt and how to prevent building “legacy” code by reworking the system first to accommodate the new feature easily (which is hard), then build the easy feature.

                                                                                                              3. 3

                                                                                                                Jepsen in my hero

                                                                                                                1. 3

                                                                                                                  https://jepsen.io/consistency alone is a good resource. At work if we start debating edge cases I can usually just get us to point at which of these we’re happy with and from there it’s relatively easy to choose the right tools/techniques.

                                                                                                                2. 40

                                                                                                                  Me: Tracking my finances is hard, there’s lots of steps and it’s time consuming. I wonder if nerds have figured out a better way…

                                                                                                                  Article: Track your finances manually with a verbose data format!

                                                                                                                  Me: Maybe I’ll just check my bank account once a week like I do now…

                                                                                                                  1. 34

                                                                                                                    Why concentrate on managing your finances when you can reinvent double-entry bookkeeping from first principles!

                                                                                                                    1. 15

                                                                                                                      Every nerd who teaches themselves financial literacy builds their own double-entry bookkeeping tool.

                                                                                                                      1. 12

                                                                                                                        The Lost Wisdom of the Ancient Renaissance Bankers.

                                                                                                                        1. 3

                                                                                                                          I preferred Business Secrets of the Pharaohs.

                                                                                                                        2. 5

                                                                                                                          hah Maybe if I understood double-entry bookkeeping, a lot of these discussions would make more sense.

                                                                                                                          1. 17

                                                                                                                            A git diff shows you what has been removed, and what has been added, and where. That’s double-entry bookkeeping

                                                                                                                            1. 9

                                                                                                                              The point of double-entry bookkeeping isn’t the ledger (the diff) but the whole way credit balance accounts, the balance sheet and the accounting equation work together.

                                                                                                                              1. 1

                                                                                                                                Yeah I think of the main principle as more that transactions always add up to 0. I.e. the conservation of money: Money is never created or destroyed, just moved between accounts. (Though revenue and expenses are both treated as bottomless and effectively “create” money in that sense).

                                                                                                                              2. 4

                                                                                                                                I only regret I have but one upvote to give.

                                                                                                                                The second step is “and your diffs are every transaction, instead of every month”

                                                                                                                                1. 1

                                                                                                                                  Very interesting analogy, I wonder if you could use this to actually build an accounting system with git?

                                                                                                                            2. 6

                                                                                                                              It’s virtually impossible to create a future-proof way to solve this, given the (natural) difficulty of integrating with loads of different banks around the world, without doing some manual data entry. However, I should’ve clarified in the article that I do not recommend you enter data manually in TOML, but rather do some importing based on e.g. CSV statements. The advantage of TOML is that it is easy to read, edit and script around.

                                                                                                                              1. 2

                                                                                                                                I do not recommend you enter data manually in TOML

                                                                                                                                Oh yeah, that makes a lot more sense hah. My reply was in reaction to thinking that’s what you were proposing. Automating it in another way and just using TOML as the backing store makes a lot more sense.

                                                                                                                                1. 4

                                                                                                                                  I totally get your confusion! I should’ve been a bit more clear about that

                                                                                                                                2. 2

                                                                                                                                  My initial impression was too that you offer to enter data in TOML and by the end of your article you would offer some convertor from TOML to ledger, hledger and/or beancount formats.

                                                                                                                                  If you do not recommend entering data manually in TOML and if I gather it right, you argue for storing data in TOML, because there are tons of robust libraries to work with this human-readable and machine-friendly format. Hmm? The article title is a bit clickbait-y then (-;

                                                                                                                                  I believe, some of the mentioned PTA software have plug-in systems, perhaps re-usable parser code, and some sort of export to some sort of CSV? How that doesn’t solve problems you are trying to solve? So far to me it seems that you add a superfluous layer in between the wild world of financial data formats and PTA. Although, you said that it was for your personal use, so I guess it’s totally okay (-:

                                                                                                                                  1. 1

                                                                                                                                    There are plugin systems and reusable parser code, yes, but they often lock you into a certain programming language and I find that libraries for parsing common data formats like TOML often are of higher quality.

                                                                                                                                    As for the title; yes, it’s clickbait

                                                                                                                                3. 3

                                                                                                                                  Mint getting shut down has been a bit of a nightmare scenario for me as that’s where everything I have is budgeted.

                                                                                                                                4. 43

                                                                                                                                  I broadly agree with the article, but I also want to note that there’s probably going to be some typing involved with the thinking and planning bits as well. ;-) To quote Leslie Lamport:

                                                                                                                                  If you’re thinking without writing, you only think you’re thinking.

                                                                                                                                  Which is to say… some amount of typing (or diagramming, or sketching on a white board) is an incredibly useful tool for doing your thinking. Writing down an idea in a concrete fashion can help you think more concretely about it. Also, I strongly believe that thinking and planning are a team activity in their own right, and that becomes a lot easier when the idea is written down.

                                                                                                                                  I recognize that objecting to the title because of this is me being a little over-literal… but the author says that “Writing code is, for many engineers, a lot more fun and approachable than sitting around and thinking about problems in the abstract”… and I think that it’s worth being explicit that thinking about the problem doesn’t have to mean spinning in your chair staring off into space. Which is how a lot of junior developers seem to react when you encourage them to think!

                                                                                                                                  1. 6

                                                                                                                                    This is a really good point and something that I agree with wholeheartedly! I find that many are often predisposed enough to action that advocating for more thinking is the right thing to do, but I think you’re correct that there are a lot of situations in which something concrete can greatly help the process of understanding what you want to do and what different solutions might look like.

                                                                                                                                    1. 6

                                                                                                                                      Here’s a few more quotes along this line I like.

                                                                                                                                      George Boole, Laws of Thought:

                                                                                                                                      That language is an instrument of human reason, and not merely a medium for the expression of thought, is a truth generally admitted.

                                                                                                                                      David McCullough:

                                                                                                                                      Writing is thinking. To write well is to think clearly. That’s why it’s so hard.

                                                                                                                                      Interview between Charles Weiner and Richard Feynman in 1973:

                                                                                                                                      Weiner: [Referring to Feynman’s journals] And so this represents the record of the day-to-day work.

                                                                                                                                      Feynman: I actually did the work on the paper.

                                                                                                                                      Weiner: That’s right. It wasn’t a record of what you had done but it is the work.

                                                                                                                                      Feynman: It’s the doing it—it’s the scrap paper.

                                                                                                                                      Weiner: Well, the work was done in your head but the record of it is still here.

                                                                                                                                      Feynman: No, it’s not a record, not really, it’s working. You have to work on paper and this is the paper. OK?

                                                                                                                                      Grothendieck:

                                                                                                                                      He was improvising, in his fast and elegant handwriting. He said that he couldn’t think without writing. I, myself, would find it more convenient first to close my eyes and think, or maybe just lie down, but he could not think this way, he had to take a sheet of paper, and he started writing. He wrote X → S, passing the pen several times on it, you see, until the characters and arrow became very thick. He somehow enjoyed the sight of these objects.

                                                                                                                                      Another one I’ve seen attributed to Lamport:

                                                                                                                                      Writing is nature’s way of telling us how lousy our thinking is. Formal mathematics is nature’s way of letting you know how sloppy your mathematics is.

                                                                                                                                      And of course Iverson’s Notation as a Tool of Thought.


                                                                                                                                      For me the best thinking tool is usually whatever has the fastest feedback loop. I write/sketch by hand constantly, but I also find tools like Alloy particularly useful as a whiteboard that can argue back at you, like a rubber duck with a SAT solver. Alloy is generally considered to be in the formal methods camp, which can make it sound like it’s more of an overhead above and beyond mere implementation, but for me it’s more in the conceptual-whiteboard-sketching part of my process, especially since it gives me a simple standard language for describing my nouns, verbs, and relationships of a system.

                                                                                                                                      I also like static typing for the same reason; the typechecker gives me back more complete feedback more rapidly than running code often can, and I generally see writing types without necessarily implementing things yet as part of my conceptualization/design process. REPLs, of course, help with this too, so ideally I have both available.

                                                                                                                                      1. 3

                                                                                                                                        One of my latest purchases for my home working setup is a cheap drawing tablet. It makes sketching, diagramming, whiteboarding super simple and effortless. Even on Ubuntu, xournal++ immediately picked it up and was able to handle tilt, pressure, etc. out of the box. On Windows, a bunch of Office programs will recognize you’ve got a pen input and unearth a bunch of new controls dedicated for it. I still haven’t worn through the pen tip included, and the thing came with like eight extra.

                                                                                                                                        10/10 would recommend.

                                                                                                                                        1. 2

                                                                                                                                          Seems interesting, which model did you get ?

                                                                                                                                          1. 2

                                                                                                                                            It’s a Huion Inspiroy H950P but idk if I would necessarily recommend this exact make and model. It’s… fine.