Threads for tazjin

  1. 16

    Fully agree. I also think it’s important to note that RAII is strictly more powerful than most other models in that they can be implemented in RAII. Some years ago I made this point for implementing defer in Rust: https://code.tvl.fyi/about/fun/defer_rs/README.md

    1. 5

      What other models are you talking about? Off the top of my head, linear types are more powerful; with-macros (cf common lisp) are orthogonal; and unconstrained resource management strategies are also more powerful.

      1. 2

        unconstrained resource management strategies are also more powerful.

        How is that more “powerful” in OP’s sense? Can you implement RAII within the semantics of such a language?

        1. 5

          I should perhaps have said ‘expressive’. There are programs you can write using such semantics that you cannot write using raii.

      2. 2

        That’s interesting, but it wouldn’t work in Go because of garbage collection. You could have a magic finalizer helper, but you wouldn’t be able to guarantee it runs at the end of a scope. For a language with explicit lifetimes though, it’s a great idea.

        1. 11

          Lua (which has GC) has the concept of a “to-be-closed”. If you do:

          local blah <close> = ...
          

          That variable will be reclaimed when it goes out of scope right then and there (no need to wait for GC). Also, if an object has a __close method, it will be called at that time.

          1. 7

            Sounds like the Python with statement.

          2. 8

            It doesn’t have to be such a clear-cut distinction. C# is a GC’d language but also has a using keyword for classes that implement IDispose, which runs their finaliser at the end of a lexical scope. This can be used to implement RAII and to manage the lifetimes of other resources.

            1. 2

              What do you do when you want the thing to live longer? For the Rust case, you just don’t let the variable drop. For Go, you can deliberately not defer a close/unlock. What do you do in C#?

              1. 3

                What do you do in C#?

                Hold onto a reference, don’t use using.

                1. 1

                  Ah. Seems a lot like with in Python.

                  1. 1

                    Basically except I believe C# will eventually run Dispose if you don’t do it explicitly unlike Python. I can’t find evidence of when C# introduced using but IDisposable has been there since 1.0 in 2002 while Python introduced with since 2.5 in 2005.

                    1. 2

                      Python explicitly copied several features from C#. I wouldn’t be surprised if with was inspired by using.

              2. 1

                That’s a nice approach. Wish more languages would do something like this.

                1. 2

                  It still suffers from the point in the article where you don’t know who held a reference to your closeable thing and it’s not always super clear what is IDisposable in the tooling. I think VS makes you run the code analytics (whatever the full code scan is called) to see them.

                  1. 5

                    Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?

                    It seems mainstream programming jumped to GC-all-the-things back in the 90s with Java/C# in response to the endless problems commercial outfits had with C/C++, but they seem to have thrown the proverbial baby out with the bathwater in the process. RAII is fantastic & it wasn’t until Rust came along & nicked it from C++ that anyone else really sat up & took notice.

                    1. 2

                      Has anyone written a language where stack / manual allocation is the default but GC’d allocations are there if you want them?

                      D?

          1. 2

            This looks very cool and the way different cargo features are exposed fits neatly into a readTree-based CI system! I never quite understood why naersk bundles all the dependencies together, and I’m very glad that someone figured out how to do it “correctly”.

            I’m gonna try this out, just need to jump through the hoops of making it work with stable Nix first.

            1. 7

              I don’t see a difference between this and contributions to most other projects requiring a Microsoft (GitHub) account.

              1. 5

                you don’t have to have a @outlook.com account to contribute to projects on microsoft github.

                1. 5

                  I think the point is that since Google develops both their email service and Go, they are the first party in both situations and have total control over whether this is required. If you were contributing to microsoft/dotnet on GitHub it would be the same situation, but most developers on that code hosting site are not Microsoft and don’t also control the account requirements.

                1. 9

                  Finally it means you need two separate sets of keys to do almost the same thing inside or outside of Emacs. Pretty nasty and pointless!

                  You can also solve this by going in the other direction by making Emacs your window manager. I moved to EXWM a few years ago and have never looked back, it’s incredible how much better basically all of my workflows have become. It also doesn’t really need any frequent handholding, I seem to touch this config about once per month.

                  This post reminds me of another nested version of this problem that some EXWM users have: Applications (especially browsers) with tabs, which they’d prefer to move into Emacs as well.

                  1. 1

                    And even if you don’t use exam - this problem becomes less significant the more you live inside Emacs. So that’s one motivation to do things in Emacs.

                    1. 1

                      Applications (especially browsers) with tabs, which they’d prefer to move into Emacs as well.

                      surf is a handy little browser for this sort of thing

                      1. 1

                        I tried one-tab-per-window inside exwm with firefox but it was significantly slower to create a new window than creating a new tab, so I ended up having to ditch it. Hadn’t tried it with one of the webkit ones; maybe once I get forced to stop using exwm by wayland I’ll give it a shot. Is it easy to configure surf with emacs keys? When I looked at it years ago it didn’t look straightforward.

                        1. 1

                          Is it easy to configure surf with emacs keys?

                          I don’t use surf, but there are similar patches for dwm. I added emacs keys to tabbed (haven’t published the patch yet), and it was pretty easy there too. I’m guessing surf would be similar to these other suckless projects.

                    1. 3

                      Yes, but not in a program that is standing still.

                      What I mean is that in my experiences, mutability and lack of static checks becomes a maintainability concern. Most codebases, even if they heavily rely on things like global state, will always kind of be in a state where they work - but at the cost of being unable to do large changes.

                      1. 1

                        To summarize, you mean a bigger problem than mutable state and side effects is not being flexible enough to make large changes?

                        1. 1

                          No, I think they lead to large-scale changes being hard/impossible because you end up in situations where you touch some innocent looking thing and then the whole application segfaults.

                      1. 14

                        I am not a lawyer but there’s no way I’d touch this license. First, it’s an EULA, it doesn’t give me any rights to modify the software, so I’m stuck with the provider as a single supplier, making it a try-before-you-buy proprietary license, not anything like a F/OSS license. Even if the terms are FRAND, this just means that they have to screw all of their customers equally. The rest of the license has a bunch of things that make me super nervous:

                        You may use the software for any noncommercial purpose.

                        Okay, what does that mean? If it’s a CMS, for example, can I host my blog on it? What if my blog receives ad revenue? Does that change if the ad revenue is less than the cost of hosting versus if it’s sufficient that I can quit my job? If it’s an image editing program and I create a picture with it for fun, but then a year later sell the copyright of the image for $1m, am I retroactively violating the license? Note that all of the revenue-related things are answered if I’m a small company, but not if I’m an individual. This is really fun if I produce something for free, give it to someone else, and they then sell it for a load of money - was my use now retroactively commercial?

                        Previous advice I’ve received from lawyers is to avoid anything that permits noncommercial use without explicitly defining it because there are so many corner cases like this that you may discover that you’re a commercial use or, worse, a particular use may retroactively become commercial.

                        Use by any charitable organization, educational institution, public research organization, public safety or health organization, environmental protection organization, or government institution counts as use for noncommercial purposes, regardless of the source of funding or obligations resulting from the funding.

                        A charitable organisation is defined by law in most jurisdictions but this doesn’t mention anything about jurisdiction. There are reciprocal treaties between a lot of countries for the purpose of donation (for example, a German company gets a tax deduction when donating to a UK registered charity can claim a deduction if the charity meets the requirements of a German charity). In some jurisdictions this explicitly includes or excludes religious organisations. If my jurisdiction regards my organisation as a charity and the licenser’s jurisdiction does not, who wins?

                        Educational institution is similarly poorly defined. I think we’d all agree that schools and universities probably qualify. Does PluraSight? Or LinkedIn (which includes LinkedIn Learning) and, by extension, Microsoft? Does Pearson Education (it’s right in the name!)? If my company does in-house training, is it an educational institution? If I run an online store that sell clue-bats embossed with the phrase ‘the world contains many legal jurisdictions’ for hitting lawyers with, am I an educational institution?

                        Similarly, does ‘public research organization’ include anything that self-identifies as such? Including think tanks that exist solely to lobby for specific policies?

                        had fewer than 20 total individuals working as employees and independent contractors at all times during the last tax year

                        A company with 13 full-time employees can be pushed over this limit if they hire half a dozen contractors for an afternoon to staff a booth at a marketing event. I believe this clause excludes agency employees, so you’d be fine if you hire these folks via an agency (this adds to your compliance costs) but if that’s the case then you can defer triggering this rule by hiring as many full-time staff via an agency as possible.

                        earned less than 1,000,000 USD (2019) total revenue in the last tax year

                        This is probably fine for anyone in the USA (though note that it implicitly biases the license towards high-margin businesses). Not so much in the rest of the world. In the time that I’ve been paying attention, the USD:GBP exchange rate has gone over 0.8:1 and below 0.45:1. It can fluctuate by over 20% in a year. Any non-US company making over half a million USD needs to worry about this because fluctuations in exchange rate could invalidate the license. Oh, and the exchange rate to use isn’t defined. HMRC, for example, publishes exchange rates to use for tax purposes (monthly and annual averages) against GBP. I presume the IRS publishes something similar, which could have been explicitly referenced here.

                        indefinitely, if the licensor or their legal successor does not offer fair, reasonable, and nondiscriminatory terms for a commercial license for the software within 32 days of written request and negotiate in good faith to conclude a deal

                        Note that the clock doesn’t stop ticking during the negotiation. So first you have to notice that you’ve crossed the threshold. Easy if it’s hiring the 21st employee, a bit harder if it it’s the end of the tax year accounting for final revenue (it’s fairly common for that to take a month or more after the end of the tax year to get accountants to sign off on the end-of-year accounts and know if you’ve just crossed the threshold). By the time you know that you need to sign the letter, you’ve probably used a quarter of that time. The licenser can then make a FRAND offer in 32 days. You now have 64 days to complete contract negotiation and signing. Good luck with that if this is anything other than an off-the-shelf license agreement with a previously published price that your legal team agreed to before you started the process. If you’re not happy with the FRAND terms then the supplier just has to run out the 64-day clock before you’re in violation.

                        1. 3

                          then the supplier just has to run out the 64-day clock before you’re in violation.

                          That’s not negotiating in good faith, is it?

                          1. 4

                            Okay, now prove that in a court of law.

                            1. 1

                              That’s not how law works. Having a disagreement doesn’t automatically teleport you before a judge—they would really hate that. It doesn’t even land you on a call with a lawyer the overwhelming majority of the time.

                              Negotiation in good faith is a magic phrase, or “term of art”, in law. The search term is “implied covenant of good faith and fair dealing”—the idea that by default, the law requires a baseline of responsibility and straight shooting from all sides to a deal. Honor the thrust of the deal.

                              Big Time’s language merely makes that implied rule explicit. It also makes clear it extends to negotiation of the new terms, not just doing what’s agreed under Big Time itself. That gives folks who don’t know the legal defaults comfort that Big Time licensors can’t simply send a qualifying offers and sit on them, effectively ghosting would-be customers. If that happens, the company’s choice is to drop the software or keep using it, in reliance on the Big Time terms. If the licensor comes back and threatens to sue, the user can cite the language back at them. Negotiate. Which is the point.

                              1. 1

                                This is the flow that I imagine:

                                • A negotiation requires multiple rounds of review by both sides lawyers (from personal experience, this happens when a company buys anything)
                                • Time zone differences mean that any query takes at least 1-2 days for a full round trip.
                                • The lawyers on the provider side keep raising entirely reasonable (or reasonable seeming) issues with any proposed changes that the buyer requests.
                                • You run out the clock and the providing company says ‘we’ve been negotiating in good faith, you are now out of time and must either agree to the terms or be violating the license’.

                                Now what happens? If I keep using the software, my compliance team will be very unhappy: I don’t have a license. If FACT does an audit, I am not in compliance. If the supplier decides to take me to court for using the software without a license, then I have to prove that I do have a valid license under the terms of of the Big Time Public License, which requires demonstrating that they were not acting in good faith.

                                From a corporate perspective, this looks like a big tangle of compliance risks and something that I wouldn’t even bother trying to get my compliance team to look at because they’d run screaming away from it.

                                In some ways it’s easier if I’m a big company to start with because I’m definitely not covered by the Bit Time license and so need to apply for the FRAND license anyway, so there’s no risk from a license that I don’t use. If I’m a startup with acquisition by a big company as a possible exit strategy then this is the kind of thing that would show up when a potential buyer did due diligence as a big risk and make my sale price lower. If I’m an individual, I can’t afford to hire a lawyer to tell me if I’m a commercial entity or not and so I just avoid it.

                          2. 2

                            Big Time’s noncommercial language, which descends directly from PolyForm Noncommercial, is the clearest I’ve seen in a public license. The idea that legal terms like this ought to be perfectly clear in every case is a myth. That’s not how natural language—or legal terms—work in practice.

                            The “safe harbors” for personal uses and noncommercial organizations address the vast majority of truly difficult edge cases we actually hear about with Creative Commons NC. CC’s own studies on understanding of its language indicate very little utility from their extra language. So Big Time doesn’t include any.

                            If a use case is arguably commercial, and the user doesn’t qualify as a small business, they should reach out for a paid license. That’s the other half of the point with Big Time. Using this license won’t make sense unless the licensor does (or did) offer paid licenses.

                            As for modification, if you’re covered for “use”, you’re covered under copyright. see the Copyright License section.

                            The headcount limit for small business was lowered from 100 to 20 in version 2.0.0. It’s only approximate, and it only needs to be, functioning as one of three rough proxies for bigness. If you have thirteen full-time employees and are hiring on contractors by the handful, you probably have what with to get a license. Nothing stops you from reaching out for terms before you’re clearly no longer “small”. There is a special transition time frame for companies that start out small and grow out of those limits.

                            We have active discussion ongoing, including on GitHub, on adjustment of the small-business thresholds for economy size, internationally. Perhaps we might add an adjustment for purchasing power parity, atop of inflation. But it’s not altogether clear yet whether that’s what developers would want.

                            The most important thing we’re hearing is that devs definitely want to sell to big, well known firms and kinds of firms that ought to be paying, like banks and startups who’ve raised millions. Not that they want to draw a particularly fine line between different degrees of small. All of our “small business” thresholds clearly put the big firms of concern out-of-bounds for free small-business use.

                            1. 1

                              in the last tax year

                              Also .. in which tax year? Here in Russia the tax year runs from January 1st to December 31st, but e.g. in the UK it ends some time in April.

                              1. 2

                                Could that language to make sense without referring to the tax year where the company pays tax?

                                Привет из Калифорнии! Счастья и успеха в новом, 2022-м году!

                                1. 1

                                  This is part of a recurring problem with the license: it does not mention jurisdiction and makes a load of US-centric assumptions. This phrase would be better if it included an explicit note that it happened within the tax year of the organisation receiving the license. Without that, there are problems if I am under $1m over my tax year but it isn’t evenly distributed and so in the seller’s tax year I am over. Am I violating the license?

                                  All of that said, to your question:

                                  Could that language to make sense without referring to the tax year where the company pays tax?

                                  Absolutely yes. Because for most companies the thing that matters is the accounting year (which may be different for each company) and not the tax year (which is defined by statute in each jurisdiction). That defines a point in time each year at which all accounts must be reconciled and, for publicly traded companies, the point at which they must report earnings.

                                  1. 1

                                    Even companies that define their own accounting year have to prepare accounts for the tax year, in order to pay their tax. This isn’t a US-specific phenomenon. Companies pay tax everywhere. Do you know of a country that requires tax reporting for other than a defined, year-long period?

                                    The point of using tax year in the terms was to use the measure companies will already have. Plenty of companies only do accounting on the tax year. Those that also do other accounts still have to do tax accounting. I believe that’s true even if the self-defined tax year dominates conversations about external reporting or internal compensation and budgeting.

                                    I certainly don’t think I agree that this language could refer to the seller’s tax year:

                                    You may use the software for the benefit of your company if it meets all these criteria: … 2. earned less than $1,000,000 total revenue in the last tax year

                            1. 3

                              It’s kind of interesting how many of the problems the author lists are just GitHub/Gitlab problems.

                              1. 10

                                Overall, the Bank Python environment seems to be indicative of how cultural norms prune the design space. If you suggested this stuff in a normal environment you would get shut down pretty quickly. However, it seems to have worked well in that space so maybe it isn’t as crazy as it sounds on paper. What other ideas do you think we haven’t explored because of normative pressures?

                                1. 20

                                  Versioning all UIs you produce so users can stay on the old UI while the backend just doesn’t care.

                                  Yes I understand how this could be a lot of work in proxying to map old abstractions on top of new backends.

                                  We do already version APIs though, so why not UIs?

                                  1. 26

                                    I think the people hanging on to old.reddit.com just shed a solitary tear.

                                    1. 14

                                      The day old reddit dies is the day I… move to 4chan? IDK what else there is even.

                                      1. 15

                                        the day I start wasting a lot less time online!

                                        1. 3

                                          Active forum communities still exist!

                                          1. 4

                                            Honestly, I’ve come to appreciate forums more with time. While I love lobster.rs, voting on comments and using that to weigh opinions takes away the feeling of community. Forums feel much flatter and approachable. I should probably spend more time on those parts of the internet.

                                          2. 3

                                            Lobsters instances focused on the particular subcommunity maybe? There are some I could see going that route.

                                            1. 1

                                              Perhaps teddit is an option for you, even now.

                                              1. 1

                                                Lemmy.

                                              2. 6

                                                Reddit is an outlier in actually still providing old.reddit for those that prefer it. Any other social network would have switched over to the new one without any sort of transition.

                                                The transition is happening. There’s a lot wrong with new.reddit but at least it supports the three-backslashes method of indicating code in Markdown (instead of the standard way of preceding each line with at least 4 spaces). A lot of the submissions in the daily solutions thread for Advent of Code are from people for whom this works perfectly, but it looks like garbage in old. The mods are fighting a losing battle trying to keep the threads usable for old reddit users.

                                                1. 6

                                                  I optimistically assume that some people on the Reddit dev team still remember how Reddit got big because Digg forced a terrible UX transition without a way to opt out.

                                                  1. 8

                                                    This is assuming that anyone in the dev team has a voice in the management of the company. I highly doubt it.

                                                    1. 6

                                                      I don’t see any evidence that Reddit has a dev team that gives two shits about Reddit. The experience is terrible and has been terrible for years. I dunno, maybe they are just wildly incompetent. I’m not sure which interpretation is more charitable. But in any case, they clearly do not use Reddit, because if you were a Reddit dev who was also a Reddit user, an itch that you would 100% have scratched when new Reddit launched is that code formatting is inconsistent between new and old Reddit. It just is not that hard to use the same Markdown rendering tool across stacks. Yeah, there’s some weirdness about CommonMark and the original Gruber Markdown is underspecified and blah blah, but it is a completely solvable problem that is totally in their wheelhouse to solve, and they have not made even a single step of progress on it. People write Markdown parsers for fun. They just don’t care.

                                                      1. 5

                                                        Fun fact: I wrote the Markdown parser that New Reddit uses, for fun. I also executed the transition to CommonMark-based Markdown at GitHub. Working on Markdown stacks used by sites as big as these is pretty intense. I think your analysis is off the mark.

                                                        1. 3

                                                          Working on Markdown stacks used by sites as big as these is pretty intense.

                                                          I definitely agree with you that having millions of messages run through a parser is going to effectively fuzz out tons of weird shit, so it’s not something you do casually, but I still don’t see how having two parsers that work completely differently makes any sense. Either stick to the old Reddit parser because it’s tried and true or switch to a new one because it’s faster and more capable, but why would you only sort of switch in some views but not others? I’m sure there’s some weird and bad architecture that makes it hard to change old reddit, but I cannot imagine working on Reddit for 40 hours a week and not fixing it eventually. New Reddit isn’t actually new anymore; it’s been around for years. At some level, it has to come down to people saying “I just work here; I don’t care that the product is bad; I’m not going to waste my time pressing for a fix because it’s difficult and it’s not my department.”

                                                2. 5

                                                  Couldn’t agree more. I see backwards-incompatible UI changes as a form of fairly cavalier disrespect for one’s users (I’m looking at you, Firefox), and very much in the same category as breaking API compatibility without warning. The user is a critical part of the overall functioning system, not unlike whatever library or service is on the other side of an API – changes to the interface they use should be managed accordingly. Adding a new keyboard shortcut? Sure. Arbitrarily changing an existing one? Bzzzt Nope.

                                                  1. 2

                                                    “Copy link location” moved from very reachable A to L. So you need to move your left hand all the way to the right side of the keyboard or put down the mouse. Sigh.

                                                    1. 2

                                                      Precisely the example I was thinking of. (If you’re not aware, there’s a hack that can be applied to revert it, but it’s irritating to have to re-apply on every update, and who knows how long it’ll keep working.)

                                              1. 15

                                                No more strict separation of evaluation and build phases: Generating Nix data structures from build artefacts (“IFD”) should be supported first-class and not incur significant performance cost.

                                                I don’t know much about Nix (although I’m about to learn with Oil), but this sounds like the same problem Bazel has, which I mentioned here:

                                                https://lobste.rs/s/virbxa/papers_i_love_gg

                                                and linked in this post:

                                                http://www.oilshell.org/blog/2021/04/build-ci-comments.html#two-problems-with-bazel-and-gg-again

                                                i.e. if you have to compute all dependencies beforehand, it both makes things slow and clashes with the way that certain packages want to be built. You have a lot of work to do before you can start to parallelize your build.

                                                Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

                                                It sounds like we still need:

                                                1. A more dynamic notion of dependencies like gg
                                                2. A middleground between Docker-like layers and /nix/store composability

                                                related: https://lobste.rs/s/65xymz/will_nix_overtake_docker#c_tfjfif

                                                1. 9

                                                  Thanks for linking the gg paper! It’s quite interesting how most concepts in there directly map to concepts that already exist in Nix today, for example:

                                                  gg models computation as a graph of “thunks,” which represent individual steps of computation. Thunks describe the execution of a specific binary on a set of inputs, producing one or more outputs. Thunks are entirely self-contained; any dependencies of the invoked binary must also be explicitly named as dependencies.

                                                  That’s the concept behind Nix derivations.

                                                  Either a primitive value, representing a concrete set of bytes stored into the object store, or

                                                  Fixed-output derivation.

                                                  As the output of a named thunk (referred to using a hash of the serialized thunk object), which must be first executed to produce its output(s).

                                                  That’s a standard derivation.

                                                  You can encode, for instance, “Link(Compile(Preprocess(foo.cc)))” as a sequence of three thunks

                                                  This is where it gets interesting. Nothing in Nix’s model prevents us from doing things at this abstraction level, but it is far more common to write a single derivation that is more like Wrap(EntireBuildSystemOfTheThing). We would like to elevate Nix one level up.

                                                  A gg thunk can return either one or more primitive values, or it can return a new thunk, with its own set of inputs.

                                                  In Nix this is known as import from derivation (IFD). It refers to any time the build graph is extended with things that first need to be computed as the result of derivations. This process is currently a major pain in Nix as you can not “chunk” these parts of the evaluation if you’re doing something like computing an entire repository’s build graph for a CI system.

                                                  Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

                                                  I find this statement confusing. Nix rather does the opposite right now - most build systems are wrapped fully in derivations. The one thing that often causes a lot of overhead (e.g. through code generation) is that the majority of build systems either don’t provide Nix with usable hashes of dependencies, or they make it really hard to “inject” “pre-built” artefacts.

                                                  A middleground between Docker-like layers and /nix/store composability

                                                  Can you expand on this?

                                                  1. 2

                                                    Nix also very much has the “rewrite the entire upstream build system” problem that Bazel has.

                                                    I find this statement confusing.

                                                    I believe GP meant “rewrite the upstream package management system”, which is mostly true for Nix; Unless a particular language’s package manager cooperates heavily with Nix and exposes its guts, you can’t reuse it to express your Nix derivations. That’s why we have projects like cabal2nix, node2nix etc. that try to bridge the gap.

                                                    1. 4

                                                      The majority of those tools just pin hashes, because the upstream package managers don’t provide appropriate lockfiles, but it’s still the upstream build system that is being invoked.

                                                      A notable exception is Rust packaging with buildRustCrate, which reimplements a huge chunk of cargo as cargo is impossible to deal with from a perspective of wanting to package individual crates (it doesn’t support injecting these pre-built artefacts in any way).

                                                      On the other end of the spectrum is something like naersk, which uses lockfiles natively.

                                                    2. 1

                                                      Yeah gg and Bazel are both very functional, parallel, and fine-grained. Actually the fine-grained-ness of Bazel is nice at times but it also takes a large amount of time and memory to handle all that file-level metadata.

                                                      It’s also worth looking at Llama, which is an open source experiment inspired by gg: https://blog.nelhage.com/post/building-llvm-in-90s/

                                                      (by the author of the gg post)

                                                      There were a few more blog posts and comments on lobste.rs about llama.


                                                      I think gg’s main contribution is that it’s fast. However it’s also not a production system like Bazel or Nix; it’s research.

                                                      Though it is open source: https://github.com/StanfordSNR/gg

                                                      What stood out to me as being somewhat unreasonable and not-fit-for-production is the idea of “model substitution”, which seems like just a fancy way of saying we have to reimplement a stub of every tool that runs, like GCC.

                                                      https://github.com/StanfordSNR/gg/blob/master/src/models/gcc.cc

                                                      The stub is supposed to find all the dependencies. This seems like a lot of tech debt to me, and limits real world applicability. However gg is fast and that’s interesting! And I do think the notion of dynamic dependencies is important, and it seems like Bazel and Nix suffer there. (FWIW I feel Nix is a pioneering system with many of the right ideas, though it’s not surprising that after >15 years there is opportunity for a lot of improvement. Just because the computing world has grown so much, etc.)


                                                      By rewrite the whole build system I basically mean stuff like PyPI and I have some particular experience with R packages (in Bazel, not Nix).

                                                      As for the middleground, I’m experimenting with that now :) As far as I understand, writing Nix package definitions can be tricky because nothing is ever in a standard place – you always have to do ./configure --prefix. Although I just looked at a few definitions, and I guess that’s hidden? But the actual configure command underneath can never be stock.

                                                      In my experience this is slightly annoying for C code, but when you have say R and Python code which rely on native libraries, it starts to get really tricky and hard to debug. My experience is more with Bazel but I’ve seen some people not liking “100 KLOC” of Nix. I’d prefer package definitions that are simpler with less room for things to go wrong.

                                                      For me a key issue is that Nix was developed in a world before Linux containers and associated mechanisms like OverlayFS, and I think with that new flexibility you might choose something different than the /nix/store/$hash mechanism.

                                                      From a philosophical level, Bazel and Nix both have a very strong model of the world, and they get huge benefits from that. But if you don’t fit in the model, then it tends to fall off a cliff and you struggle to write solid build config / package defs.

                                                      Also I don’t remember the details, but I remember being excited about Nix Flakes, because I thought Nix had some of those guarantees all along but apparently doesn’t.


                                                      Anyway it sounds like a very cool project, and I look forward to future blog posts. BTW it might also be worth checking out the Starlark language from Bazel. It started out as literally Python, but evolved into a language that can be evaluated in parallel quite quickly. This partially mitigates the “two stage” latency of evaluating dependencies and then evaluating the graph. I didn’t see any mention of parallel Nix evaluation but it seems to make sense, especially for a functional language? Starlark is not as functional but the semantics are such that there’s a lot of parallelism.

                                                      1. 2

                                                        BTW it might also be worth checking out the Starlark language from Bazel. It started out as literally Python, but evolved into a language that can be evaluated in parallel quite quickly.

                                                        Oh, I’m aware of Starlark - my previous full-time position was in SRE at Google :)

                                                        I didn’t see any mention of parallel Nix evaluation but it seems to make sense, especially for a functional language

                                                        Yeah, we want to get to a parallel evaluation model for sure. In fact the lack of that model is a huge problem for current Nix evaluation, as any use of IFD will force a build process and block the rest of the language evaluation completely until it is done. It makes a lot more sense to mark the thunk as being evaluated and continue on a different part of the graph in the meantime.

                                                  1. 61

                                                    I support this. There’s a Nix post on the front page almost every day at this point and it seems to make sense.

                                                    1. 2

                                                      Hijacking the top comment to say THANK YOU SO MUCH for including the tag!

                                                    1. 3

                                                      This is so great. The bus factor on the Nix tool is ~1, and this has been causing lots of issues.

                                                      Is this meant to be flakes-aware?

                                                      Also, what’s it being implemented in? If there’s any code available, I couldn’t see it - navigating the SourceGraph-based forge on mobile was an exercise in frustration.

                                                      1. 6

                                                        Is this meant to be flakes-aware?

                                                        We’re not planning to support experimental features for now beyond what is required for nixpkgs, but note that experiments like flakes can also be implemented in the Nix language - you don’t need tooling support for them.

                                                        Also, what’s it being implemented in?

                                                        The evaluator is in Rust, store/builder are undecided.

                                                        navigating the SourceGraph-based forge on mobile was an exercise in frustration

                                                        There’s also a cgit instance at https://code.tvl.fyi - we have a search service that takes a cookie for redirecting to the preferred code browser.

                                                        1. 1

                                                          I think it’s a shame that Flakes are still considered experimental.

                                                          1. 3

                                                            I disagree, but opinions are split on this :-)

                                                            1. 4

                                                              what are the problems you see with flakes?

                                                        2. 1

                                                          I think I recall reading on Matrix earlier that whatever current source is available is from the ~failed fork attempt, and that the reimplementation would likely be in rust?

                                                          (For that matter, I also saw the suggestion that they may be over-estimating how readily ~compatible their effort may be with guix–though I don’t recall for sure if that was on Matrix or orange site.)

                                                          1. 2

                                                            how readily ~compatible their effort may be with guix

                                                            You’re probably referring to this thread. The thing is that we don’t explicitly intend to be compatible with the derivation files, but the fundamental principles of Guix (language evaluation leads to some sort of format for build instructions) has - to our knowledge - not diverged too much to be conceptually compatible.

                                                            Note that we haven’t really sat down with anyone from Guix to discuss this yet, for now our efforts are focused on Nix (and personally I believe that a functional, lazy language is the right abstraction for this kind of problem).

                                                        1. 3

                                                          My experience with Nix in the past has been slightly less advanced/dynamic (mainly NixOS and simple packages) but the performance point was a major factor to me. I understand that Flakes are meant to address some of this, but as it stands, Nix evaluations can get really slow. I’d personally love to see something closer to the speed of an apk add.

                                                          I’d be curious if there is a “simpler” version of Nix that could exist which gets speed ups from different constraints. For example, I’ve found please to be faster than most bazel projects, partly due to being written in go and having less of a startup cost, but also because the build setup seems to be simpler.

                                                          I think that the root of the problem might be that Nix is a package build system, not a development build system, and so is build with different assumptions. I wonder if there’s a way to build a good tool that does both package builds (tracks package dependencies, build binary artifacts, has install hooks) and a build tool (tracks file dependencies, has non-build rules such as linting, and caches artifacts for dev not installation). I’m just spitballing but it seems to me like trying to reconcile these two different systems might force a useful set of constraints that results in a fast & simple build system? (though it could just as easily go the other way and become unwieldy and complex).

                                                          1. 10

                                                            Nix is a package build system, not a development build system

                                                            Ah, but this is exactly the point :-)

                                                            There is nothing fundamental about Nix that prevents it from covering both, other than significant performance costs of representing the large build graphs of software itself (rather than a simplified system build graph of wrapped secondary build systems). At TVL we have buildGo and buildLisp as “Bazel-like” build systems written in Nix, and we do use them for our own tools, but evaluation performance suffers significantly and stops us from adding more development-focused features that we would like to see.

                                                            In fact this was a big driver behind the original motivation that led to us making a Nix fork, and then eventually starting Tvix.

                                                            1. 6

                                                              I wonder if there’s a way to build a good tool that does both package builds (tracks package dependencies, build binary artifacts, has install hooks) and a build tool (tracks file dependencies, has non-build rules such as linting, and caches artifacts for dev not installation).

                                                              I believe there is! It mostly comes from a paper called A Sound and Optimal Incremental Build System with Dynamic Dependencies (PDF), which is not my work (although I’m currently working on an implementation of the ideas).

                                                              There are three key things needed:

                                                              1. Dynamic dependencies.
                                                              2. Flexible “file stamps.”
                                                              3. Targets can execute arbitrary code, instead of just commands.

                                                              The first item is needed based on the fact that dependencies can change based on the configuration needed for the build of a package. Say you have a package that needs libcurl, but only if users enable network features.

                                                              It is also needed to import targets from another build. I’ll use the libcurl example above. If your package’s build target has libcurl as a dependency, then it should be able to import libcurl’s build files and then continue the build making the dependencies of libcurl’s build target dependencies of your package’s build targets.

                                                              In other words, dynamic dependencies allow a build to properly import the builds of its dependencies.

                                                              The second item is the secret sauce and is, I believe, the greatest idea from the paper. The paper calls them “file stamps,” and I call them “stampers.” They are basically arbitrary code that returns a Boolean showing whether or not a target needs updating or not.

                                                              A Make-like target’s stampers would check if the file mtime is less than any of its dependencies. A more sophisticated one might check that any file attributes of a target’s dependencies have changed. Another might hash a file.

                                                              The third is needed because otherwise, you can’t express some builds, but tying it with dynamic dependencies is also the bridge between building in the large (package managers) and building in the small (“normal” build systems).

                                                              Why does this tie it all together? Well, first consider trying to implement a network-based caching system. In most build systems, it’s a special thing, but in a build system with the above the things, you just need write a target that:

                                                              1. Uses a custom stamper that checks the hash of a file, and if it is changed, checks the network for a cached built version of the new version of the file.
                                                              2. If such a cache version exists, make updating that target mean downloading the cache version; otherwise, make updating the target mean building it as normal.

                                                              Voila! Caching in the build system with no special code.

                                                              That, plus being able to import targets from other build files is what ties packages together and what allows the build system to tie package management and software building together.

                                                              I’ll leave it as an exercise to the reader to figure out how such a design could be used to implement a Nix-like package manager.

                                                              (By the way, the paper uses special code and a special algorithm for handling circular dependencies. I think this is a bad idea. I think this problem is neatly solved by being able to run arbitrary code. Just put mutually dependent targets into the same target, which means targets need to allow multiple outputs, and loop until they reach a fixed point.)

                                                              I’m just spitballing but it seems to me like trying to reconcile these two different systems might force a useful set of constraints that results in a fast & simple build system?

                                                              I think that design is simple, but you can judge for yourself. As to whether it’s fast, I think that comes down to implementation.

                                                              1. 2

                                                                If categorised in the terminology of the ‘build systems a la carte’ paper (expanded jfp version from 2020), where would your proposal fit? Though you havn’t mentioned scheduling or rebuilding strategies (page 27).

                                                                1. 4

                                                                  That is a good question.

                                                                  To be able to do dynamic dependencies, you basically have to have a “Suspending” scheduler strategy, so that’s what mine will have.

                                                                  However, because targets can run arbitrary code, and because stampers can as well, my build system doesn’t actually fit in one category because different stampers could implement different rebuilding strategies. In fact, there could be stampers for all of the rebuilding strategies.

                                                                  So, technically, my build system could fill all four slots under the “Suspending” scheduler strategy in the far right column in table 2 on page 27.

                                                                  In fact, packages will probably be build files that use deep constructive traces, thus making my build system act like Nix for packages, while in-project build files will use any of the other three strategies as appropriate. For example, a massive project run by Google would probably use “Constructive Traces” for caching and farming out to a build farm, medium projects would probably use “Verifying Traces” to ensure the flakiness of mtime didn’t cause unnecessary cleans, and small projects would use “Dirty Bit” because the build would be fast enough that flakiness wouldn’t matter.

                                                                  This will be what makes my build system solve the problem of scaling from the smallest builds to medium builds to the biggest builds. That is, if it actually does solve the scaling problem, which is a BIG “if”. I hope and think it will, but ideas are cheap; execution is everything.

                                                                  Edit: I forgot to add that my build system also will have a feature that allows you to limit its power in a build script along three axes: the power of targets, the power of dependencies, and the power of stampers. Limiting the last one is what causes my build system to fill those four slots under the “Suspending” scheduler strategy, but I forgot about the ability to limit the power of dependencies. Basically, you can turn off dynamic dependencies, which would effectively make my build system use the “Topological” scheduler strategy. Combine that with the ability to fill all four rebuilder strategy slots, and my build system will be able to fill 8 out of the 12.

                                                                  Filling the other four is not necessary because anything you can do with a “Restarting” scheduler you can do with a “Suspending” scheduler. And restarting can be more complicated to implement.

                                                            1. 1

                                                              Every time you see an open-source rewrite of a system from a Google paper, you should keep in mind that Google wouldn’t have published the system unless it had already been replaced by something better.

                                                              1. 16

                                                                This is not true at all.

                                                                1. 7

                                                                  Spanner, Zanzibar and Borg among outers are proof that this isn’t true. GRPC is even a little different that they are still adopting it as it was released publicly.

                                                                  Systems are going to change over time, so in the long term there will be outdated papers, but that is just a fact that papers are a snapshot of knowledge at the time of authorship.

                                                                  1. 5

                                                                    Even if they have, so what? If the concept is one that (when implemented) works well, and there isn’t an obviously better alternative publicly available, who cares whether or not it’s not pulled straight from Google’s latest stack?

                                                                    1. 4

                                                                      Though, even if Google’s moved on internally, these implementations can have a big impact - Hadoop comes to mind.

                                                                    1. 3

                                                                      This link 404s for me.

                                                                      1. 1

                                                                        I’m not seeing any /wiki links from the home page of the site. I wouldn’t be surprised if paths under /wiki were not intended to be publicly shared. It’s working again.

                                                                      1. 12

                                                                        Docker has some suboptimal design choices (that clearly hasn’t stopped its popularity). Yes, the build process, with the line continuation ridden Dockerfile format that makes it impossible to comment what each thing is there for, with implicit transactions that buries temporary files in the layers, and the layers themselves, that behave nothing like an ideal dependency tree, is one thing, but that’s fixable. What makes me sad are the fundamental design choices that can’t be satisfyingly fixed by adding stuff on top, such as being a security hole by design and containers being stateful and writable, and therefore inefficient to share between processes and something you have to delete afterwards.

                                                                        What is a more ideal way to build an image? For a start, run a shellscript in a container and save it. The best part is that you don’t need to copy the resources into the container, because you can mount it as a readonly volume. You need to implement rebuilding logic yourself, though, but you can, and it will be better. Need layers? Just build one upon another. Even better, use a proper build system, that treats dependencies as a tree, and then make an image out of it.

                                                                        As for reimplementing Docker the right way from the ground up, there is fortunately no lack alternatives these days. My attempt, selfdock is just one.

                                                                        1. 8

                                                                          As for reimplementing Docker the right way from the ground up, there is fortunately no lack alternatives these days.

                                                                          What about nixery?

                                                                          Especially their idea on “think graphs not layers” is quite an improvement over previous projects.

                                                                          1. 4

                                                                            I spent some time talking about the optimisations Nixery does for layers in my talk about it (bit about layers starts at around 13:30).

                                                                            An interesting constraint we had to work with was the restriction on the maximum number of layers permitted in an OCI image (which, as I understand it, is an implementation artefact from before) and there’s a public version of the design doc we wrote for this on my blog.

                                                                            In theory an optimal implementation is possible without that layer restriction.

                                                                            1. 2

                                                                              Hey! Thanks for sharing and also thank you for your work, true source of inspiration :)

                                                                          2. 2

                                                                            My attempt, selfdock is just one.

                                                                            This looks neat. But your README left me craving for examples.

                                                                            Say I want to run or distribute my python app on top of this. Could you provide an example of the equivalent to a docker file?

                                                                            1. 4

                                                                              Thanks for asking! The idea is that instead of building, distributing and using an image, you build, distribute and use a root filesystem, or part of it (it can of course run out of the host’s root filesystem), and you do this however you want (this isn’t the big point, however).

                                                                              To start with something like a base image, you can undocker a docker image:

                                                                              docker pull python:3.9.7-slim
                                                                              sudo mkdir /opt/os/python:3.9.7/
                                                                              docker save python:3.9.7-slim | sudo undocker -i -o /opt/os/myroot
                                                                              

                                                                              Now, you have a root filesystem. To run a shell in it:

                                                                              selfdock --rootfs /opt/os/myroot run bash
                                                                              

                                                                              Now, you are in a container. If you try to modify the root filesystem from a container, it’s readonly – that’s a feature!

                                                                              I have no name!@host:/$ touch /touchme
                                                                              touch: cannot touch '/touchme': Read-only file system
                                                                              

                                                                              When you exit this process, the reason for this feature starts to show itself: The process was the container, so when you exit it, it’s gone – there is no cleanup. Zero bytes written to disk. Writing is what volumes are for.

                                                                              To build something into this filesystem, replace run with build, which gives you write access. The idea is as outlined above, to mount your resources readonly and running whatever:

                                                                              selfdock --rootfs /opt/os/myroot --map $PWD /mnt build pip install -r /mnt/requirements.txt
                                                                              

                                                                              … except that if it modifies files owned by root, you need to be root. As the name implies, selfdock doesn’t just give you root.

                                                                              Then, you can run your thing:

                                                                              selfdock --rootfs /opt/os/myroot --vol $PWD/data /home python app.py
                                                                              

                                                                              Note that we didn’t specify user- and group ID to run as – it just does (anything else would be a security hole). This is important for file permissions, especially when giving write access to a volume as above. But since the root filesystem is readonly, you can run thousands of instances out of it, and the overhead isn’t much more than spawning a process. The big point here is not in the features, but in doing things correctly.

                                                                              1. 2

                                                                                That sounds very similar to what systemd-nspawn offers. Once you deal with unpacked root filesystems it may be another solution to look at.

                                                                                1. 1

                                                                                  So, it has even more resemblance to chroot but with more focus on isolation and control of resource usage, IIUIC.

                                                                                  A bit of feedback, if I may. The whole requirement of carrying files around, will put people off. Including myself. I refrain from using docker because of the gigantic storage footprint any simple thing requires. But the reason it is so popular is that it abstracts away the binary blobs. People run docker commands and let it do its thing, they don’t need to fiddle with or even know about the images which are stored on their hard drive. It was distributed with dockerhub connectivity by default. So people only worry about their docker files and refer to images as a URL or even just a slug if the image is in docker hub.

                                                                                  Similarly, back in the day, many chroot power users had a script to copy a basic filestructure to a folder and run chroot. I think most people would want this. Even if inconsciently. A command that does the complicated parts with a simple porcelain.

                                                                            1. 3

                                                                              Proper declarative unit files go so much deeper than being “syntax sugar” for templating scripts. My view is that any new service manager should be fundamentally based on the concept of repeatedly attempting to reconcile the current state with what has been declared (think reconciliation loops in Kubernetes, or running something like Terraform in a loop).

                                                                              Designing such a format constrains what can be done inside of these service definitions - but that isn’t a problem, it’s a feature. Long-term moving towards more homogenous setups for all software will chip away at the accidental complexity that is currently part of building any service.

                                                                              On a different note, I have a hard time getting excited for another important, highly-privileged system component being written in a memory unsafe language …

                                                                              1. 5

                                                                                This doesn’t seem to motivate why having a security key proves that you’re human. What prevents people from just automating this interaction?

                                                                                1. 14

                                                                                  “Prove you are human” has always been the sort of marketing spin on captchas; it’s about making automation marginally more expensive than it’s worth.

                                                                                  1. 13

                                                                                    In the case of Google, I think it is also: get image recognition training data for free. Most of their image captchas are clearly image recognition for automotive. I strongly suspect that a subset of tiles that they serve are unannotated and they will then use annotations for which there is high agreement.

                                                                                    1. 2

                                                                                      If you click wrong, someone could die. It isn’t just a captcha.

                                                                                      1. 4

                                                                                        First, people are incentivized to click the right tiles, since they want to bypass the captcha.

                                                                                        Second, they would not base the label on a single annotation, but rather on thousands or even tens of thousands of annotations which have a high level of inter-annotator agreement.

                                                                                        1. 7

                                                                                          And yet, a lot of the time CAPTCHA insists that that mailbox is actually a parking meter.

                                                                                    2. 3

                                                                                      Prove we should allow us to advertise to you!

                                                                                    3. 6

                                                                                      The article addresses this toward the end:

                                                                                      We also have to consider the possibility of facing automated button-pressing systems. A drinking bird able to press the capacitive touch sensor could pass the Cryptographic Attestation of Personhood. At best, the bird solving rate matches the time it takes for the hardware to generate an attestation. With our current set of trusted manufacturers, this would be slower than the solving rate of professional CAPTCHA-solving services, while allowing legitimate users to pass through with certainty.

                                                                                      1. 3

                                                                                        Essentially they are relying on the required physical presence mechanisms of FIDO security keys to limit the rate at which the challenges can be passed. So essentially having a key does not prove you are human but attempts to constrain anyone using it to the challenge passing rate roughly attainable by a human. Since these keys are issued by trusted authorities I imagine this means they probably have some mechanism implemented or planned that would ban keys with superhuman challenge rates.

                                                                                      1. 10

                                                                                        I’ve used the Kinesis Advantage almost exclusively for about 7 years. I have a wrist injury from a motorbike accident a decade ago that has made me extra sensitive to RSI issues, but with the Kinesis and reduced mouse usage it’s okay. The main drawback is that it’s quite bulky - I’m currently traveling around Africa with one and it takes up a non-insignificant chunk of luggage space.

                                                                                        There’s a few other similar keyboards, like the Ergodox, but one of them work for me as they don’t have key wells. The Dactyl might be interesting …

                                                                                        1. 3

                                                                                          I love my kinesis, but the problem is that I don’t have a really good cadence of cleaning it. I’ve heard yearly/monthly/even weekly. I’d love to start a conversation around that.

                                                                                          1. 3

                                                                                            I do it whenever it annoys me, and I don’t really know how often that is. Maybe twice a year? My usual process is to pull all key caps and submerge them in a bleach dilution for a while, then wash them off and wipe/vacuum the rest as appropriate.

                                                                                            1. 3

                                                                                              Twice a year sounds about what I do for the one I use daily. I snap a picture of the keyboard to help remember what goes where, then pop all the key caps off, and usually clean them thoroughly with a damp cloth. I usually take the whole keyboard apart, and thoroughly clean out any crumbs, dust, etc. For my older Kinesis keyboards I would use Deoxit on the connectors, to try to make them less flaky.

                                                                                              I cleaned out a keyboard I’d left sitting unused a few years, and it had spiders inside. So good clean out never hurts…

                                                                                              1. 2

                                                                                                Nice, tazjin and astangl, thanks for the advice.

                                                                                        1. 1

                                                                                          I’m not a web dev by profession, but I was under the impression that the issue with generating HTML was the cost on the server. I thought it was a lot easier to scale out by just using a CDN for static assets

                                                                                          1. 7

                                                                                            Generating HTML dynamically is of course more expensive than static HTML, but this is how every site in the world worked as of ~5-10 years ago: eBay, Amazon, Yahoo, Google, Wikipedia, slashdot, reddit, dating sites, etc.

                                                                                            There is a newer crop of apps (that seem to have unified mobile apps) that generate more HTML on the client, and seem to be slow and want me to look at spinny things for long periods. I’m thinking of Doordash and Airbnb, but I haven’t looked in detail at how they work.

                                                                                            But all the former sites still generate HTML on the server of course, and many new ones do too. This was done with 90’s hardware and 90’s languages. It’s essentially a “solved problem”.

                                                                                            1. 3

                                                                                              The venerable C2 Wiki (a.k.a. the original wiki) switched to being all client-side.💔

                                                                                              1. 2

                                                                                                And that’s terrible.

                                                                                                1. 1

                                                                                                  Heartbreakingly so.

                                                                                              2. 1

                                                                                                Only some of the former still render on the server - for example, Google is a mix (Search being partially server-side, pretty much everything else is client-side), new Reddit is all client-side.

                                                                                                Everything is slowly trending towards the newer, slower, client-side apps - I guess in an attempt to uphold Page’s Law :-)

                                                                                              3. 3

                                                                                                That’s sometimes true but usually isn’t. Web dev blogs though often focus on esoteric problems though and gives the illusion that they are more common than they really are.

                                                                                                1. 3

                                                                                                  For Phoenix’s LiveView (and I suspect Hotwire), they don’t actually generate the HTML, but instead pass some minimal data to the frontend which generates the HTML. It acts a bit like a client side application, but the logic is being driven by the backend and the developer doesn’t need to write much Javascript. It’s primarily aimed at replacing client side rendering.

                                                                                                  You can read this blog post for some details on the underlying data transfer in LiveView.

                                                                                                  Caveat emptor: I haven’t worked with this tech. I’ve just read a bit about it.

                                                                                                  1. 2

                                                                                                    Hotwire explicitly say that they generate the HTML and the client side logic just replaces nodes in the tree. This is why this can technically be used without any specialised backend support.

                                                                                                    1. 2

                                                                                                      Thank you for the correction.

                                                                                                1. 4

                                                                                                  It would be amazing if something like this was standardised, so that the client side could be implemented directly by browsers.