Threads for talideon

  1. 2

    surprising their 1.0 proposed feature list has Mastodon API support. Why not use the ActivityPub protocol?

    1. 1

      Because ActivityPub is probably implicit. However, it’s not really a general client protocol AFAIK, and there are plenty of clients that support Mastodon’s protocol.

      1. 3

        it’s not really a general client protocol AFAIK

        It is though. ActivityPub has two sections, the client to server (C2S) one and the server to server (S2S) one. Historically services have mostly implemented the later but avoided the former for various reasons: either claiming that it’s a little under specified (which it is) or that they had their own client APIs by the time ActivityPub was usable (as is the case for Mastodon for example).

        I am working on a suite of libraries (for Go) and a generic ActivityPub server that implements both.

        1. 2

          Since Mastodon uses ActivityPub for server-to-server communication only, nearly all the clients created use the Mastodon API for client-to-server communications. A very small minority support both AP C2S and Mastodon’s API but it’s nearly a lost cause at this point; Mastodon’s API is the de-facto standard. If you want good client support, it’s the only way.

          1. 1

            it’s the only way.

            I disagree. It just requires more work.

            1. 2

              “Become an expert in iOS, Android, Electron, native Windows apps, etc so you can add C2S support to the existing apps” isn’t really feasible for most people. Technically it is “just more work” but it’s unrealistic.

    1. 5

      I unfortunately chose .io a number of years ago before I knew better, and because I could actually get a decent name there.

      These days it seems like there are a billion TLDs and most aren’t very recognizable by the average user, other than really obscure or long domain names, and many still have similar baggage to .io. Does anyone here have a recommendation on where to look for good domain names? Sometimes it seems like all of them are already taken.

      1. 3

        Well if there would be a good recommendation, they’d already be squatted.

        1. 2

          i’m on .me (vgel.me), Montenegro’s cctld run by doMEn which I assume is a single purpose company, and haven’t had any complaints. i mean i’m sure they’re eating babies in the corporate offices or something, but the email delivery seems alright, i was able to get a 4-letter domain without dropping $200k, and they’re not overtly horrid as far as i know.

          1. 2

            In a former life, I used to work for a registrar and built their domain management platform. That meant I dealt with the .me registry - the staff there were all lovely to deal with!

          2. 1

            For non-email use, I honestly just look at whatever namecheap has on sale at the moment. I’ve had good luck with .fyi, .site and .us for a few PoC-y things lately.

            1. 2

              Are you allowed to have anonymous whois on a .us domain? Back when I had one, you needed to provide your real name and address to the record for “anti-terrorism” reasons or something silly like that. I even wrote to my (then) Senator about it because I thought it was dumb… I ended up moving my personal domain from .us to .net over it.

              1. 3

                No. And that just bit me once again. For the stuff I use it for (tech demos, etc.) I don’t really care. But I’m so used to my registrar’s generous private whois service that I didn’t notice the absence of that checkbox when I bought my most recent one.

                Then my phone started ringing with “scam risk” numbers wanting to sell me offshore site development services. I’ve set up a free google voice number with screening just to list in whois, now.

                None of them seem to bother with direct mail because it’s too costly.

          1. 9

            .io isn’t going anywhere any time soon. .su still exists, as do practically all of the other ccTLDs that got any use.

            I’d love the Chagossians to get all the money they’re rightfully owed.

            1. 18

              It would be good in general if people were more aware of the political considerations of choosing a TLD and the dangers they might pose to a registration.

              I’ve seen people using .dev and .app a lot, it’s worth considering these are Google-controlled TLDs. What really rubbed me the wrong way about these TLDs is Google’s decision to make HSTS mandatory for the entire TLD, forcing HTTPS for any website using them. I’m sure some people will consider this a feature but for Google to arbitrarily impose this policy on an entire TLD felt off to me. No telling what they’ll do in the future.

              1. 12

                .app and .dev aren’t comparable to ccTLDs like .sh and .up, however. gTLDs like .app and .dev have to stick to ICANN policies; ccTLDs don’t, and you’re at the mercy of the registry and national law for the country in question.

                1. 11

                  I was actually just discussing this fact with someone, but interestingly, we were discussing it as a positive, not a negative.

                  All of the newTLDs are under ICANN’s dominion, and have to play by ICANN’s rules, so they don’t provide independence from ICANN’s influence. Whereas the CCTLDs are essentially unconditional handouts which ICANN can’t exert influence over. So there’s a tradeoff here depending on whom you distrust more: ICANN, or the specific country whose TLD you’ve chosen.

                2. 10

                  HSTS preload for the entire TLD is brilliant idea, and I think every TLD going forward should have it.

                  Defaulting to insecure HTTP URLs is a legacy problem that creates a hole in web’s security (it doesn’t matter whats on insecure-HTTP sites, their mere existence is an entry point for MITM attacks against browser traffic). TOFU HSTS is only a partial band-aid, and per-domain preload list is not scalable.

                  1. 1

                    Does HTTPS really count as TOFU? Every cert is ultimately checked against a known list of CAs.

                    1. 4

                      The Trust-On-First-Use aspect is that HSTS is remembered by the browser only after the browser has loaded the site once; this leaves first-time visitors willing to connect over unencrypted HTTP.

                      (Well, except for the per-domain preload list mentioned by kornel.)

                      1. 2

                        Sure, but HSTS is strictly a hint that HTTPS is supported, and browsers should use that instead, right? There is no actual trust there, because the TLS certificate is still authenticated as normal.

                        Compare this to SSH, which actually is TOFU in most cases.

                        1. 3

                          Not quite - HSTS prevents connection over plaintext HTTP and prevents users from creating exceptions to ignore invalid certificates. It does more than be a hint, it changes how the browser works for that domain going forward. The TOFU part is that it won’t apply to a user’s first connection - they could still connect over plaintext HTTP, which means that a suitably positioned attacker could respond on the server’s behalf with messages that don’t include the HSTS header (if the attacker is fast enough). This works even if the site itself isn’t serving anything over HTTP or redirects immediately to HTTPS.

                          Calling it TOFU is admittedly a bit of a semantic stretch as I’m not sure what the specific act of trust is (arguably HSTS tells your browser to be less trustful), but the security properties are similar in that it only has the desired effect if the initial connection is trustworthy.

                          1. 1

                            Okay, I see the point about first-time connections, but that wouldn’t change regardless of the presence or absence of HSTS. So why single that header out? It seems to me that having HSTS is strictly better than not having one.

                            1. 2

                              The discussion was about HSTS preload which avoids the first connection problem just explained by pre-populating HSTS enforcement settings for specific domains directly in the browser distribution, so there is no risk of that first connection hijack scenario because the browser acts as if it had already received the header even if it had never actually connected before.

                              Normally this is something you would opt-in to and request for your own domain after you registered it, if desired… but Google preloaded HSTS for the entire TLDs in question, so you don’t have the option to make the decision yourself. If you register a domain under that TLD then Chrome will effectively refuse to ever connect via http to anything under that domain (and to my knowledge every other major browser uses the preload list from Chrome.)

                              It’s this lack of choice that has some people upset, though it seems somewhat overblown, as Google was always very upfront that this was a requirement, so it shouldn’t have been a surprise to anyone. There is also some real concern that there’s a conflict of interest in Google’s being effectively in total control of both the TLDs and the preload list for all browsers.

                              1. 1

                                The discussion was about HSTS preload which avoids the first connection problem just explained by pre-populating HSTS enforcement settings for specific domains directly in the browser distribution, so there is no risk of that first connection hijack scenario because the browser acts as if it had already received the header even if it had never actually connected before

                                Ahh, THIS is the context I was missing here. In which case, @kornel’s original comment about this being a non-scalable bandaid solution is correct IMO. It’s a useful mitigation, but probably only Google could realistically do it like this.

                                I think the more annoying thing about .dev is that a bunch of local development dns systems like puma-dev and pow used .dev and then Google took it away and made us all change our dev environments.

                                1. 2

                                  I think the more annoying thing about .dev is that a bunch of local development dns systems like puma-dev and pow used .dev and then Google took it away and made us all change our dev environments.

                                  That seems unfortunate, but a not terribly surprising consequence of ignoring the names that were specifically reserved for this purpose and making up their own thing instead.

                      2. 1

                        I mean user typing “site.example.com” URL in their browser’s address bar. If the URL isn’t in the HSTS preload list, then it is assumed to be HTTP URL and HTTPS upgrade is like TOFU (the first use is vulnerable to HTTPS-stripping). There are also plenty of http:// links on the web that haven’t been changed to https://, because HTTP->HTTPS redirects keep them working seamlessly, but they’re also a weak link if not HSTS-ed.

                    2. 5

                      uh! I chose .app (unaware, stupid me) for a software project that discarded the Go toolchain for this very reason. Have to reconsider, thx!

                      1. 3

                        I have no idea where to even start to research this stuff. I use .dev in my websites but I didn’t know it was controlled by Google. I legitimately thought these all are controlled by some central entity.

                        1. 2

                          I have no idea where to even start to research this stuff.

                          It is not really that hard. You can start with https://en.wikipedia.org/wiki/.dev

                          If you are going to rent a property (domain names) for your www home and if you are going to let your content live in that home for many years it pays off to research this stuff about where you are renting the property from.

                          1. 1

                            .test is untainted.

                            1. 6

                              Huh? There’s no registrar for .test, it’s just for private test/debug domains.

                        1. 2

                          I write Go at work, and we’re moving to codegen-ing quite a bit of our everyday stuff. Now, I wouldn’t call myself a Go fan, in fact I’m fairly frustrated by it on the regular. But, there is something to be said about a very simple semantic core and leaning on codegen to handle the expressivity. I just wish Go had macros built in, so you didn’t need to pick your codegen tool and it would all be AST-based instead of string-based.

                          I’m of the opinion recently that we’re at a local maximum with PL syntax expressivity. I do not see what could be done to really change things all that much. I’ve used all of the state of the art type systems - Scala, F*, OCaml, Haskell, Idris, you name it. None of them affect the raw amount of code that you have to write all that much. Probably the only thing that affects it is not having types at all, and I don’t even think that is all that expressive across the whole system. There is clear essential complexity with the level of logic that we’re writing.

                          So I’m very open to codegen and macros recently. Sure, they can consist of total black magic and be hard to debug and fully understand. But, I don’t think we have an alternative. There’s an upper limit on how much code a human can produce in a given timespan, and more importantly there’s a limit on the surface area of how much a human can understand enough to successfully modify code correctly. There’s some clear information theoretic limits at play, and no I don’t think that we’re one beautiful PL feature away from getting past those limits in a meaningful way, and yes I’m proposing that the answer is macros and / or codegen to get around it

                          1. 3

                            A thing to keep in mind is that none of the languages you’ve mentioned are trying to make it so that you write less code, but to ensure certain classes of errors are caught sooner and so that code is more likely to be “obviously correct”. There’s a limit to how much code someone can produce, but you can increase how much of their time isn’t wasted on silly things that the compiler can/should catch.

                            Codegen does help, as do macros, but only when they’re hygienic. Generics help here too, as they’re a form of typesafe codegen.

                            1. 1

                              That’s true, I didn’t mean to focus on type systems exactly, but meant that these are the languages with the most advanced features overall which should translate to programmer productivity in some way. I fully agree that “productivity” has two aspects, raw surface area but also manageability of that surface area. Types / advanced PL features help with the manageability, but the surface area magnitude is what i’m most concerned about now.

                            2. 2

                              I’ve used all of the state of the art type systems - Scala, F*, OCaml, Haskell, Idris, you name it. None of them affect the raw amount of code that you have to write all that much.

                              I don’t completely disagree with your larger point, but I think you’re overstating the case with respect to the amount of code.

                              For example, even with the same language, it is not uncommon to see a 2-3x difference in the amount of code needed depending on who writes it. And I’m not just talking about golf tricks – I mean the difference between two fully formed solutions whose goal was readability and correctness.

                              On top of that, while OCaml and Haskell, say, might be close in expressiveness, there is a substantial average reduction in code between Go and Haskell, say. At least 2x, and it can be greater. Whether this translates to less time spent overall is a separate question, but there is no question that some languages are substantially more concise than others.

                              1. 2

                                I have no quantitative info about this, this is definitely just my current intuition, and I’d actually really like to get some more quantitative data here. So I can’t disagree with you, because I have nothing to base it on other than feelings.

                                My feelings, though, are that of course you’re right, but I don’t think that means what I’m saying was overstated. You can get marginal improvements by “just writing the code better” and “choosing a more expressive language,” but I still don’t think it’s good enough. We need a much higher level of abstraction.

                            1. 3

                              So, s-expressions, but the brackets are curly and after the first atom.

                              1. 1

                                A rock doesn’t do computation when provided with energy. Now the question is, “what is computation”.

                                1. 1

                                  Church turing thesis is the accepted answer. Energy of what form? Matter is energy.

                                  1. 1

                                    Almost exactly, except that matter != energy, but you can convert matter into energy. I’ll go a bit further though: the rock is not a computer, and nor are any of the basic components of a computer themselves a computer. Doping some silicon doesn’t give you a computer, but a way to direct one energy flow and in the case of transistors, based on another. What actually makes them a computer is their arrangement, and it’s that arrangement, mediated by the components, that does the computation by directing some form of energy in some manner.

                                1. 2

                                  Adding an additional page including a salt, hash or scrypt brcrypt algorithm would be really good.

                                  1. 5

                                    BLAKE2 is probably a good choice, given it’s in th standard library and easy enough to use. It would require an extra column to store the salt, naturally.

                                    The method for producing the salted hash in the first place is:

                                    from hashlib import blake2b
                                    import os
                                    
                                    # ...
                                    
                                    salt = os.urandom(blake2b.SALT_SIZE)
                                    hash = blake2b(pwd, salt=salt).hexdigest()
                                    

                                    You can then use either hmac.compare_digest() or secrets.compare_digest() (they’re the same function) to do the comparison securely without any timing information leaking out.

                                    1. 3

                                      It’s a shame the standard library has no “password” module with an “hash password(password, version)” function that returns an opaque string blob that contains the hash, salt and version you can then use with “compare password(stored_hash, input)”. You should never have to type the name of some crypto algo name, even less be expected to know how to safely generate, store and compare hash. A generic “safe-enough” standard module would cover 99% of developers need.

                                      1. 3

                                        There’s a middle ground between “developer must manually roll their own” and “standard library does everything”, and it’s “third-party libraries/frameworks implement this, with knowledge of their domain”. Which is really where people ought to be.

                                        The standard library provides the constant-time comparison utility, but beyond that does not move fast enough, have enough ability to do hard compatibility breaks, or have enough context to make the choice of the One True KDF For All Use Cases Everywhere. Third-party libraries/frameworks can move fast enough, do have extra context from being closer to specific use cases, and can provide migration paths as needed.

                                        1. 2

                                          Sounds like secrets

                                    1. 23

                                      There are some pretty egregious issues with this code. Let’s leave aside the plain text storage is the password (storing at least a salted hash is easy), but the use of string interpolation in database queries is bad, especially when using prepared statements is easier!

                                      Instead of:

                                      	cursor
                                      		.execute("""
                                      		INSERT INTO user
                                      		(username, password)
                                      		VALUES ('%s', '%s')
                                      		"""
                                      		% (username, password))
                                      

                                      You do:

                                      	cursor
                                      		.execute("""
                                      		INSERT INTO user
                                      		(username, password)
                                      		VALUES (?, ?)
                                      		""",
                                      		(username, password))
                                      

                                      It’s best not to take shortcuts like this, even in demonstration code. Not only is it a bad habit, but you can be guaranteed that somebody will copy the code and propagate the error, and in this case the right thing is easier to do than the wrong thing.

                                      1. 9

                                        Thanks so much, I don’t know really much of anything about any of these was just getting it working for a project, but boy am I happy to get feedback like this to make it better!

                                      1. 8
                                        if realPassword[0] == password
                                        

                                        I know this wasn’t the point of the article, but can we please either write pseudo code for this or a safe implementation? :(

                                        1. 6

                                          I am going to keep this whole thing really simple. This should NEVER be done in production for authentication.

                                          This is only for demonstration purposes. Overly simplified to have the core connections exemplified.

                                          I do agree though, that at least short comment about proper way would be good.

                                          1. 12

                                            The right way would be to use the secrets.compare_digest() function. It’s right there in the standard library and has been since 3.6. It’s an alias of hmac.compare_digest(), which has been in the standard library since 3.3.

                                            1. 5

                                              More context on why you should do this: if you want to compare two things for security, use compare_digest() function. If you do naive comparison in python, it will stop at the first mismatch character, and therefore, prone to timing attacks.

                                          2. 3

                                            I didn’t know how to do a safe implementation but getting these comments, I’m learning quickly. Will make another article on how to do it the right way after learning more!

                                          1. 2

                                            This was/is a pretty common way of doing things in PETSCII on the various Commodore 8-bit machines.

                                            1. 12

                                              Mastodon

                                              Too power hungry for my taste. No easy way to host inside docker, which made it a pain to keep running. I’m very happy with Fosstodon, and don’t see a reason to switch to a self-hosted instance any time soon.

                                              I run an instance too and agree it’s too power hungry. Just the RAM requirements are a bit excessive. Ruby seems to use a lot, as does Postgres and elasticsearch (not required, but needed if you want good searchability)

                                              1. 12

                                                There’s also a newer implementation of an activitypub server at https://docs.gotosocial.org/en/latest/

                                                1. 6

                                                  I’ve been running a node from under my desk and I gotta say I’ve been really impressed with the ease of installation and the responsiveness of the dev team. Everything that I ran into was due to generic self-hosting problems like dynamic DNS and hairpin routing, gotosocial itself hasn’t given me any trouble.

                                                  1. 2

                                                    This is what I use to run my server on, I have a server that could run Mastodon but it’s so fiddly to setup and operate that I never bothered. GtS on the other hand is extremely easy to run and they are very quickly (surprisingly so) adding features to give it parity with Mastodon.

                                                    1. 2

                                                      What I’m waiting for is a “migrate from Pleroma to GTS” guide. I can probably figure it out but it looks like a mountain of faff and pain that my brain fog prevents right now.

                                                    2. 8

                                                      There’s also Honk! if you’re willing to go SUPER minimalist.

                                                      I’m with the OP as well. I ran a Mastodon instance of my own for a bit less than a year, and after a few iterations of Rails migrations, the back-end and the front-end refused to speak and nobody could figure out why so I gave up :)

                                                      1. 5

                                                        Have you considered swapping to pleroma? If I were to host a fediverse node, I’d try that first, looking at the current shape right now.

                                                        I think you can even graft the slick mastodon frontend to a pleroma backend.

                                                        1. 15

                                                          For people like me who would never consider Pleroma because of their, um, politics, it’s worth noting there is a hard fork called Akkoma that is working to save the technology from its toxic community: https://coffee-and-dreams.uk/development/2022/06/24/akkoma.html

                                                          https://akkoma.dev/AkkomaGang/akkoma/

                                                          I can’t promise they’re better, not having personally used Pleroma or interacted with either dev community directly, but I’m cautiously optimistic.

                                                          They’ve put a Code of Conduct in place too: https://akkoma.dev/AkkomaGang/akkoma/src/branch/develop/CODE_OF_CONDUCT.md​

                                                          the community must create an environment which is safe and equitable

                                                          1. 3

                                                            …I’ve never seriously considered getting involved in fediverse dev (mastodon is no better for my mental health than twitter is), but I have to admit that hacking on Akkoma sounds fun. I’ve been wanting a good excuse to get into Elixir/Erlang’s ecosystem more.

                                                            1. 7

                                                              I did some development on Pleroma back before their dev team got overtaken by shitlords, and I have to say I was impressed with how approachable it was. I’ve never done Elixir before but I have some dusty experience with Erlang and some very dusty experience with Rails and everything seemed to fit together in a sensible way, kind of what I wish Rails could have been. I wrote about my experience here: https://technomancy.us/191

                                                            2. 4

                                                              I have great difficulty understanding the approach of “this tool is made by people I don’t like so I will deny myself the utility of this tool”.

                                                              1. 21

                                                                It’s possible that part of your confusion is that with an open source project, it is often possible to use the software without directly giving the developers money or other obvious support. But this seems unwise if you want the software to continue to be developed and maintained, as most users of software do. And if you engage in less monetary ways like filing bug reports, you then have to interact with the people you do not like.

                                                                Fortunately this is a demonstration of one strength of FOSS, the right to fork: people who do not want to work with the Pleroma developers can take the codebase and go their own way, as Akkoma seems to be doing. Why spend time with people you don’t get along with, if you could just… not?

                                                                1. 12

                                                                  People who write open source software write it, primarily, for themselves. It will end up optimised for their use cases. If they are interested in creating a society that is antithetical to one in which I want to live then, pragmatically, they will probably evolve the software in directions that I dislike.

                                                                  1. 4

                                                                    This seems like quite a bit of a stretch. Perhaps for social media, since different groups have different ideas on how to regular discourse, but vast amounts of software don’t fall in this bucks.

                                                                    If libpng was written by Hitler, it still does the job.

                                                                    This divisive attitude leaking (primarily?) out of America is seriously not healthy. For better or worse, people you do not agree with will not simply disappear. If we stop talking, all that is left is violence.

                                                                    1. 10

                                                                      If libpng was written by Hitler, it still does the job.

                                                                      It does run the same, yes. But as skyfaller was saying, if you want to report a bug or send a patch, you depend on Hitler. Unless you fork. I don’t think such an extreme example serves your argument well.

                                                                      This divisive attitude leaking (primarily?) out of America is seriously not healthy. For better or worse, people you do not agree with will not simply disappear. If we stop talking, all that is left is violence.

                                                                      Case in point: out-of-context, I would like to agree with this. But now that you mentioned Hitler, I have to remind you that western democraties actually kept talking with him until very late in the 30s. It didn’t stop the violence.

                                                                      1. 3

                                                                        Most people are not Hitler, and we all know it. It was hyperbole for effect, and we all know that too. I feel you’re intentionally missing my obvious point.

                                                                        The vast majority of your (likely) fellow Americans who you disagree with are not bad people. This is a deeply unhealthy perspective that will only make things worse, and outside this argument you surely know this too.

                                                                        You’ll forgive me if I bow out now.

                                                                      2. 5

                                                                        If libpng was written by Hitler, it still does the job.

                                                                        This isn’t about libpng; it’s about Pleroma, which is a social media tool.

                                                                        It turns out when these kinds of people have atrocious opinions about minorities, they tend to also have bad ideas about moderation and harassment; they only care about the use cases that matter to straight white males.

                                                                        I think it’s a bad idea to run social software that’s written by people who don’t care about moderation and protecting their users.

                                                                        1. 4

                                                                          they only care about the use cases that matter to straight white males.

                                                                          Citation needed please.

                                                                          I think it’s a bad idea to run social software that’s written by people who don’t care about moderation and protecting their users.

                                                                          Social software is about bringing people together, right? Moderation and protecting users is about keeping people apart. I’ll cheerfully admit that there are reasons we keep people apart, but if the criteria is “software to bring people together” it seems obvious to me that the more laid-back software is the way to go.

                                                                          The platonic ideal of protecting users is putting them in a box by themselves.

                                                                          1. 7

                                                                            Social software is about bringing people together, right? Moderation and protecting users is about keeping people apart.

                                                                            This kind of simplistic thinking is exactly the kind of thing that would be an enormous red flag if I was evaluating social media servers and I heard one of the maintainers saying it.

                                                                            1. 4

                                                                              Sure, but you’ve neither explained why it’s incorrectly simplistic nor why it’s a red flag (nor justified your lazy dig at “straight white males”).

                                                                              I’ll drop it, but if you want to have a discussion of substance DMs are always open. :)

                                                              2. 4

                                                                Ruby seems to use a lot

                                                                I think it’s mostly rails actually. Ruby has a bit of a bad reputation when it comes to performance, when it’s mostly rails. In that area, other than JS always used to do the best, and compared to language implementations like Python it’s quite fast.

                                                                At least it was like that ages ago, when someone told me they want to rewrite some big project thinking switching away from Ruby would somehow magically make things faster.

                                                                1. 2

                                                                  I’ve never heard of Ruby being described as ‘quite fast’ compared to Python. Way back in the Ruby 1.8 days, Python was faster, but they’re now more or less neck and neck in terms of performance. Ruby got a bad reputation because of how slow its AST-based interpreter was back in the day.

                                                                  On the other hand, JavaScript (specifically V8) and PHP are faster than both.

                                                                  1. 1

                                                                    Yeah, I think you’re right. I don’t know much about either, but just going off what I see in top

                                                                1. 17

                                                                  When bazel works it’s a bliss and development cycle is fast. Caching is nice too.

                                                                  However, when you need to make changes to your pipeline, and your pipeline breaks, it’s the worst kind of hell. Bazel uses starlark - so editor support is poor. You can’t really script in starlark, you’re forced to implement interop with bash/python/your favourite scripting language via cmd arguments or inputs/outputs. So when your script breaks you’re left with debugging of multiple layers of abstractions. I had to deal with starlark, layered in top of go extension on top of python scripts executing bash scripts on the bottom. Not fun. And then you have gazelle or whatever BUILD files generation machinery. Because bazel is meant to be declarative, you have to use those if you need to have bigger control over your project. So on our Haskell project we’re using gazelle to configure bazel to call Haskell Stack to generate cabal files, and in the end generate BUILD files from cabal files to use Haskell rules to run ghc. And this whole cake is very brittle. In the end the build takes a bit shorter than with just Haskell toolchain, but there’s so much overhead when changing dependencies, and it’s hard to avoid. And then we have simple bash targets to start database or do some automation, and for some reason bazel quite often redownloads and recompiles whole go toolchain just to run a bash script. That last bit is probably a misconfiguration on our side, but debugging that isn’t that easy in bazel, especially when it reinvents a bunch of abstractions.

                                                                  My conclusion is pretty much the same as in the article - YMMV. Bazel solves a lot of problems, but the learning curve is steep and you are opening a new can of worms.

                                                                  1. 3

                                                                    I have a bunch of issues with Bazel, but Starlark isn’t one of them. It’s basically a cut-down version of Python, so any Python support can help. As far as it not being Turing complete, that’s on purpose to make it decidable. By making it primitive recursive, Bazel can actually analyse the work that needs to be done. If it were more general, it couldn’t. It’s a feature, not a bug. The only thing that irks me is that sometimes sets would be nice to have.

                                                                  1. 2

                                                                    SQL as a language is awesome and I love its declarative nature. But is usage has always suffered from a fatal flaw: there’s no good programming interface for it. For example, how do you programmatically add a WHERE clause to a given base query, say under a certain user-controlled condition? Similarly, how would you condtionally add a JOIN? An ORDER BY? The answer to all of these is string concatenation.

                                                                    The interface is basically: throw me a string and I’ll tell you if it means anything. It’s awkward to do safely and correctly because, IMO, it exists at the wrong level of abstraction.

                                                                    1. 4

                                                                      There are SQL APIs that are safer and more powerful than string concatenation and that support programmatic query construction. jOOQ is the one I use nowadays but there are plenty of others.

                                                                      1. 1

                                                                        Yeah, but under the hood doesn’t that do string concatenations anyway? The benefit is a better API, but someone still has to make the sausage.


                                                                        To add on to the OP, some people suggest you do things like

                                                                        SELECT a, b, c
                                                                        FROM foo
                                                                        WHERE %(a)s IS NULL or a = %(a)s;
                                                                        

                                                                        Aside from the duplication (I think there’s a way to do this with one reference, but I forget), this also tends to be optimized poorly (at least in postgresql). I had several queries like this in my app, but I had to switch to dynamic query construction to avoid pathologic query plans.

                                                                        1. 1

                                                                          The process of constructing a SQL query using a library like jOOQ isn’t fundamentally any different than the process of constructing a human-readable JSON payload using a client library for a network service. Yes, it’s probably doing string concatenation somewhere along the line, but that fact is largely hidden from the calling code.

                                                                          From a more abstract point of view, at some point, no matter what the in-memory representation is, you have to render your query to a flat sequence of bytes to send it to the database server. You can think of SQL as a human-readable wire format. If you have a safe, feature-rich API to build the query in memory before it gets rendered to the wire format, whether the rendering is done via string concatenation or some other technique shouldn’t be of much concern from the application code’s point of view.

                                                                          But I suspect you mean something more specific by “string concatenation” that I’m not seeing.

                                                                          1. 1

                                                                            String concatenation isn’t bad so long as you’re building a prepared statement or a client side equivalent that can take care of of the interpolation in a more secure way than the average user of the DB driver can.

                                                                      1. 2

                                                                        A better title for this would be “The Second Coming of SQL”.

                                                                        1. 2

                                                                          People think SQL will die, but no, it never left.

                                                                          1. 1

                                                                            Quite true!

                                                                        1. 5

                                                                          It would probably be better to link to the homepage, where it can be downloaded in various formats: https://sre.google/books/building-secure-reliable-systems/

                                                                          1. 1

                                                                            It would be, thanks, I’ve swapped it in.

                                                                          1. 5

                                                                            This means learning SQL will benefit your career as a programmer—and it’s a fairly intuitive language to pick up.

                                                                            “It’s taught in colleges and universities, and it’s really easy to learn.”

                                                                            Basic things in SQL is easy, even straightforward. But anything advanced, is rapidly far from “easy to learn”, especially when you are doing data analysis. More than once, I made what I could in SQL, then switched to some other language to use my SQL query, and filter/aggregate my results, where it could have been done in pure SQL :x

                                                                            1. 1

                                                                              The thing is that while those things are still not easy in SQL, as a general rule they are vastly easier than how you’d hand-roll these thing if SQL wasn’t a thing in the first place. Obviously, you have certain cases where this hasn’t proven to be the case, but it would be interesting to know what places you hit a wall.

                                                                              1. 1

                                                                                That’s because it’s a leaky DSL, making it a bad DSL.

                                                                                DSLs need to stay small in scope in order to be useful and not become more complicated than just doing the things they try to abstract away and make easier. (Another example is CSS, where the simple things are simple, but the complex stuff require understanding way too many fundamental rendering/layout concepts.)

                                                                                So in this case, yeah, SQL is “basically English” for all the small and simple things, and then for everything else, you need to understand why a given sub-select is 10x slower than a join. “No, John, you can’t just sub-select every single select column expression! You need to understand the execution plan is going to do this and that.” And thus, the abstraction of this terrible DSL breaks, and now you’re finding out the right incantations and forms to get the underlying system to do what you want.

                                                                                So while I personally like relational databases, I can’t stand SQL, and can’t believe we’ve not replaced it yet, or at least an alternative option to use instead.

                                                                                1. 1

                                                                                  All abstractions leak. A DSL is an abstraction, so it shouldn’t be a surprise that SQL leaks. That doesn’t make SQL a bad abstraction. What could be said to make SQL a bad abstraction is that it implements a compromised version of relational calculus and has done since the beginning.

                                                                                2. 1

                                                                                  What kind of things? SQL’s main expressiveness problem is that subqueries are required to generate new name bindings; the main reason I’ve found to switch out of sql is that sometimes aggregations result in really slow execution which can be performed more quickly by explicitly specifying the algorithm.

                                                                                1. 17

                                                                                  I disagree with R6. Specifically, the advice of moving sub-blocks to separate function in the name of avoiding too much nesting. My problem here is that the nesting is still there. It’s just hidden through the function call.

                                                                                  I’ll concede that in some cases this makes sense. Especially when the resulting sub-function ends up performing a well defined task and needs few arguments. But it’s not what we should go to first. What we should try instead is to flatten the scopes.

                                                                                  In practice, the main sources of deeply nested scopes are loops and conditionals. (Like, duh.) And while nested loops can rarely be collapsed, I’ve seen many conditionals that could be flattened. A classic example is the following pyramid, that we often see when we applying the brainded “single exit point” fake quality rule:

                                                                                  if (test1) {
                                                                                      if (test2) {
                                                                                          if (test3) {
                                                                                              OK();
                                                                                          } else { error3(); }
                                                                                      } else { error2(); }
                                                                                  } else { error1(); }
                                                                                  

                                                                                  It can generally be flattened to:

                                                                                  if (!test1) { error1(); return; }
                                                                                  if (!test2) { error2(); return; }
                                                                                  if (!test3) { error2(); return; }
                                                                                  OK()
                                                                                  

                                                                                  (When your language doesn’t have destructors or defer it can be a bit more complex. Worst case, you’ll need to use goto to handle the cleanup without duplicating code all over the place.)

                                                                                  Only when everything is nice and flattened can we ask ourselves whether the nesting is still too deep.

                                                                                  1. 6

                                                                                    And while nested loops can rarely be collapsed

                                                                                    But often extracted. An extraordinary number of the comments I make while reviewing other people’s code is pointing out to them where they can do a tiny amount of preprocessing to avoid unnecessary nested loops and ways they can chain functions that accept and return iterators to replace nested loops with what are in effect pipelines, while making their code simpler.

                                                                                    1. 3

                                                                                      It can potentially offer more readable code.

                                                                                      hasPineapple *> isFresh *> isNotOnPizza
                                                                                      

                                                                                      This can’t happen if you inline the conditions and errors. Hiding the specifics is desirable if it’s behind a good function name. Contrast:

                                                                                      do
                                                                                        when (foo1 `bar1` baz1) $ Left "tastes dull"
                                                                                        when (foo2 `bar2` baz2) $ Left "could be fresher"
                                                                                        when (foo3 `bar3` baz3) $ Left "heathen"
                                                                                      
                                                                                      1. 5

                                                                                        This is a good example of the tension between naming things and keeping the code local. I’d say it depends what the emphasis should be on. Your 3 function chain at the beginning is perfect for showing the big picture at a glance. The expanded version below however is better at showing the internals. The question is, which is more important in any given case?

                                                                                        1. 3

                                                                                          Big picture, but possibly only when the language gives you the tools to do so comfortably.

                                                                                          A delightful feature of Haskell that’s not made it to any other language I use is where clauses, where you can have local definitions after the function body. It’s essentially an inverted let..in.

                                                                                          f x = hasPineapple *> isFresh *> isNotOnPizza
                                                                                            where hasPineapple = etc
                                                                                                  isFresh = etc
                                                                                                  etc
                                                                                          

                                                                                          This allows you to define these bindings at the scope of the outer function, and also hide them below the main function body. In use it acts as if to say “read the main body first, and you can drill down to any of the specifics if/when you care about them”.

                                                                                          On the other hand in a language like TypeScript you need to define these bindings above the main body. (You could make an exception for hoisted functions but the scope is still unclear and not all bindings will be functions.)

                                                                                        2. 3

                                                                                          hasPineapple *> isFresh *> isNotOnPizza

                                                                                          I can’t parse this. What is it meant to express?

                                                                                          1. 1

                                                                                            It’s Haskell’s applicative functor syntax. In fp-ts it’d be:

                                                                                            pipe(hasPineapple, apSecond(isFresh), apSecond(isNotOnPizza))
                                                                                            

                                                                                            It’s similar to monadic bind (which is sort of found in other languages such as JavaScript’s Promise.then and Rust’s Result.and_then). The difference is that here, with the weaker applicative dependency, the code needn’t run sequentially, and we don’t care about the result on one side of the operator assuming it succeeds.

                                                                                            If we imagine their types are Either e a, then this will either give us back isNotOnPizza‘s a (which we don’t really care about in this example), or the left-most e, representing failure. Here are some REPL-friendly examples:

                                                                                            -- Right 'y'
                                                                                            Right 'x' *> Right 'y'
                                                                                            
                                                                                            -- Left 'e'
                                                                                            Left 'e' *> Right 'y'
                                                                                            
                                                                                            -- Left 'e'
                                                                                            Right 'x' *> Left 'e'
                                                                                            

                                                                                            If the types were Validation, then the same code would result in us collecting all the errors in say a list instead of failing fast.

                                                                                            Applicative syntax is also extremely pleasant for writing recursive descent parsers.

                                                                                            1. 1

                                                                                              I’m sorry, I still can’t parse the actual intent here. When I read that expression I see two boolean qualifiers (hasPineapple, isFresh) which make sense to apply to a given value (a pizza). But then there’s this parameterized qualifier (isNotOnPizza) without any apparent parameter. Is what not on pizza?

                                                                                          2. 1

                                                                                            but for chains like this you need languages that supports Result-like types well

                                                                                            1. 2

                                                                                              It’s becoming more common. Rust’s “enums” come to mind. I’m positive there are others in more mainstream, non-functional languages.

                                                                                              1. 3

                                                                                                Wider use of algebraic data types can’t come soon enough.

                                                                                          3. 1

                                                                                            And while nested loops can rarely be collapsed, I’ve seen many conditionals that could be flattened.

                                                                                            I tend to think of error handling as a special case, where the ideas of structured programming must be suspended. Errors are exceptional states that usually require execution to be aborted, which is why they are often handled using non-structured jumps, like exceptions or gotos.

                                                                                            Outside of error handling, I am not sure if early returns are a good idea in general. Compare:

                                                                                            void UpdateTheme()
                                                                                            {
                                                                                                if (!IsThemeActive) return;
                                                                                                const bool bThemeActive = IsThemeActive();
                                                                                                g_dlv->UpdateTheme(bThemeActive);
                                                                                                g_elv->UpdateTheme(bThemeActive);
                                                                                            }
                                                                                            

                                                                                            and

                                                                                            void UpdateTheme()
                                                                                            {
                                                                                                if (IsThemeActive) {
                                                                                                    const bool bThemeActive = IsThemeActive();
                                                                                                    g_dlv->UpdateTheme(bThemeActive);
                                                                                                    g_elv->UpdateTheme(bThemeActive);
                                                                                                }
                                                                                            }
                                                                                            

                                                                                            I recently changed the former to the latter, because I think the intent and logical structure are more transparent.

                                                                                            1. 5

                                                                                              In your specific example I think they the before and after are equivalently clear, but in the general case early returns for “exceptional” cases are more readable.

                                                                                              Align the happy path to the left edge

                                                                                              The nesting forces you keep more context, and more distant context, in your short term memory.

                                                                                              While technically speaking you could say the same of early returns (all the previous early returns are context, after all), in practice this isn’t really case, because the early return style allows you take advantage of the knowledge “ok, there’s a bunch of crap that could go wrong, and we’re dealing with each of them, and now it’s all done and we can handle the main case” – which is an easier way to think about things.

                                                                                              It’s also nice to have it highlighted that way in code with a consistent structure. In the nested version, your “main logic”, the happy path, is this deeply indented little branch in the middle of a bunch of other code or, even worse, split up across multiple such branches.

                                                                                              1. 1

                                                                                                in the general case early returns for “exceptional” cases are more readable.

                                                                                                I agree. I suppose I should say that in my specific example, it isn’t really an error or an exceptional case that IsThemeActive is null. That it is non-null is simply a condition for the following three lines of code to be executed. The benefit of using a conditional statement rather than an early return is that it is harder for the reader to miss the condition, and it is easier to refactor the code, e.g. if one were to move the block into another function.

                                                                                                My general point is that early returns and gotos are great when they’re needed, but that I’m not sure whether it is a good idea to use them when a normal conditional statement would fit just as well. If the only reason an early return is used is to avoid another level of indentation, then perhaps it actually makes the code less clear and harder to understand. Indentation highlights the logical structure of the code visually in a way that non-structured control flow tends to obscure.

                                                                                                1. 1

                                                                                                  The thing is that indentation is a pretty good proxy for state, and state is absolutely worth minimizing. It’s not an exceptional case that IsThemeActive is false, but if the majority of the logic of a block of code assumes that IsThemeActive is true, then it’s good if you can eliminate the counterfactual condition early-on, and therefore be able to drop that state going forward.

                                                                                                  And I don’t think early returns and gotos are really comparable. Early returns work within the rules of the call stack, but gotos can do basically whatever.

                                                                                              2. 5

                                                                                                Errors are exceptional states that usually require execution to be aborted, which is why they are often handled using non-structured jumps, like exceptions or gotos.

                                                                                                Is it exceptional if an HTTP GET request fails? Or a disk read? Or a JSON parse of untrusted data? Or an execution of a subprocess? Or a function call that takes user input? Assuming you’re programming with syscalls – errors are absolutely normal, equivalent to happy-path control flow, and should be returned to callers same as a successful result.

                                                                                                UpdateTheme

                                                                                                YMMV, but I wouldn’t approve a PR that changed your first version to the second. Early returns are God-sends! They’re one of the best ways to reduce the amount of state that human beings need to maintain as they read and model a block of code.

                                                                                                1. 2

                                                                                                  errors are absolutely normal, equivalent to happy-path control flow

                                                                                                  The difference is that errors usually require the entire execution to be aborted early, and they may arise in different parts of the logical structure of a function. Exceptions are so useful because they cater to this specific, but very common need. Early returns are another way of handling it.

                                                                                                  Early returns are God-sends! They’re one of the best ways to reduce the amount of state that human beings need to maintain as they read and model a block of code.

                                                                                                  On the one hand, yes. Outside of error handling, I find early returns to make the code more clear when the execution of an entire function depends on a variety of preconditions.

                                                                                                  On the other hand, some problems with early returns are that

                                                                                                  1. they’re easy to miss, and
                                                                                                  2. they require the reader to carefully read all the code in order to understand the logical structure and control flow.

                                                                                                  I think both points apply to the code sample I shared above. In the second example, it is clear visually that the execution of the indented lines depend on some condition. One doesn’t even really have to read the code. It is also easier to refactor.

                                                                                                  There is a presentation about structured programming by Kevlin Henney, which influenced some of my thoughts on this.

                                                                                                  1. 1

                                                                                                    The difference is that errors usually require the entire execution to be aborted early, and they may arise in different parts of the logical structure of a function. Exceptions are so useful because they cater to this specific, but very common need.

                                                                                                    I bet we work in pretty different programming contexts, and I’m sure that rule makes sense in yours. But in my world, there is essentially no situation where an e.g. network request error, or input validation error, or whatever else, should abort execution at any scope beyond the current callstack.

                                                                                                    On the other hand, some problems with early returns are they’re easy to miss, and they require the reader to carefully read all the code in order to understand the logical structure and control flow.

                                                                                                    Early returns model error handling as normal control flow. If errors are normal, and an error is no different than any other value, then readers will naturally need to do these things to understand some code. They have to do that anyway! And exceptions actually make understanding control flow harder rather than easier, I think. With early returns you don’t get any surprises — the function will return when it says return. But with exceptions, all bets are off — the function can return at any expression.

                                                                                            1. 5

                                                                                              It seems GNOME’s building for a wide audience of “normies” while their actual users are “geeks”. Their hear in the right place wanting accessible and nice looking UI but the completely miss what their users want. They want freedom to tinker and break their stuff at expense of accessibility and nice UI.

                                                                                              GNOME should stop fighting their users and stop breaking stuff out of spite. Any support request for a broken theme should be redirected to distros who shipped it. Yes, it’s a big burden and might look like finger pointing at times but so is the cost of FOSS. As OP rightly mentioned no one has infinite support capacity and most GNOME users understand that.

                                                                                              1. 18

                                                                                                I’m a GNOME user, very much a geek, and I love the direction they’re taking. I don’t want to mess with my UI, I want it to get the out of my way and let me use the computer. GNOME does that spectacularly well, much better than any other DE I have tried over the years. I love that I don’t have to tinker with it, because that lets me focus on what I want to do, rather than having to fight my DE. I do not enjoy tinkering with my desktop, it is not my area of interest. If it would be, I’d use something else, that’s the beauty of having a diverse set of options. That GNOME focuses on providing an accessible, consistent experience out of the box with only a few knobs to tweak, is great. It’s perfect for those of us - geek or non-geek alike, and anything inbetween - who just want to get shit done, and honestly not care about tweaking it to the last detail.

                                                                                                GNOME stays out of my way, doesn’t overwhelm me with tweaks and knobs I couldn’t care less about. It’s perfect. It’s perfect for me, a geek who keeps tweaking stuff that matters to him (like, my keyboard firmware is still not quite where I want it to be after half a decade of tweaking it). I love tinkering with things where tinkering makes sense. Tinkering with my firmware makes me more productive, and/or the experience more ergonomic, easier on my hands and fingers. Tinkering with my editor helps me get things done faster.

                                                                                                My DE? My DE stays out of my way, why would I want to tinker with that?

                                                                                                As for theming, I’d much prefer a single theme in light & dark variants where both of them are carefully designed, than a hodge-podge of half-broken distro-branded “stuff”. The whole “lets make the distro look different” idea is silly, if you ask me. A custom splash screen, or background, or something unobtrusive like that? Sure. But aggressively theming so it’s distro-branded? Nope, no thanks. I’d much prefer if it didn’t matter whether I’m using RedHat, Ubuntu, or whatever else, and my GNOME would look the same. That’s consistent. I don’t care about the brands, it’s not useful.

                                                                                                So, dear GNOME, please keep on doing what you’re doing. People who don’t like the direction, have alternatives, if they like to tinker so much, they can switch away too. Those of us who want something that Just Works, and is well designed out of the box, we’ll stay with GNOME.

                                                                                                1. 6

                                                                                                  I think the problem is, you’re not getting a desktop you don’t have to fight, you’re just getting a desktop that you can’t fight.

                                                                                                  1. 12

                                                                                                    I am getting a desktop I don’t have to fight, thank you. I don’t want to fight it, either. If I wanted to, there are many other options. I prefer not to, and GNOME does what I need it to do. For me, that’s what matters.

                                                                                                    It doesn’t work for everybody, and that’s fine, there are other options, they can use something that fits their needs better. But do let GNOME fit ours.

                                                                                                    1. 4

                                                                                                      I mean, I guess I just don’t see why removing options would give you a desktop that you don’t want to fight. You don’t have to fight KDE either. The only difference, aside default preference, is that you can fight KDE if you want to.

                                                                                                      If Gnome can be a desktop you don’t have to fight without customisability, it can be a desktop you don’t have to fight with customisability just as easily.

                                                                                                      1. 5

                                                                                                        You misunderstood. I don’t care about customizability of my desktop. I want it to stay out of my way, and provide a nice, cohesive design out of the box. Simple as that. If the developers believe the best way to achieve that is libadwaita, I’m fine with that. I don’t want to tinker with my DE. If I have to, I’ll find one where I don’t.

                                                                                                        Besides, libadwaita can be customised. Perhaps not themed, as in, completely change it, but it does provide the ability to customise it. Pretty much how macOS Carbon does customisation. Personally, I find libadwaita’s customisation a lot more approachable than GTK3’s theming. It’s simpler, easier to use.

                                                                                                        1. 4

                                                                                                          I think people misunderstand - it’s not just “less options as simple for user”, but also simpler for the people maintaining the application, as the application has less permutations of configuration to test and debug.

                                                                                                    2. 4

                                                                                                      And what happens if I’m using KDE and need to use a single GNOME app?

                                                                                                      You install one GNOME app, which, so far, was automatically themed with Breeze and looked at least somewhat like a native app, and used native file pickers. Now with the recent GNOME changes, just installing a single GNOME app forces you to look at their theme, and forces you to use at their broken filepicker.

                                                                                                      Apps should try to be native to whichever desktop they’re running it, they shouldn’t forcefully bring their own desktop into whatever environment they’re in.

                                                                                                      GIMP isn’t using adwaita on Windows either, and neither should Bottles bring adwaita into my KDE desktop.

                                                                                                      1. 12

                                                                                                        And what happens if I’m using KDE and need to use a single GNOME app, and now I’m forced to look at their hideous and unusable adwaita theme?

                                                                                                        Then you go and write - or fund - a KDE alternative if you hate the GNOME look so much, and there’s no KDE alternative.

                                                                                                        GNOME is like a virus, it infests your desktop more and more.

                                                                                                        Every single toolkit is like that.

                                                                                                        QT isn’t any different. macOS’s widget set isn’t any different. Windows’ isn’t any different. They all look best in their native environments, and they’re quite horrible in others. The macOS and Windows widgets sets aren’t even portable. QT is, but even when it tries to look native, it fails miserably, and we’d be better off if it didn’t even try. It might look out of place then, but it would at least be usable. Even if it tries to look like GNOME, it doesn’t, and just makes things worse, because it looks neither GNOME-native, nor KDE/QT-native, but a weird mix of both. Yikes.

                                                                                                        GNOME is doing the right thing here. Seeing apps of a non-native widget set try to look native is horrible, having to fight to make them use their native looks rather than try - and fail - to emulate another is annoying, to say the least. I’d much prefer if QT apps looked like QT apps, whether under KDE or GNOME, or anywhere else.

                                                                                                        The only way to have a consistent look & feel is to use the same widget set, because emulating another will always, without exception, fail.

                                                                                                        Now with the recent GNOME changes, just installing a single GNOME app forces you to look at their theme, and forces you to use at their broken filepicker.

                                                                                                        Opinions. I see no problem with the GNOME file picker. If you dislike it so much, don’t install GNOME apps, help write or fund alternatives for your DE of choice.

                                                                                                        Apps should try to be native to whichever desktop they’re running it, they shouldn’t forcefully bring their own desktop into whatever environment they’re in.

                                                                                                        No, they should not. Apps should be native to whichever desktop they were designed for. It is unreasonable to expect app developers to support the myriad of different desktops and themes (because we’d have to include themes then, too).

                                                                                                        KDE/QT apps bring their own desktop to an otherwise GNOME/GTK one. Even if they try to mimic GNOME, the result is bad at best, and we’d be better of if they didn’t try. GNOME is doing the right thing by not trying to mimic something it isn’t and then fail. It stays what it is, and so should QT apps, and we’d be free of the broken stuff that stems from apps trying to pretend they’re something they really are not.

                                                                                                        GIMP isn’t using adwaita on Windows either

                                                                                                        Last I checked, GIMP isn’t even using GTK4 yet to begin with, so it doesn’t use libadwaita anywhere. They didn’t make a windows-exception, they just didn’t port GIMP to GTK4 yet. Heck, the stable version of it isn’t even GTK3, let alone 4.

                                                                                                        1. 3

                                                                                                          help write or fund alternatives for your DE of choice.

                                                                                                          Considering the funding for open source projects is limited, this means I’ll have to try to get Gnome users to stop donating to Gnome, and instead donate for my own project. I’m not sure if you actually want that to happen (because it’d mean I’d have to actively try to defund Gnome).

                                                                                                          It’d be much better if we just had one, well-funded project that looks native in multiple DEs, than separate per-DE projects

                                                                                                          1. 5

                                                                                                            Considering the funding for open source projects is limited, this means I’ll have to try to get Gnome users to stop donating to Gnome

                                                                                                            Huh? Why? They use GNOME, why would they want to fund something else? People should help projects they use.

                                                                                                            and instead donate for my own project.

                                                                                                            Find your own users. Seeing the backlash against GNOME - usually from people not even using GNOME - suggests that there’s a sizable userbase that would be interested in having alternatives to some applications that do not have non-GNOME alternatives. Perhaps that’s an opportunity there.

                                                                                                            1. 1

                                                                                                              Huh? Why? They use GNOME, why would they want to fund something else? People should help projects they use.

                                                                                                              The absolute majority of GNOME users only use it because they either don’t know of alternatives, or because they have to use a few GNOME apps because there’s no alternative. If true alternatives existed, a lot of people would stop using and funding GNOME.

                                                                                                              (This sentence was written by me using Budgie, which uses parts of GNOME, solely because I need to run a GTK based desktop just for one single app that doesn’t properly work otherwise. If I could, I’d never touch Gnome or GTK, ever)

                                                                                                              1. 6

                                                                                                                The absolute majority of GNOME users only use it because

                                                                                                                Do you have a credible source for that? Because my experience is the exact opposite. Every GNOME user I know (with wildly varying backgrounds), are aware of alternatives, yet, they use GNOME, and are in general, happy with it.

                                                                                                                If true alternatives existed, a lot of people would stop using and funding GNOME.

                                                                                                                I very much doubt that people who otherwise wouldn’t use GNOME, would fund it.

                                                                                                                solely because I need to run a GTK based desktop just for one single app that doesn’t properly work otherwise

                                                                                                                I very much doubt that there’s a GTK app that cannot be used unless you run a full GTK desktop. Link, please?

                                                                                                                1. 2

                                                                                                                  n=1, but the reason I threw up my hands and stuck with GNOME on Fedora 36 was because my custom theme wasn’t entirely broken. Some apps use libadwaita and stick out like a sore thumb, though at least I can still move the window buttons to the left which is where I prefer them (for now?), but others still use the theme, and my system-wide font choices are apparently still honoured (again, for now?). But none of this means I don’t think that their UI choices are wasteful of space or find some of their design decisions personally suspect. I tolerate it, but I’m increasingly not happy with it, and eventually it will exceed my daily inertia. I have a custom window manager I’ve been working on, and I might be able to make KDE into enough of what I want that I have alternatives.

                                                                                                                  1. 7

                                                                                                                    You dislike the direction GNOME is taking then. That’s fine, and understandable: neither the looks, nor their approach suits everybody. Thankfully, in the free software world, there are alternatives.

                                                                                                                    I hate that KDE has so many knobs, it’s overwhelming and distracting. The default theme looks horrible too, in my opinion. So I don’t use KDE, because I accept that I’m not their target audience. I don’t complain about it, I don’t hate on them, I am genuinely happy they take a different approach, because then other people can choose them.

                                                                                                                    Sometimes the DE we use takes a different direction than one would like. That’s a bit of a bummer, but it happens. We move on, and find something else, because we can. Or fork, that happened too before, multiple times.

                                                                                                                    Taking a different direction is not wrong. It’s just a different direction, is all. You may not like it, there are plenty who do.

                                                                                                          2. 1

                                                                                                            The macOS and Windows widgets sets aren’t even portable.

                                                                                                            Tell that to the wine darlings.

                                                                                                            1. 3

                                                                                                              Apps running under Wine stick out like a sore thumb if they’re not basically compositing everything, in which case it’s at least on purpose. I believe that was Algernon’s point.

                                                                                                              1. 1

                                                                                                                Then every widget set is cross platform, because we can just run stuff in emulators. Good luck trying to look native then!

                                                                                                                1. 4

                                                                                                                  run stuff in emulators

                                                                                                                  wine is not an emulator. It is an implementation of the Windows library on top of Linux. It is exactly as equally “native” as GTK and Qt, which are also just libraries implemented on top of Linux.

                                                                                                                  The only question is what collection of applications you prefer. That’s really how native is defined on the linux desktop - that it fits in with the other things you commonly use.

                                                                                                            2. 3

                                                                                                              I mean you’re the one choosing to use a Gnome app. “A Gnome app looks like a Gnome app” is, at its core, something that makes sense imo.

                                                                                                              That said I would like for there to be more unification on the low hanging fruit.

                                                                                                          3. 10

                                                                                                            It’s not “spite” - there are a million Linux desktops for tinkering and breaking. Give “normies” something productive and usable in the meanwhile and they might not all neglect what could be the best platform for their purposes. I use Gnome 4(?) on Wayland and it’s great - I had it basically looking clean enough as macOS without the ugly icons in like 10 minutes. Real geeks waste their time in the terminal anyway, not customising it. (:p)

                                                                                                            1. 6

                                                                                                              It’s not “spite”

                                                                                                              Well, what is it then? For decades GNOME had flexibility, users created horribly broken themes and everyone was more or less happy. GNOME was happy to have users. Users were happy they had freedom to do whatever. Yes, not everything was perfect. Custom widgets were mostly broken, accessibility was lacking, etc.

                                                                                                              As I said, GNOME’s heart in the right place to want to have a working/accessible default but does it have to be at expense of flexibility? OP presents it as if there’s only two options: either we let users do whatever, or we have a good nice looking theme. And the main driving force behind the decision to remove configurability was distros having a bad default theme.

                                                                                                              I think GNOME is completely misguided in their approach. Instead of creating a good, pretty, accessible default theme and telling people use this if you want a good, pretty, accessible theme, they decided they won’t let distros break their default theme and lump in users into the distro category. It goes completely against the spirit of FOSS. Instead of creating better options for users they chose to remove options.

                                                                                                            2. 8

                                                                                                              It seems GNOME’s building for a wide audience of “normies” while their actual users are “geeks”. Their hear in the right place wanting accessible and nice looking UI but the completely miss what their users want. They want freedom to tinker and break their stuff at expense of accessibility and nice UI.

                                                                                                              I mean, technical professionals are trying to get their job done. Give me a desktop that works well, and I don’t want to touch it beyond using it. I want to work with compilers, not window managers.

                                                                                                              1. 6

                                                                                                                Give me a desktop that works well, and I don’t want to touch it beyond using it. I want to work with compilers, not window managers.

                                                                                                                I’ve said before that this is why Apple ended up being the manufacturer of the default “developer laptop”. They never really set out to do that, they just wanted to make nice and powerfully-spec’d machines targeting a broad “pro” market. But as a result of accidents of their corporate history, they ended up doing what no Linux distro vendor ever managed: ship something that works well and is Unix-y enough for developers at the same time.

                                                                                                                I ran various Linux distros as my primary desktop operating system for much of the 00s, and I know my first experience with an Apple laptop and OS X was a breath of fresh air.

                                                                                                            1. 41

                                                                                                              The interface of Git and its underlying data models are two very different things, that are best treated separately.

                                                                                                              The interface is pretty bad. If I wasn’t so used to it I would be fairly desperate for an alternative. I don’t care much for the staging area, I don’t like to have to clean up my working directory every time I need to switch branches, and I don’t like how easy it is to lose commit from a detached HEAD (though there’s always git reflog I guess).

                                                                                                              The underlying data model however is pretty good. We can probably ditch the staging area, but apart from that, viewing the history of a repository as a directed graph of snapshots is nice. Captures everything we need. Sure patches have to be derived from those snapshots, but we care less about the patches than we care about the various versions we saved. If there’s one thing we need to get right, it’s those snapshots. You get reproducible builds & test from them, not from patches. So I think Patches are secondary. I used to love DARCS, but I think patch theory was probably the wrong choice.

                                                                                                              Now one thing Git really really doesn’t like is large binary files. Especially if we keep changing them. But then that’s just a compression problem. Let the data model pretend there’s a blob for each version of that huge file, even though in fact the software is automatically compressing & decompressing things under the hood.

                                                                                                              1. 62

                                                                                                                What’s wrong with the staging area? I use it all the time to break big changes into multiple commits and smaller changes. I’d hate to see it removed just because a few people don’t find it useful.

                                                                                                                1. 27

                                                                                                                  Absolutely, I would feel like I’m missing a limb without the staging area. I understand that it’s conceptually difficult at first, but imo it’s extremely worth the cost.

                                                                                                                  1. 7

                                                                                                                    Do you actually use it, or do you just do git commit -p, which only happens to use the staging area as an implementation detail?

                                                                                                                    And how do you test the code you’re committing? How do you make sure that the staged hunks aren’t missing another hunk that, for example, changes the signature the function you’re calling? It’s a serious slowdown in workflow to need to wait for CI rounds, stash and rebase to get a clean commit, and push again.

                                                                                                                    1. 25

                                                                                                                      Do you actually use it

                                                                                                                      Yes.

                                                                                                                      And how do you test the code you’re committing?

                                                                                                                      rebase with --exec

                                                                                                                      1. 12

                                                                                                                        I git add -p to the staging area and then diff it before generating the commit. I guess that could be done without a staging area using a different workflow but I don’t see the benefit (even if I have to check git status for the command every time I need to unstage something (-: )

                                                                                                                        As for testing, since I’m usually using Github I use the PR as the base unit that needs to pass a test (via squash merges, the horror I know). My commits within a branch often don’t pass tests; I use commits to break things up into sections of functionality for my own benefit going back later.

                                                                                                                        1. 7

                                                                                                                          Just to add on, the real place where the staging area shines is with git reset -p. You can reset part of a commit, amend the commit, and then create a new commit with your (original) changes or continue editing. The staging area becomes more useful the more you do commit surgery.

                                                                                                                          1. 2

                                                                                                                            Meh, you don’t need a staging area for that (or anything). hg uncommit -i (for --interactive) does quite the same thing, and because it has no artificial staging/commit split it gets to use the clear verb.

                                                                                                                          2. 2

                                                                                                                            I guess that could be done without a staging area using a different workflow but I don’t see the benefit

                                                                                                                            I don’t see the cost.

                                                                                                                            My commits within a branch often don’t pass tests;

                                                                                                                            If you ever need to git bisect, you may come to regret that. I almost never use git bisect, but for the few times I did need it it was a life saver, and passing tests greatly facilitate it.

                                                                                                                            1. 9

                                                                                                                              I bisect every so often, but on the squashed PR commits on main, not individual commits within a PR branch. I’ve never needed to do that to diagnose a bug. If you have big PRs, don’t squash, or don’t use a PR-based workflow, that’s different of course. I agree with the general sentiment that all commits on main should pass tests for the purposes of bisection.

                                                                                                                          3. 3

                                                                                                                            I use git gui for committing, (the built in git gui command) which let’s you pick by line not just hunks. Normally the things I’m excluding are stuff like enabling debug flags, or just extra logging, so it’s not really difficult to make sure it’s correct. Not saying I never push bad code, but I can’t recall an instance where I pushed bad code because of that so use the index to choose parts of my unfinished work to save in a stash (git stash –keep-index), and sometimes if I’m doing something risky and iterative I’ll periodically add things to the staging area as I go so I can have some way to get back to the last known good point without actually making a bunch of commits ( I could rebase after, yeah but meh).

                                                                                                                            It being just an implementation detail in most of that is a fair point though.

                                                                                                                            1. 2

                                                                                                                              I personally run the regression test (which I wrote) to test changes.

                                                                                                                              Then I have to wait for the code review (which in my experience has never stopped a bug going through; when I have found bugs, in code reviews, it was always “out of scope for the work, so don’t fix it”) before checking it in. I’m dreading the day when CI is actually implemented as it would slow down an already glacial process [1].

                                                                                                                              Also, I should mention I don’t work on web stuff at all (thank God I got out of that industry).

                                                                                                                              [1] Our customer is the Oligarchic Cell Phone Company, which has a sprint of years, not days or weeks, with veto power over when we deploy changes.

                                                                                                                            2. 5

                                                                                                                              Author of the Jujutsu VCS mentioned in the article here. I tried to document at https://github.com/martinvonz/jj/blob/main/docs/git-comparison.md#the-index why I think users don’t actually need the index as much as they think.

                                                                                                                              I missed the staging area for at most a few weeks after I switched from Git to Mercurial many years ago. Now I miss Mercurial’s tools for splitting commits etc. much more whenever I use Git.

                                                                                                                              1. 1

                                                                                                                                Thanks for the write up. From what I read it seems like with Jujutsu if I have some WIP of which I want to commit half and continue experimenting with the other half I would need to commit it all across two commits. After that my continuing WIP would be split across two places: the second commit and the working file changes. Is that right? If so, is there any way to tag that WIP commit as do-not-push?

                                                                                                                                1. 3

                                                                                                                                  Not quite. Every time you run a command, the working copy is snapshotted and becomes a real commit, amending the precis working-copy commit. The changes in the working copy are thus treated just like any other commit. The corresponding think to git commit -p is jj split, which creates two stacked commits from the previous working-copy commit, and the second commit (the child) is what you continue to edit in the working copy.

                                                                                                                                  Your follow-up question still applies (to both commits instead of the single commit you seemed to imagine). There’s not yet any way of marking the working copy as do-not-push. Maybe we’ll copy Mercurial’s “phase” concept, but we haven’t decided yet.

                                                                                                                            3. 8

                                                                                                                              Way I see it, the staging area is a piece of state needed specifically for a command line interface. I use it too, for the exact reason you do. But I could do the same by committing it directly. Compare the possible workflows. Currently we do:

                                                                                                                              # most of the time
                                                                                                                              git add .
                                                                                                                              git commit
                                                                                                                              
                                                                                                                              # piecemeal
                                                                                                                              git add -p .
                                                                                                                              # review changes
                                                                                                                              git commit
                                                                                                                              

                                                                                                                              Without a staging area, we could instead do that:

                                                                                                                              # most of the time
                                                                                                                              git commit
                                                                                                                              
                                                                                                                              # piecemeal
                                                                                                                              git commit -p
                                                                                                                              # review changes
                                                                                                                              git reset HEAD~ # if the changes are no good
                                                                                                                              

                                                                                                                              And I’m not even talking about a possible GUI for the incremental making of several commits.

                                                                                                                              1. 7

                                                                                                                                Personally I use git add -p all of the time. I’ve simply been burned by the other way too many times. What I want is not to save commands but to have simple commands that work for me in every situation. I enjoy the patch selection phase. More often than not it is what triggers my memory of a TODO item I forgot to jot down, etc. The patch selection is the same as reviewing the diff I’m about to push but it lets me do it incrementally so that when I’m (inevitably) interrupted I don’t have to remember my place.

                                                                                                                                From your example workflows it seems like you’re interested in avoiding multiple commands. Perhaps you could use git commit -a most of the time? Or maybe add a commit-all alias?

                                                                                                                                1. 1

                                                                                                                                  Never got around to write that alias, and if I’m being honest I quite often git diff --cached to see what I’ve added before I actually commit it.

                                                                                                                                  I do need something that feels like a staging area. I was mostly wondering whether that staging area really needed to be implemented differently than an ordinary commit. Originally I believed commits were enough, until someone pointed out pre-commit hooks. Still, I wonder why the staging area isn’t at least a pointer to a tree object. It would have been more orthogonal, and likely require less effort to implement. I’m curious what Linus was thinking.

                                                                                                                                  1. 2

                                                                                                                                    Very honourable to revise your opinion in the face of new evidence, but I’m curious to know what would happen if you broadened the scope of your challenge with “and what workflow truly requires pre-commit hooks?”!

                                                                                                                                    1. 1

                                                                                                                                      Hmm, that’s a tough one. Strictly speaking, none. But I can see the benefits.

                                                                                                                                      Take Monocypher for instance: now it’s pretty stable, and though it is very easy for me to type make test every time I modify 3 characters, in practice I may want to make sure I don’t forget to do it before I commit anything. But even then there are 2 alternatives:

                                                                                                                                      • Running tests on the server (but it’s better suited to a PR model, and I’m almost the only committer).
                                                                                                                                      • Having a pre push hook. That way my local commits don’t need the hook, and I could go back to using the most recent one as a staging area.
                                                                                                                                  2. 1

                                                                                                                                    I use git add -p all the time, but only because Magit makes it so easy. If I had an equally easy interface to something like hg split or jj split, I don’t think I’d care about the lack of an index/staging area.

                                                                                                                                  3. 6

                                                                                                                                    # most of the time

                                                                                                                                    git add .

                                                                                                                                    Do you actually add your entire working directory most of the time? Unless I’ve just initialized a repository I essentially never do that.

                                                                                                                                    Here’s something I do do all the time, because my mind doesn’t work in a red-green-refactor way:

                                                                                                                                    Get a bug report

                                                                                                                                    Fix bug in foo_controller

                                                                                                                                    Once the bug is fixed, I finally understand it well enough to write an automated regression test around it, so go do that in foo_controller_spec

                                                                                                                                    Run test suite to ensure I didn’t break anything and that my new test is green

                                                                                                                                    Add foo_controller and foo_controller_spec to staging area

                                                                                                                                    Revert working copy (but not staged copy!) of foo_controller (but not it’s spec)

                                                                                                                                    Run test suite again and ensure I have exactly one red test (the new regression test). If yes, commit the stage.

                                                                                                                                    If no, debug spec against old controller until I understand why it’s not red, get it red, pull staged controller back to working area, make sure it’s green.

                                                                                                                                    Yeah, I could probably simulate this by committing halfway through and then doing some bullshit with cherry-picks from older commits and in some cases reverting the top commit but, like, why? What would I gain from limiting myself to just this awkward commit dance as the only way of working? That’s just leaving me to cobble together a workflow that’s had a powerful abstraction taken away from it, just to satisfy some dogmatic “the commit is the only abstraction I’m willing to allow” instinct.

                                                                                                                                    1. 4

                                                                                                                                      Do you actually add your entire working directory most of the time?

                                                                                                                                      Yes. And when I get a bug report, I tend to first reproduce the bug, then write a failing test, then fix the code.

                                                                                                                                      Revert working copy (but not staged copy!) of foo_controller (but not it’s spec)

                                                                                                                                      Sounds useful. How do you do that?

                                                                                                                                      1. 7

                                                                                                                                        Revert working copy (but not staged copy!) of foo_controller (but not it’s spec)

                                                                                                                                        Sounds useful. How do you do that?

                                                                                                                                        You can checkout a file into your working copy from any commit.

                                                                                                                                        1. 6

                                                                                                                                          Yes. And when I get a bug report, I tend to first reproduce the bug, then write a failing test, then fix the code.

                                                                                                                                          Right, but that was just one example. Everything in your working copy should always be committed at all times? I’m almost never in that state. Either I’ve got other edits in progress that I intend to form into later commits, or I’ve got edits on disk that I never intend to commit but in files that should not be git ignored (because I still intend to merge upstream changes into them).

                                                                                                                                          I always want to be intentionally forming every part of a commit, basically.

                                                                                                                                          Sounds useful. How do you do that?

                                                                                                                                          git add foo_controller <other files>; git restore -s HEAD foo_controller

                                                                                                                                          and then

                                                                                                                                          git restore foo_controller will copy the staged version back into the working set.

                                                                                                                                      2. 1

                                                                                                                                        TBH, I have no idea what “git add -p” does off hand (I use Magit), and I’ve never used staging like that.

                                                                                                                                        I had a great example use of staging come up just yesterday. I’m working in a feature branch, and we’ve given QA a build to test what we have so far. They found a bug with views, and it was an easy fix (we didn’t copy attributes over when copying a view).

                                                                                                                                        So I switched over to views.cpp and made the change. I built, tested that specific view change, and in Magit I staged that specific change in views.cpp. Then I commited, pushed it, and kicked off a pipeline build to give to QA.

                                                                                                                                        I also use staging all the time if I refactor while working on new code or fixing bugs. Say I’m working on “foo()”, but while doing so I refactor “bar()” and “baz()”. With staging, I can isolate the changes to “bar()” and “baz()” in their own commits, which is handy for debugging later, giving the changes to other people without pulling in all of my changes, etc.

                                                                                                                                        Overall, it’s trivial to ignore staging if you don’t want it, but it would be a lot of work to simulate it if it weren’t a feature.

                                                                                                                                      3. 6

                                                                                                                                        What’s wrong with the staging area? I use it all the time to break big changes into multiple commits and smaller changes.

                                                                                                                                        I’m sure you do – that’s how it was meant to be used. But you might as well use commits as the staging area – it’s easy to commit and squash. This has the benefit that you can work with your whole commit stack at the same time. I don’t know what problem the staging area solves that isn’t better solved with commits. And yet, the mere existence of this unnecessary feature – this implicitly modified invisible state that comes and crashes your next commit – adds cognitive load: Commands like git mv, git rm and git checkout pollutes the state, then git diff hides it, and finally, git commit --amend accidentally invites it into the topmost commit.

                                                                                                                                        The combo of being not useful and a constant stumbling block makes it bad.

                                                                                                                                        1. 3

                                                                                                                                          I don’t know what problem the staging area solves that isn’t better solved with commits.

                                                                                                                                          If I’ve committed too much work in a single commit how would I use commits to split that commit into two commits?

                                                                                                                                          1. 4

                                                                                                                                            Using e.g. hg split or jj split. The former has a text-based interface similar to git commit -p as well as a curses-based TUI. The latter lets you use e.g. Meld or vimdiff to edit the diff in a temporary directory and then rewrites the commit and all descendants when you’re done.

                                                                                                                                            1. 3

                                                                                                                                              That temporary directory sounds a lot like the index – a temporary place where changes to the working copy can be batched. Am I right to infer here that the benefit you find in having a second working copy in a temp directory because it works better with some other tools that expect to work files?

                                                                                                                                              1. 1

                                                                                                                                                The temporary directory is much more temporary than the index - it only exists while you split the commit. For example, if you’re splitting a commit that modifies 5 files, then the temporary directory will have only 2*5 files (for before and after). Does that clarify?

                                                                                                                                                The same solution for selecting part of the changes in a commit is used by jj amend -i (move into parent of specified commit, from working-copy commit by default), jj move -i --from <rev> --to <rev> (move changes between arbitrary commits) etc.

                                                                                                                                            2. 2

                                                                                                                                              I use git revise. Interactive revise is just like interactive rebase, except that it has is a cut subcommand. This can be used to split a commit by selecting and editing hunks like git commit -p.

                                                                                                                                              Before git-revise, I used to manually undo part of the commit, commit that, then revert it, and then sqash the undo-commit into the commit to be split. The revert-commit then contains the split-off changes.

                                                                                                                                            3. 3

                                                                                                                                              I don’t know, I find it useful. Maybe if git built in mercurials “place changes into commit that isn’t the most recent” amend thing then I might have an easier time doing things but just staging up relevant changes in a patch-based flow is pretty straightforward and helpful IMO

                                                                                                                                              I wonder if this would be as controversial if patching was the default

                                                                                                                                            4. 6

                                                                                                                                              What purpose does it serve that wouldn’t also be served by first-class rollback and an easier way of collapsing changesets on their way upstream? I find that most of the benefits of smaller changesets disappear when they don’t have commit messages, and when using the staging area for this you can only rollback one step without having to get into the hairy parts of git.

                                                                                                                                              1. 3

                                                                                                                                                The staging area is difficult to work with until you understand what’s happening under the hood. In most version control systems, an object under version control would be in one of a handful of states: either the object has been cataloged and stored in its current state, or it hasn’t. From a DWIM standpoint for a new git user, would catalog and store the object in its current state. With the stage, you can stage, and change, stage again, and change again. I’ve used this myself to logically group commits so I agree with you that it’s useful. But I do see how it breaks peoples DWIM view on how git works.

                                                                                                                                                Also, If I stage, and then change, is there a way to have git restore the file as I staged it if I haven’t committed?

                                                                                                                                                1. 7

                                                                                                                                                  Also, If I stage, and then change, is there a way to have git restore the file as I staged it if I haven’t committed?

                                                                                                                                                  Git restore .

                                                                                                                                                  1. 3

                                                                                                                                                    I’ve implemented git from scratch. I still find the staging area difficult to use effectively in practice.

                                                                                                                                                  2. 1

                                                                                                                                                    Try testing your staged changes atomically before you commit. You can’t.

                                                                                                                                                    A better design would have been an easy way to unstage, similar to git stash but with range support.

                                                                                                                                                    1. 5

                                                                                                                                                      You mean git stash --keep-index?

                                                                                                                                                      1. 3

                                                                                                                                                        Interesting, that would solve the problem. I’m surprised I’ve not come across that before.

                                                                                                                                                        In terms of “what’s wrong with the staging area”, what I was suggesting would work better is to have the whole thing work in reverse. So all untracked files are “staged” by default and you would explicitly un-stage anything you don’t want to commit. Firstly this works better for the 90% use-case, and compared to this workaround it’s a single step rather than 2 steps for the 10% case where you don’t want to commit all your changes yet.

                                                                                                                                                        The fundamental problem with the staging area is that it’s an additional, hidden state that the final committed state has to pass through. But that means that your commits do not necessarily represent a state that the filesystem was previously in, which is supposed to be a fundamental guarantee. The fact that you have to explicitly stash anything to put the staging area into a knowable state is a bit of a hack. It solves a problem that shouldn’t exist.

                                                                                                                                                        1. 2

                                                                                                                                                          The way I was taught this, the way I’ve taught this to others, and the way it’s represented in at least some guis is not compatible.

                                                                                                                                                          I mean, sure, you can have staged and unstaged changes in a file and need to figure it out for testing, or unstage parts, but mostly it’s edit -> stage -> commit -> push.

                                                                                                                                                          That feels, to me and to newbies who barely know what version control is, like a logical additive flow. Tons of cases you stage everything and commit so it’s a very small operation.

                                                                                                                                                          The biggest gripe may be devs who forget to add files in the proper commit, which makes bisect hard. Your case may solve that for sure, but I find it a special case of bad guis and sloppy devs who do that. Also at some point the fs layout gets fewer new files.

                                                                                                                                                          1. 2

                                                                                                                                                            Except that in a completely linear flow the distinction between edit and stage serves no purpose. At best it creates an extra step for no reason and at worst it is confusing and/or dangerous to anyone who doesn’t fully understand the state their working copy is in. You can bypass the middle state with git add .; git commit and a lot of new developers do exactly that, but all that does is pretend the staging state doesn’t exist.

                                                                                                                                                            Staging would serve a purpose if it meant something similar to pushing a branch to CI before a merge, where you have isolated the branch state and can be assured that it has passed all required tests before it goes anywhere permanent. But the staging area actually does the opposite of that, by creating a hidden state that cannot be tested directly.

                                                                                                                                                            As you say, all it takes is one mistake and you end up with a bad commit that breaks bisect later. That’s not just a problem of developers being forgetful, it’s the bad design of the staging area that makes this likely to happen by default.

                                                                                                                                                            1. 1

                                                                                                                                                              I think I sort of agree but do not completely concur.

                                                                                                                                                              Glossing over the staging can be fine in some projects and dev sloppiness is IMO a bigger problem than an additive flow for clean commits.

                                                                                                                                                              These are societal per-project issues - what’s the practice or policy or mandate - and thus they could be upheld by anything, even using the undo buffer for clean commits like back in the day. Which isn’t to say you never gotta do trickery like that with Git, just that it’s a flow that feels natural and undo trickery less common.

                                                                                                                                                              Skimming the other comments, maybe jj is more like your suggestion, and I wouldn’t mind “a better Git”, but I can’t be bothered when eg. gitless iirc dropped the staging and would make clean commits feel like 2003.

                                                                                                                                                      2. 2

                                                                                                                                                        If git stash --keep-index doesn’t do what you want the you could help further the conversation by elaborating on what you want.

                                                                                                                                                        1. 1
                                                                                                                                                      3. 16

                                                                                                                                                        The underlying data model however is pretty good. We can probably ditch the staging area,

                                                                                                                                                        Absolutely not. The staging area was a godsend coming from Subversion – it’s my favorite part of git bar none.

                                                                                                                                                        1. 4

                                                                                                                                                          Everyone seem to suppose I would like to ditch the workflows enabled by the staging area. I really don’t. I’m quite sure there ways to keep those workflows without using a staging area. If there aren’t well… I can always admit I was wrong.

                                                                                                                                                          1. 9

                                                                                                                                                            Well, what I prize being able to do is to build up a commit piecemeal out of some but not all of the changes in my working directory, in an incremental rather than all-in-one-go fashion (ie. I should be able to form the commit over time and I should be able to modify a file, move it’s state into the “pending commit” and continue to modify the file further without impacting the pending commit). It must be possible for any commit coming out of this workflow to both not contain everything in my working area, and to contain things no longer in my working area. It must be possible to diff my working area against the pending commit and against the last actual commit (separately), and to diff the pending commit against the last actual commit.

                                                                                                                                                            You could call it something else if you wanted but a rose by any other name etc. A “staging area” is a supremely natural metaphor for what I want to work with in my workflow, so replacing it hardly seems desirable to me.

                                                                                                                                                            1. 2

                                                                                                                                                              How about making the pending commit an actual commit? And then adding the porcelain necessary to treat it like a staging area? Stuff like git commit -p foo if you want to add changes piecemeal.

                                                                                                                                                              1. 11

                                                                                                                                                                No. That’s cool too and is what tools like git revise and git absorb enable, but making it an actual commit would have other drawbacks: it would imply it has a commit message and passes pre-commit hooks and things like that. The staging area is useful precisely for what it does now—help you build up the pieces necessary to make a commit. As such it implies you don’t have everything together to make a commit out of it. As soon as I do I commit, then if necessary --ammend, --edit, or git revise later. If you don’t make use of workflows that use staging then feel free to use tooling that bypasses it for you, but don’t try to take it away from the rest of us.

                                                                                                                                                                1. 9

                                                                                                                                                                  pre-commit hooks

                                                                                                                                                                  Oh, totally missed that one. Probably because I’ve never used it (instead i rely on CI or manually pushing a button). Still, that’s the strongest argument so far, and I have no good solution that doesn’t involve an actual staging area there. I guess it’s time to change my mind.

                                                                                                                                                                  1. 2

                                                                                                                                                                    I think the final word is not said. These tools could also run hooks. It may be that new hooks need to be defined.

                                                                                                                                                                    Here is one feature request: run git hooks on new commit

                                                                                                                                                                    1. 1

                                                                                                                                                                      I think you missed the point, my argument is that the staging area is useful as a place to stage stuff before things like commit related hooks get run. I don’t want tools like git revise to run precommit hooks. When I use git revise the commit has already been made and presumably passed precommit phase.

                                                                                                                                                                      1. 1

                                                                                                                                                                        For the problem that git revise “bypasses” the commit hook when using it to split a commit, I meant the commit hook (not precommit hook).

                                                                                                                                                                        I get that the staging area lets you assemble a commit before you can run the commit hook. But if this was possible to do statelessly (which would only be an improvement), you could do without it. And for other reasons, git would be so much better without this footgun:

                                                                                                                                                                        Normally, you can look at git diff and commit what you see with git commit -a. But if the staging area is clobbered, which you might have forgot, you also have invisible state that sneaks in!

                                                                                                                                                                        1. 1

                                                                                                                                                                          Normally, you can look at git diff and commit what you see with git commit -a.

                                                                                                                                                                          Normally I do nothing of the kind. I might have used git commit -a a couple times in the last 5 years (and I make dozens to hundreds of commits per day). The stattefullness of the staging area is exactly what benefits my workflow and not the part I would be trying to eliminate. The majority of the time I stage things I’m working on from my editor one hunk at a time. The difference between my current buffer and the last git commit is highlighted and after I make some progress I start adding related hunks and shaping them into commits. I might fiddle around with a couple things in the current file, then when I like it stage up pieces into a couple different commits.

                                                                                                                                                                          The most aggressive I’d get is occasionally (once a month?) coming up with a use for git commit -u.

                                                                                                                                                                          A stateless version of staging that “lets you assemble a commit” sounds like an oxymoron to me. I have no idea what you think that would even look like, but a state that is neither the full contents of the current file system nor yet a commit is exactly what I want.

                                                                                                                                                                    2. 1

                                                                                                                                                                      Why not allow an empty commit message, and skip the commit hooks if a message hasn’t been set yet?

                                                                                                                                                                      1. 1

                                                                                                                                                                        Why deliberately make a mess of things? Why make a discreet concept of a “commit” into something else with multiple possible states? Why not just use staging like it is now? I see no benefit to jurry rigging more states on top of a working one. If the point is to simplify the tooling you won’t get there by overloading one clean concept with an indefinite state and contextual markers like “if commit message empty then this is not a real commit”.

                                                                                                                                                                        1. 1

                                                                                                                                                                          Empty commit message is how you abort a commit

                                                                                                                                                                          1. 1

                                                                                                                                                                            With the current UI.

                                                                                                                                                                            When discussing changes, there’s the possibility of things changing.

                                                                                                                                                                      2. 5

                                                                                                                                                                        Again, what’s the benefit?

                                                                                                                                                                        Sure, you could awkwardly simulate a staging area like this. The porcelain would have to juggle a whole bunch of shit to avoid breaking anytime you merge a bunch of changes after adding something to the fake “stage”, pull in 300 new commits, and then decide you want to unstage something, so the replacement of the dedicated abstraction seems likely to leak and introduce merge conflict resolution where you didn’t previously have to worry about it, but maybe with enough magic you could do it.

                                                                                                                                                                        But what’s the point? To me it’s like saying that I could awkwardly simulate if, while and for with goto, or simulate basically everything with enough NANDs. You’re not wrong, but what’s in it for me? Why am I supposed to like this any better than having a variety of fit-for-purpose abstractions? It just feels like I’d be tying one hand behind my back so there can be one less abstraction, without explain why having N-1 abstractions is even more desirable than having N.

                                                                                                                                                                        Seems like an “a foolish consistency is the hobgoblin of little minds” desire than anything beneficial, really.

                                                                                                                                                                        1. 1

                                                                                                                                                                          Again, what’s the benefit?

                                                                                                                                                                          Simplicity of implementation. Implementing the staging area like a commit, or at least like a pointer to a tree object, would likely make the underlying data model simpler. I wonder why the staging area was implemented the way it is.

                                                                                                                                                                          At the interface level however I’ve had to change my mind because of pre-commit hooks. When all you have is commits, and some tests are automatically launched every time you commit anything, it’s pretty hard to add stuff piecemeal.

                                                                                                                                                                          1. 3

                                                                                                                                                                            Yes, simplicity of implementation and UI. https://github.com/martinvonz/jj (mentioned in the article) makes the working copy (not the staging area) an actual commit. That does make the implementation quite a lot simpler. You also get backups of the working copy that way.

                                                                                                                                                                            1. 1

                                                                                                                                                                              Simplicity of implementation.

                                                                                                                                                                              No offence but, why would I give a shit about this? git is a tool I use to enable me to get other work done, it’s not something I’m reimplementing. If “making the implementation simpler” means my day-to-day workflows get materially more unpleasant, the simplicity of the implementation can take a long walk off a short pier for all I care.

                                                                                                                                                                              It’s not just pre-commit hooks that get materially worse with this. “Staging” something would then have to have a commit message, I would effectively have to branch off of head before doing every single “staging” commit in order to be able to still merge another branch and then rebase it back on top of everything without fucking about in the reflog to move my now-burried-in-the-past stage commit forward, etc, etc. “It would make the implementation simpler” would be a really poor excuse for a user hostile change.

                                                                                                                                                                              1. 3

                                                                                                                                                                                If “making the implementation simpler” means my day-to-day workflows get materially more unpleasant, the simplicity of the implementation can take a long walk off a short pier for all I care.

                                                                                                                                                                                I agree. Users shouldn’t have to care about the implementation (except for minor effects like a simpler implementation resulting in fewer bugs). But I don’t understand why your workflows would be materially more unpleasant. I think they would actually be more pleasant. Mercurial users very rarely miss the staging area. I was a git developer (mostly working on git rebase) a long time ago, so I consider myself a (former) git power user. I never miss the staging area when I use Mercurial.

                                                                                                                                                                                “Staging” something would then have to have a commit message

                                                                                                                                                                                Why? I think the topic of this thread is about what can be done differently, so why would the new tool require a commit message? I agree that it’s useful if the tool lets you provide a message, but I don’t think it needs to be required.

                                                                                                                                                                                I would effectively have to branch off of head before doing every single “staging” commit in order to be able to still merge another branch and then rebase it back on top of everything without fucking about in the reflog to move my now-burried-in-the-past stage commit forward

                                                                                                                                                                                I don’t follow. Are you saying you’re currently doing the following?

                                                                                                                                                                                git add -p
                                                                                                                                                                                git merge <another branch>
                                                                                                                                                                                git rebase <another branch>
                                                                                                                                                                                

                                                                                                                                                                                I don’t see why the new tool would bury the staging commit in the past. That’s not what happens with Jujutsu/jj anyway. Since the working copy is just like any other commit there, you can simply merge the other branch with it and then rebase the whole stack onto the other branch after.

                                                                                                                                                                                I’ve tried to explain a bit about this at https://github.com/martinvonz/jj/blob/main/docs/git-comparison.md#the-index. Does that help clarify?

                                                                                                                                                                                1. 1

                                                                                                                                                                                  Mercurial users very rarely miss the staging area.

                                                                                                                                                                                  Well, I’m not them. As somebody who was forced to use Mercurial for a bit and hated every second of it, I missed the hell out of it, personally (and if memory serves, there was later at least one inevitably-nonstandard Mercurial plugin to paper over this weakness, so I don’t think I was the only person missing it).

                                                                                                                                                                                  I’ve talked about my workflow elsewhere in this thread, I’m not really interested in rehashing it, but suffice to say I lean on the index for all kinds of things.

                                                                                                                                                                                  Are you saying you’re currently doing the following? git add -p git merge

                                                                                                                                                                                  I’m saying that any number of times I start putting together a commit by staging things on Friday afternoon, come back on Monday, pull in latest from main, and continue working on forming a commit.

                                                                                                                                                                                  If I had to (manually, we’re discussing among other things the assertion that you could eliminate the stage because it’s pointless, and you could “just” commit whenever you want to stage and revert the commit whenever they want to unstage ) commit things on Friday, forget I’d done so on Monday, pull in 300 commits from main, and then whoops I want to revert a commit 301 commits back so now I get to back out the merge and etc etc, this is all just a giant pain in the ass to even type out.

                                                                                                                                                                                  Does that help clarify?

                                                                                                                                                                                  I’m honestly not interested in reading it, or in what “Jujutsu” does, as I’m really happy with git and totally uninterested in replacing it. All I was discussing in this thread with Loup-Vaillant was the usefulness of the stage as an abstraction and my disinterest in seeing it removed under an attitude of “well you could just manually make commits when you would want to stage things, instead”.

                                                                                                                                                                                  1. 2

                                                                                                                                                                                    I’m honestly not interested in reading it, or in what “Jujutsu” does

                                                                                                                                                                                    Too bad, this link you’re refusing to read is highly relevant to this thread. Here’s a teaser:

                                                                                                                                                                                    As a Git power-user, you may think that you need the power of the index to commit only part of the working copy. However, Jujutsu provides commands for more directly achieving most use cases you’re used to using Git’s index for.

                                                                                                                                                                                    1. 0

                                                                                                                                                                                      What “jujutsu” does under the hood has nothing whatsoever to do with this asinine claim of yours, which is the scenario I was objecting to: https://lobste.rs/s/yi97jn/is_it_time_look_past_git#c_k6w2ut

                                                                                                                                                                                      At this point I’ve had enough of you showing up in my inbox with these poorly informed, bad faith responses. Enough.

                                                                                                                                                                                      1. 1

                                                                                                                                                                                        I was claiming that the workflows we have with the staging area, we could achieve without. And Jujutsu here has ways to do exactly that. It has everything to do with the scenario you were objecting to.

                                                                                                                                                                                        Also, this page (and what I cited specifically) is not about what jujutsu does under the hood, it’s about its user interface.

                                                                                                                                                                                        1. 1

                                                                                                                                                                                          I’ve made it clear that I’m tired of interacting with you. Enough already.

                                                                                                                                                                                2. 1

                                                                                                                                                                                  No offence but, why would I give a shit about [simplicity of implementation]?

                                                                                                                                                                                  It’s because people don’t give a shit that we have bloated (and often slow) software.

                                                                                                                                                                                  1. 0

                                                                                                                                                                                    And it’s because of developers with their heads stuck so far up their asses that they prioritize their implementation simplicity over the user experience that so much software is actively user-hostile.

                                                                                                                                                                                    Let’s end this little interaction here, shall we.

                                                                                                                                                                    3. 15

                                                                                                                                                                      Sublime Merge is the ideal git client for me. It doesn’t pretend it’s not git like all other GUI clients I’ve used so you don’t have to learn something new and you don’t unlearn git. It uses simple git commands and shows them to you. Most of git’s day-to-day problems go away if you can just see what you’re doing (including what you’ve mentioned).

                                                                                                                                                                      CLI doesn’t cut it for projects of today’s size. A new git won’t fix that. The state of a repository doesn’t fit in a terminal and it doesn’t fit in my brain. Sublime Merge shows it just right.

                                                                                                                                                                      1. 5

                                                                                                                                                                        I like GitUp for the same reasons. Just let me see what I’m doing… and Undo! Since it’s free, it’s easy to get coworkers to try it.

                                                                                                                                                                        1. 4

                                                                                                                                                                          I didn’t know about GitUp but I have become a big fan of gitui as of late.

                                                                                                                                                                          1. 2

                                                                                                                                                                            I’ll check that out, thank you!

                                                                                                                                                                        2. 2

                                                                                                                                                                          I use Fork for the same purpose and the staging area has never been a problem since it is visible and diffable at any time, and that’s how you compose your commits.

                                                                                                                                                                        3. 6

                                                                                                                                                                          See Game of Trees for an alternative to the git tool that interacts with normal git repositories.

                                                                                                                                                                          Have to agree with others about the value of the staging area though! It’s the One Big Thing I missed while using Mercurial.

                                                                                                                                                                          1. 5

                                                                                                                                                                            Well, on the one hand people could long for a better way to store the conflict resolutions to reuse them better on future merges.

                                                                                                                                                                            On the other hand, of all approaches to DAG-of-commits, Git’s model is plain worse than the older/parallel ones. Git is basically intended to lose valuable information about intent. The original target branch of the commit often tells as much as the commit message… but it is only available in reflog… auto-GCed and impossible to sync.

                                                                                                                                                                            1. 10

                                                                                                                                                                              Half of my branches are called werwerdsdffsd. I absolutely don’t want them permanently burned in the history. These scars from work-in-progress annoyed me in Mercurial.

                                                                                                                                                                              1. 9

                                                                                                                                                                                Honestly I have completely the opposite feeling. Back in the days before git crushed the world, I used Mercurial quite a lot and I liked that Mercurial had both the ephemeral “throw away after use” model (bookmarks) and the permanent-part-of-your-repository-history model (branches). They serve different purposes, and both are useful and important to have. Git only has one and mostly likes to pretend that the other is awful and horrible and nobody should ever want it, but any long-lived project is going to end up with major refactoring or rewrites or big integrations that they’ll want to keep some kind of “here’s how we did it” record to easily point to, and that’s precisely where the heavyweight branch shines.

                                                                                                                                                                                And apparently I wrote this same argument in more detail around 12 years ago.

                                                                                                                                                                                1. 1

                                                                                                                                                                                  ffs_please_stop_refactoring_and_review_this_pr8

                                                                                                                                                                                2. 2

                                                                                                                                                                                  This is a very good point. It would be interesting to tag and attach information to a group of related commits. I’m curious of the linux kernel workflows. If everything is an emailed patch, maybe features are done one commit at a time.

                                                                                                                                                                                  1. 2

                                                                                                                                                                                    If you go further, there are many directions to extend what you can store and query in the repository! And of course they are useful. But even the data Git forces you to have (unlike, by the way, many other DVCSes where if you do not want a meaningful name you can just have multiple heads in parallel inside a branch) could be used better.

                                                                                                                                                                                  2. 2

                                                                                                                                                                                    I can’t imagine a scenario where the original branch point of a feature would ever matter, but I am constantly sifting through untidy merge histories that obscure the intent.

                                                                                                                                                                                    Tending to your commit history with intentionality communicates to reviewers what is important, and removes what isn’t.

                                                                                                                                                                                    1. 1

                                                                                                                                                                                      It is not about the point a branch started from. It is about which of the recurring branches the commit was in. Was it in quick-fix-train branch or in update-major-dependency-X branch?

                                                                                                                                                                                      1. 2

                                                                                                                                                                                        The reason why this isn’t common is because of GitHub more than Git. They don’t provide a way to use merge commits that isn’t a nightmare.

                                                                                                                                                                                        When I was release managing by hand, my preferred approach was rebasing the branch off HEAD but retaining the merge commit, so that the branch commits were visually grouped together and the branch name was retained in the history. Git can do this easily.

                                                                                                                                                                                  3. 5

                                                                                                                                                                                    I never understood the hate for Git’s CLI. You can learn 99% of what you need to know on a daily basis in a few hours. That’s not a bad time investment for a pivotal tool that you use multiple times every day. I don’t expect a daily driver tool to be intuitive, I expect it to be rock-solid, predictable, and powerful.

                                                                                                                                                                                    1. 9

                                                                                                                                                                                      This is a false dichotomy: it can be both (as Mercurial is). Moreover, while it’s true that you can learn the basics to get by with in a few hours, it causes constant low-level mental overhead to remember how different commands interact, what the flag is in this command vs. that command, etc.—and never mind that the man pages are all written for people thinking in terms of the internals, instead of for general users. (That this is a common failing of man pages does not make it any less a problem for git!)

                                                                                                                                                                                      One way of saying it: git has effectively zero progressive disclosure of complexity. That makes it a continual source of paper cuts at minimum unless you’ve managed to actually fully internalize not only a correct mental model for it but in many cases the actual implementation mechanics on which it works.

                                                                                                                                                                                      1. 3

                                                                                                                                                                                        Its manpages are worthy of a parody: https://git-man-page-generator.lokaltog.net

                                                                                                                                                                                      2. 2

                                                                                                                                                                                        Its predecessors CVS and svn had much more intuitive commands (even if they were was clumsy to use in other ways). DARCS has been mentioned many times as being much more easy to use as well. People migrating from those tools really had a hard time, especially because git changed the meanings of some commands, like checkout.

                                                                                                                                                                                        Then there were some other tools that came up around the same time or shortly after git but didn’t get the popularity of git like hg and bzr, which were much more pleasant to use as well.

                                                                                                                                                                                        1. 2

                                                                                                                                                                                          I think the issues people have are less about the CLI itself and more about how it interfaces with the (for some developers) complex and hard to understand concepts at hand.

                                                                                                                                                                                          Take rebase for example. Once you grok what it is, it’s easy, but trying to explain the concept of replaying commits on top of others to someone used to old school tools like CVS or Subversion can be a challenge, especially when they REALLY DEEPLY don’t care and see this as an impediment to getting their work done.

                                                                                                                                                                                          I’m a former release engineer, so I see the value in the magic Git brings to the table, but it can be a harder sell for some :)

                                                                                                                                                                                        2. 5

                                                                                                                                                                                          The interface is pretty bad.

                                                                                                                                                                                          I would argue that this is one of the main reasons for git’s success. The CLI is so bad that people were motivated to look for tools to avoid using it. Some of them were motivated to write tools to avoid using it. There’s a much richer set of local GUI and web tools than I’ve seen for any other revision control system and this was true even when git was still quite new.

                                                                                                                                                                                          I never used a GUI with CVS or Subversion, but I wanted to as soon as I started touching the git command line. I wanted features like PRs and web-based code review, because I didn’t want to merge things locally. I’ve subsequently learned a lot about how to use the git CLI and tend to use it for a lot of tasks. If it had been as good as, say, Mercurial’s from the start then I never would have adopted things like gitx / gitg and GitHub and it’s those things that make the git ecosystem a pleasant place to be.

                                                                                                                                                                                          1. 4

                                                                                                                                                                                            The interface of Git and its underlying data models are two very different things, that are best treated separately.

                                                                                                                                                                                            Yes a thousand times this! :) Git’s data model has been a quantum leap for people who need to manage source code at scale. Speaking as a former release engineer, I used to be the poor schmoe who used to have to conduct Merge Day, where a branch gets merged back to main.

                                                                                                                                                                                            There was exactly one thing you could always guarantee about merge day: There Will Be Blood.

                                                                                                                                                                                            So let’s talk about looking past git’s god awful interface, but keep the amazing nubbins intact and doing the nearly miraculous work they do so well :)

                                                                                                                                                                                            And I don’t just mean throwing a GUI on top either. Let’s rethink the platonic ideal for how developers would want their workflow to look in 2022. Focus on the common case. Let the ascetics floating on a cloud of pure intellect script their perfect custom solutions, but make life better for the “cold dark matter” developers which are legion.

                                                                                                                                                                                            1. 2

                                                                                                                                                                                              I would say that you simultaneously give credit where it is not due (there were multiple DVCSes before Git, and approximately every one had a better data model, and then there are things that Subversion still has better than everyone else, somehow), and ignore the part that actually made your life easier — the efforts of pushing Git down people’s throat, done by Linus Torvalds, spending orders of magnitude more of his time on this than on getting things right beyond basic workability in Git.

                                                                                                                                                                                              1. 1

                                                                                                                                                                                                Not a DVCS expert here, so would you please consider enlightening me? Which earlier DVCS were forgotten?

                                                                                                                                                                                                My impressions of Mercurial and Bazaar are that they were SL-O-O-W, but they’re just anecdotal impressions.

                                                                                                                                                                                                1. 3

                                                                                                                                                                                                  Well, Bazaar is technically earlies. Monotone is significantly earlier. Monotone has quite interesting and nicely decoupled data model where the commit DAG is just one thing; changelog, author — and branches get the same treatment — are not parts of a commit, but separately stored claims about a commit, and this claim system is extensible and queriable. And of course Git was about Linus Torvalds speedrunning implementation of the parts of BitKeeper he really really needed.

                                                                                                                                                                                                  It might be that in the old days running on Python limited speed of both Mercurial and Bazaar. Rumour has it that the Monotone version Torvalds found too slow was indeed a performance regression (they had one particularly slow release at around that time; Monotone is not in Python)

                                                                                                                                                                                                  Note that one part of things making Git fast is that enables some optimisations that systems like Monotone make optional (it is quite optimistic about how quickly you can decide that the file must not have been modified, for example). Another is that it was originally only intended to be FS-safe on ext3… and then everyone forgot to care, so now it is quite likely to break the repository in case of unclean shutdown mid-operation. Yes, I have damaged repositories that way to a state where I could not find advice on how to avoid re-cloning to get even partially working repository.

                                                                                                                                                                                                  As of Subversion, it has narrow checkouts which are a great feature, and DVCSes could also have them, but I don’t think anyone properly has them. You kind of can hack something with remote-automate in Monotone, but probably flakily.

                                                                                                                                                                                            2. 4

                                                                                                                                                                                              Let the data model pretend there’s a blob for each version of that huge file, even though in fact the software is automatically compressing & decompressing things under the hood.

                                                                                                                                                                                              Ironically, that’s part of the performance problem – compressing the packfiles tends to be where things hurt.

                                                                                                                                                                                              Still, this is definitely a solvable problem.

                                                                                                                                                                                              1. 2

                                                                                                                                                                                                I used to love DARCS, but I think patch theory was probably the wrong choice.

                                                                                                                                                                                                I have created and maintains official test suite for pijul, i am the happiest user ever.

                                                                                                                                                                                                1. 2

                                                                                                                                                                                                  Hmm, knowing you I’m sure you’ve tested it to death.

                                                                                                                                                                                                  I guess they got rid of the exponential conflict resolution that plagued DARCS? If so perhaps I should give patch theory another go. Git ended up winning the war before I got around to actually study patch theory, maybe it is sounder than I thought.

                                                                                                                                                                                                  1. 1

                                                                                                                                                                                                    Pijul is a completely different thing than Darcs, the current state of a repository in Pijul is actually a special instance of a CRDT, which is exactly what you want for a version control system.

                                                                                                                                                                                                    Git is also a CRDT, but HEAD isn’t (unlike in Pijul), the CRDT in Git is the entire history, and that is not a very useful property.

                                                                                                                                                                                                  2. 1

                                                                                                                                                                                                    Best test suite ever. Thanks again, and again, and again for that. It also helped debug Sanakirja, a database engine used as the foundation of Pijul, but usable in other contexts.

                                                                                                                                                                                                  3. 2

                                                                                                                                                                                                    There are git-compatible alternatives that keep the underlying model and change the interface. The most prominent of these is probably gitless.

                                                                                                                                                                                                    1. 1

                                                                                                                                                                                                      I’ve been using git entirely via UI because of that. Much better overview, much more intuitive, less unwanted side effects.

                                                                                                                                                                                                      1. 1

                                                                                                                                                                                                        You can’t describe Git without discussing rebase and merge: these are the two most common operations in Git, yet they don’t satisfy any interesting mathematical property such as associativity or symmetry:

                                                                                                                                                                                                        • Associativity is when you want to merge your commits one by one from a remote branch. This should intuitively be the same as merging the remote HEAD, but Git manages to make it different sometimes. When that happens, your lines can be shuffled around more or less randomly.

                                                                                                                                                                                                        • Symmetry means that merging A and B is the same as merging B and A. Two coauthors doing the same conflictless merge might end up with different results. This is one of the main benefits of GitHub: merges are never done concurrently when you use a central server.

                                                                                                                                                                                                        1. 1

                                                                                                                                                                                                          Well, at least this is not the fault of the data model: if you have all the snapshots, you can deduce all the patches. It’s the operations themselves that need fixing.

                                                                                                                                                                                                          1. 1

                                                                                                                                                                                                            My point is that this is a common misconception: no datastructure is ever relevant without considering the common operations we want to run on it.

                                                                                                                                                                                                            For Git repos, you can deduce all the patches indeed, but merge and rebase can’t be fixed while keeping a reasonable performance, since the merge problem Git tries to solve is the wrong one (“merge the HEADs, knowing their youngest common ancestor”). That problem cannot have enough information to satisfy basic intuitive properties.

                                                                                                                                                                                                            The only way to fix it is to fetch the entire sequence of commits from the common ancestor. This is certainly doable in Git, but merges become O(n) in time complexity, where n is the size of history.

                                                                                                                                                                                                            The good news is, this is possible. The price to pay is a slightly more complex datastructure, slightly harder to implement (but manageable). Obviously, the downside is that it can’t be consistent with Git, since we need more information. On the bright side, it’s been implemented: https://pijul.org

                                                                                                                                                                                                            1. 1

                                                                                                                                                                                                              no datastructure is ever relevant without considering the common operations we want to run on it.

                                                                                                                                                                                                              Agreed. Now, how often do we actually merge stuff, and how far is the common ancestor in practice?

                                                                                                                                                                                                              My understanding of the usage of version control is that merging two big branches (with an old common ancestor) is rare. Far more often we merge (or rebase) work units with just a couple commits. Even more often than that we have one commit that’s late, so we just pull in the latest change then merge or rebase that one commit. And there are the checkout operations, which in some cases can occur most frequently. While a patch model would no doubt facilitate merges, it may not be worth the cost of making other, arguably more frequent operations, slower.

                                                                                                                                                                                                              (Of course, my argument is moot until we actually measure. But remember that Git won in no small part because of its performance.)

                                                                                                                                                                                                              1. 2

                                                                                                                                                                                                                I agree with all that, except that:

                                                                                                                                                                                                                • the only proper modelling of conflicts, merges and rebases/cherry-picking I know of (Pijul) can’t rely on common ancestors only, because rebases can make some future merges more complex than a simple 3-way merge problem.

                                                                                                                                                                                                                • I know many engineers are fascinated by Git’s speed, but the algorithm running on the CPU is almost never the bottleneck: the operator’s brain is usually much slower than the CPU in any modern version control system (even Darcs has fixed its exponential merge). Conflicts do happen, so do cherry-picks and rebases. They aren’t rare in large projects, and can be extremely confusing without proper tools. Making these algorithms fast is IMHO much more important from a cost perspective than gaining 10% on a operation already taking less than 0.1 second. I won’t deny the facts though: if Pijul isn’t used more in industry, it could be partly because that opinion isn’t widely shared.

                                                                                                                                                                                                                • some common algorithmic operations in Git are slower than in Pijul (pijul credit is much faster than git blame on large instances), and most operations are comparable in speed. One thing where Git is faster is browsing old history: the datastructures are ready in Pijul, but I haven’t implemented the operations yet (I promised I would do that as soon as this is needed by a real project).