1. 27
  1. 4

    I agree somewhat with the conclusion, but not the argument.

    Ok, there’s no free software implementation to verify against - but so what? If it breaks, it breaks; It’s then on NVidia, or someone else who cares, to fix it. You could easily continue development with the general contract that you don’t have to test the EGLStreams support and, while I understand that developers will generally feel uneasy about the possibility that they are breaking things without being able to verify that they aren’t, it wouldn’t be that hard a pill to swallow. You’d quickly find out if the code was going to be properly maintained and if it wasn’t, you could rip it right back out. Then, the main argument is only that you’re likely wasting effort because you don’t think it will be maintained, but the fact that the code is submitted in the first place is somewhat evidence that someone cares enough that this code works.

    In my view, a much better technical argument is that supporting two different APIs, one of which is single-vendor, is a maintenance burden regardless of whether you have to test it - since you will still compile it regardless, and changes may break compilation due to the EGLStreams component, which then has to be resolved. In particular, this makes general changes to a higher level abstraction (which presumably exists as a layer over the two underlying implementations ie. EGLStreams and GBM) more costly.

    (I don’t know enough about the two APIs to make a technical judgement regarding which is better, but when it appears that neither API can be implemented in terms of the other I worry that they are both flawed).

    On the political side: I don’t think trying to force NVidia to use an API that they have persistently insisted is not suitable is really being on the right side of the argument. It’s reasonable to not expend effort supporting a proprietary API, but not to refuse to allow the vendor to provide their own support for that API. And if that, or part of it, comes in the form of closed source software, let the people who feel strongly enough about it choose the competitor who has the more open alternative (and let the problems caused by closed source components cause the frustration that they do; that’s the choice of the user). Would I (personally) prefer that NVidia release and support open-source drivers? Yes I would, but only because I’d rather have more options for hardware with open drivers, not because I begrudge NVidia their choice of providing closed drivers. The latter I solve easily by buying only AMD graphics cards (or using integrated Intel graphics. And yes there may be a few other options).

    And it’s all very well to offer to buy a competitor’s graphics card for “anyone who works on Linux graphics” but that is pretty meaningless for regular consumers who just want the graphics card that they bought to work as it should. If on some level NVidia seem to be trying to take steps to make that be the case then I don’t see why they should be discouraged.

    Ultimately, that leaves the maintenance burden issue, and I think that could easily be argued is a sufficient reason not to merge the code. But I personally think it’s the only valid reason, in this case.

    1. 4

      This. The wlroots people have an extreme stance against merging patches that they themselves won’t maintain / patches that could become dead code. I really don’t agree with them on that. Merge all the things, delete later if it’s abandoned by the original author. Linux merged the AMD Display Core after all, and no one outside of AMD is going to maintain it, and it’s fine. And here, it’s a much simpler, tiny patch.

      The conclusion… well… so there’s two goals – pressuring Nvidia into being a real open source citizen and making Wayland widely adopted. Pretty much everyone would like both to happen, but which to prioritize? I’m leaning towards the latter. I think getting desktop users off Xorg ASAP might be more important after all.

      1. 2

        Merge all the things, delete later if it’s abandoned by the original author.

        Even if it works for Linux, it might not work well for other projects like wlroots.

        First, Linux is a much older project, wlroots has much more breaking changes. Second, having code you don’t maintain in your tree makes it impossible to refactor code (or very expensive). Third, this code can prevent new features from being added. For instance, it’s not clear how EGLStreams can work with multi-GPU, direct scan-out and efficient screen capture.

        pressuring Nvidia into being a real open source citizen and making Wayland widely adopted

        I personally think it’s better to have a sane ecosystem. There will always be X11 users, it’s not harmful. Users should stay on X11 if they don’t want to switch to Wayland.

        1. 2

          having code you don’t maintain in your tree makes it impossible to refactor code (or very expensive)

          Not necessarily. If you have a well defined stable interface, and someone adds an implementation of that interface, it’s not going to be a huge problem.

          KWin is written in C++ and it seems like they have already planned for GBM alternatives — there was an abstract base class for various backends, right?

          can prevent new features from being added

          Usually not a big problem to only add features supported on one backend and not another one.

          Users should stay on X11 if they don’t want to switch to Wayland

          Sure, but

          • many do want, but nvidia + compositor authors made the choice for them
          • most casual users don’t care about protocols (but do care about the benefits like proper HiDPI and zero tearing) and the distro maintainers made the choice for them
          1. 1

            If you have a well defined stable interface, and someone adds an implementation of that interface, it’s not going to be a huge problem.

            If this were the case, how would there be any benefit to merging the implementation?

            1. 1

              Not like an external interface. They have an abstract class it seems.

              1. 1

                So either:

                • The interface of the abstract class is considered stable, in which case it could be exposed as an external interface and the NVidia implementation could remain external.
                • The interface of the abstract class is considered experimental, and will want to evolve as development continues, in which case having a bunch of untestable code tied to it is a burden on anyone trying to change it.
                1. 1

                  I think that’s a bit of a false dichotomy. The interface might be mostly stable - enough so that major changes aren’t expected and that ongoing development should be minimally impacted, but not necessarily enough that no minor changes will need to be made to implementations. Undoubtedly there will be some maintenance cost from incorporating a second implementation, but if that reduces overall development burden across what would otherwise be two separate trees, and if the functionality added is worth the cost - then incorporation seems the clear choice. (I’m not convinced this is necessarily the case here, as I noted above).

                  1. 1

                    I agree that there is some middle ground between the two in terms of interface stability, but somewhere you have to draw a line between merging vs. the implementation remaining external.

                    As long as changes to the interface are minor, it shouldn’t be difficult for the external implementation stay in sync, such that you rarely or never get an actual release where it doesn’t work.

                    The main benefit to having the implementation merged would be that developers could check their changes don’t break things, or fix minor breakages as they go. If the code isn’t testable, that benefit isn’t really there.

                    The problem with trying to reduce the ‘overall development burden’ in this way is that in doing so you shift some of the burden from a multi-billion dollar company to a much smaller group, many of whom are volunteers.

      2. 3

        If it breaks, it breaks; It’s then on NVidia, or someone else who cares, to fix it.

        No, it’s ultimately on the person/group distributing the software, not the original patch owner, to fix things. At least that’s where most of the pressure is. Good luck deferring responsibility in that case, when users do not understand the history/technical details.

        but the fact that the code is submitted in the first place is somewhat evidence that someone cares enough that this code works.

        Companies throw code over the wall all the time, and subject its maintenance to the whims of their current corporate agendas (which almost always change with the seasons).

        And it’s all very well to offer to buy a competitor’s graphics card for “anyone who works on Linux graphics” but that is pretty meaningless for regular consumers who just want the graphics card that they bought to work as it should.

        Regular consumers don’t work on Linux graphics. It seems like the author wants to encourage those who do work on Linux graphics to work on supporting devices from folks that are friendly towards Linux graphics.

        1. 2

          No, it’s ultimately on the person/group distributing the software, not the original patch owner, to fix things. At least that’s where most of the pressure is

          Yes, there might be some pressure in the sense that you get users complaining about things not working. But again, if the code isn’t maintained, you rip it out. And the pressure may well exist even if you don’t merge the code - “hey why is my NVidia card dog slow in KWin? Why doesn’t it support effects?” - in either case you have to try and pass the blame NVidia. Much easier to do that when you can say “they have a proprietary driver with a proprietary API, they provided support code for it but it is buggy” rather than “we don’t support that API”. From a user perspective, NVidia looks more to blame in the first case.

          Companies throw code over the wall all the time, and subject its maintenance to the whims of their current corporate agendas (which almost always change with the seasons).

          “We won’t accept code because you might not maintain it” is a terrible argument.

          Regular consumers don’t work on Linux graphics.

          That was my point. They still want their hardware to work.

          1. 2

            Hey thanks for the response.

            But again, if the code isn’t maintained, you rip it out.

            “We won’t accept code because you might not maintain it” is a terrible argument.

            Is it though? KWin has a lot of users (compared to many DE/community projects), so the impact of accepting something which could stop being maintained is greater. Why not require new contributors who want to submit API changes to demonstrate that they can maintain code when they say they would? Why is nvidia automatically a “trusted contributor” in this case? Accepting code, dealing with issues/bugs (even if you attempt to pass blame), and reverting it still takes a non-zero amount of man hours.

            1. 2
              “We won’t accept code because you might not maintain it” is a terrible argument.
              

              Is it though?

              Well - “companies throw code over the wall all the time, [and then don’t maintain it]” can just as easily be turned around: Companies hand over code all the time and go on to do a great job of maintaining it. But what actually is the impact of accepting this code, if it ceases to be maintained? I’d say, again, that you just have the option of ripping out the unmaintained code.

              Why not require new contributors who want to submit API changes to demonstrate that they can maintain code when they say they would?

              As I understand it, NVidia aren’t submitting API changes here. They’re submitting an alternative implementation of parts of KWin, presumably selectable at runtime (and hopefully possible to disable at compile time), which makes use of an alternative underlying API.

              The suggestion of asking them to maintain the necessary changes out-of-tree for a bit perhaps isn’t so bad (It wasn’t clear that this is what you were actually arguing for). I’d personally, in this situation, assuming I thought the code should be merged at all, prefer just merging it and hoping for the best; goodwill goes both ways (and I think the downsides are minimal risk/reward trade-off is generally favourable). But I’m not totally against an out-of-tree-maintainership test, conceptually; I just also happen to think it probably does make sense to reject the changes outright, in this case.

              As for NVidia automatically being a trusted contributor, I think it’d be unusual to ask any new contributor to maintain their changes out-of-tree, but if that’s a standard practice, then sure, ask NVidia to do it too. If you don’t do that as standard practice, but ask NVidia to do it (without having any particular reason to suspect they will be any worse than any other contributor), then you’re automatically making them an “untrusted contributor” which also seems unfair.

              Accepting code, dealing with issues/bugs (even if you attempt to pass blame), and reverting it still takes a non-zero amount of man hours.

              Agreed! - But it feels like you’re starting from a position that this is likely what will happen, and I don’t think that’s necessarily fair; if it wasn’t for the fact that we were talking about an API which only has a proprietary implementation, I don’t think this question would’ve come up. Or if it had, it would’ve been not so much about “will it be maintained” but “is it worth the burden of having extra code and additional abstraction necessary to support two APIs”, which I think is the legitimate question here.

        2. 2

          In my view, a much better technical argument is that supporting two different APIs, one of which is single-vendor, is a maintenance burden regardless of whether you have to test it - since you will still compile it regardless, and changes may break compilation due to the EGLStreams component, which then has to be resolved. In particular, this makes general changes to a higher level abstraction (which presumably exists as a layer over the two underlying implementations ie. EGLStreams and GBM) more costly.

          Exactly - THIS is an argument I can get behind. I have Nvidia hardware in my laptop. I don’t love that they use binary drivers, but forcing everyone to use a different API from the accepted standard is at the very least worthy of some discussion and debate.

        3. 2

          With much respect for Drew, who does amazing work with Sway - this is precisely the kind of thinking that keeps Linux on the desktop a fringe market with (in the grand scheme) very limited appeal.

          AMD has made great strides of late, but Nvidia still owns a very sizable portion of the market.

          Yes, everyone hates that Nvidia isn’t AMD and doesn’t open source their drivers, forcing LInux folks to interoperate with an ABI.

          Can we please, pretty please with sugar on top, get over ourselves for just a moment and recognize that for the teeming droves of PC owners with Nvidia hardware want to enjoy running Linux and have made the choice that utilizing a proprietary binary blob is a reasonable trade-off to make. Why not honor that choice?

          1. 2

            The main issue with the proprietary driver is that it’s using its own special EGLStreams API, whereas all other drivers use GBM.

            1. 3

              From the outside, this feels like a confusing mess.

              GBM has become the defacto standard for Wayland it would seem, but EGLStreams seems like a bona-fide specification.

              Do we have any sense of why it’s important enough for Nvidia to fight this fight and why someone can’t just write an abstraction layer that makes one talk to the other or some such and keep everybody happy?