1. 50
  1.  

  2. 15

    Such pointless posturing and inflammatory piece of nonsense all in one (we’re talking about ~300 lines of code here).

    The whole schtick that Nvidia does this out of spite is just short-fused idiocy. It doesn’t take much digging into the presentations that Nvidia has held at XDC2016/2017 (or just by asking aritger/cubisimo,… directly) to understand that there are actual- and highly relevant- technical concerns that make the GBM approach terrible. The narrative Drew and other are playing with here - that there’s supposedly a nice “standard” that Nvidia just don’t want to play nice with is a convenient one rather than a truthful one.

    So what’s this actually about?

    First, this is not strictly part of Wayland vs X11. Actually, there’s still no accelerated buffer passing subprotocol accepted as ‘standard’ into Wayland, the only buffer transfer mechanism part of the stable set is via shared memory. Also, the GBM buffer subsystem is part of Xorg/DRI3 so the fundamental problem applies there as well.

    Second, this only covers the buffer passing part of the stack, the other bits - API for how you probe and control displays (KMS) is the same, even with the Nvidia blobs, you can use this interface.

    Skipping to what’s actually relevant - the technical arguments - the problem they both (EGLStreams, GBM) try to address is to pass some kind of reference to a GPU-bound resource from a producer to a consumer in a way that the consumer can actually use as part of its accelerated drawing pipeline. This becomes hairy mainly for the reason that the device which produces the contents is not necessarily the same as the device that will consume it - and both need to agree on the format of whatever buffer is being passed.

    Android has had this sorted for a long while (gralloc + bufferQueues), same goes for iOS/OSX (IOSurface) and Windows. It’s harder to fix in the linux ecosystem due to the interactions with related subsystems that also need to accept whatever approach you pick. For instance, you want this to work for video4linux so that a buffer from a camera or accelerated video decoding device can be directly scanned out to a display without a single unnecessary conversion or copy step. To be able to figure out how to do this, you pretty much need control and knowledge of the internal storage- and scanout formats of the related devices, and with it comes a ton of politics.

    GBM passes one or several file descriptors (contents like a planar YUV- video can have one for each plane-) from producer to consumer with a side-channel for metadata (plane sizes, formats, …) as to the properties of these descriptors. Consumer side collects all of these, binds into an opaque resource (your texture) and then draws with it. When there is a new set awaiting, you send a release on the ones you had and switch to a new set.

    EGLStreams passes a single file descriptor once. You bind this descriptor to a texture and then draw using it. Implicitly, the producer side signals when new contents is available, the consumer side signals when it’s time to draw. The metadata, matchmaking and format conversions are kept opaque so that the driver can switch and decide based on the end use case (direct to display, composition, video recording, …) and what’s available.

    1. 4

      The whole schtick that Nvidia does this out of spite is just short-fused idiocy. It doesn’t take much digging into the presentations that Nvidia has held at XDC2016/2017 (or just by asking aritger/cubisimo,… directly) to understand that there are actual- and highly relevant- technical concerns that make the GBM approach terrible.

      I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route. And I question the technical merit of EGLStreams - it’s not like the Linux graphics community is full of ignorant people who are pushing for the wrong technology because… well, I don’t know what reason you think we have. And at XDC, did you miss the other talk by nouveau outlining how much Nvidia is still a thorn in their side and is blocking their work?

      The narrative Drew and other are playing with here - that there’s supposedly a nice “standard” that Nvidia just don’t want to play nice with is a convenient one rather than a truthful one.

      What, you mean the interfaces that literally every other gfx vendor implements?

      To be able to figure out how to do this, you pretty much need control and knowledge of the internal storage- and scanout formats of the related devices, and with it comes a ton of politics.

      Necessary politics. A lot of subsystems are at stake here with a lot of maintainers working on each. Instead of engaging with the process Nvidia is throwing blobs/code over the wall and expecting everyone to change for them without being there to support them doing so.

      Your explanation of how EGLStreams and GBM are different is accurate, but the drawbacks of GBM are not severe. It moves the responsibility for some stuff but this stuff is still getting done. Read through the wayland-devel discussions if you want to get a better idea of how this conversation happened - believe it or not, it did happen.

      1. 2

        I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route. And I question the technical merit of EGLStreams - it’s not like the Linux graphics community is full of ignorant people who are pushing for the wrong technology because… well, I don’t know what reason you think we have.

        Well, the FOSS graphics “community” (as in loosely coupled tiny incestuous fractions that just barely tolerate each-other) pretty much represents the quintessential definition of an underdog in what is arguably the largest, most lock-in happy, control surface there is. The lack of information, hardware and human resources at every stage is sufficient explanation for the current state of affairs without claiming ignorance or incompetence. I don’t doubt the competence of nvidia driver teams in this regard either and they hardly just act out of spite or malice - their higher level managers otoh - that’s a different story with about as broken a plot as the corresponding one had been at AMD or Intel in other areas where they are not getting their collective asses handed to them.

        Your explanation of how EGLStreams and GBM are different is accurate, but the drawbacks of GBM are not severe. It moves the responsibility for some stuff but this stuff is still getting done. Read through the wayland-devel discussions if you want to get a better idea of how this conversation happened - believe it or not, it did happen.

        I have, along with the handful of irc channels where the complementary passive-aggressive annotations are being delivered (other recommended reading is ‘luc verhaagen/libvs blog’ starting with https://libv.livejournal.com/27799.html - such a friendly bunch n’est ce pas?), what I’m blurting out comes from rather painful first hand experiences - dig through the code I’ve written on the subject and the related wayland-server stuff and you can see that I’ve done my homework, if there’s one hour of code there’s one hidden hour of studying/reversing prior use. I’ve implemented both (+ delivered evaluations on three others in a more NDAy setting) and experimented with them since initial proposal and bothered nvidia about it (~dec 2014) - neither deliver what I think “we” need, but that’s a much longer thread.

        On GBM: The danger is splitting up an opaque ‘handle’ into a metadata- dependent set of file descriptors, passing the metadata over a side channel, recombining them again with no opportunity for compositor- validation at reassembly. That is validation both for >type/completion<, for >synchronization< and for >context-of-use<. It’s “try and hope” for asynch fail delivery, crash or worse. There’s ample opportunity here for race conditions, DoS and/or type confusions (got hard-to-reproduce such issues with wild writes both in gpu-space and kernel-space in this very layer with Xorg/DRI3 bound as producer via a render node). Look at the implementation of wl_drm and tell me how that approach should be able to bring us back from buffer bloat land to chase-the-beam producer-to-scanout (the VR/AR ideal and 8k/16k HDR bandwidth scaling necessity) - last I checked Mesa could quad-buffer a single producer just to avoid tearing - FWIW, Nvidia are already there, on windows.

        What, you mean the interfaces that literally every other gfx vendor implements?

        Yes, the Mali and Exynos support for GBM/KMS is just stellar – except “Implements” range from half-assed, to straight out polite gestures rather than actual efforts, so about the quality of compositor ‘implementations’ on the Wayland side. Out of 5 AMD cards I run in the test rig here right now, 2 actually sort of work if I don’t push them too hard or request connector level features they are supposed to offer (10bit+FreeSynch+audio for instance). To this day, Xorg is a more reliable “driver” than kms/gbm - just disable the X protocol parts. I started playing with KMS late 2012, and the experience have been the same every step of the way. If anything, Gralloc/HWC is more of a widespread open “standard”. Then again, Collabora and friends couldn’t really excel selling consulting for Android alternatives if they’d use that layer now could they?

        Necessary politics. A lot of subsystems are at stake here with a lot of maintainers working on each. Instead of engaging with the process Nvidia is throwing blobs/code over the wall and expecting everyone to change for them without being there to support them doing so.

        and

        I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route

        Again, gralloc/hwc predates dma-buf, and are half a decade into seamless multi-gpu handover/composition. For nvidia, recall that they agree on KMS as a sufficiently reasonable basis for initial device massage (even though it’s very much experimental- level of quality, shoddy synchronisation, no portable hotplug detection, no mechanism for resource quotas and clients can literally priority-invert compositor at will) - It’s the buffer synchronisation and the mapping to accelerated graphics that is being disputed (likely to be repeated for Vulkan, judging by movements in that standard). Dri-devel are as much the monty python ’knights who say NIH’ as anyone in this game: Unless my memory is totally out of whack EGLStreams as a mechanism for this problem stretch as far back as 2009, a time when khronos still had “hopes” for a display server protocol. Dma-buf (GEM is just a joke without a punchline) was presented in 2012 via Linaro and serious XDC @ 2013 with Xorg integration via DRI”3000” by keithp the same year. Nvidia proposed Streams @ 2014, and the real discussion highlighted issues in the span 2014-2015. The talks have then been how to actually move forward. Considering the “almost enough for a decent lunch” levels of resource allocations here, the pacing is just about on par with expectations.

    2. 5

      And proprietary driver users have the gall to reward Nvidia for their behavior by giving them hundreds of dollars for their GPUs, then come to me and ask me to deal with their bullshit for free. Well, fuck you, too.

      Boy, is he mad.

      Strangely enough, nVidia seems to be the only vendor pushing for HDR and DeepColor.

      1. 4

        Not to mention pushing performance; the 1080Ti and Titan Xp(p) maybe overpriced and from an evil company, but those products have literally no competition at all.

        1. 1

          Yeah, but most people aren’t getting those, they are getting cards with definite AMD equivalents like the 1050/1060.

      2. 4

        Damn nvidia, the open source driver gives me a black screen on nix, I’m an i3 user who would love to try sway and wayland.

        1. 3

          Sounds about right. I’m stuck with a 2K resolution on a 4K screen and no HDMI sound. Even picked a pricey GPU that’s supposed to be well supported. Can anyone suggest an AMD card that supports the above?

          1. 3

            I currently have an AMD RX580 and use dual 1440p monitors without issue. I have also tested another 2x 1080p simultaneously with those 2x 1440p. All at 60Hz. So it can definitely push the necessary pixels for 4K on Linux. You need a more recent Linux kernel (4.8 or above iirc, I’m on 4.12) for the newer AMDGPU driver. There are some improvements to be made on certain codec playback support (I get occasional tearing on VP9 at 1440p for instance) but ever since the switch to the AMDGPU stack I have a lot of confidence in AMD fixing things as time goes on.

            My previous Nvidia card was supposed to support 2x 1440p, but wouldn’t run the second one at the correct resolution, instead limiting it at some odd resolution of 2160x1200 (iirc?). Would not work in nouveau, would not work in the latest proprietary drivers.

            As for HDMI audio on the RX 580, I am traveling but will try and test when I return home. I don’t know if I have an HDMI audio device though.

            1. 2

              I believe HDMI audio is still not available ootb until a set of AMD patches (that have been around for quite a while now) are mainlined.

              https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-Linux-4.15-1

              1. 1

                Yeah, HDMI audio with the latest AMD cards got pushed back when their big code dump was refused back in the spring. They’ve been rewriting it to use kernel APIs instead of wrappers ever since & the main code merge was accepted by Dave Arlie a week or two ago IIRC. So, barring some ironing out of wrinkles, 4.15 looks hopeful.

            2. 2

              I have an R9 390X that drives 3840x2160 + 1920x1080 all at 60Hz in Linux without issue. I’d do some research, though; I don’t even come close to stressing it in Linux, it’s there so the Windows install on that machine can run games.

              No idea about sound, my monitors are connected over displayport and dvi, respectively, and my speakers are plugged into my sound card.

              1. 1

                Even my old intel i915/i5 testing setups can push 60Hz@4K, it’s likely that it’s not about your card but rather about the combination of (GPU, connector, cable and display) or bandwidth saturation. KVMs and similar stuff in between can mess with the negotiation. For 4k@60Hz you want a chain capable of HDMI-2.0 or DisplayPort. It might simply be that Xorg probes the connector, gets the choice of 2k@60Hz or 4k@30Hz and picks the former.

                AMD cards won’t give you HDMI sound on the open source drivers unless you manually integrate their HAL patches. The relevant fixes are slowly trickling in, and chances are that in 4.15 the support will get there. But really, “recent” features like display-port chaining, event-driven synch like FreeSync etc. all have spotty support and its a gamble if it will work on any given card.

                1. 2

                  I’m using the DisplayPort cable that came with my monitor. I can set it to 4K but the DE becomes unusably sluggish on both Wayland and Xorg. Thanks for the pointers.

              2. 3
                1. 1

                  Looking to buy a new GOU to replace my Nvidia card for exactly those reasons.

                  Does someone know when (or if) lower-tier Vega models will be available?

                  Currently there are only the Vega 56 and the Vega 64, I’m looking to buy a “Vega 48” or “Vega 40” or something. :-)

                  1. 1

                    Unfortunately, the same will happen. His project or Wayland will end up being forked to support Nvidia…

                    Also realised that it’s named “Sway” because the 1st letter is actually the 1st letter of his alias.