1. 1

    Is “withdrawl” a different way of spelling “withdrawal”? Thought it was a typo, but the post consistently spells it that way.

    1. 2

      Perhaps the author actually thinks that is how it is spelled.

      1. 1

        Spelling has never been my strong suit! I will get that fixed.

      1. 18

        Neat, but you may want to disclose in fine print somewhere that you’re the author of fosspay, even though it’s free/libre software and not a platform you’re pushing.

        1. 1

          Fair point. This wasn’t meant to be an endorsement of any of these platforms (after all, there are situations where either of the other processors get more of your money into the creator’s pocket), but I will add a note.

        1. 3

          There’s a cool flag that makes it so you don’t have to reap the process, too, which is nice because reaping children is another really stupid idea.

          I… is it? It doesn’t seem a wholly unreasonable way to arrange to get the exit status (or other termination details) of your child processes.

          1. 4

            Author here. I originally expanded on this in my first draft, but cut it out to balance complaints with solutions better. In my opinion, waiting on your children is fine if you can afford to block, and if not you have to set up SIGCHLD handlers, which is a non-trivial amount of code and involves signal handling, which is a mess in its own right and can easily be done incorrectly. Or you can use non-blocking waitpid, but that wasn’t a thing until recently. In all of these cases, if the parent doesn’t do its job well, your process table is littered with a bunch of annoying dead entries.

          1. 15

            Such pointless posturing and inflammatory piece of nonsense all in one (we’re talking about ~300 lines of code here).

            The whole schtick that Nvidia does this out of spite is just short-fused idiocy. It doesn’t take much digging into the presentations that Nvidia has held at XDC2016/2017 (or just by asking aritger/cubisimo,… directly) to understand that there are actual- and highly relevant- technical concerns that make the GBM approach terrible. The narrative Drew and other are playing with here - that there’s supposedly a nice “standard” that Nvidia just don’t want to play nice with is a convenient one rather than a truthful one.

            So what’s this actually about?

            First, this is not strictly part of Wayland vs X11. Actually, there’s still no accelerated buffer passing subprotocol accepted as ‘standard’ into Wayland, the only buffer transfer mechanism part of the stable set is via shared memory. Also, the GBM buffer subsystem is part of Xorg/DRI3 so the fundamental problem applies there as well.

            Second, this only covers the buffer passing part of the stack, the other bits - API for how you probe and control displays (KMS) is the same, even with the Nvidia blobs, you can use this interface.

            Skipping to what’s actually relevant - the technical arguments - the problem they both (EGLStreams, GBM) try to address is to pass some kind of reference to a GPU-bound resource from a producer to a consumer in a way that the consumer can actually use as part of its accelerated drawing pipeline. This becomes hairy mainly for the reason that the device which produces the contents is not necessarily the same as the device that will consume it - and both need to agree on the format of whatever buffer is being passed.

            Android has had this sorted for a long while (gralloc + bufferQueues), same goes for iOS/OSX (IOSurface) and Windows. It’s harder to fix in the linux ecosystem due to the interactions with related subsystems that also need to accept whatever approach you pick. For instance, you want this to work for video4linux so that a buffer from a camera or accelerated video decoding device can be directly scanned out to a display without a single unnecessary conversion or copy step. To be able to figure out how to do this, you pretty much need control and knowledge of the internal storage- and scanout formats of the related devices, and with it comes a ton of politics.

            GBM passes one or several file descriptors (contents like a planar YUV- video can have one for each plane-) from producer to consumer with a side-channel for metadata (plane sizes, formats, …) as to the properties of these descriptors. Consumer side collects all of these, binds into an opaque resource (your texture) and then draws with it. When there is a new set awaiting, you send a release on the ones you had and switch to a new set.

            EGLStreams passes a single file descriptor once. You bind this descriptor to a texture and then draw using it. Implicitly, the producer side signals when new contents is available, the consumer side signals when it’s time to draw. The metadata, matchmaking and format conversions are kept opaque so that the driver can switch and decide based on the end use case (direct to display, composition, video recording, …) and what’s available.

            1. 4

              The whole schtick that Nvidia does this out of spite is just short-fused idiocy. It doesn’t take much digging into the presentations that Nvidia has held at XDC2016/2017 (or just by asking aritger/cubisimo,… directly) to understand that there are actual- and highly relevant- technical concerns that make the GBM approach terrible.

              I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route. And I question the technical merit of EGLStreams - it’s not like the Linux graphics community is full of ignorant people who are pushing for the wrong technology because… well, I don’t know what reason you think we have. And at XDC, did you miss the other talk by nouveau outlining how much Nvidia is still a thorn in their side and is blocking their work?

              The narrative Drew and other are playing with here - that there’s supposedly a nice “standard” that Nvidia just don’t want to play nice with is a convenient one rather than a truthful one.

              What, you mean the interfaces that literally every other gfx vendor implements?

              To be able to figure out how to do this, you pretty much need control and knowledge of the internal storage- and scanout formats of the related devices, and with it comes a ton of politics.

              Necessary politics. A lot of subsystems are at stake here with a lot of maintainers working on each. Instead of engaging with the process Nvidia is throwing blobs/code over the wall and expecting everyone to change for them without being there to support them doing so.

              Your explanation of how EGLStreams and GBM are different is accurate, but the drawbacks of GBM are not severe. It moves the responsibility for some stuff but this stuff is still getting done. Read through the wayland-devel discussions if you want to get a better idea of how this conversation happened - believe it or not, it did happen.

              1. 2

                I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route. And I question the technical merit of EGLStreams - it’s not like the Linux graphics community is full of ignorant people who are pushing for the wrong technology because… well, I don’t know what reason you think we have.

                Well, the FOSS graphics “community” (as in loosely coupled tiny incestuous fractions that just barely tolerate each-other) pretty much represents the quintessential definition of an underdog in what is arguably the largest, most lock-in happy, control surface there is. The lack of information, hardware and human resources at every stage is sufficient explanation for the current state of affairs without claiming ignorance or incompetence. I don’t doubt the competence of nvidia driver teams in this regard either and they hardly just act out of spite or malice - their higher level managers otoh - that’s a different story with about as broken a plot as the corresponding one had been at AMD or Intel in other areas where they are not getting their collective asses handed to them.

                Your explanation of how EGLStreams and GBM are different is accurate, but the drawbacks of GBM are not severe. It moves the responsibility for some stuff but this stuff is still getting done. Read through the wayland-devel discussions if you want to get a better idea of how this conversation happened - believe it or not, it did happen.

                I have, along with the handful of irc channels where the complementary passive-aggressive annotations are being delivered (other recommended reading is ‘luc verhaagen/libvs blog’ starting with https://libv.livejournal.com/27799.html - such a friendly bunch n’est ce pas?), what I’m blurting out comes from rather painful first hand experiences - dig through the code I’ve written on the subject and the related wayland-server stuff and you can see that I’ve done my homework, if there’s one hour of code there’s one hidden hour of studying/reversing prior use. I’ve implemented both (+ delivered evaluations on three others in a more NDAy setting) and experimented with them since initial proposal and bothered nvidia about it (~dec 2014) - neither deliver what I think “we” need, but that’s a much longer thread.

                On GBM: The danger is splitting up an opaque ‘handle’ into a metadata- dependent set of file descriptors, passing the metadata over a side channel, recombining them again with no opportunity for compositor- validation at reassembly. That is validation both for >type/completion<, for >synchronization< and for >context-of-use<. It’s “try and hope” for asynch fail delivery, crash or worse. There’s ample opportunity here for race conditions, DoS and/or type confusions (got hard-to-reproduce such issues with wild writes both in gpu-space and kernel-space in this very layer with Xorg/DRI3 bound as producer via a render node). Look at the implementation of wl_drm and tell me how that approach should be able to bring us back from buffer bloat land to chase-the-beam producer-to-scanout (the VR/AR ideal and 8k/16k HDR bandwidth scaling necessity) - last I checked Mesa could quad-buffer a single producer just to avoid tearing - FWIW, Nvidia are already there, on windows.

                What, you mean the interfaces that literally every other gfx vendor implements?

                Yes, the Mali and Exynos support for GBM/KMS is just stellar – except “Implements” range from half-assed, to straight out polite gestures rather than actual efforts, so about the quality of compositor ‘implementations’ on the Wayland side. Out of 5 AMD cards I run in the test rig here right now, 2 actually sort of work if I don’t push them too hard or request connector level features they are supposed to offer (10bit+FreeSynch+audio for instance). To this day, Xorg is a more reliable “driver” than kms/gbm - just disable the X protocol parts. I started playing with KMS late 2012, and the experience have been the same every step of the way. If anything, Gralloc/HWC is more of a widespread open “standard”. Then again, Collabora and friends couldn’t really excel selling consulting for Android alternatives if they’d use that layer now could they?

                Necessary politics. A lot of subsystems are at stake here with a lot of maintainers working on each. Instead of engaging with the process Nvidia is throwing blobs/code over the wall and expecting everyone to change for them without being there to support them doing so.

                and

                I think they may have had more of a point if they had been pushing EGLStreams back when these APIs were initially under discussion, before everyone else had gone the GBM route

                Again, gralloc/hwc predates dma-buf, and are half a decade into seamless multi-gpu handover/composition. For nvidia, recall that they agree on KMS as a sufficiently reasonable basis for initial device massage (even though it’s very much experimental- level of quality, shoddy synchronisation, no portable hotplug detection, no mechanism for resource quotas and clients can literally priority-invert compositor at will) - It’s the buffer synchronisation and the mapping to accelerated graphics that is being disputed (likely to be repeated for Vulkan, judging by movements in that standard). Dri-devel are as much the monty python ’knights who say NIH’ as anyone in this game: Unless my memory is totally out of whack EGLStreams as a mechanism for this problem stretch as far back as 2009, a time when khronos still had “hopes” for a display server protocol. Dma-buf (GEM is just a joke without a punchline) was presented in 2012 via Linaro and serious XDC @ 2013 with Xorg integration via DRI”3000” by keithp the same year. Nvidia proposed Streams @ 2014, and the real discussion highlighted issues in the span 2014-2015. The talks have then been how to actually move forward. Considering the “almost enough for a decent lunch” levels of resource allocations here, the pacing is just about on par with expectations.

            1. 7

              To echo the problems described in the article with small Linux distros: there’s this huge problem with not enabling localization (I speak English, and pretty much only English, so yay for me, but I could see it being a problem for others), and not including all documentation.

              I can’t remember if it was Alpine or Void, but one of them either doesn’t include the man command in the default installation or doesn’t have man pages for the default package manager, or both, I can’t remember. Obviously a problem. And nothing is more irritating than reading a man page, seeing “look at /usr/share/docs/FOO” and there being nothing in /usr/share/docs.

              (And don’t even get me started on texinfo. That shit needs to die.)

              1. 3

                Void comes with man pages and mdocml.

                1. 3

                  Alpine has no man by default, though easily installed. I gave up shortly after, though, upon discovering there’s no xterm package.

                  1. 1

                    Seems to be on the community repo: https://pkgs.alpinelinux.org/packages?name=xterm

                  2. 1

                    l10n is definitely on my radar, but it’s a bit of a head scratcher wrt how to implement it correctly in-line with the principles of the distro. I don’t want to include anything on your system you’re not actually using, like l10n for langauges you don’t speak.

                  1. 1

                    This sounds exactly like what I’ve imagined a perfect Linux distro to be like. I love it when people who can put in the work get the same ideas :)

                    Gotta try to help him out if I can. I created #agunix:matrix.org if people wanna join in.

                    1. 1

                      Try #agunix on irc.freenode.net, that’s where agunix development takes place.

                      1. 1

                        Cool, thanks. Creating a bridge between these should be trivial if we ever need to. Probably not yet, since I’m the only one on the Matrix side :)