1. 70
    1. 5

      The really interesting bit here is the fact that close can be surprisingly slow, so if you really need to care about it (like in this case), you should probably call it on a worker thread.

      1. 1

        That is probably fine in this use case, but has some annoying interactions with UNIX semantics for file descriptors. UNIX requires that any new file descriptor will have the number of the lowest unallocated file descriptor. This means that any file-descriptor-table manipulation requires some serialisation (one of the best features of io_uring is providing a separate namespace for descriptors that is local to the ring and so can be protected by a lock that’s never contended). I think Linux uses RCU here, other systems typically use locks. Some userspace code actually depends on this and so localises all fd creation / destruction to a single thread. If the program isn’t already multithreaded then adding a thread that calls close can break it.

    2. 3

      This is both a marvelous write-up and a reminder as to why I am near exclusively a console gamer these days.

      1. 4

        It’s funny that game consoles become more and more like pcs with a single fixed configuration. Almost like apple hardware.

        1. -1

          Not really, it’s the fixed configuration (and appliance-like customer lockout) that hass the value, not having a weird low-numbers CPU.

          1. 5

            it’s the fixed configuration (and appliance-like customer lockout) that hass the value

            I mean, you literally just described game consoles and apple hardware, so I’m not sure what your point is?

            1. 4

              I think his point is that, probably until about 10 years ago, consoles shipped differentiating hardware. You could do things on an 8-bit Nintendo that you couldn’t do on a commodity general-purpose computer at the time, even though the PC was more expensive. When 3D acceleration started to be the norm (around the PS1 / N64 era), consoles had very exciting accelerator designs to get the best possible performance within a price envelope. Most PCs didn’t have 3D accelerators at all and when they did they were either slower than the ones in consoles or a lot more expensive.

              Over time, the economies of scale for CPUs and GPUs have meant that a mostly commodity CPU and GPU are faster than anything custom that you could build for the same price. Consoles typically have custom SoCs still (which makes economic sense because they’re selling a large number of exactly the same chip), but most of the IP cores on them are off-the-shelf components. They even run commodity operating systems (Windows on Xbox, FreeBSD on PS4), though tuned somewhat to the particular use case.

              It’s unlikely that a future console will have much custom hardware unless it is doing something very new and exciting. HoloLens, for (a non-console) example, has some custom chips because off-the-shelf commodity hardware for AR doesn’t really exist and so a console wanting to do AR might get custom chips.

              Even in the classic Nintendo era, the value of consoles to developers was twofold:

              • They had hardware optimised for games.
              • Every single instance of the console had exactly the same hardware and so the testing margin was small.

              The first is now far less important than the second. This is somewhat true for the Apple hardware but the scales are different. The Xbox One, for example, came out in 2013. The Xbox One S was almost identical hardware, just cheaper. The Xbox One X wasn’t released until 2017 and was faster but any game written for the older hardware would run fine on it, so if you weren’t releasing a AAA game then you could just test on the cheaper ones. The Xbox Series X/S were released 7 years later. If it has a similar lifetime, that’s four devices to test on for 14 years. Apple generally releases at least 2-3 models of 4-5 different product lines every year.

            2. 1

              How is it funny? It’s sensible and predictable, and IMO kind of a bummer.

    3. 2

      I feel like the bug is in repeatedly trying to identify if a device you’ve already previously checked should be added. I would think SDL would cache the list of devices already checked (invalidated when the list of devices changes) and only opens/tests/closes devices the first time they are encountered. At least, that’s how I would have written it.

      1. 1

        I suspect one issue is that “no new device” when scanning /dev/input periodically doesn’t necessarily mean that one of the existing device nodes doesn’t actually refer to a new device. I.e. you pull out a mouse that was /dev/input/event5, plug in a joystick and it becomes that same /dev/input/event5. So just comparing the list of device nodes isn’t enough to tell if the devices have actually changed.

        Probably the right way to do it is watch the hotplug events (via libudev for example), though doing that properly in a library like SDL would also be a bit complicated, I think.

        1. 1

          In Ubuntu there is also also a directory with the devices by name (which are symlinks to the actual devices). So you can open a device by name, then when you get an error (as happens when the device is unplugged), close the device and reopen it rather than scanning the directory. (I don’t know if this is idev or the kernel or something else that creates the directory, so I’m not sure if it’s portable to other distros, but it’s quite useful).

          This works great for performance but if you want to handle the case where a new controller can be plugged in while the game is running (a second player, perhaps), then you have to scan the directory or do something like you suggested.

          EDIT: I realized after posting that I’m not sure which process creates the directory and the symlinks so I updated my reply to reflect that.

        2. 1

          I wonder if a scheme will be developed for input devices that is similar to networking devices. Meaning that they each get some unique-ish name.

          1. 4

            I’m not sure how SDL handles that but you can (technically) get as unique an ID as you can for a given /dev/input path. evdev exposes the type and physical topology for each entry. Querying that probably takes some extra time but I don’t think it’s something SDL couldn’t handle for you (or if it’s something that it doesn’t handle for you, I haven’t touched SDL since the 1.2 days…).

            Providing unique names for hotpluggable devices is not a very easy problem so I don’t know how much it would help if that information were exposed straight in the device name. The “predictable” network device naming scheme doesn’t really work for all of those, either. The predictable part relies on information provided by the motherboard firmware/BIOS so it only works as long as the enumerated devices are on-chip, soldered on the mainboard, or fitted in a slot that you can’t reach without getting a screwdriver. Once you get to cheap USB adapter land you’re back to unpredictable names, except they’re all called enpXsY because something something modern hoptlugging whatever mumbles in BSD. Unpredictable as in, just because a device has the same name as one you’ve already seen doesn’t guarantee it’s the same one, and just because you got an event for a new device with a new name doesn’t mean it’s not the same device in a different physical location.

            It’s certainly true that most of the breakage (especially of the “just because it’s got the same name doesn’t mean it’s the same device” kind) is caused by bad hardware. Unfortunately that doesn’t stop people from buying bad hardware and they tend to blame software they get for free as opposed to hardware they paid money for, so ¯\(ツ)/¯.

            1. 5

              EVIOCGUNIQ often fail outright or return empty values or provide the same value for multiple devices of the same kind when plugged in. The coping mechanism is basically a hash of various device features, and caching plug/unplug actions hoping for a “proximate onset/proximate cause” kind of relationship to retain the logical mapping.

              Nothing in evdev is reliable if your test-set of devices is big enough, and every low level consumer of the thing ends up with workaround databases to cope - and those are in a similarly terrible shape. Step into the bowels of udev and trace from how it goes from discover a new input device from NETLINK and onwards to figuring out what it is. There are some grand clintonesque assumptions about what constitutes “a keyboard”.

              With current game devices it is practically easier to forget that the thing even exist and go for a userspace usb/bluetooth implementation of your own. It’s no less painful than trying to stitch together the quilt of rusty kernel interfaces that nobody wants to touch that has decades of workarounds - in reality you get to do both or walk away.

              Recent console-class game “controllers” is an army of environment sensors, e.g. a camera or three, a handful of analog sticks and buttons that are both analog and digital in various stages of their exciting journey from depressed to pressed. They also have more expressive LED outputs than “designed for numlock and capslock” than evdev was ever “designed” for, as well as a low-fi speakers (force feedback) and a hi-fi one (because local sound effects) that may or may not appear as a sound device, and often split across multiple device nodes that appear unrelated from the evdev consumer point of view. Then comes assistive devices and VR …

              The kind of performance bugs mentioned in the article are everywhere in the stack. It makes sense to outsource opening/closing to another process to forget about a few of them and just not accept tickets on the matter - it’s probably some powersave PMIC dance gone horribly wrong. I have this fine USB hub here that, depending on the order devices are plugged into it, will introduce about ~20ms extra latency on its own, jittery as all hell, or add stalls every n seconds when it suddenly and silently forces a soft reset that the kernel tries to hide from the evdev side…

              1. 2

                buttons that are both analog and digital in various stages of their exciting journey from depressed to pressed

                This is the kind of thing that makes me wonder why prisons aren’t full of programmers who went insane, showed up at the office with a chainsaw one day, and did a different kind of hacking. God!

                Do you know if things are any better in, erm, other lands, like macOS/iOS or Windows? I first started hearing horror stories about evdev about 15 years ago and the design was never particularly forward-looking so I imagine that fifteen years of industrial evolution (!?) made things even worse. Did anyone get it any “righter”?

                1. 5

                  This is the kind of thing that makes me wonder why prisons aren’t full of programmers who went insane, showed up at the office with a chainsaw one day, and did a different kind of hacking. God!

                  If only the culprits weren’t so geographically spread ..

                  Do you know if things are any better in, erm, other lands, like macOS/iOS or Windows? I first started hearing horror stories about evdev about 15 years ago and the design was never particularly forward-looking so I imagine that fifteen years of industrial evolution (!?) made things even worse. Did anyone get it any “righter”?

                  There are pockets of “righter” in Windows/iOS/Android but they also have the benefit of being able to act more authoritatively and set a ’minimal viable type and data model” to work from. OpenXR does a fair job of approaching some of the consumer API side of things by shifting the responsibility around by having applications define the abstract inputs they want and leave it to the compositor side to actually provide the translation and mapping.

                  Android has a very well-thought out basic data model, and past the InputManager things looks quite clean (about the same stage as the compositor would be at in the Linux stack) - but its data routing is punished both by being layered on top of evdev and for having a Java GC in nearly every process.

                  The procedure I usually apply is walking the path from the input device to the consumer, noting each ‘sample/pack/unpack/filter/translation’ stage and cost along the way, and for each indirection ask “what happens on backpressure?”, “is source/identity retained?”, “can filter/translation/queueing be configured?”, “can sampling parameters be changed at runtime?”.

                  For the really neglected input devices though, hands and eyes - things are really grim. Enjoy this little nugget: https://github.com/leapmotion/leapuvc/blob/master/LeapUVC-Manual.pdf - and that’s not even all of the abuse needed for an input sample, there is something about “authentication” there at the end. Then comes the actual computer vision part ..

                  The eye tracker I use had a linux input driver at one time. It was quickly pulled from public support. It pegged a CPU core for its own, bundled electron to be able to provide a configuration step - “Look at these four reference points”, depended on systemd for .. things, required certificates for providing a configuration of your own (or you could get much better precision that made it harder to justify a commercial option), … Now I run the driver in a windows VM with the usb node forwarded, extract the samples and send them over the network. That is against the driver TOS.

                  1. 3

                    For the really neglected input devices though, hands and eyes - things are really grim. Enjoy this little nugget: https://github.com/leapmotion/leapuvc/blob/master/LeapUVC-Manual.pdf - and that’s not even all of the abuse needed for an input sample, there is something about “authentication” there at the end. Then comes the actual computer vision part ..

                    Oh wow. That whole thing is… one gem after another. I like that bit about gain control, too. Gotta hand it off to them, though, at least they didn’t just throw their hands in the air and said well, they’re multiplexed, so reading the gain value will return either the gain control, the FPS ratio, or the dark frame interval, you’ll figure out which one’s which eventually.

                    I wish I had something constructive to say but right now I mostly want to stay away from computers for a while…

          2. 3

            /dev/input/by-id/

    4. 1

      I ran into this issue in my summer side project. I had been having seemingly random multi-frame stalls in retroarch, so I decided to write a new libretro frontend (*) that would allow me to measure and visualize where stalls occur.

      In evaluating whether to use GLFW or SDL (retroarch uses SDL), I noticed that I still had similar stalls as retroarch and that it occurred in the joystick code. With GLFW the stalls would cause me to miss a frame, but I could catch up on the next frame. With SDL the stalls were multiple frames long.

      It turns out that I had a bad USB cable and the controller was getting disconnected/reconnected as I was playing. I replaced the cable and wrote my own input code using the Linux gamepad API that has better latency characteristics. As a bonus I also get access to the timing data from the Linux kernel, so I can see exactly when a button press was processed (my 8bitdo controller has an 8ms polling interval, so there is a high chance of any given input falling on the next frame).

      (*) I’ve called the new frontend fenestra, which is Latin for “window”, because it gives you a view into the performance of all the pieces inside. You can find it on my github account.