1. 15
  1.  

  2. 6

    Just painful.

    Now I know why I was told by gnn@freebsd to never try porting their bwn driver to OpenBSD because it “didn’t really work”…

    1. 1

      Broadcom wifi is a giant pain in the ass on Linux with its hundreds of thousands of users and developers devoted to making it at least sort of mostly work. I can only imagine how bad it must be on smaller platforms.

      The linux folk did a huge reverse engineering effort on the binary broadcom driver (wl) over many years, and generated a specification document with which they implemented b43 (and bcm-v3 for b43legacy.) It’s .. pretty amazing, to be honest. So, armed with that, I went off to attempt to implement support for the first 11n chip, the BCM4321.

      I’ve always wondered how common this is. It seems to me that having another open-source driver to examine while trying to reverse some bit of proprietary hardware would be incredibly useful, even if it couldn’t be directly ported for API or licensing reasons, but I’ve never tried to write a driver so I have no direct experience. Do Linux and FreeBSD (and other) driver authors coordinate or crib off each other regularly, or is it more typical to do driver writing and reverse engineering independently?

      1. 5

        It depends.

        In some cases (especially where no open source code existed) binary blob drivers from other system were reverse engineered. If there is a bus between the blob driver and the device (e.g. USB) the communication on that bus can be recorded and analyzed. This is particularly useful for buses with a command/response communication model. It takes time to map all the commands, but they are often rather high level (e.g. a wireless device might have a “scan” command). Or you could wrap the blob in some code of your own that calls some of the blob’s API functions and traces the resulting memory reads/writes to device registers which are mapped into e.g. PCI memory space. Now implement a function of your own that does those same reads and writes. Repeat until the hardware does something useful and try to clean up the mess of code you’ve produced in the process.

        Nowadays, it is more common to look at source code from other systems, if available.

        But that doesn’t always help. In some cases, the code is a joke of undocumented magic numbers. Using Broadcom as an example again, see linux/drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c That file has about 28.000 lines. The upper half is mostly tables of numbers which get written to the hardware e.g. for calibration purposes. These numbers might be derived from measurements taken during hardware testing, so before you scream “blob!” keep in mind that even the hardware engineers may not actually know why these numbers work better than some other set of numbers. Actual code starts about half way down, and wouldn’t pass code review in many places. One cannot make sense of this without the data sheet which gives meaningful names to register offsets. Note that this is a particularly bad example. Most vendors don’t obfuscate their code like this. Some vendors even provide comments which document some aspects of the hardware (see header files in the iwlwifi driver).

        Even with code available, I often find myself adding print statements to trace register reads/writes so I can look at the run-time behaviour of the driver, rather than a static maze of function calls. This makes it much easier to figure out which parts of the code are relevant.

        Even if the source code for other systems can be legally ported, it is still an error-prone incremental process. A lot of the time is spent on trial-and-error debugging for issues where the hardware doesn’t respond, raises error interrupts with magic error numbers, or just live-locks the whole system. The worst thing about it is that you can never know ahead of time when you’ll be done. The problem could be a single character in a line of code somewhere, but it might take months to it track down without knowing what to look for because the e.g. the wireless firmware raises a “sysassert 65” error interrupt when you give it a packet to transmit, with no futher useful information. Once you’ve fixed that, it starts raising “sysassert 99” and you’re still not done yet.

        So having source code to look at is useful, but it doesn’t replace hardware documentation. In an ideal world, we could all just download these documents and look up what “sysassert 65” is supposed to mean. And perhaps have pages worth of it in wikipedia to search through.

        1. 3

          Years ago, I wrote a RTL8139 driver. I both referred to the Engrish data sheet, and cribbed from the Linux and NetBSD code. All three were wrong in many interesting ways.