1. 4

    as we move towards computers which can use whatever endianness is appropriate for the situation

    What are the appropriate situations when you want to run your whole system in big endian? It might be my lack of imagination, but other than compatibility with buggy C programs that assume big endian, I can’t think of any. It would be nice to leave this part of computer history behind, like 36-bit words and ones’ complement arithmetic.

    1. 12

      I’ve been running big endian workstations for years. It’s slightly faster at network processing, and it’s a whole lot easier to read coredumps and work with low-level structures. Now that modern POWER workstations exist, I no longer even have an x86 on my desk.

      Many formats are big-endian and that won’t change. TIFF, JPEG, ICC colour profiles, TCP, etc…

      Ideally, higher level languages would make this irrelevant to most people, so we could just run everything in BE and nobody would notice except the people doing system-level work where it’s relevant. Unfortunately, we haven’t gotten there yet. So it’s best for user freedom to let the user decide what suits their workload.

      1. 5

        Modern x86 has a special instruction for byte-swapping moves, MOVBE: https://godbolt.org/z/juJ6VL

        I disagree that low level languages are a problem when it comes to this. Even higher level languages need to deal with endianness when working with those formats you mentioned, so we’ll never be rid of it on that level. On the other hand, it’s possible to do it properly in low level languages as well; don’t read u16/u32/u64 data directly and avoid ntohl()/htonl(), etc. The C function in my link works on both big and little endian systems because it expresses the desired result without relying on the native endianness.

        1. 3

          I wish more people would know the proper ways to do that in C.

          1. 5

            Simple: “reading several bytes as if they were a 32 bit integer is implementation defined (or even undefined if your read is not aligned). Now here’s the file format specification, go figure a way to write a portable program that reads it. #ifdef is not allowed.”

            From there, reading bytes one by one and shift/add them is pretty obvious.

        2. 5

          I’ve been running big endian workstations for years.

          I’m curious about your setup? What machines are you running with MIPS? I guess I haven’t really looked into “alternative architectures” since the early 2000s, so I’m quite intrigued at what people are actually running these days.

          1. 3

            My internal and external routers are both MIPS BE.

            My main workstation is a Raptor Talos II, POWER9 in BE mode. Bedroom PC is a G5.

            My media computer is an old Mac mini G4. I haven’t felt the need to replace it.

            1. 3

              I suspected that you had a Talos machine. The routers make total sense, too. Thanks for taking the time to reply!

              1. 1

                My internal and external routers are both MIPS BE.

                May I ask what the make and model codes are?

                1. 3

                  Netgear WNR3500L.

                  1. 2

                    Thank you @awilfox!

            2. 3

              Many formats are big-endian and that won’t change. TIFF, JPEG, ICC colour profiles, TCP, etc…

              All those standards use Big-Endian for various reasones related to hardware down to the chip level.

              • IP for example, is used for routing based on prefixes where you only look at the first few bits to decide which port a packet of data should exit through. In more than 99,9% of the cases, it simply does not make sense to look at the low end of the numbers.
              • TIFF, JPEG and ICC colour profiles all deal with pixels and some form of light sensors which are connected some form of analog to digital converter-circuit. Such a circuit is essentially a string of resistors interlaced with digital comparators that output 1 if the input voltage is above a certain threshold. If the first half of all comparators returns 1, you switch on the MSB, if not, you switch it off, however, the MSB (which would be upfront in Big Endian notation) denotes 50% of the input signal’s strength and is therefore more important to “get right” than the lower numbers.

              So why is Little Endian winning on modern CPU’s? Well that’s because we have different concerns when we are running computer programs in which a pattern like this

              for(int i=0; i<length; i++) {}
              

              is common.

              It would make no sense to start comparing numbers bitwise from the high end, because that almost never changes. The low end however, changes all the time. This makes it easier to put the low-end bytes upfront and only check the higher bytes when we have overflowed on a low-end byte.

              So it’s a story about: Different concerns -> different hardware.

              As for Debian: They must have looked through the results of their package popularity contest and have judged that the amount of work required to maintain the mips architecture cannot be justified by the small number of users that uses it.

              This is also why I always opt for yes when I’m asked to vote in the popcon. Because they can’t see you if you don’t vote!

              1. 2

                Ideally, higher level languages would make this irrelevant to most people

                See Erlang binary patterns. It provides what you want.

                1. 1

                  It’s slightly faster at network processing

                  New protocols these days tend to have a little endian wire format. TCP/IP is still big endian, but whatever lies on top of it might not be. Maybe that explains the rise of dual endian machines: little endian has won, but some support for big endian still comes in handy.

                  1. 1

                    Yes and no. The z-cash algorithm (used by ethereum et al) serialises numbers to BE always. But some LE protocols and formats exist. I think the real winner is not LE, nor BE, but systems that can let you use both.

                    1. 4

                      And everything designed by DJB is little Endian: Salsa/Chacha, Poly1305, Curve25519… And then there’s Blake/Blake2, Argon2, and more. I mean, the user hardly cares about the endianness of those primitives (it’s mostly about mangling bytes), but their underlying structure is clearly little endian. Older stuff like SHA-2 is still big endian, though.

                      Now sure, we still see some big endian stuff. The so called “network byte order” is far from dead. Hence big endian support in otherwise little endian systems. But I think it is fair to say that big endian by default is mostly extinct by now. New processors are little endian first, they just have additional support for big endian formats.

                      And if you were to design a highly constrained microcontroller now (that must not cost more than a few cents), and your instruction set is not big enough to support both endianness efficiently, which endianness would you chose? Personally, I would think very hard before settling on big endian.

              1. 2

                I want to run a self-hosted issue tracker, which is my favorite thing about GitHub. I do NOT want a replacement for GitHub. This has nothing to do with the Microsoft/GitHub merger. This is purely about the fact that I do not like fork-and-PR workflow/s and I don’t like the way that GitHub has implemented code reviews. So I’m not looking to run GitLab, Gitea, Gogs, or any other GHE clone.

                I’d rather host my own raw Git server (possibly using Patchwork to manage patches). I just need some sort of issue-tracking software that has the ability to link to specific patches and commits in my Git repos.

                Does anybody have any suggestions please?

                1. 1

                  There are plenty of standalone issue trackers. Bugzilla is the godfather of them all; Request Tracker is similarly venerable but is more often used for IT helpdesks, and only occasionally OSS projects (e.g. Perl).

                  The trouble with standalone issue tracking software is that since issue tracking is the focus of its existence, they tend to end up a lot more complex than something like GitHub issues, if something that simple is what you’re looking for. If you want something GitHub issues-like, I wonder if mild modification of Gitea to shut off the code hosting aspects would be productive.

                  Another thing I’ve been thinking about lately is tracking issues in a branch of the repository (similarly to how GitHub uses an unrelated gh-pages branch for website hosting). This would have the not insignificant advantage that the issues would then become as portable as Git itself, and be versioned using standard Git processes. I think there are some tools that do this, but I haven’t looked at them yet.

                  1. 1

                    If those issue trackers are too complex for your needs, I reckon it’d be about an afternoons work to throw together a simple one (which might be why there isn’t one packaged - it’s not big enough!). Of course, within a few months you’ll start wanting to add more features…

                    Agree that tracking issues in a git repo is great.