1. 15

    Does “master” in the context of git mean master like in “master and slave”, like “master bedroom” or like “come here young master!”? I’m not a native English speaker, but it’s a word with multiple meanings, right?

    1. 12

      It is indeed a word with multiple meanings. But the people who initially developed git had previously used a tool called bitkeeper, and were inspired by its workflow. It used the term master like in “master and slave”.

      https://github.com/bitkeeper-scm/bitkeeper/blob/master/doc/HOWTO.ask#L223

      So the most benign explanation is that git used the term in the same sense.

      1. 17

        And until people started talking about it recently, if you asked anyone what the “master” branch meant, they didn’t give that answer – they thought it meant like “master copy”.

        So in a very real sense, the people promoting this explanation are actually creating an association that did not exist in people’s minds before, and incurring stereotype threat that did not need to exist.

        I understand the sentiment, but I think this has negative utility overall.

        1. 2

          I don’t think any of the meanings commonly assigned to “master” really fit git’s usage. The only explanation I have heard that I find particularly satisfying is that people accustomed to bitkeeper adopted a familiar term.

          It’s not really like a “master copy” or a “master key”. Nor does it control anything, which is usually the sense for “master/slave”. I expect if they had been working without the context of BK, it’d have been called “primary” “default” or “main” in all likelihood. I think giving it a clearer name rather than continuing to overload the poorly chosen term “master” has some small utility, as long as it doesn’t break too much tooling too badly.

          And I think a much more interesting question about git is whether Larry McVoy still thinks it was a good move to spawn its creation by revoking the kernel developers’ license to use it on account of Tridge’s reverse engineering efforts.

          1. 6

            I think it means master copy in that all branches come back to it when finished. So at any given point in time it has the most finished, most production, most merged copy.

            Like if you are mixing a song and you put all the tracks together into a master copy. That’s like bringing all the branches together and tagging a release on master.

            If anything, git branches are in no way “slave” or “secondary,” just works in progress that will eventually make it into master, if they are good enough.

            That’s at least how I understood it.

            1. 1

              I certainly would have no argument with using main, default, or primary if creating a new system. It would be a little more descriptive, which is good. I don’t think it’s better enough to upset a convention, though.

              (One argument against master/slave in disk and database terminology, besides the obvious and very valid societal one, is that it can be terribly misleading and isn’t a good description.)

          2. 9

            But git never adopted the concept of master as in the meaning “master/slave” only in the meaning “master branch” (like “master key”), right?

            1. 3

              I thought it was more in reference to “master copy” akin to an audio recording.

              1. 1

                I don’t recall any usage of the word slave in git.

                I find it impossible to say, though, because none of the meanings you listed are really a good fit for git’s usage of the term. I think the only answer is that it was familiar from BK.

                Something like “main” or “primary” would better match the way it gets used in git.

                1. 2

                  Or something like “tip” or “trunk”…

                  yep I’m using trunk in new projects now, as an SVN reference :D

                  1. 1

                    Heh, i’m tempted to use attic to be even more contrarian then.

                    1. 1

                      I would encourage main simply because it autocompletes the same for the first two characters. :-)

                      1.  

                        The “headhoncho” branch it is.

                2. 17

                  Yes, for example Master’s Degree

                  1. 2

                    But it’s the dumbest branch. It knows less than other active branches, it only eventually collects the products of the work on other branches.

                    Of all the meanings of master, I can only think of one where this analogy applies.

                    It also doesn’t “do everything” like a master key, it does the same thing as all the active branches, or one thing less if a feature is completed on that branch. Code in the master branch should be the most active, so it’s not a bedroom. It’s the parent of all the others so it’s not a young master.

                    It’s a boss, a leader, a main, or indeed a slave master. Any of these analogies would fit.

                    1. 9

                      Master in git doesn’t mean master like any of those things. It means finished product, master. The exact same way it’s used in media, for example, when a “remastered” song is released.

                      1. 6

                        Gold master.

                    2. 9

                      There is some further discussion about this on the GNOME desktop-devel mailing list. Petr Baudis, who was the first to use “master” in the git content had intended it in the “master recording” sense.

                      Edit: Added additional link and removed “Apparently”

                      1. 6

                        Arguably language changes with usage, but in this case if you look at where the word came from, Git is based in many ways on Bitkeeper, which had master and slave both, so it would fall into the first category.

                        1. 15

                          But surely git isn’t using the word with that neaning, since there are no slave brances? Or?

                          1. 8

                            That’s how I feel about it, but apparently others disagree.

                            1. -1

                              Git was made by Linus Torvalds. If you know anything about the guy, you’d know that the only human aspects he takes into consideration is efficiency of tool use. Having named slaves is more useful, and once they have name there’s no reason to call them that anymore.

                              1. 8

                                Linus didn’t introduce the master branch concept, that was a dude named Petr ‘Pasky’ Baudis. He recently clarified that he intended to use it in the sense of ‘master copy’, but he may have been influenced by ‘master-slave’ terminology.

                            2. 3

                              Thinking in terms of what the authors themselves meant at the time, and whether or not the word “slave” is explicitly stated is a pretty limiting framing of the issue IMO. In reality, people react negatively to using metaphors of human dominance to describe everyday tools.

                              1. 3

                                In reality, git’s use of master has not resulted in a preponderance of negative reactions.

                                It’s used millions (billion?) of times a day with neutral to positive reactions, I expect.

                                I would like to see this empirically validated, but I think “In reality, people react negatively to using metaphors of human dominance to describe everyday tools.” is unverified at best and probably false.

                                1.  

                                  You can either argue the need for some sort of empirical sociological analysis of the quantity of people bothered vs not bothered by the word “master” to gauge the importance of the topic, or you can make your own anecdotal assertion as to how big or important the controversy is, but it’s not terribly consistent to advocate for both IMO.

                                  I make no claim as to the number of users bothered by “master”, and I certainly wouldn’t say it’s a “preponderance” of the userbase. But again IMO you’re further missing the point if you think that the broader issue has anything to do with the particular size of the anti-“master” crowd. The fact is, if you’ve followed recent online discussion on the topic, you’ll have noticed that there’s clearly some number of users that would prefer for their main branch not to be named “master”. Does it bother you if they choose an alternative name that doesn’t draw a metaphor - intentionally or not - to systems of human hierarchy and control?

                                  1.  

                                    I certainly haven’t read all the discussion, but I feel I’ve read a decent amount and while there are some people, it doesn’t seem that large.

                                    For me, the issue seems to be whether there is any intended malice in the term. If not, then the individuals who are offended may want to reconsider being offended.

                                    I say this because it seems like a small amount. While I would like to know the actual level, I don’t think it’s reasonable for many people using a common, non-racist connotation of the term “master” to change because some people think that there’s a metaphor to human systems of hierarchy and control that wasn’t intended by the author and isn’t interpreted as such by the vast majority.

                                    Potential offense and misinterpretation doesn’t seem like a worthwhile level of effort. Mainly because people can be offended by all sorts of stuff. The three tabbers are upset with two and four, should we change to prevent offense?

                                    I would feel very differently if this was a racist term or the number of people offended was very large.

                                    Also, if someone chooses to make the change on their own project, it wouldn’t bother me at all. It’s their project, they can name master whatever they feel like.

                                    1.  

                                      For me, the issue seems to be whether there is any intended malice in the term. If not, then the individuals who are offended may want to reconsider being offended.

                                      That’s highly unlikely to happen, I’m afraid.

                                      The parallel to draw is the re-branding of Uncle Ben’s and Aunt Jemima. Despite Aunt Jemima’s old marketing material looking quaint and “racist”, they were laudations of excellence in times when racism was much more rampant. Uncle Ben was a competent farmer, and no one knew as much about as cake as your (likely black female) housekeeper.

                                      Now with all that’s going on, those companies have not stated that case, but instead issued statements bending a knee to the masses. This is counter-productive to any minority cause, because it literally kills off appreciation if it happens.

                                      On the coder side of the fence, where master isn’t even misinterpreted marketing material from the late 1800s, but a technical detail, this will swiftly blow off. Like I believe the master/slave vocabulary pretty much blew over in networking and other contexts as well.

                                      Which is not to say another word than “slave” is inherently worse, but the effort put into churning a codebase to get rid of something that’s essentially a homonym combined with a neologism is simply wasteful.

                                      It’s marginally sad, in the sense it will incur some technical difficulties, that you can’t rely on the name of the master copy of the code in the Git repository. You could before.

                                      It’s also sad that there’s very little, or nothing I know, in common code vernacular that elevates minority achievements, but I un-ironically believe that such vernacular would be at risk of being labeled racist as well :(

                                      Wouldn’t expect too many “Oh yeah, I can see that!” type responses, because people tend to “Raise shields! Go to red alert!” (In Sir Patrick Stewart’s voice) when their view is challenged, and instead give some Off-topic or Troll downvotes, maybe a defensive reply, whenever this point of view is brought up.

                                2.  

                                  Just calling out that I have “-1 incorrect” and “-1 troll” on this reply. I wrote two sentences, the first qualified as opinion and the second is a good faith summary of the anti-“master” branch opinion. Please tell me how I’m trolling and what about this reply is incorrect.

                              2. 3

                                Do words become taboo only due to their original meaning, or to their current meaning? Or both? What if I make up a false etymology to make a word sound bad, do you then have an obligation to stop using it?

                            1. 1

                              For now, only H.264 is supported in the rkvdec driver. Support for H.264 High-10 profile is currently being discussed by the community. In addition, VP9 and HEVC are planned to be added soon.

                              Wait, does the RK3399 hardware already support both VP9 and HEVC?

                              1. 1
                                1.  

                                  Yeah. The upcoming RK3588 will also have AV1 decoding support (4K 60fps 10bit) acc. to CNX.

                                1. 8

                                  There are some really specious assertions in here.

                                  I don’t use any of the extensions the author cites as the “best” parts of Visual Studio Code.

                                  I also use vscodium instead of the Microsoft release because it’s been blessed by my employer’s security team and is 100% open source.

                                  Sure, emacs and vim are amazing tools. I’ve used them productively for many years and will continue to do so, but I also use and love VSCodium as my editor of choice when I’m not on a remote server, and I don’t use any of the extensions the author cites.

                                  I’m not suggesting anyone else use VSCode, I just want to be clear about the fact that the authors “Best parts” are at best highly subjective.

                                  1. 6

                                    I really want to use the Remote extension. The fact that it’s proprietary is not a problem, the fact that it is not very portable is. It basically runs a full VS Code headless on the remote end and so only supports the platforms where there are supported binary releases. I can’t run it on FreeBSD, for example, which is what my dev VMs typically run.

                                    1. 3

                                      I found the Remote extension to be problematic in a number of ways.

                                      First, they didn’t market it very well :) It wasn’t at all clear to me when I first tried to use it that what’s really happening is I’m spinning up a headless VSCode on the remote system.

                                      That’s a log of Javascript and a lot of complexity. That kind of complexity makes security teams crazy, and I don’t blame them.

                                      It’s an incredibly powerful extension, but its massive scope and complexity makes it not make sense for the security conscious environments I tend to work in.

                                      1. 4

                                        Yes, I was quite surprised when I learned how it worked. I expected it to basically need to be able to launch a terminal to run build commands and to read / write files. Running other extensions on the remote side, rather than proxying read/write-file and run-command makes me really nervous.

                                    2. 3

                                      As a (mostly) non-user of VSCode, looking in from the outside Remote and LiveShare (along with solid first-class LSP integration) absolutely look like a couple things that I can’t do as well in other editors at the moment. So while there are clearly other use cases and many people may not need them, I do think they are among the few things that actually clearly differentiate VSCode from other options.

                                      1. 3

                                        They’re only differentiators IMO if you can actually use them.

                                        As I mentioned in another response, the Remote extension is great but basically amounts to installing a headless VSCode on your remote host. That’s gonna be a total deal breaker in any environment where security is even remotely a factor.

                                        1. 4

                                          In the same way as installing Emacs or a compiler is?

                                          1. 2

                                            In the same way as installing Emacs or a compiler is?

                                            The reason that VS Code’s remote extension runs a headless VS Code is extensions. Any extension that you use may come with arbitrary binaries that it wants to run to provide some of the functionality. In theory, it’s no less secure than running it locally but it can make people nervous if the remote machine is a production machine - generally, crashing the developer’s desktop or running it out of memory is less of a problem than doing the same thing on production hardware.

                                            The big problem for me with this approach is that it limits the targets to machines with official VS Code binary releases. I think ARM Linux is now supported but if you want to develop on a remote *BSD / Solaris / whatever VM or PowerPC / MIPS / RISC-V / whatever system, it’s a problem.

                                            1. 1

                                              Actually potentially much worse.

                                              emacs and most mainstream compilers have been fairly well vetted by any number of different security teams.

                                              The problem IMO with the VSCode Remote extension is that it represents an arbitrarily large body of Javascript that can’t easily be audited.

                                            2. 2

                                              Fair enough. It still seems like it might be useful for development environments at least, but that is a pretty big concern.

                                              1. 1

                                                basically amounts to installing a headless VSCode on your remote host

                                                Well there’s no need for any proprietary extensions to do just that: https://github.com/cdr/code-server

                                                1. 1

                                                  Have you actually tried / deployed this thing? I have, and it doesn’t do what you’re asserting here :)

                                                  The Remote extension lets you run your VSCode front end Electron-esque GUI on your local workstation but act on files on a remote Linux server.

                                                  The thing you’re pointing at here presents a web UI where you do everything, and it’s got some rather sincere limitations.

                                          1. 1

                                            Neat build! I may have missed it, but do you just have the knob tied to volume?

                                            I’ve entered a few group buys recently to try out and ortho-linear build and possibly an ergodox.

                                            I haven’t soldered anything before and thought it would be a fun project.

                                            Any suggestions on soldering irons? (I was looking at the TS80)

                                            1. 1

                                              Get the TS80P over the original TS80. Have to have PD, what is even the point of USB-C if you’re going to use QC over it.

                                              1. 1

                                                Yup! I have one knob mapped to volume. The other currently is mapped to scroll up/down, but that’s just because it was the default. Not sure yet what to map that second knob to…

                                                I don’t have any good suggestions for soldering irons, I still have an old no-name one I bought from a hardware store several years ago. Gets the job done!

                                                1. 1

                                                  I made an ErgoDox layout with volume control buttons. When I added the second layer, I came up with the idea of having those keys do screen brightness.

                                                  There’s a kind of symmetry between the L1 and L2 functionalities IMO, and maybe brightness is something you could try out for the second knob!

                                              1. 2

                                                Is there no mode that would share the physical network port but tag all IPMI traffic with a VLAN you configure?

                                                1. 6

                                                  Many HPE servers have a dedicated network ports for the iLO card but can also optionally share one of the regular network ports if needed. When in shared mode, you can indeed configure a VLAN tag for the management traffic, which can be different to the VLAN tag used by the host operating system normally.

                                                  1. 1

                                                    Unfortunately, in the same way that chris explained that a any compromised host might be able to switch the device IPMI mode from dedicated to shared, using a VLAN for segregation can have a similar problem. If the compromised host adds a sub-interface with the tagged VLAN to their networking stack they now can gain network access to the entire IPMI VLAN.

                                                    1. 2

                                                      In addition there are other annoyance with using a shared interface. Because the OS has control of the NIC it can reset the PHY. If the PHY is interrupted while, for example, you’re connected over Serial over LAN or a virtual KVM, you lose access. If you’re lucky, that’s temporary. If you’re really unlucky the OS can continually reset the PHY making IPMI access unusable. A malicious actor could abuse this to lock out someone from remote management.

                                                      That can’t happen when you use a dedicated interface for IPMI (other than explicit IPMI commands sent over /dev/ipmi0). Generally switching a BMC from dedicated mode to shared mode requires a BIOS/UEFI configuration change and a server reset.

                                                      (Speaking from experience with shared mode and the OS resetting the NIC. The malicious actor is merely a scenario I just dreamt up.)

                                                      1. 1

                                                        Indeed, although I suspect in many cases these IPMI modules are already accessible from the compromised host over SMBus/SMIC or direct serial interfaces anyway - possibly even with more privileged access than over the network. That’s how iLOs and DRACs can have their network and user/group settings configured from the operating system.

                                                        1. 4

                                                          The increased risk mostly isn’t to the compromised host’s own IPMI; as you note, that’s more or less under the control of the attacker once they compromise the host (although network access might allow password extraction attacks and so on). The big risk is to all of the other IPMIs on the IPMI VLAN, which would let an attacker compromise their hosts in turn. Even if an attacker doesn’t compromise the hosts, network access to an IPMI often allows all sorts of things you won’t like, such as discovering your IPMI management passwords and accounts (which are probably common across your fleet).

                                                          (I’m the author of the linked to article.)

                                                          1. 3

                                                            The L2 feature you are looking for is called a protected port. This should be available on any managed switch, but I’ll link to the cisco documentation:

                                                            https://www.cisco.com/en/US/docs/switches/lan/catalyst3850/software/release/3.2_0_se/multibook/configuration_guide/b_consolidated_config_guide_3850_chapter_011101.html

                                                            1. 1

                                                              In a previous life at a large hosting we used this feature on switch ports that were connected to servers for the purposes of using our managed backup services.

                                                  1. 1

                                                    In truth, Google was never going to buy your software. If you don’t use the AGPL, they’re just going to take your software and give nothing back. If you do use the AGPL, they’re just going to develop a solution in-house. There’s no outcome where Google pays you.

                                                    This is as valid of an argument for WTFPL as it is for AGPL!! No outcome where corporations pay you → might as well not put any restrictions at all.

                                                    This is my licensing philosophy: for most personal projects — especially libraries — my code is a throw-away gift to the world, I don’t even need anyone to credit me, so Unlicense it is.

                                                    Copyleft only really makes sense for big, serious, directly “commercializable” applications, like LibreOffice or Postgres. (and yet, Postgres is permissively licensed, heh) These projects actually have the resources to hire lawyers to enforce copyleft, or at least the clout to get help with this from non-profit foundations. For personal or little-community projects, is anyone ever going to sue for GPL violations?!

                                                    1. 3

                                                      It’s interesting, because starting from the same point

                                                      my code is a throw-away gift to the world

                                                      I arrived at a totally different conclusion: I’d rather gift it to deserving individuals, than to tax-dodging entities not bound to any kind of morality or ethical behavior.

                                                      So I distribute my stuff at least under the MPL, and if that makes big-corp X not use my work, I’m more than fine with it.

                                                      (I have reverted from MPL to a weaker license once in the past due to pressure from Rust people. In hindsight – I will never do this again.)

                                                      1. 2

                                                        If you do care about who gets to use it, it’s already not throw-away anymore.

                                                    1. 6

                                                      Deduplication (heh my phone wants to correct to “reduplication”??) in ZFS is kind of a mis-feature that makes it easy to destroy the performance. (I’ve had some painful experiences with it on a small mail server…) Pretty much everyone recommends not enabling it ever. So indeed it’s not a realistic concern, but it is fun to think about.

                                                      It shouldn’t be that hard to add a setting to ZFS that would only show logicalused to untrusted users, not used.

                                                      1. 9

                                                        For folks not familiar with ZFS, just want to expand on what @myfreeweb said: “pretty much everyone” even includes the ZFS folks themselves. The feature has a bunch of warnings all over it about how you really need to be sure you need deduplication, and really you probably don’t need it, and by the way you can’t disable it later so you better well be damn sure. btrfs’ implementation, though, does not AFAIK suffer from the performance problems ZFS’ does because btrfs is willing to rewrite existing data pretty extensively, whereas ZFS is not because this operation (“Block Pointer Rewrite”) would among other problems break a bunch of the really smart algorithms they can use to make stuff like snapshot deletion fast. A btrfs filesystem after offline deduplication is not fundamentally different from the same filesystem before. ZFS deduplication fundamentally changes the filesystem because it adds a layer of indirection.

                                                        logicalused seems like a good idea. It doesn’t fix the timing side channel, though. I think you’d want to keep a rolling average of how long recent I/O requests took to service, plus the standard deviation. Then pick a value from that range somehow (someone better at statistics than me could tell you exactly how) and don’t return from the syscall for that amount of time. Losing the performance gain from a userspace perspective is unavoidable since that’s the whole point, but you can use that time (and more importantly, I/O bus bandwidth) to service other requests to the disk.

                                                        (Side note: my phone also wanted to correct to “reduplication”. Hilarious. Someone should stick that “feature” in a filesystem based on bogosort or something.)

                                                        1. 2

                                                          It shouldn’t be that hard to add a setting to ZFS that would only show logicalused to untrusted users, not used.

                                                          I think that’s harder than you think. The df(1) command will show you free space, I’m not sure you can set a quota that hides whether a file was deduplicated. Also a user can use zpool(8) to see how much space is used in total.

                                                          However, I hardly think this is going to be a problem with ZFS, because, as you say, “Pretty much everyone recommends not enabling it ever”. I have never experienced a use case where deduplication in ZFS would be advantageous for me, on the contrary; ZFS gets slower because it has to look up every write in a deduplication table, and it uses more space because it has to keep a deduplication table. If you enable deduplication on ZFS without thorough research, you will be punished for it with poor performance long before security becomes an issue.

                                                          1. 2

                                                            I mean report logicalused to everywhere like df, hide the zpools..

                                                            The pools would already be hidden if it’s e.g. a FreeBSD jail with a dataset assigned to it.

                                                        1. 3

                                                          I’m not sure if it’s just me but to suggest that disk utilisation statistics can be realistically used as a law enforcement weapon in this way seems farfetched. It’s also worth noting that there are all sorts of reasons why apparent disk utilisation might not be what it appears, including but not limited to overlay mounts, in-memory caching, compression.

                                                          I’m also not sure that inferring that data exists across virtual machine boundaries is practical either. With lightweight containers maybe, but in a “real” virtual machine with emulated or paravirtual disk controllers, on-disk deduplication is invisible to the virtual machine. The virtual machine will have a filesystem of its own and will report the disk space as used regardless, even if it was deduplicated in the real world, because the filesystem descriptors on the virtual hard disk will say that it is used. How is the virtual machine supposed to know otherwise?

                                                          1. 4

                                                            to suggest that disk utilisation statistics can be realistically used as a law enforcement weapon in this way seems farfetched

                                                            Very much so. I worked as a digital forensics analyst for several years, and never once did this kind of technique even remotely appear. I may not go so far as to say that it would never be used, maybe in some crazy high stakes case something like this could be used in a very targeted last ditch effort but that be a major exception. In most forensic cases you’re looking at either raw data blocks where permissions aren’t an issue anyways, or encrypted blobs where a technique like this wouldn’t make sense either.

                                                            So perhaps the author is focusing more on typical malicious software.

                                                            1. 3

                                                              VMs were mentioned in the context of timing, not space utilization. i.e. you could detect (I guess) a suspiciously fast sync write and infer that it didn’t have to write the data to the disk because it’s been detected as being already there.

                                                              1. 4

                                                                Perhaps this can be reproduced in controlled conditions with very specific hardware but it also seems like a stretch otherwise. The moment you move away from a single 5400rpm disk up to a hardware RAID controller or a full SAN appliance, you’re suddenly subject to fabric congestion, flash write caches and possibly even multi-tiered storage. That’s still ignoring the fact that the emulated/paravirtual disk controller still has to handle the writes as if they are really happening, as deduplication won’t be happening until the hypervisor writes virtual blocks out to real disk. I suppose the notable exception here is with raw device mappings or cases where something like iSCSI is used to present a LUN to the virtual machine itself, but you’d still be subject to a whole range of other conditions first.

                                                            1. 7

                                                              https://en.wikipedia.org/wiki/Betteridge%27s_law_of_headlines but seriously, I don’t see this taking off. Open source OSs can take on Microsoft with enough coders because it’s just software - hardware is a very different business. I wish it could happen, but it’s very doubtful IMHO.

                                                              1. 31

                                                                Depends on what you mean by ‘taking off’. RISC-V has successfully killed a load of in-house ISAs (and good riddance!). For small control-plane processors, you don’t care about performance or anything else much, you just want a cheap Turing-complete processor with a reasonably competent C compiler. If you don’t have to implement the C compiler, that’s a big cost saving. RISC-V makes a lot of sense for things like the nVidia control cores (which exist to set up the GPU cores and do management things that aren’t on the critical path for performance). It makes a lot of sense for WD to use instead of ARM for the controllers on their SSDs: the ARM license costs matter in a market with razor-thin margins, power and performance are dominated by the flash chips, and they don’t need any ecosystem support beyond a bare-metal C toolchain.

                                                                The important lesson for RISC-V is why MIPS died. MIPS was not intended as an open ISA, but it was a de-facto one. Aside from LWL / LWR, everything in the ISA was out of patent. Anyone could implement an almost-MIPS core (and GCC could target MIPS-without-those-two-instructions) and many people did. Three things killed it in the market:

                                                                First, fragmentation. This also terrifies ARM. Back in the PDA days, ARM handed out licenses that allowed people to extend the ISA. Intel’s XScale series added a floating-point extension called Wireless MMX that was incompatible with the ARM floating point extension. This cost a huge amount for software maintenance. Linux, GCC, and so on had to have different code paths for Intel vs non-Intel ARM cores. It doesn’t actually matter which one was better, the fact both existed prevented Linux from moving to a hard-float ABI for userland for a long time: the calling convention passed floating-point values in integer registers, so code could either call a soft-float library or be compiled for one or the other floating-point extensions and still interop with other libraries that were portable across both. There are a few other examples, but that’s the most painful one for ARM. In contrast, every MIPS vendor extended the ISA in incompatible ways. The baseline for 64-bit MIPS is still often MIPS III (circa 1991) because it’s the only ISA that all modern 64-bit MIPS processors can be expected to handle. Vendor extensions only get used in embedded products. RISC-V has some very exciting fragmentation already, with both a weak memory model and TSO: the theory is that TSO will be used for systems that want x86 compatibility, the weak model for things that don’t, but code compiled for the TSO cores is not correct on weak cores. There are ELF header flags reserved to indicate which is which, but it’s easy to compile code for the weak model, test it on a TSO core, see it work, and have it fail in subtle ways on a weak core. That’s going to cause massive headaches in the future, unless all vendors shipping cores that run a general-purpose OS go with TSO.

                                                                Second, a modern ISA is big. Vector instructions, bit-manipulation instructions, virtualisation extensions, two-pointer atomic operations (needed for efficient RCU and a few other lockless data structures) and so on. Dense encoding is really important for performance (i-cache usage). RISC-V burned almost all of their 32-bit instruction space in the core ISA. It’s quite astonishing how much encoding space they’ve managed to consume with so few instructions. The C extension consumes all of the 16-bit encoding space and is severely over-fitted to the output of an unoptimised GCC on a small corpus of C code. At the moment, every vendor is trampling over all of the other vendors in the last remaining bits of the 32-bit encoding space. RISC-V really should have had a 48-bit load-64-bit-immediate instruction in the core spec to force everyone to implement support for 48-bit instructions, but at the moment no one uses the 48-bit space and infrequently used instructions are still consuming expensive 32-bit real-estate.

                                                                Third, the ISA is not the end of the story. There’s a load of other stuff (interrupt controllers, DMA engines, management interfaces, and so on) that need to be standardised before you can have a general-purpose compute platform. Porting an OS to a new ARM SoC used to be a huge amount of effort because of this. It’s now a lot easier because ARM has standardised a lot of this. x86 had some major benefits from Compaq copying IBM: every PC had a compatible bootloader that provided device enumeration and some basic device interfaces. You could write an OS that would access a disk, read from a keyboard, and write text to a display for a PC that would run on any PC (except the weird PC98 machines from Japan). After early boot, you’d typically stop doing BIOS thunks and do proper PCI device numeration and load real drivers, but that baseline made it easy to produce boot images that ran on all hardware. The RISC-V project is starting to standardise this stuff but it hasn’t been a priority. MIPS never standardised any of it.

                                                                The RISC-V project has had a weird mix from the start of explicitly saying that it’s not a research project and wants to be simple and also depending on research ideas. The core ISA is a fairly mediocre mid-90s ISA. Its fine, but turning it into something that’s competitive with modern x86 or AArch64 is a huge amount of work. Some of those early design decisions are going to need to either be revisited (breaking compatibility) or are going to incur technical debt. The first RISC-V spec was frozen far too early, with timelines largely driven by PhD students needing to graduate rather than the specs actually being in a good state. Krste is a very strong believer in micro-op fusion as a solution to a great many problems, but if every RISC-V core needs to be able to identify 2-3 instruction patterns and fuse them into a single micro-op to do operations that are a single instruction on other ISAs, that’s a lot of power and i-cache being consumed just to reach parity. There’s a lot of premature optimisation (e.g. instruction layouts that simplify decoding on an in-order core) that hurt other things (e.g. use more encoding space than necessary), where the saving is small and the cost will become increasingly large as the ISA matures.

                                                                AArch64 is a pretty well-designed instruction set that learns a lot of lessons from AArch32 and other competing ISAs. RISC-V is very close to MIPS III at the core. The extensions are somewhat better, but they’re squeezed into the tiny amount of left-over encoding space. The value of an ecosystem with no fragmentation is huge. For RISC-V to succeed, it needs to get a load of the important extensions standardised quickly, define and standardise the platform specs (underway, but slow, and without enough of the people who actually understand the problem space contributing, not helped by the fact that the RISC-V Foundation is set up to discourage contributions), and get software vendors to agree on those baselines. The problem is that, for a silicon vendor, one big reason to pick RISC-V over ARM is the ability to differentiate your cores by adding custom instructions. Every RISC-V vendor’s incentives are therefore diametrically opposed to the goals of the ecosystem as a whole.

                                                                1. 3

                                                                  Thanks for this well laid out response.

                                                                  The problem is that, for a silicon vendor, one big reason to pick RISC-V over ARM is the ability to differentiate your cores by adding custom instructions. Every RISC-V vendor’s incentives are therefore diametrically opposed to the goals of the ecosystem as a whole.

                                                                  This is part of what makes me skidish, as well. I almost prefer the ARM model to keep a lid on fragmentation than RISC-V’s “linux distro” model. But also, deep down, if we manage to create the tooling for binaries to adapt to something like this and have a form of Universal Binary that progressively enhances with present CPUIDs, that would make for an exciting space.

                                                                  1. 6

                                                                    But also, deep down, if we manage to create the tooling for binaries to adapt to something like this and have a form of Universal Binary that progressively enhances with present CPUIDs, that would make for an exciting space.

                                                                    Apple has been pretty successful at this, encouraging developers to distribute LLVM IR so that they can do whatever microarchitectural tweaks they want for any given device. Linux distros could do something similar if they weren’t so wedded to GCC and FreeBSD could if they had more contributors.

                                                                    You can’t do it with one-time compilation very efficiently because each vendor has a different set of extensions, so it’s a combinatorial problem. The x86 world is simpler because Intel and AMD almost monotonically add features. Generation N+1 of Intel CPUs typically supports a superset of generation N’s features (unless they completely drop something and are never bringing it back, such as MPX) and AMD is the same. Both also tend to adopt popular features from the other, so you have a baseline that moves forwards. That may eventually happen with RISC-V but the scarcity of efficient encoding space makes it difficult.

                                                                    On the other hand, if we enter Google’s dystopia, the only AoT-compiled code will be Chrome and everything else will be JavaScript and WebAssembly, so your JIT can tailor execution for whatever combination of features your CPU happens to have.

                                                                    1. 1

                                                                      Ultimately, vendor extensions are just extensions. Suppose a CPU is RV64GC+proprietary extensions, what this means is that RV64GC code would still work on it.

                                                                      This is much, much better than the alternative (vendor-specific instructions implemented without extensions).

                                                                    2. 2

                                                                      Vendor extensions only get used in embedded products. RISC-V has some very exciting fragmentation already, with both a weak memory model and TSO: the theory is that TSO will be used for systems that want x86 compatibility, the weak model for things that don’t, but code compiled for the TSO cores is not correct on weak cores. There are ELF header flags reserved to indicate which is which, but it’s easy to compile code for the weak model, test it on a TSO core, see it work, and have it fail in subtle ways on a weak core. That’s going to cause massive headaches in the future, unless all vendors shipping cores that run a general-purpose OS go with TSO.

                                                                      I don’t understand why they added TSO in the first place.

                                                                      Third, the ISA is not the end of the story. There’s a load of other stuff (interrupt controllers, DMA engines, management interfaces, and so on) that need to be standardised before you can have a general-purpose compute platform. Porting an OS to a new ARM SoC used to be a huge amount of effort because of this. It’s now a lot easier because ARM has standardised a lot of this. x86 had some major benefits from Compaq copying IBM: every PC had a compatible bootloader that provided device enumeration and some basic device interfaces. You could write an OS that would access a disk, read from a keyboard, and write text to a display for a PC that would run on any PC (except the weird PC98 machines from Japan). After early boot, you’d typically stop doing BIOS thunks and do proper PCI device numeration and load real drivers, but that baseline made it easy to produce boot images that ran on all hardware. The RISC-V project is starting to standardise this stuff but it hasn’t been a priority. MIPS never standardised any of it.

                                                                      Yeah this part bothers me a lot. It looks like a lot of the standardization effort is just whatever OpenRocket does, but almost every RISC-V cpu on the market right now has completely different peripherals outside of interrupt controllers. Further, there’s no standard way to query the hardware, so creating generic kernels like what is done for x86 is effectively impossible. I hear there’s some work on ACPI which could help.

                                                                      1. 7

                                                                        I don’t understand why they added TSO in the first place.

                                                                        Emulating x86 on weakly ordered hardware is really hard. Several companies have x86-on-ARM emulators. They either only work with a single core, insert far more fences than are actually required, or fail subtly on concurrent data structures. It turns out that after 20+ years of people trying to implement TSO efficiently, there are some pretty good techniques that don’t sacrifice much performance relative to software that correctly inserts the fences and perform a lot better on the software a lot of people write where they defensively insert too many fences because it’s easier than understanding the C++11 memory model.

                                                                        Yeah this part bothers me a lot. It looks like a lot of the standardization effort is just whatever OpenRocket does, but almost every RISC-V cpu on the market right now has completely different peripherals outside of interrupt controllers. Further, there’s no standard way to query the hardware, so creating generic kernels like what is done for x86 is effectively impossible. I hear there’s some work on ACPI which could help.

                                                                        Initially they proposed their own thing that was kind-of like FDT but different, because Berkeley. Eventually they were persuaded to use FDT for embedded things and something else (probably ACPI) for more general-purpose systems.

                                                                        The weird thing is that Krste really understands the value of an interoperable ecosystem. He estimates the cost of building it at around $1bn (ARM thinks he’s off by a factor of two, but either way it’s an amount that the big four tech companies could easily spend if it were worthwhile). Unfortunately, the people involved with the project early were far more interested in getting VC money than in trying to build an open ecosystem (and none of them really had any experience with building open source communities and refused help from people who did).

                                                                        1. 2

                                                                          Are the Apple and Microsoft emulators on the “far more fences than are actually required” side? They don’t seem to have many failures..

                                                                          1. 2

                                                                            I don’t know anything about the Apple emulator and since it runs only on Apple hardware, it’s entirely possible that either Apple’s ARM cores are TSO or have a TSO mode (TSO is strictly more strongly ordered than the ARM memory model, so it’s entirely conformant to be TSO). I can’t share details of the Microsoft one but you can probably dump its output and look.

                                                                        2. 2

                                                                          there’s no standard way to query the hardware, so creating generic kernels like what is done for x86 is effectively impossible

                                                                          Well, device trees (FDT) solve the “generic kernel” problem specifically, but it all still sucks. Everything is so much better when everyone has standardized most peripherals.

                                                                          1. 1

                                                                            That’s the best solution, but you still have to have the bootloader pass in a device tree, and that device tree won’t get updated at the same cadence as the kernel does (so it may take a while if someone finds a bug in a device tree).

                                                                            1. 2

                                                                              For most devices it’s the kernel that maintains the device tree. FDT is not really designed for a stable description, it changes with the kernel’s interface.

                                                                              1. 2

                                                                                FDT is not specific to a kernel. The same FDT blobs work with FreeBSD and Linux, typically. It’s just a description of the devices and their locations in memory. It doesn’t need to change unless the hardware changes and if you’re on anything that’s not deeply embedded it’s often shipped with U-Boot or similar and provided to the kernel. The kernel then uses it to find any devices it needs in early boot or which are attached to the core via interface that don’t support dynamic enumeration (e.g. you would put the PCIe root complex in FDT but everything on the bus is enumerated via the bus).

                                                                                The reason for a lot of churn recently has been the addition of overlays to the FDT spec. These allow things that are equivalent to option roms to patch the root platform’s FDT so you can use FDT for expansions connected via ad-hoc non-enumerable interfaces.

                                                                                1. 2

                                                                                  It doesn’t need to change.. but Linux developers sometimes like to find “better” ways of describing everything, renaming stuff, etc. To be fair in 5.x this didn’t really happen all that much.

                                                                                  And of course it’s much worse if non-mainline kernels are introduced. If there’s been an FDT for a vendor kernel that shipped with the device, and later drivers got mainlined, the mainlined drivers often expect different properties completely because Linux reviewers don’t like vendor ways of doing things, and now you need very different FDT..

                                                                                  The reason for a lot of churn recently has been the addition of overlays to the FDT spec

                                                                                  That’s not that recent?? Overlays are from like 2017..

                                                                          2. 1

                                                                            Further, there’s no standard way to query the hardware, so creating generic kernels like what is done for x86 is effectively impossible. I hear there’s some work on ACPI which could help.

                                                                            There’s apparently serious effort put into UEFI.

                                                                            With rpi4 uefi boot, FDT isn’t used. I suppose UEFI itself has facilities to make FDT redundant.

                                                                            1. 2

                                                                              With RPi4-UEFI, you have a choice between ACPI and FDT in the setup menu.

                                                                              It’s pretty clever what they did with ACPI: the firmware fully configures the PCIe controller by itself and presents a generic XHCI device in the DSDT as if it was just a directly embedded non-PCIe memory-mapped XHCI.

                                                                              1. 1

                                                                                I have to ask, what is the benefit of special casing the usb3 controller?

                                                                                1. 2

                                                                                  The OS does not need to have a driver for the special Broadcom PCIe host controller.

                                                                                  1. 1

                                                                                    How is the Ethernet handled?

                                                                                    1. 2

                                                                                      Just as a custom device, how else? :)

                                                                                      Actually it’s kinda sad that there’s no standardized Ethernet “host controller interface” still… (other than some USB things)

                                                                                      1. 1

                                                                                        Oh. So Ethernet it’s not on PCIe to begin with, then. Only XHCI. I see.

                                                                          3. 1

                                                                            This doesn’t paint a very good picture of RISC-V, IMHO. It’s like some parody of worse-is-better design philosophy, combined with basically ignoring all research in CPU design since 1991 for a core that’s easy to make an educational implementation for that makes the job of compiler authors and implementers harder. Of course, it’s being peddled by GNU zealots and RISC revanchists, but it won’t benefit the things they want; instead, it’ll benefit vanity nationalist CPU designs (that no one will use except the GNU zealots; see Loongson) and deeply fragmented deep embedded (where software freedom and ISA doesn’t matter other than shaving licensing fees off).

                                                                            1. 3

                                                                              Ignoring the parent and focusing on hard data instead, RV64GC has higher code density than ARM, x86 and even MIPS16, so the encoding they chose isn’t exactly bad, objectively speaking.

                                                                              1. 8

                                                                                Note that Andrew’s dissertation is using integer-heavy, single-threaded, C code as the evaluation and even then, RISC-V does worse than Thumb-2 (see Figure 8 of the linked dissertation). Once you add atomics, higher-level languages, or vector instructions, you see a different story. For example, RISC-V made an early decision to make the offset of loads and stores scaled with the size of the memory value. Unfortunately, a lot of dynamic languages set one of the low bits to differentiate between a pointer and a boxed value. They then use a complex addressing mode to combine the subtraction of one with the addition of the field offset for field addressing. With RISC-V, this requires two instructions. You won’t see that pattern in pure C code anywhere but you’ll see it all over the place in dynamic language interpreters and JITs.

                                                                                1. 1

                                                                                  I think there was another example of something far more basic that takes two instructions on RISC-V for no good reason, just because of their obsession with minimal instructions. Something return related?? Of course I lost the link to that post >_<

                                                                                  1. 1

                                                                                    Interesting. There’s work on an extension to help interpreters, JITs, which might or might not help mitigate this.

                                                                                    In any event, it is far from ready.

                                                                                    1. 6

                                                                                      I was the chair of that working group but I stepped down because I was unhappy with the way the Foundation was being run.

                                                                                      The others involved are producing some interesting proposals though a depressing amount of it is trying to fix fundamentally bad design decisions in the core spec. For example, the i-cache is not coherent with respect to the d-cache on RISC-V. That means you need explicit sync instructions after every modification to a code page. The hardware cost of making them coherent is small (i-cache lines need to participate in cache coherency, but they can only ever be in shared state, so the cache doesn’t have to do much. If you have an inclusive L2, then the logic can all live in L2) but the overheads from not doing it are surprisingly high. SPARC changed this choice because the overhead on process creating from the run-time linker having to do i-cache invalidates on every mapped page were huge. Worse, RISC-V’s i-cache invalidate instruction is local to the current core. That means that you actually need to do a syscall, which does an IPI to all cores, which then invalidates the i-cache. That’s insanely expensive but the initial measurements were from C code on a port of Linux that didn’t do the invalidates (and didn’t break because the i-cache was so small you were never seeing the stale entries).

                                                                                      1. 1

                                                                                        L1$ not coherent

                                                                                        Christ. How did that go anywhere?

                                                                                        1. 4

                                                                                          No one who had worked on an non-toy OS or compiler was involved in any of the design work until all of the big announcements had been made and the spec was close to final. The Foundation was set up so that it was difficult for any individuals to contribute (that’s slowly changing) - you had to pay $99 or ask for the fee to be waived to give feedback on the specs as an individual. You had to pay more to provide feedback as a corporation and no corporation was going to pay thousands of dollars membership and the salaries of their contributors to provide feedback unless they were pretty confident that they were going to use RISC-V.

                                                                                          It probably shouldn’t come as a surprise that saying to people ‘we need your expertise, please pay us money so that you can provide it’ didn’t lead to a huge influx of expert contributors. There were a few, but not enough.

                                                                            2. 7

                                                                              Keep in mind an ISA isn’t hardware, it’s just a specification.

                                                                              1. 6

                                                                                That ties into my point - RISC-V is kinda useless without fabbing potential. And that’s insanely expensive, which means the risk involved is too high to take on established players.

                                                                                1. 9

                                                                                  According to the article, it seems that Samsung, Western Digital, NVIDIA, and Qualcomm don’t think the risk is too high, since they plan to use RISC-V. They have plenty of money to throw at any problems, such as inadequate fabbing potential. Hobbyists may benefit from RISC-V, but (like Linux) it’s not just for hobbyists.

                                                                                  1. 8

                                                                                    According to the article, it seems that Samsung, Western Digital, NVIDIA, and Qualcomm don’t think the risk is too high, since they plan to use RISC-V.

                                                                                    I think it is more accurate they plan to use the threat of RISC-V to improve negotiating position, use it in some corner cases and as a last ditch hedge. Tizen is a prime example of such a product.

                                                                                    1. 2

                                                                                      I think it is more accurate they plan to use the threat of RISC-V to improve negotiating position, use it in some corner cases and as a last ditch hedge.

                                                                                      Yet WD and NVIDIA designed their own RISC-V cores. Isn’t it a bit too much for “insurance”?

                                                                                      The fact here is that they do custom silicon and need CPUs in them for a variety of purposes. Until now, they paid the ARM tax. From now on, they don’t have to, because they can and do just use RISC-V.

                                                                                      I’m appalled at how grossly the impact of RISC-V is being underestimated.

                                                                                      1. 4

                                                                                        Yet WD and NVIDIA designed their own RISC-V cores. Isn’t it a bit too much for “insurance”?

                                                                                        I don’t think so – it isn’t purely insurance, it is negotiating power. The power can be worth tens (even hundreds) of millions for companies at the scale of WD and NVIDIA. Furthermore they didn’t have to develop FABs for the first time, both have existing manufacturing prowess and locations. I think it is a rather straightforward ROI based business decision.

                                                                                        The fact here is that they do custom silicon and need CPUs in them for a variety of purposes. Until now, they paid the ARM tax. From now on, they don’t have to, because they can and do just use RISC-V.

                                                                                        They will use this to lower the ARM tax without actually pulling the trigger on going with something as different as RISC-V (except on a few low yield products to prove they can do it, see Tizen and Samsung’s strategy).

                                                                                        I’m appalled at how grossly the impact of RISC-V is being underestimated.

                                                                                        Time will tell, but I think that RISC-V only become viable if Apple buys and snuffs out new customers of ARM, only maintaining existing contracts.

                                                                                        1. 1

                                                                                          I don’t think so – it isn’t purely insurance, it is negotiating power.

                                                                                          Do you think they have any reason left to license ARM, when they clearly can do without?

                                                                                          Time will tell, but I think that RISC-V only become viable if Apple buys and snuffs out new customers of ARM, only maintaining existing contracts.

                                                                                          I see too much industry support behind RISC-V at this point. V extension will be quite the spark, so we’ll see how it plays out after that. All it’ll take is one successful high performance commercial implementation.

                                                                                          1. 2

                                                                                            Do you think they have any reason left to license ARM, when they clearly can do without?

                                                                                            I think you are underestimating the cost of rebuilding an entire ecosystem. I have run in production ThunderX arm64 servers – and ARM has massive support behind it and we still fell into weird issues, niches and problems. Our task was fantastic fit (large-scale OCR) and it still was tough setup and in the end due to poor optimizations and other support issues – it probably wasn’t worth it.

                                                                                            I see too much industry support behind RISC-V at this point. V extension will be quite the spark, so we’ll see how it plays out after that. All it’ll take is one successful high performance commercial implementation.

                                                                                            Well – I think it actually takes a marketplace of commercial implementations so that selecting RISK-V isn’t single-vendor lockin forever, but I take your meaning.

                                                                                    2. 3

                                                                                      As I said up top, I hope this really happens. But I’m not super confident it’ll ever be something we can use to replace our AMD/Intel CPUs. If it just wipes out the current microcontroller and small CPU space that’s good too, since those companies don’t usually have good tooling anyway.

                                                                                      I just think features-wise it’ll be hard to beat the current players.

                                                                                      1. 1

                                                                                        I just think features-wise it’ll be hard to beat the current players.

                                                                                        Can you elaborate on this point?

                                                                                        What are the features? Who are the current players?

                                                                                        1. 4

                                                                                          Current players are AMD64 and ARM64. Features lacking in RV64 include vector extension.

                                                                                          1. 4

                                                                                            I notice you’re not the author of the parent post. Still,

                                                                                            Features lacking in RV64 include vector extension.

                                                                                            V extension is due to be active standard by September if all is well. This is practically like saying “tomorrow”, from a ISA timeline perspective. To put it into context, RISC-V was introduced in year 2010.

                                                                                            Bit manipulation (B) is also close to active standard, and also pretty important.

                                                                                            With these extensions out of the way, and software support where it is today, I see no features stopping low power, high performance implementations appearing and getting into smartphones and such.

                                                                                            AMD64 and ARM64.

                                                                                            The amd64 ISA is CISC legacy. Popular or not, it’s long overdue replacement.

                                                                                            ARM64 isn’t a thing. You might have meant aarch64 or armv8.

                                                                                            I’m particularly interested whether the parent meant ISAs or some company names regarding current players.

                                                                                            1. 4

                                                                                              ARM64 isn’t a thing. You might have meant aarch64 or armv8.

                                                                                              The naming is a disaster :/ armv8 doesn’t specifically mean 64-bit because there’s technically an armv8 aarch32, and aarch64/32 is just an awful name that most people don’t want to say out loud. So even ARM employees are okay with the unofficial “arm64” name.


                                                                                              Another player is IBM with OpenPOWER.. Relatively fringe compared to ARM64 (which the Bezos “Cloud” Empire is all-in on, yay) but hey, there is a supercomputer and some very expensive workstations for open source and privacy enthusiasts :) and all the businesses buying IBM’s machines that we don’t know much about. That’s much more than desktop/server-class RISC-V… and they made the POWER ISA royalty-free too now I think.

                                                                                              1. 4

                                                                                                SPARC is also completely open. Yes, POWER is open now, but I don’t see why it would fare better than SPARC.

                                                                                                1. 1

                                                                                                  In terms of diversity of core designers and chip makers, maybe not. But POWER generally just as an ISA is doing much better. IBM clearly cares about making new powerful chips and is cultivating a community around open firmware.

                                                                                                  Who cares about SPARC anymore? Seems like for Oracle it’s kind of a liability. And Fujitsu, probably the most serious SPARC company as of late, is on ARM now.

                                                                                                2. 3

                                                                                                  The naming is a disaster :/ armv8 doesn’t specifically mean 64-bit because there’s technically an armv8 aarch32

                                                                                                  Amusingly, the first version of the ARMv8 spec made both AArch32 and AArch64 optional. I implemented a complete 100% standards-compliant soft core based on that version of the spec. They later clarified it so that you had to implement at least one out of AArch32 and AArch64.

                                                                                              2. 1

                                                                                                Current players are AMD64 and ARM64

                                                                                                And ARM32/MIPS/AVR/SuperH/pick your favorite embedded ISA. The first real disruption brought by RISC-V will be in microcontrollers and in ASICs. With RISC-V you aren’t tied to one’s board/chip to a single company (Like ARM Holdings, MIPS Technologies, Renesas, etc.). If they go under/decide to get out of the uC market/slash their engineering budgets/start charging double you can always license from another vendor (or roll your own core). In addition, the tooling for RISC-V is getting good fairly fast and is mostly open source. You don’t have to use the vendor’s closed-source C compiler, or be locked into their RTOS.

                                                                                                1. 1

                                                                                                  The first real disruption

                                                                                                  Indeed. The second wave is going to start soon, triggered by stable status of the V and B extensions.

                                                                                                  This will enable Qualcomm and friends to release smartphone-tier SoCs with RISC-V CPUs in them.

                                                                                        2. 6

                                                                                          Yes fab is expensive, but SiFive is a startup, it still managed to fab RISC-V chips that can run Linux desktop. I don’t think there is need to be too pessimistic.

                                                                                          1. 4

                                                                                            The economics are quite interesting here. Fabs are quite cheap if you are a brand-new startup or a large established player. They give big discounts to small companies that have the potential to grow into large customers (because if you get them early then they end up at least weakly tied into a particular cell library and you have a big long-term revenue stream). They give good prices to big customers, because they amortise the setup costs over a large number of wafers. For companies in the middle, the prices are pretty bad. SiFive is currently getting the steeply discounted rate. It will be interesting to see what happens as they grow.

                                                                                          2. 5

                                                                                            RISC-V is kinda useless without fabbing potential.

                                                                                            RISC-V foundation have no interest on fabbing themselves.

                                                                                            And that’s insanely expensive, which means the risk involved is too high to take on established players.

                                                                                            Several chips with CPU in them based on RISC-V have been fabricated. Some are already shipped as components in other products. Some of these chips are available on sale.

                                                                                            RISC-V’s got significant industry backing.

                                                                                            Refer to: https://riscv.org/membership/

                                                                                            1. 3

                                                                                              There’s a number of companies that provide design and fabbing services or at least help you realize that.

                                                                                              The model is similar to e.g. SOLR, where the core is an open source implementation, but enterprise services are actually provided by a number of companies.

                                                                                              1. 2

                                                                                                With ARM on the market, RISC-V has to be on a lot of people’s minds; specifically, those folks that are already licensing ARM’s ISA, and producing chips…

                                                                                            2. 3

                                                                                              Open source OSs can take on Microsoft with enough coders because it’s just software

                                                                                              Yet we haven’t seen that happening either. In general, creating a product that people love require a bit more than opensource software. It requires vision, deep understanding of humans and rock solid implementation. This usually means the cathedral approach that is exactly the opposite of FOSS approach.

                                                                                              1. 5

                                                                                                Maybe not for everyone on the market, but I’ve been using Linux exclusively for over 10 years now, and I’m not the only one. Also, for some purposes (smartphones, servers, SBCs, a few others) Linux is almost the only choice.

                                                                                                1. 3

                                                                                                  You are absolutely in the minority though in terms of desktop computing. The vast majority of people can barely get their hand held through Mac OS, much less figuring out wtf is wrong with their graphics drivers or figuring out why XOrg has shit out on them for the 3rd time that week, or any number of problems that can (and do) crop up when using Linux on the desktop. Android, while technically Linux, doesn’t really count IMO because it’s almost entirely driven by the vision, money, and engineers of a single company that uses it as an ancillary to their products.

                                                                                                  1. 6

                                                                                                    That’s a bit of a stereotype - I haven’t edited an Xorg conf file in a very long time. It’s my daily driver so stability is a precondition. My grandma runs Ubuntu and it’s fine for what she needs.

                                                                                                    1. 3

                                                                                                      Not XOrg files anymore, maybe monitors.xml, maybe it’s xrandr, whatever. I personally just spent 4+ hours trying to get my monitor + graphics setup to behave properly with my laptop just last week. Once it works, it tends to keep working (though not always, it’s already broken on me once this week for seemingly no reason) unless you change monitor configuration or it forgets your config for some reason, but getting it to work in the first place is a massive headache depending on the hardware. Per-monitor DPI scaling is virtually impossible on XOrg, and Wayland is still a buggy mess with limited support. Things get considerably more complex with a HiDPI, multi-head setup, which are things that just work on Windows or Mac OS.

                                                                                                      The graphics ecosystem for Linux is an absolute joke too. That being said, my own mother runs Ubuntu on a computer that I built and set up, it’s been relatively stable since I put in the time to get it working in the first place.

                                                                                                2. 2

                                                                                                  Not on th desktop, for sure. Server side however, GNU Linux is a no brainer, the default choice.

                                                                                              1. 16

                                                                                                In here, we see another case of somebody bashing PGP while tacitly claiming that x509 is not a clusterfuck of similar or worse complexity.

                                                                                                I’d also like to have a more honest read on how a mechanism to provide ephemeral key exchange and host authentication can be used with the same goal as PGP, which is closer to end-to-end encryption of an email (granted they aren’t using something akin to keycloak). The desired goals of an “ideal vulnerability” reporting mechanism would be good to know, in order to see why PGP is an issue now, and why an HTTPS form is any better in terms of vulnerability information management (both at rest and in transit).

                                                                                                1. 22

                                                                                                  In here, we see another case of somebody bashing PGP while tacitly claiming that x509 is not a clusterfuck of similar or worse complexity.

                                                                                                  Let’s not confuse the PGP message format with the PGP encryption system. Both PGP and x509 encodings are a genuine clusterfuck; you’ll get no dispute from me there. But TLS 1.3 is dramatically harder to mess up than PGP, has good modern defaults, can be enforced on communication before any content is sent, and offers forward secrecy. PGP-encrypted email offers none of these benefits.

                                                                                                  1. 6

                                                                                                    But TLS 1.3 is dramatically harder to mess up than PGP,

                                                                                                    With a user-facing tool that has plugged out all the footguns? I agree

                                                                                                    has good modern defaults,

                                                                                                    If you take care to, say, curate your list of ciphers often and check the ones vetted by a third party (say, by checking https://cipherlist.eu/), then sure. Otherwise I’m not sure I agree (hell, TLS has a null cipher).

                                                                                                    can be enforced on communication before any content is sent

                                                                                                    There’s a reason why there’s active research trying to plug privacy holes such as SNI. There’s so much surface to the whole stack that I would not be comfortable making this claim.

                                                                                                    offers forward secrecy

                                                                                                    I agree, although I don’t think it would provide non-repudiation (at least without adding signed exchanges, which I think it’s still a draft) and without mutual TLS authentication, which can be achieved with PGP quite easily.

                                                                                                    1. 1

                                                                                                      take care to, say, curate your list of ciphers often and check the ones vetted by a third party

                                                                                                      There are no bad ciphers in 1.3, it’s a small list, so you could just kill the earlier TLS versions :)

                                                                                                      Also, popular web servers already come with reasonable default cipher lists for 1.2. Biased towards more compatibility but not including NULL, MD5 or any other disaster.

                                                                                                      I don’t think it would provide non-repudiation

                                                                                                      How often do you really need it? It’s useful for official documents and stuff, but who needs it on a contact form?

                                                                                                    2. 3

                                                                                                      I want to say that it only provides DNS based verification but then again, how are you going to get the right PGP key?

                                                                                                      1. 3

                                                                                                        PGP does not have only one trust model, and it is a good part of it : You choose, according to the various sources of trust (TOFU through autocrypt, also saw the key on the website, or just got the keys IRL, had signed messages prooving it is the good one Mr Doe…).

                                                                                                        Hopefully browsers and various TLS client could mainstream such a model, and let YOU choose what you consider safe rather than what (highly) paid certificates authorities.

                                                                                                        1. 2

                                                                                                          I agree that there is more flexibility and that you could get the fingerprint from the website and have the same security.

                                                                                                          Unfortunately, for example the last method doesn’t work. You can sign anybody’s messages. Doesn’t prove your key is theirs.

                                                                                                          The mantra “flexibility is an enemy of security” may apply.

                                                                                                          1. 1

                                                                                                            I meant content whose exclusive disclosure is in a signed message, such as “you remember that time at the bridge, I told you the boat was blue, you told me you are colorblind”.

                                                                                                            [EDIT: I realize that I had in mind that these messages would be sent through another secure transport, until external facts about the identity of the person at the other end of the pipe gets good enough. This brings us to the threat model of autocrypt (aiming working through email-only) : passive attacker, along with the aim of helping the crypto bonds to build-up: considering “everyone does the PGP dance NOW” not working well enough]

                                                                                                            1. 1

                                                                                                              Unfortunately, for example the last method doesn’t work. You can sign anybody’s messages. Doesn’t prove your key is theirs.

                                                                                                              I can publish your comment on my HTTPS protected blog. Doesn’t prove your comment is mine.

                                                                                                              1. 2

                                                                                                                Not sure if this is a joke but: A) You sign my mail. Op takes this as proof that your key is mine. B) You put your key on my website..wait no you can’t..I put my key on your webs- uh…you put my key on your website and now I can read your email…

                                                                                                                Ok, those two things don’t match.

                                                                                                      2. 9

                                                                                                        I’d claim I’m familiar with both the PGP ecosystem and TLS/X.509. I disagree with your claim that they’re a similar clusterfuck.

                                                                                                        I’m not saying X.509 is without problems. But TLS/X.509 gets one thing right that PGP doesn’t: It’s mostly transparent to the user, it doesn’t expect the user to understand cryptographic concepts.

                                                                                                        Also the TLS community has improved a lot over the past decade. X.509 is nowhere near the clusterfuck it was in 2010. There are rules in place, there are mitigations for existing issues, there’s real enforcement for persistent violation of rules (ask Symantec). I see an ecosystem that has its issues, but is improving on the one side (TLS/X.509) and an ecosystem that is in denial about its issues and which is not handling security issues very professionally (efail…).

                                                                                                        1. 3

                                                                                                          Very true but the transparency part is a bit fishy because TLS included an answer to “how do I get the key” which nowadays is basically DNS+timing while PGP was trying to give people more options.

                                                                                                          I mean we could do the same for PGP but if that fits your security requirements is a question that needs answering..but by whom? TLS says CA/DNS PGP says “you get to make that decision”.

                                                                                                          Unfortunately the latter also means “your problem” and often “idk/idc” and failed solutions like WoT.

                                                                                                          Hiw could we do the same? We can do some validation in the form of we send you an email encrypted for what you claim is your public key to what you claim is your mail and you have to return the decrypted challenge. Seems fairly similar to DNS validation for HTTPS.

                                                                                                          While we’re at it…. Add some key transparency to it for accountability. Fix the WoT a bit by adding some DOS protection. Remove the old and broken crypto from the standard. And the streaming mode which screws up integrity protection and which is for entirely different use-cases anyway. Oh, and make all the mehish or shittyish tools better.

                                                                                                          That should do nicely.

                                                                                                          Edit: except, of course, as Hanno said: “an ecosystem that is in denial about its issues and which is not handling security issues very professionally”…that gets in the way a lot

                                                                                                          1. 2

                                                                                                            I’d wager this is mostly a user-facing tooling issue, rather than anything else. Would you believe that having a more mature tooling ecosystem with PGP would make it more salvageable for, say, vulnerability disclosure emails instead of a google web form?

                                                                                                            If anything, I’m more convinced that the failure of PGP is to trust GnuPG as its only implementation worthy of blessing. How different would it be if we had funded alternative, industry-backed implementations after e-fail in the same way we delivered many TLS implementations after heartbleed?

                                                                                                            Similarly, there is a reason why there’s active research on fuzzing TLS implementations for their different behaviors (think, frankencerts). Mostly, this is due the fact that reasoning about x509 is impossible without reading through stacks and stacks of RFC’s, extensions and whatnot.

                                                                                                            1. 0

                                                                                                              I use Thunderbird with Enigmail. I made a key at some point and by now I just send and receive as I normally do. Mails are encrypted when they can be encrypted, and the UI is very clear on this. Mails are always signed. I get a nice green bar over mails I receive that are encrypted.

                                                                                                              I can’t say I agree with your statement that GPG is not transparent to the user, nor that it expects the user to understand cryptographic concepts.

                                                                                                              As for the rules in the TLS/X.509 ecosystem, you should ask Mozilla if there’s real enforcement for Let’s Encrypt.

                                                                                                            2. 4

                                                                                                              The internal complexity of x509 is a bit of a different one than the user-facing complexity of PGP. I don’t need to think about or deal with most of that as an end-user or even programmer.

                                                                                                              With PGP … well… There are about 100 things you can do wrong, starting with “oops, I bricked my terminal as gpg outputs binary data by default” and it gets worse from there on. I wrote a Go email sending library a while ago and wanted to add PGP signing support. Thus far, I have not yet succeeded in getting the damn thing to actually work. In the meanwhile, I have managed to get a somewhat complex non-standard ACME/x509 generation scheme to work though.

                                                                                                              1. 3

                                                                                                                There have been a lot of vulns in x509 parsers, though. They are really hard to get right.

                                                                                                                1. 1

                                                                                                                  I’m very far removed from an expert on any of this; so I don’t really have an opinion on the matter as such. All I know is that as a regular programmer and “power user” I usually manage to do whatever I want to do with x509 just fine without too much trouble, but that using or implementing PGP is generally hard and frustrating the the point where I just stopped trying.

                                                                                                                2. 1

                                                                                                                  You are thinking of gnupg. I agree gnupg is a usability nightmare. I don’t think PGP (RFC4880) makes much claims about user interactions (in the same way that the many x509 related RFC’s talk little about how users deal with tooling)

                                                                                                                3. 1

                                                                                                                  Would you say PGP has a chance to be upgraded? I think there is a growing consensus that PGP’s crypto needs some fixing, and GPG’s implementation as well, but I am no crypto-people.

                                                                                                                  1. 2

                                                                                                                    Would you say PGP has a chance to be upgraded?

                                                                                                                    I think there’s space for this, although open source (and standards in general) are also political to some extent. If the community doesn’t want to invest on improving PGP but rather replace it with $NEXTBIGTHING, then there is very little you can do. There’s also something to be said that 1) it’s easier when communities are more open to change and 2) it’s harder when big names at google, you-name-it are constantly bashing it.

                                                                                                                    1. 2

                                                                                                                      Can you clarify where “big names at Cloudflare” are bashing PGP? I’m confused.

                                                                                                                      1. 1

                                                                                                                        Can you clarify where “big names at Cloudflare” are bashing PGP? I’m confused.

                                                                                                                        I actually can’t, I don’t think this was made in any official capacity. I’ll amend my comment, sorry.

                                                                                                                1. 2

                                                                                                                  This proxy technique, which safe-guards the privacy of millions of users ever day

                                                                                                                  Far from the only thing protecting privacy there. Some browsers (GNOME Web) actually just download the whole database (which is very fun on a metered mobile connection hehehe), but the mainstream solution is to use the v4 update API which is based on hash prefix matches, kinda similar to pwned passwords.

                                                                                                                  I wonder if all of this could be replaced with a CRLite style cascade of bloom filters..

                                                                                                                  1. 2

                                                                                                                    This is very cool (and I’m very impressed by the offering of the kits).

                                                                                                                    But isn’t this basically just an external BMC? Can someone point out what the differences are (in general or in this instance)?

                                                                                                                    1. 4

                                                                                                                      In addition to KVM functionality, a “real” BMC typically offers things like power control, thermal monitoring/fan control, virtual boot media (BMC pretends to be a CD-ROM drive and lets you boot the system from an image it loads over the network via CIFS/NFS or the like), etc.

                                                                                                                      I don’t know offhand what the upstream support status currently is or exactly what functionality it offers (probably not the full smorgasbord), but I know people have gotten OpenBMC running on Raspberry Pis; this could potentially be a nice supplement to something like that?

                                                                                                                      1. 2

                                                                                                                        Being a virtual disk drive is easy: https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/usb-device-mode-storage.html

                                                                                                                        Connecting a gpio for reset is also easy (though you don’t really need this if you already have a smart power plug or managed UPS)

                                                                                                                        Thermal/fan would be the most difficult part.

                                                                                                                    1. 0

                                                                                                                      I agree with the feeling, but there’s nothing really new here. Google Analytics is still the most convenient solution, as any alternative means either maintaining your own server or pay yet another monthly fee for a SaaS.

                                                                                                                      I have GA on my site mostly because I want to see the referrals so that if there’s a post somewhere about my app I can contribute to the discussion. I consider that for users who don’t like GA, they have the option to disable it via various extensions. It’s not an opt-in but at least it’s possible to opt-out.

                                                                                                                      1. 11

                                                                                                                        Seeing referrals from server logs is simple, especially with tools like goaccess. GA has many functions which are not covered by simpler tools like goaccess, but this is not one of them.

                                                                                                                        1. 3

                                                                                                                          Yes and installing Plausible and running your own server is simple too, but all this takes time. The day it’s down because your logs are full or because an update broke the Nginx server you still need to spend a few hours fixing all this. With GA, you copy some JS and you never have to do anything more. I won’t deny there are problem with GA, but currently no solution is as simple as this.

                                                                                                                          1. 8

                                                                                                                            I think you’re rather overstating the likelihood of Nginx breaking due to an update (especially if you’re running a stable distribution) and how long it would take to fix it in the very rare case that it did. According to this article, log rotation is enabled automatically for Nginx in at least Ubuntu.

                                                                                                                            The way I see it is that if you decide that you require analytics then you can either:

                                                                                                                            1. Make the small amount of effort required to self host
                                                                                                                            2. Pay someone to host it for you
                                                                                                                            3. Be lazy and decide that your users should pay with their data and go with GA

                                                                                                                            If you choose the latter then that’s entirely on you.

                                                                                                                            1. 1

                                                                                                                              Yep, would do the latter

                                                                                                                              1. 2

                                                                                                                                Since you are not motivated by the advantages of not using GA, your objection to plausible is not very relevant.

                                                                                                                        2. 1

                                                                                                                          It’s not an opt-in but at least it’s possible to opt-out.

                                                                                                                          This is really important, and I really don’t understand how almost all of these new wave minimal analytics tool we are currently seeing do not even offer an opt-out mechanism of any kind (let alone opt-in) other than (maybe) telling people to install an Adblocker in their documentation. This situation is something people will need to fix before they want to consider themselves a Google Analytics alternative.

                                                                                                                          1. 2

                                                                                                                            Is it really important? Does every analytics tool require its own opt out browser extension when a single, more generic blocker extension can do the same thing, without requiring that you trust the company that made the tracking tool in the first place? Is this really more important in your view than the simple fact that by using GA, you’re giving Google of all companies free access to even more data? How many people do you think know about or use that opt out extension? I certainly didn’t until it was mentioned here, and I’m not going to start using it.

                                                                                                                            1. 2

                                                                                                                              There’s this thing that was designed for opting out of everything, it’s called Do-Not-Track…

                                                                                                                              (But if something just collects statistics about browsers and screen sizes and doesn’t track, is it even appropriate to respect DNT?)

                                                                                                                        1. 17

                                                                                                                          How is that “finally”? AWK has been there since 1977.

                                                                                                                          1. 5

                                                                                                                            The title seems exaggerated for the sake of the wordplay (Plausible is the tool they switched to).

                                                                                                                          1. 2

                                                                                                                            That’s why I’ve always been slightly bothered when people call the return key “enter”.

                                                                                                                            1. 9

                                                                                                                              Windows keyboards tend to have an Enter key but no Return key, Mac keyboards tend to have a Return key but no Enter key. They’re in the same place and at first glance do exactly the same thing so it’s easy to understand why people would say that.

                                                                                                                              1. 4

                                                                                                                                Even portable Macs used to have an enter key, where the right option key is now situated. I really missed it when it disappeared, only to discover fn+enter quite recently.

                                                                                                                                1. 3

                                                                                                                                  Did you mean Fn+Return for Enter?

                                                                                                                                  1. 3

                                                                                                                                    Haha, of course.

                                                                                                                                2. 2

                                                                                                                                  Well, non-Apple keyboards tend to label it “enter” but it sends the keycode that means “return”..

                                                                                                                                  1. 1

                                                                                                                                    That sounds like a recent development. Almost all non-Apple keyboards I’ve seen have labeled the key with the carriage return symbol, not the text “Enter”.

                                                                                                                                    1. 2

                                                                                                                                      I think the symbol is more common on ISO layout keyboards. The ANSI Thinkpad and Pixelbook I have both say “Enter”.

                                                                                                                                3. 1

                                                                                                                                  Yes, this is one of my favourite pieces of trivia to bring up when I’m in the mood for teasing someone.

                                                                                                                                  I’m also consistent in calling it Return, and most people seem to not even notice. (Unless I want to tease them by correcting them when they say Enter.)

                                                                                                                                1. 16

                                                                                                                                  Most text editors (and even IDEs) have a surprising lack of semantic knowledge. Editing programs as flat text is brittle and prone to error. If we had better, language-aware transforms and querying systems built into our text editors, we’d be able to more easily build interactive tools/macros/linters rather than relying on the batch processing we use these days.

                                                                                                                                  Some cool, language-aware tools that exist today (ish) are:

                                                                                                                                  1. 17

                                                                                                                                    The problem is that plain text is proven to carry information across thousands of years, whereas custom formats rot. I can read a paper from 1965 and understand the Fortran source in it, but it’s next to impossible to read many binary formats from the 90s without custom code.

                                                                                                                                    I think that we need to focus on simplifying the analysis of languages: effect systems, and limiting global state should make it easier to analyze the semantics of syntactic structures and thus make structure and semantics easier to highlight. (I certainly don’t need syntax highlighting working in haskell, but it’s hard not to miss in C-likes).

                                                                                                                                    1. 2

                                                                                                                                      Well, ASCII has only been around for a few decades, so I’m not sure it’s been shown to last thousands of years yet ;)

                                                                                                                                      Granted, there haven’t been any graph (for program ASTs) or table-like (for other program data) data structures that are as pervasive as ASCII or UTF-8 plaintext, and if you want to argue that it makes sense to keep the serialization format plaintext so it’s human readable (like JSON or graphviz or CSV), that’s fine. It still doesn’t prevent us from storing more rich and semantic information beyond just flat symbolic source code.

                                                                                                                                      The problem with source code is it’s difficult to build a parsers for it, and there’s only one representation for code. For instance, if all source code was stored as an AST in json, think of how easy it would be for you to build custom tools to analyze and transform your code? (This is a terrible idea for other reasons, but it illustrates the idea).

                                                                                                                                      1. 2

                                                                                                                                        True, I’m using a wider definition of “plain text” than just ASCII.

                                                                                                                                        You’re right about being able to deserialize plain text into more semantically interesting structures, of course. Then, though, you’re tying visualization (or, at least, editing) to a probably-limited set of tools. I think about the XML ecosystem, which fifteen years ago probably seemed unstoppable, a sure bet for further tool development… but these days the only really powerful one of which I’m aware is Oxygen, which is dated and costs $lots for business licenses.

                                                                                                                                        Other problems are possible as well, such as vulnerability to deserialization attacks, like CVE-2017-2779.

                                                                                                                                        Ultimately I think that many things could be helped by plain text structures that allow more sophisticated namespacing and structuring than the usual function/class/const options we get: first class modules, as in OCaml, for example. I think these sorts of things are coming, but it’s a slow process.

                                                                                                                                    2. 10

                                                                                                                                      Editing programs as flat text is brittle and prone to error.

                                                                                                                                      I strongly disagree. I work on a rather large code base and nearly everyone on my team prefers to use vim or emacs. There’s something to be said for walking through a neighbourhood rather than driving through one when you want to buy a house. The vast majority of our time (99%?) is spent reading code or debugging rather than writing code. Every line of code should be thoughtful and we should ALWAYS optimize for readability. Not just the semantics of variables and objects but the design of the whole system.

                                                                                                                                      Languages like java are impossible to write without tool assistance. They’re aggregating large and miserable frameworks where code refers to variables in other files through inheritance and all that other stuff. Just trying to figure out which implementation of foo() an object will call can be difficult or impossible without assistance. That sort of complexity now needs to be internalized in your limited human memory banks as you try to make sense of it all.

                                                                                                                                      1. 2

                                                                                                                                        Oh, I use Vim too – I dislike IDEs for their bloat, and I also prefer languages that are more oriented towards small, compact solutions (and even have an interest in taking it to an extreme with, e.g. APL). If the entire program can be kept in a single file (or even better a single page of text), all the better. Spatial compactness is useful for understanding and debugging, and less code means less bugs.

                                                                                                                                        My original point still stands though. Having better tools doesn’t mean code quality has to suffer. The fact of the matter is that we end up having larger codebases that require more complicated code transforms or linting checks. At minimum, having a syntax aware way of doing variable or function renaming in a text editor is superior to blindly running sed over a arbitrarily[1] line-oriented character array. Even from a programmer’s perspective, I’m not convinced a purely symbolic representation of code is always superior. It’s certainly a compact and information dense way of viewing small pieces of code, but it quickly becomes overwhelming when coming to grips with larger systems. Plus, there’s only so much info you can cram into one screenful of code.

                                                                                                                                        I think, ideally, we’d have multiple ways of viewing the same code depending on the context we’re working in. For instance, when trying to jump into a new codebase to add a feature, data flow is more important than directly understanding the specific implementations of any function. It would be useful to be able to take a function, and view it in the context of a block diagram to see how it fits into the rest of the system and all code paths that lead to it. In another situation, you may want to view it from a documentation perspective that allows you to semantically tie documentation, proofs, formulas, or diagrams directly into the code, even to specific expressions (kind of like docstrings, but more structured and format rich). Or in a situation where you’re working with a protocol, rather than having an implicit finite state machine that’s only viewable from a code point of view (with a switch statement or through functions that are tail called), you could flip into a graphical view of the FSM or a tabular view of the state transitions.

                                                                                                                                        Some of the things I’ve mentioned above are somewhat possible today with external tools, but the problem is they each construct their own AST and semantic knowledge of the source (sometimes incorrectly). There’s no communication between the tools, no referential integrity (if you update the source, do you have to rebuild an index for each tool from scratch?). A standardized, semantic storage format for code would help to address some of these issues.

                                                                                                                                        [1]: I say arbitrarily here, because sometimes the line-oriented nature of sed or grep conflicts with the true expression oriented structure of the code. For instance, if a function signature is split across two lines, trying to search for all return types with grep -e '\w+ .*(.*).*{ wouldn’t work. Besides, most syntax structures are recursive which regexs are inherently limited at parsing.

                                                                                                                                        1. 4

                                                                                                                                          It would be useful to be able to take a function, and view it in the context of a block diagram to see how it fits into the rest of the system and all code paths that lead to it.

                                                                                                                                          I think it would be more useful if a function had to consider less and less the rest of the system. Otherwise you have a poor contract and high coupling. I think code and architecture need to blend together and if you need a tool to make sense of it all then you’ve failed.

                                                                                                                                          This is a perfect example of nightmarish code for me. There’s about 200 methods and maybe 6 or 7 deep on the inheritance chain. It’s barely possible to manage even with an IDE and a WPF textbook sitting on your desk. https://docs.microsoft.com/en-us/dotnet/api/system.windows.shapes.rectangle?view=netcore-3.1

                                                                                                                                      2. 8

                                                                                                                                        I strongly agree with you. We’ve been hamstrung by the primitive editors for decades. This fixation on text cripples other tools like version control as well - semantic diffs would be an obvious improvement but it’s rarely available. (The usual counterarguments about the universality and accessibility of text don’t stack up to me.)

                                                                                                                                        1. 2

                                                                                                                                          The insistence on using plain text for canonical storage, API interface, and user interface is IMO the thing most holding us back (some other top contenders being the pursuit of “performance” and compilation-as-DRM).

                                                                                                                                          1. 10

                                                                                                                                            Looking at the current web, I would have to disagree with the idea that the pursuit of performance is holding anything or anyone back…

                                                                                                                                            1. 1

                                                                                                                                              If you’d seen all the node-gyp build failures I had, you might think differently. But I’m thinking more about stack busting and buffer overruns at runtime and hobbled tooling at devtime in this case.

                                                                                                                                              1. 2

                                                                                                                                                Native modules and the whole node-gyp system is horrible, but I don’t think that’s due to pursuing performance? Most of the time, packages with native code seem to just have taken the easiest path by creating node bindings for an existing library, and I don’t think node-gyp itself is bad due to a pursuit of performance…

                                                                                                                                                AFAIK, though this could be wrong, the main reason for node’s horrible native code support is that people just use the V8 C++ API directly, and Google is institutionally incapable of writing stable interfaces which other people can depend on. They constantly rename methods, rename or remove classes, move header files around, even deprecate functionality before replacements exist. Even that isn’t just due to a pursuit of performance though, but due to a fear of tech debt and a lack of care for anyone outside of Google.

                                                                                                                                        2. 2

                                                                                                                                          Comby is definitely a huge upgrade from writing regexps. There’s also Retrie for Haskell and Coccinelle + coccigrep for C. I’d really love to see a semantic search/replace/patch tool for Rust…

                                                                                                                                        1. 4

                                                                                                                                          Now, can [IP mobility] work for the whole Internet?

                                                                                                                                          Yes. Apple has been using MPTCP in production since iOS 7 was released. QUIC supports mobility.

                                                                                                                                          1. 1

                                                                                                                                            Confusingly, my Thinkpad laptop keyboard has only an Enter key (which isn’t L-shaped, and doesn’t have a return arrow printed), but xev says it sends Return, and indeed it really does seem to act as a Return key in all the ways I can tell from Linux. I’ve been typing so long I don’t think I’ve ever bothered to look at the label on it before, and now I find it highly annoying!

                                                                                                                                            1. 1

                                                                                                                                              which isn’t L-shaped

                                                                                                                                              The (IMO awful) L-shape is found on ISO layouts, while ANSI has the (good) single row key.

                                                                                                                                              1. 1

                                                                                                                                                The (IMO awful) L-shape is found on ISO layouts, while ANSI has the (good) single row key.

                                                                                                                                                Each to their own - I’ve learnt to always hit Return somewhere around its centre and usually miss it when typing on a keyboard with ANSI layout. There is no awful or good here my friend - just a matter of what one is used to.

                                                                                                                                            1. 5

                                                                                                                                              if I may lose my server and lose some important email

                                                                                                                                              This is the biggest problem I have with most of the “host your own {x}”. Yes, I have to do maintenance, yes things may break. I can probably even deal with spam. I ran my own email a few times. As a secondary server though.

                                                                                                                                              Because hosting means I need to have a looooong term backup&recovery strategy. Unless Google goes bust, I’m pretty certain I’ll be able to read an email from 2004 or 2005 or whenever it was I’ve switched to gmail.

                                                                                                                                              And some of my blog attempts I can’t even find on the internet archive, let alone mails film that server. Or photos. Or whatever. I don’t even know what will happen in 10 or 20 years.

                                                                                                                                              I’m curious as to how do people deal with that issue? Okay, having a newsletter from 2004 is probably my hoarding impulse problem and the inability to go back and clean it up now is just making it worse. Probably the same with the 100gb of photos I have (again, needs cleanup and only 30% are the everyday smartphone snaps).

                                                                                                                                              What’s your strategy? What are your ultra long term backup and recovery plans?

                                                                                                                                              What will you do if you give up on computers in 10 years?

                                                                                                                                              1. 9

                                                                                                                                                What’s your strategy? What are your ultra long term backup and recovery plans?

                                                                                                                                                ZFS. Mirroring. A pair of 4tb drives is not prohibitively expensive in this age.

                                                                                                                                                Periodic snapshots.

                                                                                                                                                One offsite backup, in case of earthquakes,fires,etc. Lots of ways to do this. Could be aws glacier or similar. Or a third drive hosted at a workplace (if allowed) or at a friend’s or relative’s house. In the latter case, zfs send/recv.

                                                                                                                                                What will you do if you give up on computers in 10 years?

                                                                                                                                                My drives will keep in a closet if I decide to run off and live in the woods for a few years. Google data will not, if you stop paying the google bill.

                                                                                                                                                1. 2

                                                                                                                                                  A pair of 4tb drives is not prohibitively expensive in this age.

                                                                                                                                                  As a reminder, SMR is still a problem, and even more so with a ZFS setup.

                                                                                                                                                  It’s not possible to just buy a pair of a 4TB drive. Extra effort is needed to avoid SMR.

                                                                                                                                                2. 6

                                                                                                                                                  What’s your strategy? What are your ultra long term backup and recovery plans?

                                                                                                                                                  Tarsnap

                                                                                                                                                  1. 4

                                                                                                                                                    And upgrade plan. While setting everything up is fun as you learn some things, upgrading software and hardware will quickly become a chore. That’s why I avoid owning any server as much as I can.

                                                                                                                                                    1. 1

                                                                                                                                                      Oh yes I totally forgot to mention maintenance and upgrades. These days the things like that are commodity.

                                                                                                                                                    2. 4

                                                                                                                                                      just keeping up with the maintenance is too much hassle for me to host anything on my own if I think it’s somewhat important. Imagine going on vacation for two weeks without a notebook to fix your mail server because it went down for whatever reason.

                                                                                                                                                      1. 3

                                                                                                                                                        I’ve embraced the impermanence of everything. I delete most mail I get. Not archive, trash and it gets auto-cleaned there.

                                                                                                                                                        1. 2

                                                                                                                                                          I use isync/mbsync. My personal email archive dating back to 2001 seems to be about 3.4GB, so I just download every mail I’ve ever received to all my devices. That’s mirroring taken care of. PCs and laptops need to be backed up anyway, so that’s backups taken care of. This strategy will work if your mail archive is 0.3, 30 or 300GB.

                                                                                                                                                          I’m pretty certain I’ll be able to read an email from 2004 or 2005 or whenever it was I’ve switched to gmail.

                                                                                                                                                          Mail is probably safe because the storage costs are negligable. But I wonder how long Google will allow people to store photos and video on their servers for free.