Threads for DeciusCaeciliusMetellus

    1. 5

      Still waiting for that POWER9 based laptop with classic 7-row ThinkPad keyboard …

      1. 3

        it’s 90W - it’s going to be one of those incredibly thick gamer laptops at best

        1. 3

          What prevents them from making a one with 2 cores and SMT4 at 25W envelope? :)

          1. 6

            The market for it. (Also, the 90W figure is for the 4-core part that pretty only exists as chaff that RCS uses and IBM otherwise wouldn’t. POWER9 is designed for 4-8 thread clumps - IBM sells single-core/thread models, but those use firmware DRM to be restricted)

              1. 1

                Servers like the S812 Mini. The AIX configuration has more cores, but the i configuration is limited to a single one.

                1. 2

                  Oh, yes. IBM i definitely plays by different rules.

          2. 3

            The limiting factor for the 4cores aren’t the cores itself, but all the peripheries like the PCIe host bridges, the core interconnect, the onchip accelerators, the MMU etc.

    2. 6

      So what does all this mean? In practice it seems the only architectures one runs across are x86-64 and ARM. Is anyone here using Openpower? What is your use case?

      1. 24

        I’m typing this (and wrote up the post) on a Talos II. It’s my daily driver computer. I wanted something I could trust, and I didn’t want to feed the x86 monoculture, and I have more money than sense.

        It’s great. I like it a lot. It just works. (Fedora 30.)

        1. 6

          can you share some impressions on power consumption and noise? I’m not asking for specifics (unless you measured it already and have the data handy) but because I live in a small single bedroom flat and the only place I could place it is in the living room, so if it is too noisy it is bad. Power also plays a big role because if it is like the dual G5 it will add a ton to my usage.

          1. 12

            This dual-4 T2 pulls around 170W. The earliest firmware was deafening, but the current firmware is pretty much silent. It’s much less noisy than the Quad G5 sitting next to it, even when the G5 is throttled down.

            That said, in your situation given that it’s a big EATX hulk, you’d probably be happier with a Blackbird. It’s smaller (mATX, though I’d strongly advise a standard ATX case for it) and cheaper. My notes on my own Blackbird are here ( ) but the TL;DR is budget for a single-8 and a GPU and you’ll be very happy with it.

        2. 2

          Why are these things so expensive? Nearly $3k for a single 4-core CPU + motherboard? How does it compare in speed to other commercially available processors?

          1. 6

            Economies of scale. AMD and Intel are shipping ~a million times more units.

            1. 0

              Yep, and I can get a Raspberry Pi for $35. $3k is absurd regardless of scale. Also, IBM seems to be behind a lot of this? It’s not like it’s being produced out of someone’s basement (which still wouldn’t justify $3k). Sounds like people ripping you off. :)

              1. 15

                Running an obscure architecture is always going to cost (much) more than a mainstream one (which is why I don’t, personally).

                Yep, and I can get a Raspberry Pi for $35

                ARM chips have much better economies of scale even than x86.

                It’s not like it’s being produced out of someone’s basement

                Setup costs dominate in a chip fabrication run. If you only sell 10k units, you have to sell them for quite a bit more to cover those costs.

                1. 5

                  Why is everyone here talking about the chips? The POWER9 CPUs are relatively reasonable, $400-500 for the 4-core is similar to Ryzen 7 1800X launch price. It’s the Raptor mainboards that are extremely expensive.

                  1. 3

                    They aren’t extremely expensive. They’re cheaper than the low-volume RISC workstations from SGI and Sun that came before them. They were quoting me five digits for good workstations. Anyone wanting many CPU’s would pay six to seven. What people are missing is the Non-Recurring Engineering [1] [2] expenses are huge, must be recovered at a profit, and are divided over the number of units sold. The units sold are many times less than Intel, AMD, and ARM. These boards are also probably more complex with more QA than a Pi or something.

                    So, they’ll cost more unless many times more people buy them allowing scale up with lower per-unit price to recover NRE costs. If they don’t and everything gets DRM’d/backdoored, then everyone who didn’t buy the non-DRM’d/backdoored systems voted for that with their wallet to get a lower, per-unit price in the past. Maybe they’re cool with that, too. Just their choice. Meanwhile, higher-priced products at low volume are acceptable to some buyers trying to send the market a different signal: give us more-inspectable, high-performance products and we’ll reward you with higher profit. That’s Raptor’s market.



                  2. 1

                    Ah, that’s good to know! Thanks for clarifying :)

                    I was mostly asking about the CPU + motherboard combo, and I didn’t see that you could buy them individually. Are there other motherboards that also work w/ it?

                    1. 1

                      Only Raptor sells boards standalone, others are part of very expensive servers, e.g. from IBM itself

                2. 1

                  Sure, this all makes sense as a producer - but seems like a lot to ask of consumers.

                  I guess it’s working, though. 🤷

              2. 1

                That Pi probably doesn’t come close to a POWER in performance, esp single-threaded. Intel and AMD are only real comparisons.

                1. 5

                  Yeah, POWER9 is a big hot chip, while the Pi can run without a heatsink. But the latest Pi, upgraded to Cortex-A72, is pretty much “ultrabook grade” performance. Totally desktopable :) I’m writing this from a quad A72 system in fact (with 8GB RAM though, and a big AMD GPU).

                  The Pi has a software advantage: e.g. Firefox has a full IonMonkey JIT for aarch64 enabled out of the box. For POWER, there’s only a WIP unofficial baseline JIT port by /u/classichasclass. My ARM system might even beat the POWER9 in some JavaScript benchmarks right now :)

                  1. 1

                    I know they’re impressive cuz I did some web browsing on my Pi 3 at the house. They’re just not a POWER9. The difference comes from full-custom design that costs a fortune. The POWER’s will have higher per-unit prices due to the much lower volume vs x86’s.

                    1. 2

                      The performance of the Pi4 is supposed to be 4x that of the Pi3 at the same price-point, so the Pi3 isn’t really a reasonable method for comparing with.

                2. 1

                  Sure, so that’s where my original question gets to. What does this compare to? Is it intended to compete w/ Xeons? Intel extreme CPUs? Which ones? I was just asking for a means of comparison, which seems hard to find.

                  1. 2

                    Xeons and EPYC’s that I can tell. High-end performance, esp multi-threaded. IBM’s material. A questionable attempt at a benchmark. Throw in side benefit that almost all malware targets x86. That will continue to be true so long as POWER-based desktops remain niche.

                    Since it’s RISC, you can also get better performance on some security mitigations due to fact that x86 optimizes for specific usage. For example, implementing a reverse stack where data flows away from stack pointer might require more indirection on x86 stack-based design than POWER. There was also work in OpenBSD on reducing ROP gadgets or something that got way more done on ARM than x86 for similar reasons. Could be true for POWER, too.

                    I’m also wondering about acceleration possibilities from modifying microcode (i.e. custom opcodes) if it’s as open as they claim. Karger et al modified VAX’s microcode to both speed up and boost security of their VMM’s fast path. One team long ago had a HLL-to-microcode compiler, too. I figure there might still be NDA’s involved in that, though.

                    1. 2

                      work in OpenBSD on reducing ROP gadgets or something that got way more done on ARM than x86 for similar reasons

                      Yeah, because x86 instructions are arbitrary length, polymorphic gadgets are a thing (jumping into the middle of an instruction to interpret everything from there as unintended instruction). Any ISA that’s not ridiculous-length doesn’t have this “feature” :)

            1. 3

              Ah! That’s a bit more reasonable. Still, my original question was more about what other kinds of CPUs these should be compared against. In terms of performance/watt, are they competing w/ the Xeon line? or like i7 extremes? Is that the right way to compare them or does it give someone an unfair advantage? That’s more what I was curious about.

      2. 14

        I use it as my desktop computer as well (ubuntu 19.04), and everything basically just works for me as well. For me, there are two main reasons for using it. One is that all the firmware is open source (Apache licence) w/o any tivotisation (this includes the firmware that is somewhat akin to the Intel Management Engine or AMD’s PSP), so it is truely “owner-controlled” and free. The other thing that makes it very appealing to me is the Power ISA itself, which doesn’t have a lot of the insanity you find in x86. As an added benefit, they have a very well documented and interesting microarch with Power9.

    3. 67

      There are a few key factors I think that come into play. I should note that I only have direct experience with the US market (although I know many people who do or have worked in other parts of the world).

      The first thing I think is to point out that and other sites that focus on salaries at FAANG/tier 1 companies greatly distort the perception of the market. Most developers don’t make anything near the total comp that you see at those companies. Even folks with comparable base salaries aren’t making nearly the amount in equity, even out on the west coast.

      Those salaries are also often for jobs that are clustered in the most expensive parts of the country. For folks living outside of the Bay Area, salaries are much lower in general. The discrepancy in cost of living is astounding.

      Equity is another big part of it. Equity isn’t normal money. Even at large public companies where you can easily sell your shares, the accounting is different. You have to wait for those shares to vest, and you can only sell them at certain times, you don’t know the exact value your getting when you accept a job with an equity component. From the companies perspective issuing equity doesn’t have the same cost as issuing cash, and most people will never actually get all of their equity in reality (they will leave with some equity unvested for example).

      Even with all of that, you might wonder why salaries are so high. Supply and demand is a big part of that. If you look at the trend of companies investing in boot camps and other training programs it’s obvious that they see increasing supply of developers as a strategic way of lowering salaries. In particular I’ve noticed that the salaries for some very common sorts of web development have plummeted over the last 5 years as hoards of bootcampers who’ve been trained in those very specific skills (but intentionally not given the skills to be mobile in the industry or get better jobs) have entered the market and started turning that part of the development field more blue collar.

      Related to supply and demand, I think one strategy that has been inflating developers salaries is that some large companies are hiring everyone they can to starve the market of talent. A lot of developers are paid a lot of money just to prevent them from starting up a competing company, or going to work for a competitor. The laws in California against non-competed has probably helped this some, but even if you can prevent your employees from doing exactly the same work they have been doing, there’s a lot of strategic value in depriving your competitors of talent.

      The last factor I think is the immaturity of the industry. As time goes on and the industry matures, we are going to see salaries for developers depressed and more compensation going to investors and executives (this has already happened with startup equity. Nobody gets rich from being an early employee anymore because VCs have sucked the marrow from that bone already). I personally think a well organized union for software developers to represent our interests is the only way to stop the complete commoditization of our jobs over the next decade, but for it to be effective we’d have to do it now- and the strong undercurrent of right leaning and libertarian leaning culture in tech is likely to prevent that kind of organizing before it’s far too late to be effective.

      1. 14

        If you want accurate statistics about pay for different jobs in the US, the Bureau of Labor Statistics provides very good ones: I think it would be a good idea to use these numbers rather than the ones from

        1. 2

          I’d never heard of this website before, so I suppose this post was good advertisement for it. I’m especially intrigued by the $2.5M salaries at Microsoft. I agree with your statement. A good place for actual numbers is also the H1B database, say

        2. 1

          Very true those numbers are more representitive of reality, BUT even my state/metro, I know many developers who are way above the median for the area. I am myself. Of course that’s why it’s a ‘median’ but on the ground pay can be substantially higher than those numbers indicate.

      2. 12

        I think one strategy that has been inflating developers salaries is that some large companies are hiring everyone they can to starve the market of talent.

        This is a hard claim to swallow. It requires that very few companies have so much market share that they can overpay developers, and make it up by monopolizing their niches.

        Maybe this is true of specialized niches, perhaps subfields of machine learning or search, where there are three or fewer companies operating at massive scale. But for “generic” web/application/systems/distributed systems/ programmers[0], you have at least Apple, Amazon, Microsoft, Google, Facebook, and “everybody else” competing for talent. Even if you very generously suppose that “everyone else” only is as big as one of those five, that gives you six players.

        So with 6+ players, I think that’s too many for a stable overpaying strategy to be going on–someone would start cutting salaries and would save lots of money.

        [0] I’m “generic”, so don’t take that as a criticism.

        1. 4

          They don’t want all the talent as in every developer. She might have meant all the best hires, such as top universities. They’re flooding toward FAANG for the big bucks and prestige.

          1. 1

            I was talking about the top of the market, hence my references to GAMFA.

            They’re a significant minority of overall hiring, but probably a much larger chunk of the very high salaries.

      3. 7

        Overall I think that’s a good summary, but it’s missing possibly the most important aspect: the revenue per employee of the top companies is unprecedented. It’s worth paying someone $500k/yr when they’re making you $1m/yr.

        I personally think a well organized union for software developers to represent our interests is the only way to stop the complete commoditization of our jobs over the next decade

        I’m not sure how unionization would help. Unions I have been involved with (mostly federal government) are as, or more, guilty of treating people as replaceable cogs in the machine than “management”. In their drive to push for equal treatment, they often end up pushing the fallacy of equal ability. I’ve had discussions with union reps who were literally saying “an employee at pay level X should be able to do the job of any other employee at pay level X”, completely disregarding specialist fields, let alone actual skill.

        What will stop the (complete) commoditization of software dev is quality. I’m yet to see “commodity dev shops” (including most big name consultancies, TBH) deliver products that are fit for use or maintainable.

        1. 16

          Profit per employee isn’t unimportant, but I see it more as something that enables the other factors, rather than a cause in and of itself. Absent the other factors that are driving salaries up, I think more companies would prefer to have a larger profit margin rather than pass that money on to developers. It’s also very hard to know how much profit actual developers are generating in a large company. Even when the story is ostensibly clear (I made code change X that brought our AWS spend down from 100k/month to 50k/month) the complexities of the business make direct attribution fizzy at best.

          Regarding unions- I think that people tend to look at the most degenerate cases of union behavior and use that to explain why we don’t need one, without considering what a union that was organized specifically for tech workers could bring. Groups like the Screen Actors Guild might be a better starting place than something like a teachers union, because it has to scale across a much broader range of talent and demand. Personally I’d love to be part of collective bargaining for better terms for equity offered by early stage startups (like a longer exercise window), or an organization that helps create guidelines so that “unlimited vacation” doesn’t effectively because “no vacation”. Tech workers are doing good right now and I don’t want to make it sound like things are terrible- but the best time to organize and collectively bargain is when you have leverage.

          1. 3

            I think more companies would prefer to have a larger profit margin rather than pass that money on to developers.

            Of course, that’s where competition comes into it. But fundamentally, without high revenue per employee, none of the other factors come into play.

            I think that people tend to look at the most degenerate cases of union behavior and use that to explain why we don’t need one

            I’m open the the idea that there are possibly good, useful unions. They’re just not something I’ve seen in the real world. The main union I have experience with represents 10s of thousands of skilled workers across many disciplines and still struggles to be, in my estimation, a net positive.

            The collectivist nature of unions means that they often act too much in the interest of the collective, regardless of what that means for individuals. That can translate into being more concerned with maintaining their power base than necessarily improving member outcomes (though you’d hope those two things would be at least partly aligned).

          2. 4

            It’s also very hard to know how much profit actual developers are generating in a large company.

            And irrelevant. Companies don’t pay what people are worth, they pay the minimum they believe they can while keeping the person on staff and productive. It is all about the competitive marketplace and individual skill (both in the trade and in negotiation).

            Right now, it is easy to leave a company for a large salary boost, so to retain people you have to pay them enough to make taking the risk of leaving not worth it – that is all.

            1. 1

              It is all about the competitive marketplace and individual skill (both in the trade and in negotiation).

              It is for the low-level, production workers. The executives get routinely overpaid with all kinds of special deals and protections. I think either they should be forced to compete in a race to the bottom like the rest of us (free market) or we get protectionism and/or profit sharing, too.

          3. 1

            It’s also very hard to know how much profit actual developers are generating in a large company.

            Divide the total profit by the number of employees.

            This isn’t a perfect solution, but it gets you pretty close.

      4. 4

        Thanks for the detailed answer. I am not that knowledgeable about the job market being a junior CS student myself, but the point about bootcamps intentionally not teaching the skills to be mobile and get better jobs caught my attention. I am aware that learning ‘basic web development’ is perhaps the fastest way into the job market, so most bootcamps focus on that, but what are those other skills you mentioned that bootcamps should be teaching but aren’t?

        1. 4

          but what are those other skills you mentioned that bootcamps should be teaching but aren’t?

          Mostly; math, algebra’s, algorithms, data structures, complexity theory, analysis, logic, set theory, consultancy skills, scientific methods and everything that allows you to build better software and frameworks than there are currently out there. You need some hardcore coding skills and formal skills to build something better than what’s already out there, but it’s totally doable with the right education, because most of the industry is just selling “hot air”. Basically: You should be able to read and understand the four volumes of “The art of computer programming” without to much trouble. If you can do that, you’re there.

          That being said: It’s more about you picking your educators carefully, than it is about getting a degree. You’d still need that degree, but where it comes from is way more important.

      5. 2

        I can’t upvote this enough! Great explanation.

      6. 1

        In particular I’ve noticed that the salaries for some very common sorts of web development have plummeted over the last 5 years as hoards of bootcampers who’ve been trained in those very specific skills (but intentionally not given the skills to be mobile in the industry or get better jobs) have entered the market and started turning that part of the development field more blue collar.

        One consideration with web development is that the amount of money you can make scales with development speed much more than total cost-per-project/website made.

        Small/local business websites generally go for somewhere in the range of $1k to $3k, but the difference in development speed to get those sites live is massive between developers. There are developers in India et., al. that will take 150 hours to complete a semi-custom WordPress site (which would be around $10 per hour) and there are developers that can build the same site at the same level of quality in 15 hours (which would be $100 per hour).

        Also, accessibility to CMS tools, plugins, and so on is another reason why that type of work has become more blue-collar-y. You can make very solid small-business type websites with little-to-no programming experience in Current Year.

        This is from a contract-based/freelancer point-of-view. It’s possible that the scenario is different in the corporate/employee-basis web development world.

    4. 1

      Is there much of a case for VLIW over SIMT/SIMD? (SIMT is the model most/all modern GPUs use, which is basically SIMD, but with conditionals that mask parts of the register/instruction, rather than the entire instruction)

      My basic thinking is that if you have SIMD, conditional masking, and swizzling, you’re going to be able to express the same things VLIW can in a lot less instruction width. And SIMT is data-dependent, which is going to be more useful than index-dependent instructions of VLIW

      Basically, I don’t see the case for having ~32 different instructions executing in lockstep, rather than 32 copies of one (conditional) instruction. It seems like it’s optimizing for something rare. But maybe my creativity has been damaged by putting problems into a shape that GPUs enjoy

      1. 2

        It is more a question between VLIW and superscalar out-of-order architectures (and not between SMT and VLIW), and there the latter ones clearly win. On a fundamental level, they are faster because they have more information at runtime than the compiler has at compile time.

    5. 5

      With VLIW, you would very likely have to recompile your program for every single unique CPU

      Fun fact, GPUs were VLIW (e.g. TeraScale), this wasn’t a problem for them at all (you recompile shaders on load anyway), and yet the current GPUs (GCN) are all more “RISC” like (not VLIW). VLIW is kind of a failure.

      1. 1

        My question is… why is VLIW a failure? Yes, the compiler technology wasn’t up to the challenge in 2000, but in my opinion, it got there by 2010. Is the technical advantage not real, or is it just another case of path dependency?

        1. 4

          The biggest problem with VLIW is that you, to effectively use it, need large numbers of independent instructions inside a basic block. As it turns out, though, most code has a lot of branches and interdependent code, which forces the compiler to emit lage quantities of NOPs. Also, for tight loops you usually don’t need VLIW, vectorization is enough

          1. 1

            and that is the one place where the people who complain about Algol like languages may have a point. I have never seen any work on e.g. executing APL like languages on a VLIW machine, but it seems like it should be interesting. My suspicion, however, is that this is just a fundamental from how digital computers do arithmetic.

      2. 1

        Oh? Where can I read about why they went RISC-like?

    6. 14

      Hot take: RISC/CISC as design philosophies are both fairly obsolete in 2019. RISC was revolutionary at the time because it designed the language for compilers to write more than humans, at a time when lots of assembly was still written by humans. They has to make fewer compromises and could make faster hardware as a result.

      Now all essentially all software comes out of a compiler with a sophisticated optimizer, and the tradeoffs hardware designers are interested in are vastly different than the 1980’s. RISC and CISC principles only matter as far as they affect those tradeoffs: memory bandwidth/latency, cache coherence, prediction/speculative execution, etc. (I am not a hardware guy, but I do like making compilers.)

      Edit: actually, despite the slightly ranty start and finish, this article is great. It talks about basically all this and more.

      1. 10

        One interesting problem is that software comes out of a compiler whose model of the machine is essentially a PDP-11..

        1. 6

          That RISC vs CISC distinction is about the compiler backend rather than the language itself. In PDP-11 , *p++ was translated to just one instruction—because PDP-11 designers anticipated that people would do it often and included instructions for that. Same reason in CISC arhictectures you always have most if not all commands support all operant combinations: register-register, register-memory, and memory-memory. People hate doing the load/store cycle by hand.

          RISC is explicitly about not caring for people who write it by hand. For a compiler backend author, it’s trivial to add load and store—once. It’s perfectly RISC’y to break down a thing a human would want to be a single instruction like *p++ into individual parts and let compiler writers decide which one to use. It was never about raw instruction count.

          1. 3

            Funny thing, *p++ = reg is a single instruction on ARM, I don’t think there is a way to do that in one instruction on x86.

            1. 1

              Interesting. I haven’t looked deep into ARM, somehow. I need to get to finally reading its docs, though if you know of a good introduction, please point me to it.

              1. 1

                Sorry, I don’t know a good entry point, I started reading the ISA some time ago but never finished it. It is reasonably readable, though (personally, I think the most readable is by far ppc64, followed by ARM with x86 being the worst)

                I just looked up the instruction though, it was ldr xT, [xN], #imm (ARM instruction are sooo unreadable :( )

                1. 2

                  x86 has the benefit of third-party tutorials, even though they tend to be a) outdated b) worse than outdated c) plain wrong d) all of the above. My personal favorite is the GFDL-licensed “Programming From the Ground Up” (, which is in the first category—ironic because it was completed just before x86_64 became widely available.

                  An ARM64 fork would be a perfect raspberrypi companion.

                  1. 3

                    I’d use the book with caution, for example all the floating point stuff is completely out of date (noone uses x87 anymore, everyone just uses SSE with half-initialized registers)

              2. 1

                The ARM System Developer’s Guide is dated (ARM v6) but I found it helpful. If you know another assembly language then I would just skim an instruction set manual.

              3. 1

                The official A64 ISA guide is quite nice

            2. 1

              x86 string store instructions (STOSB, STOSW, and STOSD) let you do this for the DI or EDI register.

              1. 6

                True. Use of these is discouraged, though. On Zen, they are implemented in microcode.

                Also, I found this gem in their documentation:

                At the assembly-code level, two forms of the instruction are allowed: the “explicit-operands” form and the “nooperands” form. The explicit-operands form (specified with the STOS mnemonic) allows the destination operand to be specified explicitly. Here, the destination operand should be a symbol that indicates the size and location of the destination value. The source operand is then automatically selected to match the size of the destination operand (the AL register for byte operands, AX for word operands, EAX for doubleword operands). The explicit-operandsform is provided to allow documentation; however, note that the documentation provided by this form can bemisleading. That is, the destination operand symbol must specify the correct type (size) of the operand (byte, word, or doubleword), but it does not have to specify the correct location. The location is always specified by the ES:(E)DI register. These must be loaded correctly before the store string instruction is executed.

                x86 really is the pinnacle of human insanity

          2. 1

            load store by hand is a lot easier when you have a decent number of registers and do not have weird rules about which registers can be accessed from what instructions.

          3. 1

            I dunno… I wrote a lot of VAX assembly, which was nice but I always had to have the reference nearby because I couldn’t remember all the instructions. I also wrote a lot of ARM4 assembly, and it was small enough that I could remember the whole thing without looking stuff up. So I really preferred the ARM instruction set — of course nowadays it seems to have bloated up to VAX size, but back then the “reduced” nature of it seemed like a good thing even for humans.

            (Edit) As an aside, I really wish I could find a copy of the bitblt routine I wrote in VAX assembly using those crazy bitfield instructions. I worked on it for months. Then DEC came out with a faster one and I had to disassemble it to figure out how on earth they did it — turned out they were generating specialized inner loop code on the stack and jumping to it, which blew my mind.

        2. 1

          I know that is a common story, but it’s not correct. Compilers now can optimize for out of order execution, large numbers of registers (a key RISC insight although maybe one people should revisit), big caches and cache line size, giant address spaces, simd, … - all sorts of things that were not supported by or key to PDP11s. C itself was working on multiple architectures very early in the history of the language.

    7. 10

      The traditional difference between RISC and CISC is memory operands for every instruction instead of just load and stores (as you will notic3, x86 doesn’t have a load instruction, it uses the move-register instruction with a memory operand for that). Another defining feature of CISC architectures is the extensive use of microcode for complicated instructions , such as x86 sine and cosine instructions (FJCVTZS is not a complicated instruction, what it does is quite simple though it does look like a cat walked over the keyboard). Use of instructions encoded in microcode is strongly discouraged, because it is usually slower than the code written by hand. I don’t see how the rest of the article (cache access times etc.) is in any way relevant to the discussion how well RISC scales.

      1. 2

        What’s the point of having complex instructions (like sin, cos) encoded in microcode, if it’s slower?

        I always thought the only reason many of the heavyweight x86 instructions was because they were faster than the fastest -hand-optimized assembler code doing the same thing. It certainly is true of the AES instruction set.

        1. 4

          What’s the point of having complex instructions (like sin, cos) encoded in microcode, if it’s slower?

          Because it was once faster.

        2. 3

          The only one of the AES instructions encoded in microcode is AESKEYGENASSIST, all the other ones aren’t (split into 2 uops at most) on Broadwell processors. On AMD Zens, all of them are in hardware.

        3. 1

          What’s the point of having complex instructions (like sin, cos) encoded in microcode, if it’s slower?

          Otherwise they need to be in hardware, taking up silicon.

          1. 1

            Sorry, I meant: why have these additional instructions at all, if they’re slower?

            1. 3

              Otherwise they need to be in hardware, taking up silicon.

              Because they are part of the ISA, and it need to remain backwards compatible. They didn’t add the instructions for SSE and AVX, though

              1. 1

                It’s easy and safe to break backwards compatibility for x86 extensions, because every program is supposed to check whether the CPU it’s running on supports the instruction before executing it. the CPUID instructions tells you which ISA extensions are supported.

                For example, if you runCPUID, it sets the 30th bit of ECX to indicate RdRand support and the 0th bit of EDX to show x87 support. The floating point trigonometric functions are part of x87. Intel and AMD could easily drop support for x87, by setting the CPUID flag for x87 to 0. The only programs that would break would be badly-behaving programs that don’t check before using x86 extensions’ instructions.

                1. 7

                  The only programs that would break would be badly-behaving programs

                  So, pretty much all of them? Programs don’t check for features that 99% of their userbase has. For example, on macOS you unconditionally use everything up to SSE3.1, because there are no usable Macs without them.

                  When programs stop working, or become much slower due to traps and emulation, users aren’t any happier knowing whose fault was it.

                2. 1

                  Are you a LLVM developer?

      2. 2

        Use of instructions encoded in microcode is strongly discouraged, because it is usually slower than the code written by hand.

        I don’t think this is correct. Microcode is a very limited resource on the CPU, so why would anyone waste space encoding instructions that could be implemented elsewhere if they were slower?

        1. 4

          Because the instruction set has to be backward compatible. They decided that floating point sine and cosine where a great idea when introducing the x87, but they soon found out it wasn’t. There’s a suspicious lack of these instructions in SSE and AVX. Still, these instructions waste a lot of space in the microcode ROM

    8. 3

      It’s unfortunate that most of these posts clump C with C++. Yes, it does reference Modern C++ Won’t Save Us. The question I would love answered is, does modern C++ solve 80% of the problems? Because 80% is probably good enough IMO if solving for 99% distracts us from other important problems.

      1. 11

        The question I would love answered is, does modern C++ solve 80% of the problems?

        The reason this isn’t really answered is because the answer is a very unsatisfying, “yes, kinda, sometimes, with caveats.”

        The issues highlighted Modern C++ Won’t Save Us are not straw men; they are real. The std::span issue is one I’ve actually hit, and the issues highlighted with std::optional are likewise very believable. They can and will bite you.

        On the other hand, there is nothing keeping you from defining a drop-in replacement for e.g. std::optional that simply doesn’t define operator* and operator->, which would suddenly and magically not be prone to those issues. As Modern C++ Won’t Save Us itself notes, Mozilla has done something along these lines with std::span, too, preventing it from the use-after-free issue that the official standard allows. These structures behave the way they do because they’re trying to be drop-in replacements for bare pointers in the 90% case, but they’re doing it at the cost of safety. If you’re doing a greenfield C++ project, you can instead opt for safe variants that aren’t drop-in replacements, but that avoid use-after-free, buffer overruns, and the like. But those are, again, not the versions specified to live in std::.

        And that’s why the answer is so unsatisfying: with std::move, rvalue references, unique_ptr, and so on give you the foundation for C++ to be…well, certainly not Rust, but a lot closer to Rust than to C. But the standard library, due to a mixture of politics and a strong desire for backwards compatibility with existing codebases, tends to opt for ergonomics over security.

        1. -1

          I think you hit the nail on the head, C++ is ergonomic. I guess I don’t like the idea that Rust would get in the way of me expressing my ideas (even if they are bad). Something about that is offensive to me. But of course, that isn’t a rational argument.

          Golang, on one hand, is like speaking like a 3-year-old, and Rust is peaking the language in 1984. C++, on the other hand, is kind of poetic. I think that people forget software can be art and self-expression, just as much as it can be functional.

          1. 10

            I guess I don’t like the idea that Rust would get in the way of me expressing my ideas (even if they are bad). Something about that is offensive to me.

            Isn’t it more offensive to tell users that you are putting them at greater risk of security vulnerabilities because you don’t like to be prevented from expressing yourself?

            1. 3

              That’s an original take on it.

          2. 6

            It doesn’t get in the way of expressing your ideas. It gets in the way of you expressing them in a way where it can’t prove they’re safe. A way where the ideas might not actually work in production. That’s a distinction I think is worthwhile.

          3. 5

            I think we agree strongly that Rust constrains what you can say. Where we have different tastes is that I like that. To me, it’s the kind of constraint that sparks artistic creativity, and by reasoning about my system so that I can essentially prove that its memory access patterns are safe, I think I get a better result.

            But I understand how a different programmer, or in different circumstances, would value the freedom to write pretty much any code they like.

          4. 3

            I don’t like the idea that Rust would get in the way of me expressing my ideas

            Every language allows you to express yourself in a different way; a Javascript programmer might say the same of C++. There is poetry in the breadth of concepts expressible (and inexpressible!) in every language.

            I started out with Rust by adding .clone() to everything that made it complain about borrowing, artfully(?) skirting around the parts that seem to annoy everyone else until I was ready. Sure, it might have made it run a bit slower, but I knew my first few (er, several) programs would be trash anyway while I got to grips with the language. I recommend it if you’re curious but reticent about trying it out.

            – The Rust Evangelion Strike Force

          5. 3

            That is true, you have to do things “Rust way” rather than your way. People do react with offense to “no, you can’t just modify that!”

            However, I found Rust gave me a vocabulary and building blocks for common patterns, which in C I’d “freestyle” instead. Overall this seems more robust and readable, because other Rust users instantly recognize what I’m trying to do, instead of second-guessing ownership of pointers, thread-safety, and meaning of magic booleans I’d use to fudge edge cases.

      2. 5

        Tarsnap is written in C. I think it’s ultra unfortunate that C has gotten a bad rap due to the undisciplined people who use it.

        C and C++ are tools to create abstractions. They leave many ways to burn yourself. But they also represent closely how machines actually work. (This is more true of C than C++, but C is a subset of C++, so the power is still there.)

        This is an important quality often lost in “better” programming languages. It’s why most software is so slow, even when we have more computing power than our ancestors could ever dream of.

        I fucking love C and C++, and I’m saddened to see it become a target for hatred. People have even started saying that if you actively choose C or C++, you are an irresponsible programmer. Try writing a native node module in a language other than C++ and see how far you get.

        1. 25

          Tarsnap is written in C. I think it’s ultra unfortunate that C has gotten a bad rap due to the undisciplined people who use it.

          I think the hate for C and C++ is misplaced; I agree. But I also really dislike phrasing the issue the way you have, because it strongly implies that bugs in C code are purely due to undisciplined programmers.

          The thing is, C hasn’t gotten a bad rap because undisciplined people use it. It’s gotten a bad rap because disciplined people who use it still fuck up—a lot!

          Is it possible to write safe C? Sure! The techniques involved are a bit arcane, and probably not applicable to general programming, but sure. For example, dsvpn never calls malloc. That’s definitely a lot safer than normal C.

          But that’s not the default, and not doing it that way doesn’t make you undisciplined. A normal C program is gonna have to call malloc or mmap at some point. A normal C program is gonna have to pass around pointers with at least some generic/needs-casting members at some point. And as soon as you get into those areas, C, both the language and the ecosystem, make you one misplaced thought away from a vulnerability.

          This is an important quality often lost in “better” programming languages. It’s why most software is so slow, even when we have more computing power than our ancestors could ever dream of.

          You’re flirting around a legitimate issue here, which is that some languages that are safer (e.g. Python, Java, Go) are arguably intrinsically slower because they have garbage collection/force a higher level of abstraction away from the hardware. But languages like Zig, Rust, and (to be honest) bygones like Turbo Pascal and Ada prove that you don’t need to be slower to be safer, either in compilation or runtime. You need stricter guarantees than C offers, but you don’t need to slow down the developer in any other capacity.

          No, people shouldn’t hate on C and C++. But I also don’t think they’re wrong to try very hard to avoid C and C++ if they can. I think you are correct that a problem until comparatively recently has been that giving up C and C++, in practice, meant going to C#, Java, or something else that was much higher on the abstraction scale than you needed if your goal were merely to be a safer C. But I also think that there are enough new technologies either already here or around the corner that it’s worth admitting where C genuinely is weak, and looking to those technologies for help.

          1. 2

            You need stricter guarantees than C offers, but you don’t need to slow down the developer in any other capacity.

            Proven in a couple of studies with this one (pdf) being the best. I’d love to see a new one using Rust or D.

          2. 1

            I also really dislike phrasing the issue the way you have, because it strongly implies that bugs in C code are purely due to undisciplined programmers.

            You dislike the truth, then. If you don’t know how to free memory when you’re done with it and then not touch that freed memory, you should not be shipping C++ to production.

            You namedrop Rust. Note that you can’t borrow subsets of arrays. Will you admit that safety comes at a cost? That bounds checking is a cost, and you won’t ever achieve the performance you otherwise could have, if you have these checks?

            Note that Rust’s compiler is so slow that it’s become a trope. Any mention of this will piss off the Rust Task Force into coming out of the woodwork with how they’ve been doing work on their compiler and “Just wait, you’ll see!” Yet it’s slow. And if you’re disciplined with your C++, then instead of spending a year learning Rust, you may as well just write your program in C++. It worked for Bitcoin.

            It worked for Tarsnap, Emacs, Chrome, Windows, and a litany of software programs that have come before us.

            I also call to your attention the fact that real world hacks rarely occur thanks to corrupted memory. The most common vector (by far!) to breach your corporation is via spearphishing your email. If you analyze the amount of times that a memory corruption actually matters and actually causes real-world disasters, you’ll be forced to conclude that a crash just isn’t that significant.

            Most people shy away from these ideas because it offends them. It offended you, by me saying “Most programmers suck.” But you know what? It’s true.

            I’ll leave off with an essay on the benefits of fast software.

            1. 12

              It worked for […] Chrome

              It’s difficult for me to reconcile this with the perspective of a senior member of the Chrome security team:

              Chrome, Chrome OS, Linux, Android — same problem, same scale.

              Here’s some of the Fish in a Barrel analysis of Chrome security advisories:

              1. 1

                This implies an alternative could have been used successfully.

                Even today, would anyone dare write a browser in anything but C++? Even Rust is a gamble, because it implies you can recruit a team sufficiently proficient in Rust.

                Admittedly, Rust is a solid alternative now. But most companies won’t make the switch for a long time.

                Fun exercise: Criticize Swift for being written in C++. Also V8.

                C++ is still the de facto for interoperability, too. If you want to write a library most software can use, you write it in C or C++.

                1. 13

                  C++ is still the de facto for interoperability, too.

                  C is the de facto for interoperability. C++ is about as bad as Rust, and for the same reason: you can’t use generic types without compiling a specialized version of the template-d code.

                2. 8

                  You’re shifting the goalposts here. “No practical alternative to C++” is altogether unrelated to “C and C++ are perfectly safe in disciplined programmers’ hands” which you claimed above.

                  And no, empirically they are not safe, a few outlier examples notwithstanding (and other examples like Chrome and Windows don’t really support your claim). It’s also illogical to suggest that just because there are a handful of developers in the world who managed to avoid all the safety issues in their code (maybe), C and C++ are perfectly fine for wide industry use by all programmers, who, in your own view, aren’t disciplined enough. Can’t you see that it doesn’t follow? I can never understand why people keep making this claim.

                  1. -1

                    Also known as “having a conversation.”

                    But, sure, let’s return to the original claim:

                    C and C++ are perfectly safe in disciplined programmers’ hands

                    Yes, I claim this with no hubris, as someone who has been writing C++ off and on for well over a decade.

                    I’m prepared to defend that claim with my upcoming project, SweetieKit (NodeJS for iOS). I think overall it’s quite safe, and that if you manage to crash while using it, it’s because you’ve used the Objective-C API in a way that would normally crash. For example, pick apart the ARKit bindings:

                    I don’t think SweetieKit could have been made in any other language, partly because binding to V8 is difficult from any other language.

                    I do not claim at the present time that there are no bugs in SweetieKit (nor will I ever). But I do claim that I know where most of them probably are, and that there are few unexpected behaviors.

                    Experience matters. Discipline matters. Following patterns, matters. Complaining that C++ is inherently unsafe is like claiming that MMA fighters will inherently lose: the claim makes no sense, first of all, and it’s not true. You follow patterns while fighting. And you follow patterns while programming. Technique matters!

                    1. 5

                      Perhaps you are one of the few sufficiently disciplined programmers! But I really can’t agree with your last paragraph when, for example, Microsoft says this:

                      the root cause of approximately 70% of security vulnerabilities that Microsoft fixes and assigns a CVE (Common Vulnerabilities and Exposures) are due to memory safety issues. This is despite mitigations including intense code review, training, static analysis, and more.

                      I think you have a point about the impact of these types of issues compared to social engineering and other attack vectors, but I’m not quite sure that it’s sufficient justification if there are practical alternatives which mostly remove this class of vulnerabilities.

                      1. 1

                        For what it’s worth, I agree with you.

                        But only because programmers in large groups can’t be trusted to write C++ safely in an environment where safety matters. The game industry is still mostly C++ powered.

                        1. 3

                          I agree with that. I’ll add that games:

                          (a) Have lots of software problems that even the players find and use in-game.

                          (b) Sometimes have memory-related exploits that have been used to attack the platforms.

                          (c) Dodge lots of issues languages such as Rust address with the fact that you can use memory pools for a lot of stuff. I’ll also add that’s supported by Ada, too.

                          Preventable, buggy behavior in games developed by big companies continues to annoy me. That’s a sampling bias that works in my favor. If what I’ve read is correct, they’re better and harder-working programmers than average in C++ but still have these problems alternatives are immune to.

                          1. 3

                            That, and games encourage an environment of ignoring security or reliability in favour of getting the product out the door, and then no long-term maintenance. If it weren’t for the consoles, they wouldn’t even have security on their radar.

                            1. 1

                              Yeah. On top of it, the consoles showed how quickly the private market could’ve cranked out hardware-level security for our PC’s and servers… if they cared. Also, what the lower end of the per-unit price might be.

                3. 6

                  Would anyone dare write a browser in anything but C++?

                  That’s preeeetty much the whole reason Mozilla made Rust. It now powers a decent chunk of Firefox, esp the performance-sensitive parts like, say, the rendering engine.

            2. 5

              If you don’t know how to free memory when you’re done with it and then not touch that freed memory, you should not be shipping C++ to production.

              Did you ever botch up memory management? If you say “no” I am going to assume you haven’t ever used C nor C++.

              1. 5

                There’s a difference between learning and shipping to production.

                Personally, I definitely can’t get memory management right by myself and I’m pretty suspicious of people who claim they can, but people can and do write C++ that only has a few bugs.

                1. 5

                  There’s always one or other edge case that make it slip into prod even with the experts. A toolchain on a legacy project that has no santizer flags you are used to. An integrated third party library with ambiguous lifecycle description. A tired dev on the end of a long stint. Etc etc.

                  Tooling helps, but anytime you have an allocation bug caught in that safety net means you screwed up on your part.

                2. 5

                  Amen. I thought I was pretty good a while back having maintained a desktop app for years (C++/MFC (yeah, I know)). Then I got on a team that handles a large enterprise product - not exclusively C++ but quite a bit. There are a couple of C++ guys on the team that absolutely run circles around me. Extremely good. Probably 10x good. It was (and has been) a learning experience. However, every once in a while we will still encounter a memory issue. It turns out that nobody’s perfect, manual memory management is hard (but not impossible), and sometimes things slip through. Tooling helps tremendously - static analyzers, smarter compilers, and better language features are great if you have access to them.

                  I remember reading an interview somewhere in which Bjarne Stroustrup was asked where felt he was on a scale of 1-10 as a C++ programmer. His response, iirc, was that he was a solid 7. This from the guy who wrote the language (granted, standardization has long since taken over.) His answer was in reference to the language as a whole rather than memory management in particular, but I think it says an awful lot about both.

                  1. 4

                    My first job included a 6 month stint tracking down a single race condition in a distributed database. Taught me quite a bit about just how hard it is to get memory safety right.

                    1. 2

                      You’re probably the kind of person that might be open-minded to the idea that investing some upfront work into TLA+ might save time later. Might have saved you six months.

                      1. 2

                        The employer might not have needed me in 2007 if the original author had used TLA+ (in 1995, when they first built it).

                        1. 2

                          Yeah, that could’ve happened. That’s why I said you. As in, we’re better off if we learn the secret weapons ourselves, go to employers who don’t have them, and show off delivering better results. Then, leverage that to level up in career. Optionally, teach them how we did it. Depends on the employer and how they’ll react to it.

                          1. 2

                            This particular codebase was ~600k lines of delphi, ~100k of which was shared between 6 threads (each with their own purpose). 100% manually synchronized (or not) with no abstraction more powerful than mutexes and network sockets.

                            It took years to converge on ‘only crashes occasionally’, and has never been able to run on a hyperthreaded CPU.

                            1. 1

                              Although Delphi is nice, it has no tooling for this that I’m aware of. Finding the bugs might have to be a manual job. Whereas, Java has a whole sub-field dedicated to producing tools to detect this. They look for interleavings, often running things in different orders to see what happens.

                              “~100k of which was shared between 6 threads (each with their own purpose).”

                              That particularly is the kind of thing that might have gotten me attempting to write my own race detector or translator to save time. It wouldn’t surprise me if the next set of problems took similarly long to deal with.

              2. 1

                Oh yes. That’s how you become an expert.

                You quickly learn to stick to patterns, and not deviate one millimeter from those patterns. And then your software works.

                I vividly remember when I became disillusioned with shared_ptr: I put my faith into magic to solve my problems rather than understanding deeply what the program was doing. And our performance analysis showed that >10% of the runtime was being spent solely incrementing and decrementing shared pointers. That was 10% we’d never get back, in a game engine where performance can make or break the company.

                1. 2

                  Ok, I take it you withheld shipping code into prod until you reached that level of expertise? I’m almost there after 20+ years and feel like a cheat now ;)

            3. 4

              It’s funny you mention slow compiles given your alternative is C++: the language that had the most people complaining about compile times before Rust.

              Far as other comment, the C++ alternative for a browser should be fairly stable. Rust and Ada are safer. D compiles faster for quicker iterations. All can turn off the safety features or overheads on a selective basis where needed. So, yeah, I’d consider starting a browser without C++.

              The other problem with C++ for security is that it’s really hard to analyze with few tools compared to just C. There still isn’t even a formal semantics for it because the language itself is ridiculously complicated. Unnecessarily so given more powerful languages, PreScheme and Scheme48, had a verified implementations. It’s just bad design far as security is concerned.

        2. 6

          Comparing something as large as an OS to a project like Tarsnap seems awfully simplistic. C has a bad rap because of undisciplined developers, sure, but also because manual memory management can be hard. The more substantial the code base, the more difficult it can get.

        3. 6

          Tarsnap is written in C

          I want a rule in any conversation about C or C++ that nobody defending what can be done in those languages by most people uses an example from brilliant, security-focused folks such as Percival or DJB. Had I lacked empirical data, I’d not know whether it was just their brilliance or the language contributing to the results they get. Most programmers, even smart ones, won’t achieve what they achieved in terms of secure coding if given the same amount of time and similar incentives. Most won’t anyway given the incentives behind most commercial and FOSS software that work against security.

          Long store short, what those people do doesn’t tell us anything about C/C++ because they’re the kind of people that might get results with assembly, Fortran, INTERCAL, or whatever. It’s a sampling bias that shows an upper bound rather than what to expect in general.

          1. 8

            Right. As soon as you restrict the domain to software written by teams, over a period of time, then it’s game over. Statistically you’re going to get a steady stream of CVE’s, and you can do things to control the rate (like using sanitizers) but qualitatively there’s really nothing you can do about it.

        4. 4

          My frustration to C is that it makes lots of things difficult and dangerous that really don’t need to be. Ignoring Rust as a comparison, there’s still lots of things that could be improved quite easily.

          1. 4

            That’s pretty much Zig. C with the low-hanging fruit picked.

          2. 0

            Now we’re talking! What kind of things could be improved easily?

            I like conversations like this because it highlights areas where C’s designers could have come up with something just a bit better.

            1. 5

              A significant amount of the undefined behavior in C and C++ is from integer operations. For example, int x = -1; int y = x << 1; is UB. (I bet if you did a poll of C and C++ programmers, a majority would say y == -2). There have been proposals (Regehr’s Friendly C, some more recent stuff in WG21) but so far they haven’t gotten much traction.

              1. 5

                I tweeted this as a poll. As of the time I posted the answer, 42% said -2, 16% correctly said it was UB, another 16% said implementation defined, and 26% picked “different in C and C++.” Digging a little further, I’m happy to see this is fixed in the C++20 draft, which has it as -2.

              2. 1

                Agreed; int operations are one area I find hard to defend. The best I could come up with is that int64_t should have been the default datatype. This wouldn’t solve all the problems, but it would greatly reduce the surface.

        5. 4

          I wonder about how well C maps to machine semantics. Consider some examples; for each, how does C expose the underlying machine’s abilities? How would we do this in portable C? I would humbly suggest that C simply doesn’t include these. Which CPU are you thinking of when you make your claim?

          • x86 supports several extensions for SIMD logic, including SIMD registers. This grants speed; performance-focused applications have been including pages of dedicated x86 assembly and intrinsics for decades.
          • amd64 supports “no-execute” permissions per-page. This is a powerful security feature that helps nullify C’s inherent weaknesses.
          • Modern ARM support embedded “thumb” ISAs which trade functionality for size improvements. This is an essential feature of ARM which has left fingerprints on video game consoles, phones, and other space-constrained devices.

          Why is software slow? This is a sincere and deep question, and it’s not just about the choice of language. For example, we can write an unacceptably-slow algorithm in any (Turing-equivalent) language, so speed isn’t inherently about choice of language.

          I remember how I learned to hate C; I wrote a GPU driver. When I see statements like yours, highly tribal, of forms like, “try writing [a native-code object with C linkage and libc interoperability] in a language other than C[++],” I wonder why you’ve given so much of your identity over to a programming language. I understand your pragmatic points about speed, legacy, interoperability, and yet I worry that you don’t understand our pragmatic point about memory safety.

          1. 4

            It was designed specifically for the advantages and limitations of the PDP-11 on top of authors’ personal preferences. It’s been a long time since there was a PDP-11. So, the abstract machine doesn’t map to current hardware. Here’s a presentation on its history that describes how many of the “design” decisions came to be. It borrowed a lot from Richard’s BCPL which wasn’t designed at all: just what compiled on even worse hardware.

            1. 1

              I keep hearing this trope, but coming from the world of EE, I’m readily convinced it is false. C never was designed to give full access to the hardware.

              1. 1

                The K&R book repeatedly describes it as using data types and low-level ops that reflect the computer capabilities of the time. Close to the machine. Then, it allows full control of memory with pointers. Then, it has an asm keyword to directly program in assembly language. It also was first used to write a program, UNIX, that had full access to and manipulated hardware.

                So, it looks like that claim is false. It was designed to do two things:

                1. Provide an abstract machine close to hardware to increase (over assembly) productivity, maintain efficiency, and keep compiler easy to implement.

                2. Where needed, provide full control over hardware with a mix of language and runtime features.

                1. 1

                  Yet even the PDP-11 had a cache. C might have been low enough to pop in to assembly or write to an arbitrary memory position, but that does not mean it ever truly controlled the processor.

                  1. 1

                    That would be an exception to the model if C programmers routinely control the cache. It wouldn’t be if the cache was an accelerator that works transparently in the background following what program is doing with RAM. Which is what I thought it did.

                    Regardless, C allows assembly. If instructions are availsble, it can control the cache with wrapped functions.

          2. 1

            In my experience, C is very close to how the processor. Pretty much every C “phoneme” maps to one or two instructions, making it very close to how your processor actually works. The assembly is a bit more expressive, especially when it comes to bit operations and weird floating point stuff (and loads of weird, speciallized stuff), but C can only use features it can expect any reasonable ISA to have. It usually is easily extensible to accomodate the more specific things, far easier than most other languages.

            About your three examples:

            1. Adding proper support for SIMD is difficult, because it is very different between architectures or between versions of an architecture. The problem of designing (in a perfomant way, because if someone is vectorizing by hand, perfomance is important) around these differences is hard enough that I haven’t seen a good implementation. GCC has an extension that tries, but it is a bit of a PITA to use ( ). There are relatively easy to use machine specific extensions out there that fit well into the language.

            2. If you malloc anything, you’ll get memory in a non-executable page from any sane allocator. If you want memory with execute permissions, you’ll have to mmap() yourself.

            3. Thumb doesn’t really change the semantics of the logical processor, it just changes the instruction encoding. This is almost completly irrelevant for C.

            You can of course argue that most modern ISAs are oriented around C (I’m looking at you, byte addressability) and not the other way around, but that is a debate for another day.

            1. 2

              “Adding proper support for SIMD is difficult, because it is very different between architectures or between versions of an architecture.”

              There’s been parallel languages that can express that and more for years. C just isn’t up to the task. Chapel is my current favorite given all the deployments it supports. IIRC, Cilk language was a C-like one for data-parallel apps.

              1. 1

                Cilk is a C extension. Also, it is based upon multithreading, not SIMD.

                1. 1

                  Oh yeah. Multithreading. Still an example of extending the language for parallelism. One I found last night for SIMD in C++ was here.

        6. 3

          I see your point regarding people sh*tting all over C/C++. These are clearly good languages and they definitely have their place. However, I work with C++ pretty frequently (not low-level OS stuff, strictly backend and application stuff on Windows) and I’ve encountered a couple of instances in which people way more capable than I am managed to shoot themselves in the foot. That changed my perspective.

          To be clear, I also love C (mmmm…less C++), and I think that most developers would do well to at least pick up the language and be able to navigate (and possibly patch a large C codebase.) However, I’d also wager that an awful lot of stuff that is written in C/C++ today, probably doesn’t need the low level access that these languages provide. I think this its particularly true now that languages like Rust and go are proving to be both very capable at the same kinds of problems and also substantially little less foot-gunny.

      3. 2

        This is a bit tangential, but your link for “Modern C++ Won’t Save Us” points to the incorrect page.

        It should point to: