This seems counterproductive.
In the near term, the machine code generated by Zig will become less competitive. Long-term, it may catch up or even surpass LLVM and GCC.
Technically true but realistically how confident is the author they can surpass a project backed by the biggest corporations in the world and a project underpinning a few (all?) largest FLOSS efforts?
In the near term it would also reduce the number of targets Zig supports, since LLVM has a nice laundry list. However, LLVM supports fewer targets than you think… we are constantly running into bugs and embarrassing O(N^2) code for pretty much every target except x86, aarch64, and webassembly.
Coincidentally, those are the targets that are are listed a little earlier in the OP as being worked on for Zig. I assume, those are going to be the only targets for a while. LLVM might be not optimal for other platforms but it does support those other platforms. Is going from suboptimal support to no support an upgrade?
We can attract direct contributions from Intel, ARM, RISC-V chip manufacturers, etc., who have a vested interest in making our machine code better on their CPUs.
Again, technically true but how realistic it is? I’m sure all those vested interests already invest in LLVM. Would they want to also invest in Zig? Let’s face it, Zig is pretty small at the moment. Zig would get all the results of LLVM’s improvement by those vested interests but can it effectively divert effort from LLVM or can it effectively make those vested interests make an additional effort to make parallel contributions to Zig?
The benefit list is tempting, true. It’s full of potential. I’m just a little sceptical whether it can be realised to the extent it’s pitched.
I don’t have a horse in this race. I don’t know the language or the community so can’t give an accurate judgement on the announcement. If authors wants to work on the compiler (as opposed to on the language) all power to them. I just don’t see it as a good way to advance the project in the “product” sense. To me it seems like a lot of effort’s going to be spent on reimplementing what’s already been done by someone else.
Given the enormous effort that goes into LLVM I’m a little skeptical about Zig code generation eventually being able to surpass either, too. But I can sort of see where this is coming from.
A few months ago I drew the short stick writing some LLVM-related build and glue code (I was the only contractor who’d ever touched it, and that was in like 2007 – I’m only moderately interested in compiler design and completely uninterested in language design). 15 years is of course a lot of time, so I wasn’t expecting LLVM to be anything like what I remembered it, but holy crap. Bootstrapping is quite a goat carnage, and you have to do it, because it moves pretty quickly and regressions are pretty abundant, so your distro’s/Homebrew’s/whatever packages tend to be useless if you want to develop against it, you start getting into weird bugs as soon as you do something non-trivial.
It’s not bad, I’m admitedly entirely uneducated when it comes to language tooling but as far as I can tell it’s an extraordinarily good piece of software, and the bug churn is inherent to its massive scope and the extremely difficult problem that it solves. However, like most corporate-backed projects, it seems to have evolved to the point where it’s also difficult to use if you don’t have corporate-level resources to throw at it.
Looking through the bug reports in Zig’s bug tracker it looks like they spend an uncanny amount of time fighting packaging problems or upstream bugs. Maybe they think they’re just too small for a tool this big.
skeptical about Zig code generation eventually being able to surpass either
Maybe it doesn’t have to?
There is increasing evidence that optimising compilers hit diminishing returns quite a while ago. See Proebsting’s Law, which states that compiler advances double computing power every 18 years, compared to the 18 months for hardware advances. That might seem depressing, but it was actually wildly optimistic, the current rate seems to be a doubling for every 50 years, with the interval increasing. And at a cost of a much larger increase in compile times.
See also The Death of Optimising Compilers by Daniel Bernstein of cryptography and Qmail fame. Frances Allen got all the good ones.
The Oberon optimising compiler is 6 source files, LLVM is what, a million lines of code?
For my own Objective-S, I made three attempts with LLVM, and each time gave up, because the cost of using it was just too high. Last summer I broke down and decided to just do my own native compiler. Are there times when I am struggling with some aspect of code-generation or object-file layout that I regret this choice? Absolutely! But overall, both progress and developer happiness are much better.
Maybe it doesn’t have to?
Maybe not. I was merely commenting of what is in the OP. I assume that is their ambition based on what they said in the OP alone.
A couple of points:
The efficacy of compiler optimisations follows a pareto distribution (see e.g.). Llvm’s investors really need (or, at least, think they need) those extra few single-digit %; if you don’t, you may be able to get away with something much smaller that still works pretty well.
Llvm is not the be-all and end-all of compilers—it is very much a product of its time, and taking advantage of developments since its creation allow the creation of a compiler generating code of a given degree of quality for appreciably less effort and code. Redundancies in its ir mean that optimisations may, to a great extent, be brute-forced where a more unified mechanism would serve better. The example I always use is this. It is not per se a problem that the code for g is suboptimal—getting that right involves annoying cost models that may not generalise from a three-line function. But it is a problem that the compiler is able to tell the difference between f and g (and note llvm and gcc both fall down in exactly the same way, and for the same reason).
That is an interesting example, but I’m having a little trouble grokking this:
But it is a problem that the compiler is able to tell the difference between f and g
Did you mean that it is a problem that the compiler is not able to tell that they are semantically the same?
Why is it a problem? Are you just referring to the fact that it ends up emitting two function bodies instead of one?
Are you just referring to the fact that it ends up emitting two function bodies instead of one?
No—that part is trivial. The problem is not the relation of different functions, only the analysis of a single function.
In general, it’s desirable for a compiler intermediate representation to canonicalise, in order to eliminate redundancies—it’s undesirable to have two ways of expressing the same thing. One trivial example is addition and negation vs subtraction; if the ir includes addition, subtraction, and negation, then there is a redundancy: do you write x-y or x+(-y)? The problem with this is that a lot of optimisations have to be written twice—once, to apply to x-y, and once again to apply to x+(-y). The solution is to not have subtraction in the intermediate representation; a source-level x-y is always lowered to x+(-y) for optimisation purposes. (We can say that x+(-y) is the canonical form of a subtraction.) Hence, the optimiser is utterly unable to tell whether the user wrote x-y or x+(-y), and so it does not need to care.
Needing to express an optimisation twice—once for addition, and once for subtraction—is what I mean by ‘brute forcing’; you’re spending less time looking for patterns and underlying structure, and more time just throwing rules at the wall.
Llvm ir doesn’t have subtraction—that is, again, a fairly trivial example, and is not hard to fix retroactively. But its intermediate representation (like gcc’s) carries around redundant information about sequencing, instead of including only as much sequencing information as is necessary to preserve the program semantics, and this is much harder to fix ex post facto. The result of this is that f and g have distinct ir. And this comes through in the final generated code, because the compiler did not go to extra effort to account for it. (Even if it had done so, it would not change the fact that the redundancy in the ir is there, and forces additional redundancy and complexity into the compiler; but the fact that it has not makes it an easy way to demonstrate the deficiency.)
One trivial example is addition and negation vs subtraction
In practice this can be handled by having a pass looking for the x+(-y) pattern and replacing occurrences with x-y (or the other way around), effectively giving you a canonical form. I’m not sure that I buy the idea that not having subtraction in the IR is really a win, when it’s so simple to remove it anyway, but sure, this is just another way of achieving the same goal.
However I’m dubious about the (potential) ability for semantically equivalent code to be canonicalised to an identical IR. In your simple example the sequencing isn’t important and so it’s true that the IR need not carry the sequencing of the call vs the arithmetic operation, but of course there are plenty of cases where A really must happen before B, and some where the compiler can’t easily (or can’t at all) determine that sequence must be preserved (consider the simple case of storing through two different pointer variables: if the pointers can alias, the order of the stores must be preserved, otherwise, in the absence of volatile, it’s not necessary. But determining whether pointers can alias can be undecidable).
Still, it’s an interesting example.
Of course you can’t canonicalise all semantically equivalent functions to the same IR—that would violate rice’s theorem. I’m not arguing that you should be able to. Only that the irs of llvm and gcc carry around a shitload of redundant sequencing information (among other things) and it holds them back considerably.
E:
this can be handled by having a pass looking for the x+(-y) pattern and replacing occurrences with x-y (or the other way around)
I guess. I don’t think it makes much difference (only be able to represent one form vs only have one form that you ever use), except that the former seems much nicer.
The forms of redundancy found in llvm and gcc’s irs are more interesting, and you will likely find them harder to treat that way. Compare block-oriented cfg vs sea of nodes.
Bootstrapping is quite a goat carnage, and you have to do it, because it moves pretty quickly and regressions are pretty abundant, so your distro’s/Homebrew’s/whatever packages tend to be useless if you want to develop against it, you start getting into weird bugs as soon as you do something non-trivial.
That’s not been my experience in recent releases (after about LLVM 8). I’ve maintained out of tree things that had a small abstraction layer to paper over differences and were able to build against 3-4 LLVM releases. The opaque pointer work has taken a long time to and and was a big change, but it was also supported with a very long (multi-year) deprecation period where you could build code that was conditional on whether the current version was built with opaque pointers or not.
I don’t doubt it, LLVM is large and covers a lot of architectures – it’s very likely that I’ve only touched some weird part. Every good project has one of these weird swamps. Thing is, if you have to deal with the one or two things it does… not even badly, just not quite the way you need it, it can seem like it’s not worth it, especially if you rely on indie developers working in their spare time a lot.
My post wasn’t intended as a jab at LLVM – it’s a big project and whatever unwieldiness it has is likely unevenly spread, and not just inherent to its size and the difficulty it solves, but further compounded by my being an amateur in its field. I rather wanted to point out that, if you’re not specifically interested in writing toolchains with features LLVM is particularly good at, or with features that are particularly important for LLVM (or, in the extreme case, you’re not that interested in writing toolchains, like me), this may seem like a far more reasonable trade-off than it looks at first.
Fun fact, Crystal currently supports all major LLVM releases from 8 to 16 (!)
I think the biggest reason to be optimistic is the Zig C backend, so we’re not really losing any platform support. Granted, I don’t know how well it works in practice and what the performance is like compared to the LLVM backend.
I don’t know how well it works in practice
We use it to bootstrap Zig, so it works at least that well.
what the performance is like compared to the LLVM backend
The C backend compiles Zig code to C code, so performance is up to the C compiler that you end up using.
I don’t agree. LLVM is a huge value add for something like Rust or C++ where having a hyper optimizing backend helps make some of the much higher level abstractions zero cost. But Zig is a relatively simple language, like Go, which also has its own code generation backend. Go’s code generation is much less sophisticated than LLVM but Go is still considered a high performance language.
Go is considered high performance by people comparing it to Python, it is much slower than C++ for a lot of things. The go front end for GCC was much faster for single threaded execution but used one pthread per goroutine, so ended up being slower overall. I don’t know if this was ever fixed.
The go front end for GCC
I know of exactly 0 people that have ever used this. Everyone I know of uses the default, written in Go, compiler and toolchain.
My favourite Go performance story was it always passing arguments on the stack, even if it could use registers.
Zig author and the small team that he built around the language seems quite capable. I would like to see them go into the direction of having their own IRL and an optimizing backend.
The approach that they have been taking with everything (at least in my undestanading) is
a) give better language and better tooling for multi-OS and multi-CPU targets
b) consider developer experience as a whole (programming, debugging, managing build systems, using external dependencies)
c) Recognize that operating systems and their user-land revolves around C
d) Carefully consider build-times and incremental build times as part of developer experience (smaller is better)
I think they are seeing also that ( d ) may not achievable with LLVM backend – because LLVMs goals would drive it to tradeoff small optimization wins for a larger compile/link time and less exhaustive testing for non-tier1-platforms.
So all in all, for the new CPUs, new or less serviced OSes, a Zig-supported backend may be a win. Which in turn result in more competition with the established vendors (OS and CPU wise). yes there will be more fragmentation, but I hope it will be good overall outcome.
Again, technically true but how realistic it is? I’m sure all those vested interests already invest in LLVM.
Quite common for microcontrollers to have a vendor supplied c compiler. If the license suits, it’s not inconceivable they might base it on the zig c compiler.
Project homepage is here https://eyg.run/
I like the portability and effect system. Sending code over the wire made me think of Telescript. It’s also a nice-looking, summary page.
Thanks. I actually think effect systems are going to find use in the future in many languages. I was thinking of describing eyg as a strongly typed Lua for a while. And dropping the structural editor, but for now I’m keen to keep both experiments going. even if it makes the language a bit more niche in the meantime.
Here’s a temporary pastebin with that program in Rust generated by both ChatGPT’s. Is this good or bad Rust? Or something else entirely?
Seems reasonable in my eyes, though I only have 6 months of professional experience with Rust. GPT-4 even gets the dependency on tokio right. I can’t help but think there must be a way to join all the handles at once, but I haven’t looked into it.
If you ever use it, one thing I’ve learned about it is that it’s better at whatever it sees lots of. Probably due to how they learn by example. It’s really good at Python, Flask, and HTML. Rust is newer so might perform less. Many LLM’s are being trained using Github data. They’ll learn those patterns.
Using that theory, one of the best things you can do is to use whatever is the most common library or approach on Github. Also, use it for whatever people commonly do in Rust code. It might generate mostly working code that way. The more you stray off that path, the worse its results might get. Also, I use it to generate code in one language or library that’s easy to port to another. If it didn’t understand Rust web stuff, I could do Flask and HTML generation because that would be straightforward to port to some Rust frameworks.
A lot on why you should read CS papers and which papers you should read, but not how to read them. First rule: read the abstract, skim the introduction, read the conclusion, then decide if it’s worth reading the rest of the paper.
Those are good steps. I also always read “related work” while skimming the implementation and limitations.
Related work of a few papers gave me a start on a whole survey of that sub-field. How they do the implementation hints at if what they did even matters in the real world. Limitations will do that while telling you the next, research problem to work on. Also, what tech to mix and match it with since its limitations will be the advertised strengths of other works.
So so true. I spent about 10 years of my life trying to find something better than C for programming stuff that needed strong control over memory. You can do it in C#, Lisp, etc but it requires incredibly detailed knowledge of the implementation.
C++? The adoption/learning curve is so shallow — for instance you can keep writing C code but just use (zone or more of) std::string, std::vector and std::unique_ptr; and most of your memory management code and its bugs go away. And of course use new/delete/malloc/free if you really need to.
After writing an OS in C++, I really hate having to go back to C. I have to write far more code in C, and (worse) I have to think about the same things all of the time. Even just having RAII saves me a huge amount of effort (smart pointers are a big part, but so is having locks released at the end of a scope). For systems code, data structure choice is critical and C++ makes it so much easier to prototype with one thing, profile usage patterns, and then replace it with something else later.
Would you think it would be helpful to have a C w/ Lisp-style metaprogramming to implement higher-level constructs that compile down to C? And then you use what you’re willing to pay for or put up with?
One that got into semi-usable form was ZL Language which hints at many possibilities in C, C++, and Lisp/Scheme.
Since you wanted destructors, I also found smart pointers for C whose quality I couldn’t evaluate because I don’t program in C. It looked readable at least. There’s been many implementations of OOP patterns in C, too. I don’t have a list of them but many are on StackOverflow.
Or you could just use a language that’s widely supported by multiple compilers and has these features. I implemented a bunch of these things in C, but there were always corner cases where they didn’t work, or where they required compiler-specific extensions that made them hard to port. Eventually I realised I was just implementing a bad version of C++.
In the example that you link to, for example, it uses the same attribute that I’ve used for RAII locks and for lexically scoped buffers. From the perspective of gcc, these are just pointers. If you assign the value to another pointer-type variable, it will not raise an error and will give you a dangling pointer. Without the ability to overload assignment (and, ideally, move), you can’t implement these things robustly.
C++ metaprogramming got a lot better with constexpr and the ability to use structural types as template arguments. The only thing that it lacks that I want is the ability to generate top-level declarations with user-defined names from templates.
The only thing that it lacks that I want is the ability to generate top-level declarations with user-defined names from templates.
Can you elaborate on this? I wonder if it’s related to a problem I have at work.
We have a lot of std::variant-like types, like using BoolExpr = std::variant<Atom, Conjunction, Disjunction>;. But the concise name BoolExpr is only an alias: the actual symbol names use the full std::variant<Atom, Conjunction, Disjunction>. Some of these variants have dozens of cases, so any related function/method names get reaallly long!
I think I would want a language feature like “the true name of std::variant<Atom, Conjunction, Disjunction> is BoolExpr”. Maybe this would be related to explicit template instantiation: you could declare this in bool_expr.h and it would be an error to instantiate std::variant<Atom, Conjunction, Disjunction> anywhere else.
The main thing for me is exposing things to C. I can use X macros to create a load of variants of a function that use a name and a type in their instantiations, but I can’t do that with templates alone. Similarly, I can create explicit template instantiations in a file (so that they can be extern in the header) individually, but I can’t write a template that declares extern templates for a given template over a set of types and another that generates the code for them in my module.
The reflection working group has a bunch of proposals to address these things and I’ve been expecting them to make it into the next standard since C++17 was released. Maybe C++26…
My motivation was this. At one point, I was also considering embedded targets which only support assembly and C variants.
What do you think of my Brute-Force Assurance concept that reuses rare, expensive investments in tooling across languages?
I think platforms without C++ support are dying out. Adding a back end for your target to LLVM is cheaper than writing a C compiler and so there’s little incentive to not support C++ (and Rust). The BFA model might work, but I’d have to see the quality of the code that it generated. Often these tools end up triggering UB, which is a problem, or leave the kind of microoptimisations that are critical to embedded systems out and impossible to add in the generated code.
a) this started in 2002 when half that stuff didn’t exist, and
b) C++ is a hateful morass of bullshit and misdesign, and that won’t change until they start removing things instead of adding it.
Yes, I am biased. Not going to change though.
Pretty sure at least string and vector existed in 2002; not unique_ptr but you can implement that yourself in 10 minutes.
You couldn’t implement unique_ptr in 2002 with the semantics that it has today. unique_ptr requires language support for move semantics in order to give you that uniqueness promise automatically and move semantics came to C++ in 2011.
but it requires incredibly detailed knowledge of the implementation.
Not just that - you actively need to work around the problems and limitations of the runtime. E.g. when garbage collection bogs your application down, you need to start creating object pools. Hence, you end up manually managing memory again - precisely the thing you tried to avoid in the first place. Many runtimes do not let you run the garbage collector manually or specify fine-granular garbage collection settings. In addition to that, an update to the runtime (which you often do not control because it’s just the runtime that is installed on the user’s machine) can ruin all your memory optimizations again and send you back to square one, which is a heavy maintenance burden. It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution. Frankly, it doesn’t make any sense to use these languages at all if you know C++ or Rust, unless the platform forces you to use then (like the web pretty much forces you to use JavaScript if you want to write code that is compatible with most browsers).
It’s been a long time (over a decade) since I had to deal with GC problems being noticeable at an application level. A lot of these problems disappeared on their own, as computers became faster and GC algorithms moved from the drawing boards into the data center. (I was working on software for trading systems, algo trading, stock market engines, operational data stores, and large scale distributed systems. Mostly written in Java.)
In the late 90s, GC pauses were horrendous. By the late aughts, GC pauses were mostly manageable, and we had had enough time to work around the worst causes of them. Nowadays, pauseless GC algorithms are becoming the norm.
I still work in C and C++ when necessary, just like I still repair junk around the house by hand. It’s possible. Sure, it would be far cheaper and faster to order something brand new from China, but there’s a certain joy in wasting a weekend trying to do what should be a 5 minute fix. Similarly, it’s sometimes interesting to spend 40 person years (e.g. a team of 10 over a 4 year period) on a software project in C++ that would take a team of 5 people maybe 3 months to do in Go. Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.
You said “It just doesn’t make any sense to use these languages for anything that requires fine-grained control over the execution.” But how many applications still require that level of fine grained control?
For literally decades, people have been saying that GC has now improved so much that it’s become unnoticeable, and every single time I return to try it, I encounter uncontrollable, erratic runtime behavior and poor performance. Unless you write some quick and dirty toy program to plot 100 points, you will notice it one way or another. Try writing a game in JavaScript - you still have to do object pooling. Or look at Minecraft - the amount of memory the JVM allocates and then frees during garbage collection is crazy. Show me a garbage collector and I’ll show you a nasty corner case where it breaks down.
Similarly, it’s sometimes interesting to spend 40 person years (e.g. a team of 10 over a 4 year period) on a software project in C++ that would take a team of 5 people maybe 3 months to do in Go.
Okay, I’m not a big C++ fan but this is obviously flamebait. Not even gonna comment on it further.
But how many applications still require that level of fine grained control?
A lot. Embedded software, operating systems, realtime buses, audio and video applications… Frankly, I have a hard time coming up with something I worked on that doesn’t require it. Not to mention, even if the application doesn’t strictly require it, a GC is still intrinsically wasteful, making the software run worse, especially on weaker machines. And even if we say performance doesn’t matter, using languages with GC encourages bad and convoluted design and incoherent lifetime management. So, no matter how you look at it, GC is a bad deal.
Okay, I’m not a big C++ fan but this is obviously flamebait. Not even gonna comment on it further.
I managed a large engineering organization at BigTechCoInc for a number of years, and kept track (as closely as possible) of technical projects and what languages they used and what results they had. Among other languages we used in quantity: C, C++, Java, C#. (Other languages too including both Python and JS on the back end, but not enough to draw any clear conclusions.) The cost per delivered function point was super high in C++ compared to everything else (including C). C tended to be cheaper than C++ because it seemd to be used mostly for smaller projects, or (I believe) on more mature code bases making incremental changes; I think if we tried building something new and huge in C, it may have been as expensive as the C++ projects, but that never happened. Java and C# are very similar languages, and had very similar cost levels, much lower than C or C++, and while I didn’t run any Go projects, I have heard from peers that Go costs significantly less than Java for development (but I don’t know about long term maintenance costs). One project I managed was implemented nearly simultaneously in C++, C#, and Java, which was quite illuminating. I also compared notes with peers at Amazon, Facebook, Google, Twitter, Microsoft, eBay, NYSE (etc.), and lots of different financial services firms, and their anecdotal results were all reasonably similar to mine. The two largest code bases for us were Java and C++, and the cost with C++ was an order of magnitude greater than Java.
Embedded software, operating systems, realtime buses, audio and video applications
Sure. Like I said: “Of course, there are still a handful of projects that actually need to be built in C or C++ (or Rust or Zig, I guess), but that is such a tiny portion of the software industry at this point, and those people already know who they are and why they have to do what they have to do.”
Or look at Minecraft - the amount of memory the JVM allocates and then frees during garbage collection is crazy.
This is absolutely true. The fact that Java works at all is a freaking miracle. The fact that it manages not to fall over with sustained allocation rates of gigabytes per second (mostly all tiny objects, too!) is amazing. That Minecraft succeeded is a bit shocking in retrospect.
Very interesting. Do you have more fine-grained knowledge about the cost per delivered function point with respect to C++? Is the additional cost caused by debugging crashes, memory leaks, etc.? Is it caused by additional training and learning or tooling and build systems? Does the usage of modern C++ idioms make a difference? Or does everything simply take longer, death by a thousand cuts?
Some more data points occurred to me. I was thinking about an old presentation I did at a few different conferences on the topic, e.g. https://www.infoq.com/presentations/Keynote-Lessons-Java-CPlusPlus-History-Cloud/
Specifically, looking at areas that Java was able to leverage:
My thinking has evolved in the subsequent decade, but there are some key things in that list that really show the pain points in C++, specifically around the difficulty of re-using libraries and components. But the other thing that’s important to keep in mind is that the form of applications has changed dramatically over time: An app used to be a file (.bin .com .exe whatever). Then it was a small set of files (some .so or .dll files and some data files in addition to the executable). And at some point, the libraries went from being 1% of the app to 99% of the app.
Just like Java/C# ate C++’s lunch in the “Internet application” era, some “newer” platforms (the modern browser plus the phone OSs) show how ill equipped Java/C# are, although I think that stuff like React and Node (JS) are just interim steps (impressive but poorly thought out) toward an inevitable shift in how we think about applications.
Anyhow, it’s a very interesting topic, and I wish I had more time to devote to thinking about this kind of topic than just doing the day job, but that’s life.
I’m going to go into opinion / editorial mode now, so please discount accordingly.
C++ isn’t one language. It’s lots of different languages under one umbrella name. While it’s super powerful, and can do literally everything, that lack of “one true way to do everything” really seems to hurt it in larger teams, because within a large team, no subgroups end up using the same exact language.
C++ libraries are nowhere near as mature (either in the libraries themselves, or in the ease of using them randomly in a project) as in other languages. It’s very common in other languages to drag in different libraries arbitrarily as necessary, and you don’t generally have to worry about them conflicting somehow (even though I guess they might occasionally conflict). In C++, you generally get burnt so badly by trying to use any library other than boost that you never try again. So then you end up having to build everything from scratch, on every project.
Tooling (including builds) is often much slower and quite complicated to get right, particularly if you’re doing cross platform development. Linux only isn’t bad. Windows only isn’t bad. But Linux + Windows (and anything else) is bad. And compile times can be strangely bad, and complex to speed up. (A project I worked on 10+ years ago had 14 hour C++ builds on Solaris/Sparc, for example. That’s just not right.)
Finding good C++ programmers is hard. And almost all good C++ programmers are very expensive, if you’re lucky enough to find them at all. And a bad C++ programmer will often do huge harm to an entire project, while a bad (for example) Python developer will tend to only shit in his own lunchbox.
I think the “death by 1000 cuts” analogy isn’t wrong. But it might only be 87 cuts, or something like that. We found that we could systematize a lot of the things necessary to make a C++ project run well, but the list was immense (and the items on the list more complex) compared to what we needed to do in Java, or C#, etc.
Finding good C++ programmers is hard
This depends a lot on your baseline. It’s easier to find a good C++ programmer than a Rust programmer of any skill level. Over the last 5 years, it’s become easier to find good C++ programmers than good C programmers. It’s orders of magnitude easier to find a good Java, C#, or JavaScript programmer than a good C++ programmer and noticeably easier than finding C++ programmers of any competence level.
Embedded software, operating systems, realtime buses, audio and video applications…
Yep! In other words, almost all the things I’m most interested in!
So, no matter how you look at it, GC is a bad deal.
…Ok I gotta call you out there. :P There’s plenty of times when a GC is a perfectly fine and/or great deal. The problem is just that when you don’t want a GC, you really don’t want a GC, and most languages with a GC use it as a way to make simplifying assumptions that have not stood the test of time. I think a bright future exists for languages like Swift, which use a GC or refcounting and have a good ownership system to let the compiler optimize the bits that don’t need it.
It’s a bad deal you can sometimes afford to take when you have lots of CPU cycles and RAM to spare ;-)
Don’t get me wrong, I’m open to using any tool as long as it gets the job done reliably. I wouldn’t want to manage memory when writing shell scripts. On the other hand, the use-case for shell scripts is very narrow, I wouldn’t use them for most things. The larger the project, the more of a liability GC becomes.
It’s a bad deal you can sometimes afford to take when you have lots of CPU cycles and RAM to spare ;-)
It’s not always that clear cut. Sometimes the performance gains from being able to easily use cyclic data structures that model your problem domain and lead to efficient algorithms can significantly outweigh the GC cost.
Ok, fair. :-) Hmmmm though, I actually thought of a use case where GC of some form or another seems almost inevitable: dealing with threads and/or coroutines that have complex/dynamic lifetimes. These situations can sometimes be avoided, but sometimes not, especially for long-running things. Even in Rust it’s pretty common to deal with them via “fiiiiiiine just throw the shared data into an Rc”.
Also, since killing threads is so cursed on just about every operating system as far as I can tell, a tracing GC has an advantage there in that it can always clean up a dead thread’s resources, sooner or later. One could argue that a better solution would be to have operating systems be better at cleaning up threads, but alas, it’s not an easy problem.
Am I missing anything? I am still a novice with actually sophisticated threading stuff.
dealing with threads and/or coroutines that have complex/dynamic lifetimes
The more code I write, the more I feel that having a strong hierarchy with clearly defined lifetimes and ownership is a good thing. Maybe I’m developing C++ Stockholm syndrome, but I find myself drawn to these simpler architectures even when using other languages that don’t force me to. About your point with Rc, I don’t think this qualifies as a garbage collector because you don’t delegate the cleanup to some runtime, you still delete the object inside the scope of one of your own functions (i.e. the last scope that drops the object) and thus on the time budget of your own code. Additionally, often just a few key objects/structs need to be wrapped in a std::shared_ptr or Rc, so the overhead is negligible.
Also, since killing threads is so cursed on just about every operating system as far as I can tell
Threads are supposed to be joined cooperatively, not killed (canceled). At the point of being canceled, the thread might be in any state, including in a critical section of a Mutex. This will almost certainly lead to problems down the road. But even joining threads is cursed because people do stuff like sleep(3), completely stalling the thread, which makes it impossible to terminate the thread cooperatively within a reasonable time frame. The proper way for threads to wait is to wait on the thing you want to wait on plus additionally a cancellation event which would be triggered if the thread needs to be joined. So you wait on two things at the same time (also see select and epoll). It’s not so much the OS that is the problem (though the OS doesn’t help because it doesn’t provide simple to use, good primitives) but the programmer. Threads should clean up their own state upon being joined. The owner of the thread, the one who called join (usually the main thread) will clean up potential remains like entries in thread lists. There should never be an ownerless thread. Threads must be able to release their resources and be stopped in a timely manner anyway, for example when the system shuts down, the process is stopped or submodules are detached. Here, a garbage collector does not provide much help.
The Go projects I’m (somewhat) involved with still very much have GC related performance issues. Big-data server stuff. Recent releases of Go have helped, though.
I first heard of it reading A.I. books, such as Paradigms of A.I. Programming. Here’s a few impressions I both experienced and heard from others.
It was a weird-looking language, outside the normal paradigm, and used formal logic. Each of those can limit adoption of a language. It ran slower than native languages. Like other esoteric languages, it was easier to do Prolog-like things in a mainstream language than to do normal things in Prolog-like languages. When I looked at it, there were transitions happening to GUI programming with no clear way I’d do that. (I found Visual Prolog later.) So, these are quite a few reasons uptake wouldn’t happen.
One more is overhyping it as part of the AI push. I remember big claims by projects like Fifth Generation in Japan. A lot of tech disappeared into the AI Winter. Prolog could’ve been impacted by that. Even if it wasn’t, it might still look high risk given that many programming languages were succeeding while investments that big in Prolog failed. Perception issue.
This language was part of the HPCS program. The others were Fortress and X10. They all had very, interesting ideas.
There were two issues: HPC had many hardware targets with their own programming approaches; quickly making efficient programs on then.
The solutions included data parallel (SIMD/SPMD), multi threading libraries on NUMA machines (pthreads/OpenMP), and clusters using message passing (PVM/MPI). SPMD was getting languages like Cilk. NUMA let you code remote reads as a memory access but had the issue of keeping threads near data (data locality) to avoid reads from remote nodes. If a cluster, you had that problem plus efficient, message packing. Distributed, shared memory (DSM) treated a cluster like a single memory (NUMA-like). Some were called Parallel, Global Address Space. There’s also the issue of expressing single-threaded vs parallel ops at language-level. Lastly, a unified model might dodge HPC lock-ins or ruts by allowing architecture swap outs.
Enter Chapel. They’ve attempted to make one language that can easily express multiple models. Their website already describes their mechanisms well. Compared to prior models, both Chapel and X10 certainly looked a lot easier. My only question that I didn’t get to answer was: “Do they have sufficiently smart compilers? Did they get the good speedups across each category?” Recently, I wondered if it could be a DSL layered on C++, Rust, etc.
Anyway, that’s the history. Hope you all enjoyed it. The most recent program to look into is Exascale Project with Oak Ridge summarizing some of it here.
Best description of Chapel is “language designed from ground up for high-performance computing”. Ten different modules for different types of fine-grained concurrency but json parsing is a single method in the IO module, because if you need fancy JSON parsing you’re using the wrong language.
@hwayne is definitely correct that JSON parsing isn’t what Chapel was designed for. But for those who would like some JSON mixed in with their scalable computations, it’s worth noting that the Chapel team is currently revamping its IO infrastructure to introduce a new serializer/deserializer framework. One of the motivators for this effort is to move JSON from being a single, happens-to-be-built-in format string (e.g., writef("%jt", myVar);) to a Chapel library that users can modify, extend, or create their own serialization formats within.
For JSON, I found a few links to help people trying to parallelize it. There were more that did queries. I’m just posting these:
Parallel Parsing Made Practical
Far as sequential-like programs, you can parallelize them in some cases if you can break them up with random execution. Looking for an example of the technique I’m thinking of. This one (pdf) looks like it with randomized, parallel branching. It’s kind of like how hardware does speculative execution. You can do that in software sometimes.
Redesigning my TLA+ workshop for a corporate event next week. Then celebrating someone else’s birthday. Then celebrating my own birthday.
This doesn’t feel very different to the cross-compile version to me. To build the WASM version, I need to have the compiler running on some other system. If I can do that, then why can it build a WASM version that can target my platform and architecture but not target my platform and architecture directly?
The reason for wanting to bootstrap is to avoid trusting-trust issues: to start with a compiler that I trust already and know that it faithfully reproduced the behaviour expressed in the source code of the compiler that I’m compiling. I lose that if there is a blob of WASM in the middle.
The important difference for me is that I can put wasm spec & wasm seed into a spaceship and be reasonably sure that the aliens would figure that out.
I could pack x86 seed and the Intel manual instead, but I would feel embarrassed about that!
But what problem are you actually trying to solve? Can you bootstrap the compiler (written in language X) when no existing compilers exist for language X? If you have to do this by implementing a compiler or interpreter for Y, then this makes sense only if Y is significantly simpler than X or if you can assume that an implementation of Y exists. WebAssembly is quite a complex language with a lot of features that are intended to make it possible to statically verify very coarse-grained memory safety and control-flow integrity properties. You’re also going to need at least WASI to be able to do I/O, so there’s a lot more in the embedding than just WebAssembly. SPARC, MIPS, and RISC-V are all simpler than WebAssembly, so you could just ship a binary for these and a spec for the architecture and a UART that you feed source into and get binaries out of. For any non-aliens use case, you can just run it in QEMU.
That’s never the problem I’m trying to solve with bootstrapping though. The problem that I am trying to solve is how do I get a binary for the compiler that I am reasonably confident that I can trust. If my FreeBSD/AArch64 FooLang compiler needs me to run a Linux/x86-64 FooLang compiler to cross-compile the first version and the Linux version of the compiler needs ten different versions to be built with eachother to get from source code to a binary, then there are a lot of places where someone could have inserted a trojan. Going via WebAssembly on the Linux/x86-64 -> FreeBSD/AArch64 step doesn’t address any of that, it still puts every single version of the Linux compiler in my TCB. It also isn’t easier to use.
No one bothers to solve this problem for C (and, increasingly, not for C++) because C/C++ compilers are assumed to be trusted for no good reason, and so the usual dodge is to write a bootstrap compiler in C/C++, then:
If they are, then you can be moderately confident that the bootstrap compiler produces semantically equivalent output to your original.
I liked Andrew Kelly’s article on Zig’s new bootstrap process, which uses WASM+WASI as matklad outlined. WRT Y being significantly simpler (where Y = WASM), they ended up with a wasm2c compiler in 4000 lines of C, which impressed me with its smallness. They skipped the wasm verification and safety features (not needed in this situation) to make it simpler.
Zig still relies on a C compiler for bootstrapping, tho. I guess an alternative would be for Zig’s wasm bootstrap compiler to have a wasm backend instead of a C backend, then the stage2 compiler runs on a wasm compiler or interpreter instead of natively via C. Then porting to a new target requires a wasm interpreter or compiler, eg a wasm2asm instead of wasm2c.
We explored this a while back in the Bootstrapping group. That page, esp “groups” page, has about every tool we found. I’m linking it in case anything there is useful to you. What I write next is specifically about bootstrapping for trustworthy software.
My own interest goes back to Paul Karger designing a compiler subversion during the MULTICS evaluation. The high-assurance, security field that they launched found that attacks would happen on every level at every step. So, all would need to be secured with rigorous design. What does that take for a trustworthy, compiler binary?
For source to object code, we would need verified compilers that do the right thing at a design level, code that’s actually reviewable by humans, that’s analyzable by machines, proof that the object code matched their source, exhaustive testing of all of that, high-security repos, trusted distribution, and ability for users to re-check and re-generate all those artifacts on their own. Most projects, from bootstrapping to reproducible builds, don’t attempt to address the whole thing even minimally. If anything, I believed that most projects are trying to make sure that, in the worst scenario, the potentially-malicious source on one machine will execute the same attacks on their own machine. A huge drop from prior goals.
There are common attacks filtered by some of their work. Let’s say it’s beneficial. Let’s say we use boostrapping with an eye for having a result we can trust. This sub-field’s solutions have been all over the place. Here’s a few categories I saw.
Some are about running a long chain of programs to get from simple tools to the binary of a gigantic compiler that nobody can fully review. I think of that as untrustworthy source to untrustworthy source compilation. The source can still be attacked with subversion. It might also already drop security-critical code, such as via optimization. If we build this chain, even most security-focused users are not going to both inspect and re-run it themselves. The reality of the situation is that almost all of them take some expert’s word that the system is secure. That seems similar to just relying on a trusted team to build a compiler. There’s a lot of OSS demand for these designs, though.
Next type: formally-verified or certifying compiler. Examples of formal verification included VLISP, Myreen’s LISP, CompCert, and CakeML. Although CompCert is proprietary, GPL version could be used for bootstrapping. Proof tools often output ML which CakeML might handle. Its intermediate languages could be building blocks for non-ML languages and bootstrapping projects. On certifying, there were tools like Flint’s for ML and Necula’s for Java. Most people on demand side neither know nor want to learn these methods. Those who know them mostly won’t build components like these because they often have different interests. That’s killed on the demand side.
Next type: build a LISP, add higher-level features, turn it into a real language, and write the compiler in that. I’ve seen so many versions of this that I think it’s the easiest. Most people coding in C, C++, Java, and .NET won’t learn Lisp at all. If they will, they don’t want it to be a critical part of their non-Lisp system. So, the highest-productivity option is gone.
Many, hand-written interpreters existed for many simple languages that could bootstrap the compiler. Like Lisp, they often had features that would make it easier than what GCC was written in. Others, like Pascal/P to P-code, made the target platform easier, too. Their simplicity also made code review easier. Even if they’d like the language, most interested in bootstrapping wanted to use the heavy language the compiler is for or their personal favorites that were less suitable. The demand side again makes the language requirements harder that way.
Conclusion: this is not a technology problem. There’s examples of everything I listed already built in many forms. The resistance to that was about non-interest in security requirements, specific languages, use of formally-verified components, etc. From the requirements to the implementations, most things are being decided for non-security-centered reasons which also sometimes lowered productivity. So, I don’t worry about it any more.
I’ll just compliment, trust, and use whatever their teams end up producing. Like I’ve been doing for GCC, LLVM, and recently Python. :)
Yeah, security angle here is fascinating for some people, but not me :0) I already trust soo many things that hacking my compiler isn’t really the easiest way to pwn me.
I am interested in bootstrapping from literally nothing case, because starting with an arbitrary turing tarping and ending up with Rust in O(1) steps with a rather small constant is nifty!
It sounds like you have a more realistic goal! Many in the bootstrapping community were doing it more for fun or an intellectual challenge. One thing we discussed was the balance of expressiveness, ease of implementation, and resulting performance (eg optimizations). We were curious how many steps were needed in between two points to express those ideas cleanly with acceptable performance. Also, what techniques at one layer dramatically reduced the work at another layer. All fascinating stuff.
I’m curious what your goal is. The article opened with “if you start out with nothing.” We almost always have something on one of the platforms. The language designer can also choose portable options in their implementation. From nothing designs might not be necessary but maybe it’s still helpful. For example, WebAssembly design might aid portability. On a higher level, I’m curious what you’re personally aiming for or would like to see in this space? Whether for practical or fun reasons.
Practically, I am pretty happy with downloading prebuilt compilers and/or cross compiling.
Theoretically, I am interested in self-contained seeds, which do not rely on existence of a particular computation device outside (https://250bpm.com/blog:157/index.html)
I guess, important clarification is “self contained seeds for real software”.
Obviously, if you start from forth or lisp, you can get a rather neat chain. But, if you want to jump from that to Rust, you’d have “implement Rust compiler in forth” step, which doesn’t seem elegant
In the linked example, I’d suggest a good chunk of what the DNA encodes is instructions directly to its environment on what and how to produce the thing. More like a series of instructions in a Linux box that have the syscalls, ABI, etc. You’d have to re-create whatever environment that information is designed for so it can “run.” In computing, you’d have to either change your instructions to match the target’s environment or your environment to match the instructions. Finding the balance between the two, or how many links in the chain are necessary, is the hobby of many boostrappers.
Back to compilers. Before I mention my concept, I’ll mention what inspired it: Pascal/P. Wirth’s design philosophy was to stop adding features when the compiler got too hard to build. His work was easy to read and build. For Pascal/P, he wanted non-compiler experts to be able to port the compiler. His P-code interpreter was close to the metal. The Pascal/P system, including libraries, was compiled entirely to P-Code. Then, just port the interpreter to bootstrap the system on a new architecture. They claimed to have 70 ports or something in two years on machines ranging from tiny machines to mainframes with unusual bits. Proven model.
So, instead of P-code, you start with the primitives that C or Rust is built on. You use their machine-level, data types. You use pointers with operations directly on memory. The ability to move what’s in a file back and forth into memory. You’ll probably need lookup storage for names, a tree w/ tree ops, maybe a stack, and so on. Active Oberon suggests objects might be easy to add. Macros or templates might go in tree ops.
All of this is stuff you can encode as bytecodes in an interpreter. The bytecodes of this interpreter are mostly self-contained functions that operate directly on memory or some shared state (eg interpreter). They’re implemented in non-optimized, straight-forward assembler. I advise even minimizing which instruction types you use so reviewers can learn just a few.
How to pick those things. I noticed that people often described C a certain way when explaining what goes on “under the hood.” They might say we’d put the function arguments on a stack, load the address of the pointer from memory, malloc, do a syscall, etc. I imagine Rust’s coders do something similar. The primitive operations should either be those very things or easy to compose into those things. Directly execute it or compose it with function calls. The concept is that you can either build up your interpreter as your language gets more expressive or you can more easily convert an existing compiler to simpler source for your interpreter. They match up.
You can drop a lot in the compiler, too. Your untrusted compiler can do the type-checking, error handling, standard library, 100+ optimizations, etc. Your source will be in an acceptable state before you mess with bootstrapping. Then, forget all those complex features in the seed compiler. You’re just reading and directly executing the statements in the compiler with no checks, optimizations, etc. That’s simpler. You build it from the start to support this or just trim it into this form in a fork for bootstrapping. This lowers the bar on what your seed tool must support.
You have multiple routes in this last step. It depends on how many links in the chain you want with how much rewriting of basically the same code. Some people keep adding language features, rewriting the compiler at that level of abstraction, and running old version through it. Bigger and bigger. Others like Wirth just translate the high-level form of compiler plus standard library to the simpler form one time to produce a full-featured compiler. Then, they keep using the current version of the compiler which produces code for the interpreter.
So, there was my concept. Grow the machine just a bit toward your language. Build it in your platform’s assembler. Then, swap a heavy language for just lots of function calls to its simpler language. Keep those with your high-level source. Realistically, it could be just another back-end of your compiler like many projects do compiling to C. Alternatively, the transpiler re-uses existing source of the compiler for a separate pass. That pass ensure compilation is traceable (diffable), more human-readable, and to an easier target.
It effectively gets the same result as cross-compiling, but now you’ve got 1 binary instead of N binaries for every platform, and that 1 binary can be checked into git. And since there’s this blob in the middle, you can optimize the WASM compiler separately or you could let users bring their own WASM+WASI environment to run it.
It seems like the trusting-trust issues weren’t the biggest reason for Zig to switch to this system - iiuc this was the fastest way to get off the old C++-based Zig compiler + Zig-based Zig compiler combo. This also has the added benefit of not requiring a C++ compiler (one less thing you need to trust). I think you currently need to have blind faith that the updates to this WASM blob aren’t malicious. Though, since this is WASM, you could attempt to sandbox it appropriately.
You have the trusting trust issue if your bootstrap is C too, though no? And C actually did have a trusting trust attack from Thompson, whereas there are no known WASM trusting trust attacks. :-) In either case, to be paranoid you want multiple implementations with independent provenance.
For those interested in it, Synflow was the last tool I looked at in this space. I’m not sure if they’re still active or not.
Washington Post has a huge list of them here. Some of these could definitely sell.
An infinite loop detector could be useful. When refactoring in a hurry, I’d sometimes forget to put an increment in a while loop. The effects were consistent: a stall in the flow of the program followed by the CPU fan turning on. If that happened, I’d look at the while loops first. I had to learn to just not rush through control flow…
Seems the infinite loops could be easy to spot with a code analyzer. Might help new programmers who often make those errors.
@projectgus, this was the BIOS thing I was talking about. calvin found it. (Thanks, calvin!)
On the first day of this weekend, all kinds of software and services will be deployed that change the world in unbelievable ways.
On the second day of this weekend, many creators will confess something about what they did the first day.
That’s an interesting article. You did well showing how many languages try to patch on the effects of a good, module system. Those patch-ups do extra damage. I was looking at functional programming via preprocessor in C just last month. I think the scariest thing about such hacks are how there are systems composing that don’t really understand what each other are doing. There’s no consistency protection across those separate formalisms or applications. If anything, it’s like we’d need real, module theory with formal specs for each thing we used to fake the modules. We don’t have that either. So, it might always be a mess to deal with.
I especially loved how you illustrated that that many interfaces might have satisfied an implementation. I’ve been thinking of software rewriting again (eg language-to-language translation). Others are using machine learning to improve software. Tools aiming for one-to-one correspondence between an original and improved version might write different, unintended pieces of software. While they’d treat it as a bug, it might be inherent to how they try to guess interfaces (or other abstract info) from the implementations people feed them. Issues like you mentioned might mean there are upper limits to what we can do.
One thing missing, but almost there in “Not Just for…” section, is you can publish a clean, prototype module as documentation along with code in a different language. This is basically the old concept of prototyping with a high-level language before implementing in a low-level language. Instead, we’re applying it as a specification technique for interfaces which might be easier to learn than full, formal specification. It would help if the readers in the messy language knew the cleaner language. If they didn’t, it could be an excuse to expose them to unfamiliar ideas that broaden their mind.
On your Trinity reference, I think verses 19-20 here apply. God said He designed all of creation to testify to His presence. Thinking of articles here, one way is that what should be a random mess of a universe follows specific laws that follow mathematical/logical rules like you shared. Many of the same maths show up in seemingly-unrelated things like how an artist leaves repeating signatures on their work. While they should changed often, the machinery consistently works with no deviations from its specs. God said in v35-36 here that the fixed laws, which secure our existence physically, were a reminder that He’d keep a greater promise to hold us secure forever. If we repent of our lives’ many sins and put our faith in Jesus Christ (God) instead.
I put up a short explanation with proof of that up here. I hope you receive His free gift, old friend. If you do, you get to meet the Creator of those abstractions, know Him personally, and explore the depths of His designs over eternity.
The “computational trinity” is an older name for what we now call the computational trilogy, a grand connection between logic/syntax/type theory, semantics/category theory, and computation. Harper is quoted on that page, too:
Computational trinitarianism entails that any concept arising in one aspect should have meaning from the perspective of the other two. If you arrive at an insight that has importance for logic, languages, and categories, then you may feel sure that you have elucidated an essential concept of computation–you have made an enduring scientific discovery.
So, when we consider syntactic modules as presented in the blog post, we should also consider the category-theoretic meaning of modules.
The “Holy Trinity” is an older name for God as one being of three persons. People have a history of taking his name in vain or re-applying it in different ways. His names, like His creation, are meant to give Him glory. I just point it back to Him when I see it.
That said, I thank you for sharing the link. It’s very interesting. On top of mathematical education, there’s even another theological concept in it which is potentially useful:
“The central dogma of computational trinitarianism holds that Logic, Languages, and Categories are but three manifestations of one divine notion of computation. There is no preferred route to enlightenment: each aspect provides insights that comprise the experience of computation in our lives.”
This is great wording for a concept that’s hard to explain. One indeed similar to the Trinity. We have God existing as His own essence with all His properties (or nature). Then, each person has their own attributes, sets an example for us, takes seemingly-independent action, and are highly interdependent. After describing specifics, one might say something similar to the above quote: “each aspect provides insights that represent the experience of God moving in our lives.”
This might be another example where God creates things on earth to be shadows of things in heaven to help us understand them and marvel at Him.
Thanks for sharing a second off-topic comment! I appreciate it. In return, here is a quick proof that Jehovah and other omnipotent beings don’t exist. First, consider potentially-infinite matrices of Booleans. If any row of such a matrix is all true, then there cannot be any column which is all false, and vice versa, and vice versa; proofs left to the reader.
Create an open tournament. By tradition, it will be a rock-throwing tournament. Entrants can register rocks which will be thrown, and also champions which will throw rocks. A judgemental person decides whether each champion/rock pair is a successful throw, and tallies the result in a matrix of Booleans. Then, by the above lemma, there cannot be an entrant who simultaneously registers an unthrowable rock and also a champion who can throw any rock whatsoever. Dually, there also cannot be an entrant who simultaneously registers an always-throwable rock and also a champion who is unable to throw rocks. Proof: what does the judge tally after each attempt? This is just a rigorous version of the paradox of the stone, with fluff removed.
In general, the logical relations between our observable reality are sufficient to exclude the possibility of omnipotence. Similar basic statements can be used to exclude trivialism, solipsism, and dialethism/paraconsistency from our reasoning, too.
The article is the topic. Many authors like to drag our faith into unrelated writing. So, my response was to multiple angles. Plus, just out compassion for my old buddy who submitted it. I’d like to see the man be saved by the very God he referenced. :) Back to the tangent.
Re proof. Although interesting, your proof ditches the scientific method for conjecture based on imagined worlds. Science says we must observe our universe’s attributes, how things work in it, build hypotheses based on those observations, test them with controlled experiments, and have skeptical parties independently replicate that. The evidentiary method used in historical situations at least requires corroborated evidence from honest observers. Divine revelation with prophecy and miracles works similarly to the evidentiary method. So, let’s also start with observation of the real world.
We see a universe whose parameters are very fine-tuned. Even non-believers argue this point. They tend to dodge the implication. I wrote that we never see such well-maintained order emerge from chaos. If working from observations, we’re forced to default on God made the universe.
We struggle to make physics work precisely in tiny areas (example). Our AI’s, whether NN or GA, require lots of intelligent design to even operate, much less work correctly. People struggle to develop, optimize, and verify logics with consistency. We marvel at mathematicians and engineers who pull off these feats. How much more should we marvel at God who drives all of that to work with no failures?! We say those people’s output was clearly design as they fought an uphill battle against chaos. Then, we treat the universe like we could make it by shaking up a shoebox long enough. Logically inconsistent. Why do we do this?
“Boast no more so very proudly, Do not let arrogance come out of your mouth; For the Lord is a God of knowledge, And with Him actions are weighed.” (1 Sam. 2:3)
“The LORD Almighty has a day in store for all the proud and lofty, for all that is exalted (and they will be humbled).” (Isa. 2:12)
I was the most guilty of this. Although a good tool, science has failed countless times with so many rewrites of our books. Empirical observation should lead scientists to be the most humble of all people with carefully-worded statements they know might be disproven the next day. Instead, they tell believers what’s definitely true, what’s definitely not possible, and often believe theories with neither observation nor skeptical replication (aka faith). Many mock people who believe in God with far greater evidence than exists for some theories they have much confidence in.
God’s Word said people like me were arrogant people who wanted to be god, had no interest in submitting to a real one, and ignored all the evidence He existed. When God judged how we lived our lives, we’d still be trying to argue with Him about why our moral theories and lifestyles made more sense. The proof was all around us but we already saw god in the mirror. We’d speak so certainly about our beliefs due to our wicked, arrogant hearts. I had to repent.
It gets better still! If you humble yourself and seek God, then He gives you a shortcut around all of those proof requirements. God’s Word says the Father draws us in, “my sheep hear my voice,” and faith comes by hearing the Word. After digging into His Word, God will supernaturally do something that makes it click at some point. Then, a testable prediction is that we’re transformed from inside out by His power. We may also experience answered prayers and miracles. That one message, without modification, has gotten those same results in around 2,000 people groups for thousands of years. It’s power is proven.
Pray to God that He reveals Himself and saves you and start in John. You’ll meet Him. Then, what happens to His followers in His Word will happen to you as well.
Thanks for a third off-topic comment. Let’s see if you’re really a person of science.
Rock-throwing tournaments are completely scientific. Nothing I described is impossible to build in a laboratory, and because Jehovah doesn’t exist, of course they can’t show up and participate in a contradictory fashion. You’re going to have to reply with logic and maths if you don’t like the rock-throwing tournament!
Quantum mechanics is incompatible with omniscience, all-knowing, all-seeing, or any other similar superpower. The Kochen-Specker lemma shows that there are properties of particles which are not locally determined, and thus cannot be observed without measurement. A Stern-Gerlach device allows us to make such a measurement. The same device shows that particles midway through the device, accelerated but not yet measured, are indeterminate – if some deity were measuring them, then the particles would be deflected measurably.
To be blunt: You’re using a device laden with semiconductors, and the theories which were used to design that device have no room for omniscient beings. Also, the device was programmed using theories of logic which have no room for omnipotent beings and can only implement the realizable theorems.
The “Rowhammer” paper came out the same year. I wonder if there are systems where isolating process’ memory to individual banks has security benefits.
Although I couldn’t find it, there was one way back that did either a VM or dual-boot setup by putting each system in a different RAM stick. They did that to mitigate covert, storage channels. I think, but really fuzzy on it, that they mitigated timing channels by suspending one and switching to the other, and vice versa. Only one allowed to run at a time. The alternative was multiple PC’s with KVM switch. The authors thought the tradeoff of one stick of RAM and suspend/resume times might be worth it.
My recommendations way back were: PCI cards w/ secure OS’s connected via high-speed backplane (or NUMA w/ IOMMU); SMP with per socket and RAM isolation; per core isolation w/ shared cache (L3) off. Each of these turned some hard, security problems into “meeting app requirements will cost a few grand more per system unit with some extra configuration” kind of problems.
At the risk of venturing too far off-topic, or getting too deep into zero-sum thinking, I wonder if the rise of CHERI means that we can and should abort efforts to rewrite stuff in memory-safe languages like Rust. If we take for granted that we’re going to leave behind folks stuck on old hardware anyway, and fixing memory-safety through something like CHERI just requires another hardware upgrade cycle, then maybe that gives us freedom, or even an obligation (particularly for those of us that actually like Rust), to go back to the old workhorses of C++ and even C.
I think CHERI shifts the narrative for memory-safe languages somewhat, in terms of memory safety. They are no longer about improving confidentiality and integrity but they are still a win for availability. In CHERIoT, we can reuse the FreeRTOS network stack almost unmodified, but that doesn’t prevent it from having memory-safety bugs. If an attacker can find and exploit them, they can crash the network stack. We can limit the fallout from that, but it still crashes. In contrast, if someone where to rewrite the network stack in Rust, then they could eliminate those bugs at design time.
The flip side of that is that rewriting the network stack in Rust is a lot more effort than recompiling it with CHERI. Of the code that we’ve tried building for CHERIoT, we have very rarely needed to modify the code for CHERI, only to add compartmentalisation. In the worst cases, we’re needing to modify around 1% of the lines of code, so the cost of rewriting in Rust is at least 100x higher.
For new code, it’s almost certainly a better choice to use a safer language. For existing code, compiling it for CHERI and putting it in a sandbox is cheaper than rewriting it (assuming CHERI hardware) and may give you sufficient availability guarantees.
The important thing to remember about rewriting existing software is the opportunity cost. Any time spent rewriting something in Rust is time not spent writing software in Rust that solves new problems.
“The flip side of that is that rewriting the network stack in Rust is a lot more effort than recompiling it with CHERI. “ “assuming CHERI hardware”
Which leads to the real problem. The market has tanked or forced a detour on [almost?] every company that bet on secure hardware for about sixty years. B5000 had some memory safety in 1961. System/38 had capability security. The market chose against those techniques for raw price/performance, integration, and backward compatibility. Even today, they’d be more likely to avoid really leveraging MCP or IBM i for security. In embedded, people avoided secure processors (esp Java CPU’s) for similar reasons. Even mid-sized to high-profit companies mostly avoided RTOS’s like INTEGRITY-178B and hypervisors like LynxSecure to a fraction of a percent of their costs.
Intel had it worse. Schell said a Burroughs guy at Intel added segments to x86 for him which secure OS’s used (eg GEMSOS). Market was mostly annoyed with and didn’t use that. Then, Intel lost billions on i432, i960, and Itanium. I liked i960MX, Secure64 built on Itanium, and companies using both had to port later. Azul’s Vega’s were safest bet since you can hide them underneath enterprise Java but they got ditched IIRC. It looks like market suicide to make or target a secure, clean-slate architecture. Whereas, memory-safe languages and software-level mitigations have been getting stronger in the market over time.
I really like your work on CHERI because it stays close to existing architectures vs dumping compatibility. Others took too many design risks. Yet, we’ve seen the RISC market abandon good RISC’s, the server market abandon good server CPU’s, and Java abandon Java CPU’s. The reasons appear to be:
The market-leading CPU’s had highest performance-per-dollar-per-watt-per-etc. Even if prototypes met specs, the same design might not meet spec requirements in full-custom, high-speed CPU’s.
Compatibility with x86. That still dominates much of the desktop and server market. There’s a lot of reasons to think it would be harder to get right than ARM or RISC-V. The ARM market for non-embedded is huge now, though.
For enterprises, integration into server boards they can plug into their kind of infrastructure. These things need to be ready to go with all the extras they like having on them. For consumers, it’s whatever goes into their phones and stuff.
Integration with common, management software that lets them deploy and administer it as easily as their Windows and Linux machines. Now maybe on clouds, too. For consumers, integration with their favorite apps.
Whoever sells secure hardware in major markets has a tremendous investment to do in hardware/software development. There’s lots of ASIC blocks, prototype boards, ports, and so on. After all that is spent, they still have to sell it at or near cost of everything else to be competitive.
That said, embedded and accelerator cards are the best bets. Might spur on the risk-adverse markets, too. Your project is in one category but still with some of the risks. Accelerator cards would be things like ML boards, baseband boards, etc. I was hoping Cavium slapped secure CPU’s on their network processors but they went to ARM. Same pattern. I’m glad ARM has a CHERI prototype.
Meanwhile, I encourage people to do R&D in all directions, including rewrites and transpilers, in case secure hardware never arrives, is hard to buy, or the company goes bankrupt. (Again, again, and again.)
I agree on most of your points. The ‘massive investment’ bit is why the Digital Security by Design project is investing £170m in ecosystem bring up, including having Arm fab a server-class test chip and development system. This can now run FreeBSD, Wayland and KDE in a memory-safe build (Chromium is not there yet but is closer than I expected), but it doesn’t give up on backwards compatibility: it can still run unmodified AArch64 binaries. Linux support is getting there, though Linux’s lack of same abstractions made the bring-up cost higher than FreeBSD.
In the embedded space, people are a lot more willing to recompile their code for each SoC. We’ve done a lot of work to make sure that, in the common case, that’s all that they have to do.
I would love to hear your thoughts on how to craft regulations that would mandate some of the security guarantees that CHERI provides.
I’m on the advisory board for the DESCRIBE project, which is looking at exactly that. It’s mostly comprised of social scientists, who are actually qualified to answer that question, I’m just there to help the, understand what the technology can (and, more importantly, can’t) do.
On the government side, a number of people involved remember mandating Ada and what a colossal mistake that was. Regulation requiring the technology is unlikely, but the two biggest things that I think could make a difference are:
I’m on the advisory board for the DESCRIBE project, which is looking at exactly that. It’s mostly comprised of social scientists, who are actually qualified to answer that question, I’m just there to help the, understand what the technology can (and, more importantly, can’t) do.
I’ve done think-tank work in the past and I was thinking about doing some policy memos framing security as a human rights issue. The problem is finding a host institution, I could probably get an internship at Galois if there was some funding available….
On the government side, a number of people involved remember mandating Ada and what a colossal mistake that was. Regulation requiring the technology is unlikely, but the two biggest things that I think could make a difference…
Increasing liability (structured around bug bounties) and altering government procurement regulations to accept a ~20% cost increase for various security requirements is reasonable. However, verifiable security claims are something that I think could be reasonably mandated, such as having formal proofs of correctness for various components.
That’s a really, good start. I think the JIT’s were the wall that some others hit after getting that far. That at least narrows the problem down. That it’s usually a few, critical runtimes might save you trouble. A reusable one per ISA might help for future apps. I’m sure yall are already all over that.
On marketing end, one thing you might consider hiding the CHERI hardware in non-security offerings. They already want to buy certain products for specific capabilities. Maybe they don’t care about security or only a tiny percentage will. Put in a switch that lets them turn it on or off with something not easily bypassable, like antifuse or a jumper. It’s a normal product (eg FreeRTOS) unless they want the protections on. The CHERI vendor can also offer them a low-cost license for the core to create demand. So, the sales of the non-security product pays for the secure hardware which, configured differently, gets used for its true purpose.
(Note: You still try to market secure-by-default products in parallel. This would just be an option to pitch suppliers, esp hot startups, with a free or cheap OEM license to get them to build what might have less demand.)
Similarly, my older concept was trying to put it into SaaS platforms. The customer buys the app mainly with configs or programs in high-level languages. If it performs well, they neither know nor care what it actually runs on. Companies can gradually mix in high-security chips with the load-balancer spreading it out. If the chips start failing or aren’t competitive, move load back to mature systems. You still need people who want to buy it with some risk there. It’s not a business-ending risk this way, though.
I think the JIT’s were the wall that some others hit after getting that far.
There’s work ongoing on OpenJDK and v8. There’s a long tail after that, but those are probably the key ones.
On marketing end, one thing you might consider hiding the CHERI hardware in non-security offerings.
Specifically in the CHERIoT case, one of the advantages is that you can use a shared heap between different phases of computation, provided by different vendors, with non-interference guarantees. Oh, and we’re giving the core away: any SoC maker can integrate it at no cost.
Even better. “Giving the core away” really jumps out. That demands a few, follow-up questions:
You said CHERIoT. Is it just that core pre-made? Or can they integrate CHERI architecture, your build or their independent builds, into any commercial core with no licensing cost?
Is that in writing somewhere so that supplier know they can’t get patent trolled by Microsoft or ARM? Are they doing the open-patent thing or no patents on any of it? ISA’s etc are a patent minefield. Companies will want assurance.
The CHERIoT Ibex core is Apache licensed. You can do basically anything you want with it. The Apache license includes a patent grant, but we also have an agreement with all of the other companies involved in CHERI not to patent ‘capability-essential IP’, which basically covers anything necessary to implement a performant CHERI core. We have patented a design for a particular technique for accelerating temporal safety, but that’s not in the open source release (it may be at some point there probably isn’t much competitive advantage in keeping it proprietary).
All of the software stack is MIT licensed, except the LLVM bits which are Apache+GPL linking exemption.
Arm doesn’t have any patents required for CHERI, they may have some on other part of the core but, considering that it’s a similar design to 30-year-old RISC chips, I think it’s quite unlikely.
Please encourage them to reach out to me directly - we’re looking to connect to silicon partners that might want to build something based on it.
I consider this a regulatory issue: we need to be able to mandate the type of formal security guarantees that CHERI provides. I’m trying to get a position at a think tank to try working on this exact problem. But it’s not easy to get funding….
Regulation to produce higher-quality systems has been done before (pdf). Critics point out it had big issues (pdf). DO-178B proved it out again with that market still existing (i.e. DO-178C). I proposed on Lobsters just a few rules for routers that should reduce lots of DDOS attacks. I believe it can be done.
The biggest worry I have is new regulations are just some pile of paperwork with boxes to check off. That’s costly and useless. That’s what a lot of Common Criteria certification was. On other end, it might be so secure that it’s features and development pace drag behind the rest of the market. Steve Lipner said the VAX Security Kernel was unmarketable partly because high-assurance engineering made every change take two to three quarters to pull off. Many features nobody knew how to secure.
Another issue, explored in the second paper, is that requirements don’t make any sense. Regulators often cause that problem once they get too much control. Companies will respond deviously. Then, there’s gotta be enough suppliers so it meets competition requirements in places like DOD. If not compatible with profitable products, the lobbyists of the tech firms will probably kill it off to prevent billions of their revenue from being killed off. Examples of big companies’ responses to what threatened them included DMCA, patent trolling, and Oracle’s API ruling.
Just a few things I remembered having to balance in my proposals in case they help you.
I’m aware of past attempts which all devolved into a bureaucratic exercise in paperwork and box ticking. There is a rumor (which I can’t debunk without spending WAY too much personal time grokking obscure NIST standards) that INTEGRITY didn’t actually get an EAL6 cert but claimed to have one anyway. My favorite paper on the subject is Certification and evaluation: A security economics perspective.
But to say that past attempts failed doesn’t mean it isn’t a good idea. It took decades for airbag technology to advance to the point that it was reliable enough to actually improve safety. They have been mandatory for 20 years now.
Formal verification becoming practical for small code bases would make it feasible to mandate correctness of specific sub-components, such as specific cryptographic operations and process isolation.
We can avoid prescriptive rules by mandating large public bug bounties and an insurance policy to cover them. That gives the insurance policy provider flexibility to adjust to whatever security procedures vendors come up with. If Comcast had to pay a million dollars for every router exploit, they would certainly be more invested in high assurance software….
“Formal verification becoming practical for small code bases would make it feasible to mandate correctness of specific sub-components, such as specific cryptographic operations and process isolation.”
“ If Comcast had to pay a million dollars for every router exploit, they would certainly be more invested in high assurance software….”
I really like both of those concepts. Totally agree.
That paper was really good, too. Thanks for it.
If you know anyone in the policy space that would like to chat (yourself included), let me know. I have think-tank experience and have developed both some interesting infosec policy ideas and how to frame them effectively.
CHERI also makes both garbage collection and formal verification of certain properties easier.
Rust developers regularly give up on fighting the borrow checker and switch to some form of reference counting (which is less efficient than a garbage collector). Making garbage collection less of a performance barrier opens up a market for programming languages that target infrastructure code. That’s not to say it’s not possible to get rid of (A)RC within RUST, just that a language built for CHERI would offer a more efficient use of a given engineering budget.
Rust doesn’t provide any formal guarantees, even in the intermediate representations. CHERI’s formally verified architecture makes it easier to prove some security properties across programming languages and down the binary level.
Rust means much more than memory safety. For instance, my favourite, it allows for type level state machines which allow HAL implementors to write more robust APIs that eg. make logical errors unrepresentable. As a result, as long as a usercode compiles, it has a higher probability of being just correct. Which means less classes of errors in runtime.
So, if one wants to use a specific peripheral in a microcontroller that requires a GPIO in a proper mode, we can represent this GPIO pin with its mode as a specific type and demand ownership over instance thereof during construction of such perpiheral abstraction. Changing a mode of a GPIO should consume the previous instance and yield a new one with a new type. And so on, and so on. These are just simple examples but possiblities are endless. Maybe Ada could compete with Rust in this area but C and C++ are just a no-go zone in comparison.
I don’t understand the last sentence in your comment. You most definitely can write C++ code that expresses this kind of state machine and it’s one of the reasons that I prefer C++ to C for embedded development.
Move semantics are quite weak in C++ (although still much much better than C). It’s not that difficult to make a mistake and use “an empty object” which was moved - where in Rust such object cannot be used.
Here is a comment from 6 days ago that states essentially the opposite. Who’s one believe..? Life is so confusing… Brings to mind this quote by Bertrand Russell:
The fundamental cause of the trouble is that in the modern world the stupid are cocksure while the intelligent are full of doubt.
I’m not sure it’s exactly the same scenario – it’s concerned with reading from peripherals with a different type model than the host (the parent comment that I was replying to also has a similar example), which the compiler can “misunderstand” and optimize away.
Write access obviously doesn’t incure the same kind of pitfalls, although it can be bumpy, too. For instance, on platforms that only allow accessing GPIO ports, as opposed to individual pins, the GPIO API needs to expose pins, but trade the ownership of ports of pins. That can be surprisingly difficult to represent, especially if you need to deviate from the straightforward case of eclusive ownership that’s acquired & relinquished synchronously and only relinquished after an operation is completed.
Edit: FWIW, I wouldn’t be surprised if “accidental” exclusive ownership of dependent resources were a remarkably frequent source of priority inversion bugs. However, this isn’t really a language’s fault. It’s very easy to bake it in a runtime model, too.
(Even later edit: that being said, none of this is really Rust’s fault per se, some things just can’t be adequately modeled at compile time. It’s a bad idea to take any compiler on its word, Rust or otherwise.)
It does not seem to state anything opposite to what I said. Modelling HW in software is hard, period. When dealing with unsafe code while writing a HAL library one has to be extra cautious to avoid violation of the platform’s invariants. Enums example is very good as they are tricky to work with directly on a FFI boundary.
However, I don’t really see in what way Rust is worse than C or C++ in that regard. I don’t consider neither C nor especially C++ to be meaningfully more inspectable. If anything it is still a massive improvement in all these areas. Like robust abstraction building, tooling & coherent build system, easiness of code reuse between the firmware and the host and so on. With RTIC you can write safe, interrupt-driven deadlock-free applications with resource sharing and priorities that do minimal locking using stack resource policy and with barely any overhead. It has a research paper behind it if anyone is interested in it. It does not compile if you violate invariants. It’s unheard of by C/C++ standards. cargo expand can be used if proc-macro is not trust-worthy enough and it’s possible to read it just fine.
I’m doubtful where it’s due and confident where it’s due. Embedded software development was for decades abysmally bad with poor Windows centric toolchains, reskins of Eclipse IDEs by different vendors (mcuxpressoide), code generation with placeholders “where user code should go to” (STM Cube), build systems based on XMLs and magic wizard GUIs - list could probably go on for a very long time. It feels insanely dishonest to write off Rust considering how empowering it is regardless of its shortcomings. Although some tools are not there yet, admittedly.
I don’t think there was a need for calling me stupid by using a quote of a famous person. I just think we have very low standards in this specific branch of IT industry when it comes to the technology we use. I look forward to another lang/tech stack that will attempt to fix everything that Rust got wrong. Again, I think we could probably learn quite a lot from Ada, for example.
However, I don’t really see in what way Rust is worse than C or C++ in that regard.
The C and C++ type systems guarantee less. That’s normally a bad thing, but it conversely means that there’s less for the compiler to take advantage of. Low level I/O generally steps outside or the language’s abstract machine. That opens more opportunities for accidentally invoking undefined behaviour in Rust than C, because there are more things that the type system doesn’t allow but that hardware or software in other languages can do. Defensively programming this is hard to get right for the same reason that signed integer overflow checking is hard in C: you have to make sure that your checks all happen before the thing that would be undefined behaviour are reachable. Rust relies quite heavily on being able to prune impossible things during optimisation for performance. Hardware interfaces, particularly in the presence of glitch attacks, can easily enter the space of things that should be impossible.
Memory safety enforced at compile time is strictly better than doing it by crashing at runtime in case of violations. So, no, this doesn’t make Rust any less attractive even as it make C/C++ behave better.
I think it’s important to point out that hardware w/o capability enforcement is probably going to continue to exist for a very long time, so if you’re going to write new code, you may as well do it in a language that protects you in both cases.
Cheri won’t stop you overflowing a small int, make sure you free that malloc, or keep you from writing to a closed file handle, and for a lot of attacks a runtime crash is as good as an overrun.
for a lot of attacks a runtime crash is as good as an overrun.
I don’t think I’d agree with that. The goal of most widely deployed mitigations is to take a bug that can attack confidentiality or integrity and turn them into attacks on availability. In general, security ranks the tree properties, in descending importance, as:
A breach in integrity lets the attacker corrupt state and can usually be used to attack the other two. Breach in confidentiality may leak secrets that the rest of your security depends on (see: heartbleed), or commercially or personally sensitive information with potentially very expensive implications for a long time, including the reputational cost of reducing trust, A breach in availability may cost some money in the short term, but that generally matters only in safety critical systems.
It’s also much easier to build in resistance to availability issues higher up in a system. If you are running a datacenter service and an attacker can compromise it to do their own work or leak customer data, that’s likely to cost millions of dollars. If they can crash a node, you restart it and block the attacking IP at the edge. You also record telemetry and fix the bug that they’re exploiting. As soon as the fix is deployed, the attack has no lasting effect. If they were able to install root kits or leak your customers’ data, the impact of the attack may extend for years after the breach.
Oh, and on the memory leak side: we can’t prevent it, but we can 100% accurately identify pointers in memory, so we can provide sanitisers that check it with periodic scans.
At best one might be able to claim that lightweight processes and supervisors are the key mechanisms at play3, but I think it would be more honest to recognise the structure that behaviours provide and how that ultimately leads to reliable software.
I think there’s a little bit more to it - specifically a runtime for managing links and other special features like links. But otherwise I think this really is a case of “Erlang’s foundational primitives let you take something fairly simple like an interface and leverage that in a way more powerful way”. The fundamental idea is really that isolated processes and message passing are the primitives you need for reliable systems. When you layer things on top of that that would otherwise be “good patterns” in other languages you get a lot of stuff “for free” in Erlang. Plus its syntax (especially the multiple handlers + destructuring in function overloads) makes it really easy to express states across behaviors.
So basically I maybe disagree a bit with the title but I thought this was a very nice writeup.
I would suggest that you check out the P programming language, I think you’d enjoy it.
So basically I maybe disagree a bit with the title but I thought this was a very nice writeup.
I chose the title partly because I don’t think you actually need lightweight processes to implement behaviours (I sketch an alternative towards the end of the post).
Another reason is that I feel a lot of people associate Erlang with lightweight processes and message passing already, and in order to “make room” for the new association of Erlang/OTP and behaviours it perhaps makes sense to weaken the existing association first.
Finally, I wanted to create some tension to lure the reader into reading the introduction (I’m not sure if this is a good tactic or if it worked).
I also considered something along the lines of: “explicit processes and message passing is the goto of concurrent programming” (as opposed to the structured approach that behaviours encourage).
I would suggest that you check out the P programming language, I think you’d enjoy it.
Yeah, P’s cool.
I enjoyed your article. I also liked seeing Jim Gray in your links on the bottom. To build on some things, here’s three gifts that you might enjoy (all PDF’s):
Fault-Tolerance in Computer Systems which thoroughly describes the architecture Gray et al built.
An Architectural Overview of QNX which describes a microkernel with lightweight processing, message passing, and high reliability. That’s been powering all kinds of things for decades. Probably lessons to learn from it.
Microreboots applied the restart concept a bit further in a Java stack. The paper does a good job of covering the many failures and components one might want to consider.
🇬🇧 The UK geoblock is lifted, hopefully permanently.
Nick, we spent lots of time talking on and in private messages here over the years and I’m pretty used to the way you talk. This message does not read like any conversation we’ve ever had and it deeply troubles me to see how drastically your thoughtful conversations with me have changed and someday I hope you go through and read those to compare.
Honestly, as someone who used to consider you an “internet friend” I hope you get banned. This sort of rhetoric is eliminationist by nature and is not just a statement of differences of opinion. This is what makes people feel less safe.
Serious q: do we tolerate this on lobsters? Anyone going to do something (ban)?
I really don’t care for the way this nice Unicode hack for Pride has been denigrated by someone who comes back, repeatedly, to spread hate and lies about people like me. It makes me not want to be here.
Edit so I’m relatively new here; how am I supposed to respond to hate speech written directly to me? I’m trying very hard to be polite but it’s just so exhausting having even the most innocuous LGBT content be attacked by bigotry.
It’s considered very abusive and not allowed here. I banned the user and the comments are all removed.
Ban after one comment? Isn’t this a bit harsh?
I read the multiple comments expressing religiously motivated bigotry before they were removed. The ban was entirely justified.
Lobsters has a public moderation log if you want to review.
Peter (@pushcx) will probably prune this comment thread any moment from now (does not relate to computing), but Banning someone just for having a different viewpoint than your own seems a bit harsh.
It’s not clear cut, of course. But I do not think it would be disproportionate to at least give a time limited ban with a warning. Because it is not just a different viewpoint; Nick attacked a marginalized group just because they were mentioned. When moderating a community you sort of have to choose between the attackers and those being attacked. You either force out the former or they will alienate the latter.
Leaving a hatted comment to note that this is pretty much the mod philosophy at work here, yes, the paradox of tolerance.
There is no realistic pro-LGBT movement. There are anti-LGBT, and anti-anti-LGBT movements (this is not a logical negation).
Bigotry with a nice veneer does not become a “different viewpoint” simply because of that nice veneer.
Seems like exterminist bigotry is fine and dandy on this site, which is a real shame. Makes me much less inclined to hang around.
It’s not fine. I don’t think it’s fine. See also his massive amount of flags, and the deluge of people telling him to shut it. Going to -10 is practically unheard of on Lobsters.
I’ve been a long time active member (9+ years 1600+ comments), and this is by far the worst anti-LGBT garbage I’ve seen on this site. I wholeheartedly support a ban.
Railing against gay people on a tech focused site is massively off-topic, which is why you are getting your comments flagged.
You are free to express your views on these topics elsewhere. When posting here, please confine yourself to tech, where I know your input is valued.
This reads like copypasta
did not read this but I think you are confused and it is a shame to see you being negative about gays.
I think your comment is off-topic here because lobste.rs is primarily a computing-focused forum, but your commentary has no relation to computing.
But since I read the linked paper anyways, I’ll include my summary of its argumentation for other readers:
Regardless of whether one agrees or disagrees with those points, this doesn’t seem like a matter of pseudoscience at all, but rather philosophy discussing the classification of “disorder”. It doesn’t directly make claims like “individuals of population X are more likely to have property P than those of population Y”, which would be more scientific. Nor does it make a claim that treating non-distressing disordered behavior would be good.
Overall, I am concerned because your comment is citing sources that don’t seem to justify the statement you’re making.
Oops, I realize now that my comment is serving to artificially push this post higher in the rankings, along with the rest of this thread.