Please take it easy in the comments. Yes, the primary author of the language is famous for posting incendiary, attention-grabbing rants. If you respond to this link with vitriol, even if you’re explaining how you dislike parts of this project’s design, you reward trolling and diminish our community.
Remember there are always a few thousand readers to every commenter. If you’re disagreeing with someone, you’re more persuasive to readers when you make your points and let them judge than descend into bickering.
Do you have a link to the apology you’re talking about? I’ve steered clear of his projects for a while now because of his toxicity, but if he’s truly making an effort to change that, I may be willing to reconsider.
(Disclaimer: I don’t know Drew personally. I have found several of his public comments abrasive. I also use and like sr.ht quite a bit.)
Just because I couldn’t remember when that toxic series of posts went up either, and couldn’t remember when his (apology or at least apology-adjacent) conciliatory comments went up, I used search engines to find the words “fuck” and “asshole” on his site. Those turned up both the earlier vitriolic posts and his statement of his ambitions to no longer make such posts. And they showed that he has largely avoided those two words in the past year, as far as the search engines are concerned, since. Those posts where he did use them don’t direct them pointedly at people, IMO.
Sample size is small, obviously, but it looks like the stated determination to be less toxic has been accompanied by progress in that direction.
I went looking for it, and… both the apology, as well as the angry rants which prompted him to apologize, have been deleted from his blog. (Unless I’m misremembering which status update the apology was written in the first place.)
Thanks for looking! I’m assuming it was his Wayland rant, since I can’t find a link to it from his blog index, but I can confirm the page still exists. I even wrote a pretty large response to it on this site because it was in very poor taste. I’m avoiding linking to it directly because he seems to have taken it down on purpose, and I’m happy to respect his decision.
I appreciate that he’s taking steps in the right direction. I don’t think I’m quite ready to look past his behavior, but I’m still willing to reconsider if he continues in this direction. It’s a very good sign.
Less vitriol in the open source community and software in general is definitely a good thing.
I tried out this language while it was in early development, writing some of the standard library (hash::crc* and unix::tty::*) to test the language. I wrote about this experience, in a somewhat haphazard way. (Note, that blog post is outdated and not all my opinions are the same. I’ll be trying to take a second look at Hare in the coming days.)
In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. I really hate to say this about a project where they authors have put in such a huge amount of effort over the past year or so, but I just don’t see its niche – the lack of generics mean I’d always use Zig or Rust instead of Hare or C. It really looks like Drew looked at Zig, said “too bloated”, and set out to create his own version.
Another thing I find strange: why are you choosing to not support Windows and macOS? Especially since, you know, one of C’s good points is that there’s a compiler for every platform and architecture combination on earth?
That said, this language is still in its infancy, so maybe as time goes and the language finds more users we’ll see more use-cases for Hare.
why are you choosing to not support Windows and macOS?
DdV’s answer on HN:
We don’t want to help non-FOSS OSes.
(Paraphrasing a lot, obvs.)
My personal 2c:
Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.
Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.
I think that’s the consequence of not planning for Windows support in the first place. Rust’s standard library was built without the assumption of an underlying Unix-like system, and it provides good abstractions as a result.
Windows and Mac/iOS don’t need help from new languages; it’s rather the other way around. Getting people to try a new language is pretty hard, let alone getting them to build real software in it. If the language deliberately won’t let them target three of the most widely used operating systems, I’d say it’s shooting itself in the foot, if not in the head.
(There are other seemingly perverse decisions too. 80-character lines and 8-character indentation? Manual allocation with no cleanup beyond a general-purpose “defer” statement? I must not be the target audience for this language, is the nicest response I have.)
Just for clarity, it’s not my argument. I was just trying to précis DdV’s.
I am not sure I agree, but then again…
I am not sure that I see the need for yet another C-replacement. Weren’t Limbo, D, Go, & Rust all attempts at this?
But that aside: there are a lot of OSes out there that are profoundly un-Unix-like. Windows is actually quite close, compared to, say, Oberon or classic MacOS or Z/OS or OpenVMS or Netware or OS/2 or iTron or OpenGenera or [cont’d p94].
There is a lot of diversity out there that gets ignored if it doesn’t have millions of users.
Confining oneself to just OSes in the same immediate family seems reasonable and prudent to me.
My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.
You could say that generics and macros are complex, relative to the functionality they offer.
But I would put comptime in a different category – it’s reducing complexity by providing a single, more powerful mechanism. Without something like comptime, IMO static languages lose significant productivity / power compared to a dynamic language.
You might be thinking about things from the tooling perspective, in which case both features are complex (and probably comptime even more because it’s creating impossible/undecidable problems). But in terms of the language I’d say that there is a big difference between the two.
I think a language like Hare will end up pushing that complexity out to the tooling. I guess it’s like Go where they have go generate and relatively verbose code.
I’m not being Zig-specific when I say that, by definition, comptime cannot introduce user-facing complexity. Unlike other attributes, comptime only exists during a specific phase of compiler execution; it’s not present during runtime. Like a static type declaration, comptime creates a second program executed by the compiler, and this second program does inform the first program’s runtime, but it is handled entirely by the compiler. Unlike a static type declaration, the user uses exactly the same expression language for comptime and runtime.
If we think of metaprogramming as inherent complexity, rather than incidental complexity, then an optimizing compiler already performs compile-time execution of input programs. What comptime offers is not additional complexity, but additional control over complexity which is already present.
To put all of this in a non-Zig context, languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming. Some folks argue that this introduces complexity. But the complexity of the Python runtime is there regardless of whether modules get an extra code-execution phase; the extra phase provides expressive power for users, not new complexity.
Yeah, but I feel like this isn’t what people usually mean when they say some feature “increases complexity.”
I think they mean something like: Now I must know more to navigate this world. There will be, on average, a wider array of common usage patterns that I will have to understand. You can say that the complexity was already there anyway, but if, in practice, is was usually hidden, and now it’s not, doesn’t that matter?
then an optimizing compiler already performs compile-time execution of input programs.
As a concrete example, I don’t have to know about a new keyword or what it means when the optimizing compiler does its thing.
A case can be made that this definition of complexity is a “good thing” to improve code quality / “matters”:
Similar arguments can be used for undefined behavior (UB) as it changes how you navigate a language’s world. But for many programmers, it can be usually hidden by code seemingly working in practice (i.e. not hitting race conditions, not hitting unreachable paths for common input, updating compilers, etc.). I’d argue that this still matters (enough to introduce tooling like UBSan, ASan, and TSan at least).
The UB is already there, both for correct and incorrect programs. Providing tools to interact with it (i.e. __builtin_unreachable -> comptime) as well as explicit ways to do what you want correctly (i.e. __builtin_add_overflow -> comptime specific lang constructs interacted with using normal code e.g. for vs inline for) would still be described as “increases complexity” under this model which is unfortunate.
The UB is already there, both for correct and incorrect programs.
Unless one is purposefully using a specific compiler (or set thereof), that actually defines the behaviour the standard didn’t, then the program is incorrect. That it just happens to generate correct object code with this particular version of that particular compiler on those particular platforms is just dumb luck.
Thus, I’d argue that tools like MSan, ASan, and UBSan don’t introduce any complexity at all. The just revealed the complexity of UB that was already there, and they do so reliably enough that they actually relieve me of some of the mental burden I previously had to shoulder.
languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming.
Python doesn’t allow compile-time metaprogramming for any reasonable definition of the word. Everything happens and is introspectable at runtime, which allows you to do similar things, but it’s not compile-time metaprogramming.
One way to see this is that sys.argv is always available when executing Python code.
(Python “compiles” byte code, but that’s an implementation detail unrelated to the semantics of the language.)
On the other hand, Zig and RPython are staged. There is one stage that does not have access to argv (compile time), and another one that does (runtime).
Related to the comment about RPython I linked here:
I am following the classic paper, “Out of the Tar Pit”, which in turn follows Brooks. In “Abstractive Power”, Shutt distinguishes complexity from expressiveness and abstractedness while relating all three.
We could always simply go back to computational complexity, but that doesn’t capture the usage in this thread. Edit for clarity: Computational complexity is a property of problems and algorithms, not a property of languages nor programming systems.
Good faith question: I just skimmed the first ~10 pages of “Out of the Tar Pit” again, but was unable to find the definition that you allude to, which would exclude things like the comptime keyword from the meaning of “complexity”. Can you point me to it or otherwise clarify?
Sure. I’m being explicit for posterity, but I’m not trying to be rude in my reply. First, the relevant parts of the paper; then, the relevance to comptime.
On p1, complexity is defined as the tendency of “large systems [to be] hard to understand”. Unpacking their em-dash and subjecting “large” to the heap paradox, we might imagine that complexity is the amount of information (bits) required to describe a system in full detail, with larger systems requiring more information. (I don’t really know what “understanding” is, so I’m not quite happy with “hard to understand” as a concrete definition.) Maybe we should call this “Brooks complexity”.
On p6, state is a cause of complexity. But comptime does not introduce state compared to an equivalent non-staged approach. On p8, control-flow is a cause of complexity. But comptime does not introduce new control-flow constructs. One could argue that comptime requires extra attention to order of evaluation, but again, an equivalent program would have the same order of evaluation at runtime.
On p10, “sheer code volume” is a cause of complexity, and on this point, I fully admit that I was in error; comptime is a long symbol, adding size to source code. In this particular sense, comptime adds Brooks complexity.
Finally, on a tangent to the top-level article, p12 explains that “power corrupts”:
[I]n the absence of language-enforced guarantees (…) mistakes (and abuses) will happen. This is the reason that garbage collection is good — the power of manual memory management is removed. … The bottom line is that the more powerful a language (i.e. the more that is possible within the language), the harder it is to understand systems constructed in it.
comptime and similar metaprogramming tools don’t make anything newly possible. It’s an annotation to the compiler to emit specialized code for the same computational result. As such, they arguably don’t add Brooks complexity. I think that this argument also works for inline, but not for @compileError.
My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.
Yeah, I can see that. But under what conditions would I care how small, big, or icecream-covered the compiler is? Building/bootstrapping for a new platform is a one-time thing, but writing code in the language isn’t. I want the language to make it as easy as possible on me when I’m using it, and omitting features that were around since the 1990’s isn’t helping.
Depends on your values! I personally see how, eg, generics entice users to write overly complicated code which I then have to deal with as a consumer of libraries. I am not sure that not having generics solves this problem, but I am fairly certain that the problem exists, and that some kind of solution would be helpful!
I see what you mean, but I think in those situations it’s not too hard to, you know, refrain from use generics. I see no reason to force all language users to not use that feature. Unless Hare is specifically aiming for that niche, which I don’t think it is.
There are very few languages that let you switch between monomorphisation and dynamic dispatch as a compile-time flag, right? So if you have dependencies, you’ve already had the choice forced on you.
that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere, and I am not convinced that complexity that is abstracted away and provides value is a bad thing, as much of the “let’s go back to simpler times” discourse seems to imply. I rather someone takes the time to solve something once, than me having to solve it every time, even if with simpler code.
Is this just complex, or is it actually doing more than the equivalent in other languages? Rust allows for expressing constraints that are not easily (or at all) expressable in other languages, and static types allow for expressing more constraints than dynamic types in general.
In sum, I’d reject a pull request with this type of code in an application, but don’t mind it at all in a library.
that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere,
I find that’s rarely the case. It’s often possible to tweak the approach to a problem a little bit, in a way that allows you to simply omit huge swaths of complexity.
I’ve done it repeatedly, as well as seeing others do it. Occasionally, though admittedly rarely, reducing the size of the codebase by an order of magnitude while increasing the number of features.
There’s a huge amount of code in most systems that’s dedicated to solving optional problems. Usually the unnecessary problems are imposed at the system design level, and changing the way the parts interface internally allows simple reuse of appropriate large-scale building blocks and subsystems, reduces the number of building blocks needed, and drops entire sections of translation and negotiation glue between layers.
Complexity rarely needs to be somewhere – and where it does need to be, it’s in often in the ad-hoc, problem-specific data structures that simplify the domain. A good data structure can act as a laplace transform for the entire problem space of a program, even if it takes a few thousand lines to implement. It lets you take the problem, transform it to a space where the problem is easy to solve, and put it back directly.
You can write complex code in any language, with any language feature. The fact that someone has written complex code in Rust with its macros has no bearing on the feature itself.
I am not entirely convinced – to me, it seems there’s a high correlation between languages with parametric polymorphism and languages with culture for high-to-understand abstractions (Rust, C++, Scala, Haskell). Even in Java, parts that touch generics tend to require some mind-bending (producer extends consumer super).
I am curious how Go’s generic would turn out to be in practice!
In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. … I’d always use Zig or Rust instead of Hare or C.
What if you were on a platform unsupported by LLVM?
When I was trying out Plan 9, lack of LLVM support really hurt; a lot of good CLI tools these days are being written in Rust.
Zig has rudimentary plan9 support, including a linker and native codegen (without LLVM). We’ll need more plan9 maintainers to step up if this is to become a robust target, but the groundwork has been laid.
Additionally, Zig has a C backend for those targets that only ship a proprietary C compiler fork and do not publish ISA details.
Finally, Zig has the ambitions to become the project that is forked and used as the proprietary compiler for esoteric systems. Although of course we would prefer for businesses to make their ISAs open source and publicly documented instead. Nevertheless, Zig’s MIT license does allow this use case.
I think that implies that your platform is essentially dead ( I would like to program my Amiga in Rust or Swift or Zig, too) or so off-mainstream (MVS comes to mind) that those tools wouldn’t serve any purpose anyway because they’re too alien).
Good news: LLVM does support 68k, in part to many communities like the Amiga community. LLVM doesn’t like to include stuff unless there’s a sufficient maintainer base, so…
MVS comes to mind
Bad news: LLVM does support S/390. No idea if it’s just Linux or includes MVS.
Good news: LLVM does support 68k
Unfortunately, that doesn’t by itself mean that compilers (apart from clang) get ported, or that the platform gets added as part of a target triple. For instance, Plan 9 runs on platforms with LLVM support, yet isn’t supported by LLVM.
Bad news: LLVM does support S/390.
I should have written VMS instead.
I won’t disagree with describing Plan 9 as off-mainstream ;) But I’d still like a console-based Signal client for that OS, and the best (only?) one I’ve found is written in Rust.
I imagine that many people will be wondering how Hare differs from Zig, which seems similar to me as an outsider to both projects. Could someone more familiar with the goals of Hare briefly describe why (assuming a future in which both projects are reasonably mature) someone may want to choose Hare over Zig, and Zig over Hare?
I imagine that many people will be wondering how Hare differs from Zig, which seems similar to me as an outsider to both projects.
As someone who used Hare briefly last year when it was still in development (so this may be slightly outdated), I honestly see no reason to use Hare for the time being. While it provides huge advances over C, it just feels like a stripped-down version of Zig in the end.
My understanding is that Hare is for people who want a modern C (fewer footguns, etc) but who also want a substantially more minimalist approach than what Zig offers. Hare differs from Zig by having a smaller scope (eg it doesn’t try to be a C cross-compiler), not using LLVM, not having generic/templated metaprogramming, and by not having async/await in the language.
That definitely sounds appealing to me as someone who has basically turned his back on 95% of the Rust ecosystem due to it feeling a bit like stepping into a candy shop when I just wanted a little fiber to keep my programs healthier by rejecting bad things. I sometimes think about what a less-sugary Rust might be like to use, but I can’t practically see myself doing anything other than what I am doing currently - using the subset of features that I enjoy while taking advantage of the work that occasionally improves the interesting subset to me. And every once in a while, it’s nice to take a bite out of some sugar :]
If I remember correctly, there was some discussion about a kind of barebones Rust at some point around here. Is that what you would ideally have/work with? Which features would survive, and which be dropped?
It looks like it’s a lot simpler. Zig is trying to do much more. I also appreciate that Hare isn’t self-hosting and can be built using any standard C compiler and chooses QBE over LLVM, which is simpler and more light-weight.
As I understand it, the current Zig compiler is in C++; they are working on a self-hosting compiler, but intend to maintain the C++ compiler alongside it indefinitely.
Well, at some point it would make sense, much like C compilers are ubiquitously self-hosted. As long as it doesn’t make it too hard to bootstrap (for instance, if it has decent cross-compilation support), it should be fine.
Warning: this is not supposed to be taken very seriously. It’s not a joke, but I won’t bet 2 cents that I’m right about any of it.
Pretty much all widely used languages today have a thing. Having a thing is not, by far, the only determinant factor in whether a language succeeds, and you can even question whether wide adoption is such a good measure of success. But the fact is, pretty much all languages we know and use professionally have a thing, or indeed, multiple things:
Python has simplicity, and later, Django, and later even, data science
Ruby has Rails and developer happiness (whatever that means)
Go had simplicity (same name, but a different thing than Python’s) and concurrency (and Google, but I don’t think that it counts as a thing)
PHP had web, and, arguably, Apache and cheap hosts
JavaScript has the browser
Typescript has the browser, but with types
Java had the JVM (write once, run everywhere), and then enterprise
C# had Java, but Microsoft, and then Java, but better
Rust has memory safety even in the presence of threads
Even older languages like SQL, Fortran, Cobol, they all had a thing. I can’t see what Hare’s thing might be. And to be fair, it’s not a problem exclusively with, or specially represented by, Hare. 9/10 times, when there’s a post anywhere about a new language, it has no thing. None. It’s not even that is not actually particularly well suited for it’s thing, it can’t even articulate what it’s thing is.
“Well, Hare’s thing is system’s programming” that’s like saying that McDonald’s thing is hamburgers. A thing is more than a niche. It’s … well, it’s a thing.
It might well be the case that you can only see a thing in retrospect (I feel like that might be the case with Python, for instance), but still, feels like it’s missing, and not not only here.
It might well be the case that you can only see a thing in retrospect
Considering how many false starts Java had, there was an obvious and error-ridden search process to locate the thing—first delivering portability, mainly for the benefit of Sun installations nobody actually had, then delivering web applets, which ran intolerably poorly on the machines people needed them to run on, and then as a mobile device framework that was, again, a very poor match for the limited hardware of the era, before finding a niche in enterprise web platforms. Ironically, I think Sun did an excellent job of identifying platforms in need of a thing, seemingly without realizing that their thing was a piss-poor contender for being the thing in that niche. If it weren’t for Sun relentlessly searching for something for Java to do, I don’t think it would have gotten anywhere simply on its merits.
feels like it’s missing
I agree, but I also think it’s a day old, and Ruby was around for years before Rails. Although I would say that Ruby’s creator did so out of a desire for certain affordances that were kind of unavailable from other systems of the time—a Smalltalk placed solidly in the Perl-Unix universe rather than isolated in a Smalltalk image. What we seem to have here is a very small itch (Zig with a simpler compiler?) being scratched very intensely.
Ruby and Python were in the back of my mind the whole time I was writing the thing about things (hehe), and you have a point about Java, that thing flailed around A LOT before settling down. Very small itch is a good summary.
Pretty much all widely used languages today have a thing.
[…]
Even older languages like SQL, Fortran, Cobol, they all had a thing
An obvious language you do not mention is C. What’s C’s thing in that framework? And why couldn’t Hare’s thing be “C, but better”, like C# is to Java? (Or arguably C++ is to C, or Zig is to C)
Well, I did say a thing is not the only determinant for widespread adoption. I don’t think C had a thing when it became widely used. Maybe portability? It was the wild wild west days, though.
Hare could very well eat C’s lunch and became big. But being possible is far away from being likely.
But it’s not. At least not once you turn on optimizations. This is a belief people have that makes C seem friendlier and lower level, but there have been any number of articles posted here about the complex transformations between C and assembly.
(Heck, even assembly isn’t really describing what the CPU actually does, not when there’s pipelining and multiprocessing going on.)
But it is. Sure, you can tell the compiler to optimize, in which case all bets are obviously off, but it doesn’t negate the fact that C is the only mainstream high-level language that gets you as close to the machine language as it gets.
you can tell the compiler to optimize, in which case all bets are obviously off
…and since all deployed code is optimized, I’m not sure what your point is.
Any modern C compiler is basically humoring you, taking your code as a rough guideline of what to do, but reordering and inlining and unrolling and constant-folding, etc.
And then the CPU chip gets involved, and even the machine instructions aren’t the truth of what’s really going on in the hardware. Especially in x86, where the instruction set is just an interpreted language that gets heavily transformed into micro-ops you never see.
If you really want to feel like your C code tells the machine exactly what to do, consider getting one of those cute retro boards based on a Z-80 or 8086 and run some 1980s C compiler on it.
No need to lecture and patronize if you don’t get the point.
C was built around machine code, with literally every language construct derived from a subset of the latter and nothing else. It still remains true to that spirit. If you see a piece of C code, you can still make a reasonable guess to what it roughly translates to. Even if it’s unrolled, inlined or even trimmed. In comparison with other languages, where “a += b” or “x = y” may translate into the pages of binary.
It’s not that C generates an exact assembly you’d expect, it’s that there’s a cap on what it can generate from a given piece of code you are currently looking at. “x = y” is a memcpy at worst and a dereferencing a pointer does more or less just that. Not the case with C++, leave alone Go, D, etc.
I suggest reading an intro to compilers class textbook. Compilers do basic optimizations like liveliness analysis, dead store eliminations etc. Just because you write down “x = y” doesn’t mean the compiler will respect it and keep the load/store in your binary.
Some general observations. I don’t have specific examples handy and I’m not going to spend the time to conjure them up for what is already a thread that is too deep.
At -O0 there are many loads and stores generated that are not expected. This is because the compiler is playing it safe and accessing everything from the stack. Customers generally don’t expect that and some complain that the code is “stupid”.
At -O1 and above, lots of code gets moved around, especially when inlining and hoisting code out of loops. Non-obvious loop invariants and loops that have on effect on memory (because the user forgot a volatile) regularly result in bug reports saying the compiler is broken. In nearly every case, the user expects all the code they wrote to be there in the order they wrote it, with all the function calls in the same order. This is rarely the case.
Interrupt code will be removed sometimes because it is not called anywhere. The user often forgets to tag a function as an interrupt and just assumes everything they write will be in the binary.
Our customers program microcontrollers. They sometimes need timing requirements on functions, but make the assumption that the compiler will generate the code they expect to get the exact timing requirements they need. This is a bad assumption. They may think that a series of reads and writes from memory will result in a nearly 1-1 correspondence of load/stores, but the compiler may realize that because things are aligned properly, it can be done in a single load-multiple/store-multiple instruction pair.
People often expect one statement to map to a contiguous region of instructions. When optimizaton is turned on, that’s not true in many cases. The start and end of something as “simple” as x = y can be very far apart.
This is just from recent memory. There is really no end it. I won’t get into the “this instruction sequence takes too many cycles” reports as those don’t seem to match your desired criteria.
Thanks. These are still more or less in the ballpark of what’s expected with optimizations on.
I’ve run into at least a couple of these, but I can remember only one case when it was critical and required switching optimizations off to get what we needed from the compiler (had to with handling very small packets in the nic driver). Did kill a day on that though.
Some initial thoughts, with the caveats that I quite like Rust and am put off by ddv:
Suspicion - the “new systems” niche seems filled pretty well by Rust and Zig IMO, and Hare would have to be pretty interesting to justify using it over one of those. As spacejam said, everyone’s initial question is “why this instead of Rust or Zig (or C)” - those comparisons should be front and center.
I’m confused about who the tutorial’s written for. It includes passages like this:
A function is the basic unit of executable code in a Hare program, which defines a step or series of steps for the program to perform. Functions can “call” each other to deputize various tasks, such as relying on “fmt” to format text for printing.
Which I can’t imagine is new or useful information to anyone reading this.
The tagged unions seem cool, especially if it’s possible to transparently pass a value of (a | b) to something expecting a (a | b | c). I wonder how that’s implemented?
The insert keyword seems like a strange choice. This means the syntax of the language relies on the presence of a single global dynamic allocator. C, Rust, and especially Zig all do a better job of limiting allocation to library functions, which makes those languages more suitable to embedded devices, contexts where allocation failure must be cleanly handled (like most C libraries), or contexts where multiple dynamic allocators are available (like the Linux kernel). I guess they did it this way because they don’t have user-accessible generics, so they only way to provide a function like this is compiler-level magic?
Edit:
Ok, yeah, the spec confirms that any dynamic allocation failure aborts the program. Rust’s standard library has this behavior, and it’s caused a lot of trouble for integrating Rust into things like curl and the Linux kernel, but there’s work on providing fallible allocation APIs in the standard library. Baking this behavior onto the language, in a language that should be learning from Rust’s mistakes, is baffling.
Hare has both a ? operator (same as Rust’s ? or Zig’s try) and a ! operator (same as Rust’s unwrap). ? is great, but I’m not a huge fan of the single-character ! operator here - a big advantage of Rust’s approach is that unwrap is much longer than ?, and sticks out in code review.
Hare has both a ? operator (same as Rust’s ? or Zig’s try) and a ! operator (same as Rust’s unwrap). ? is great, but I’m not a huge fan of the single-character ! operator here - a big advantage of Rust’s approach is that unwrap is much longer than ?, and sticks out in code review.
I dunno, Zig has catch unreachable, which is sticks out way more than .unwrap(), but I just find it annoying to type that out dozens of times, especially in testing/experimental/throwaway code where I couldn’t care less about robustness.
Each type in Hare is assigned a unique ID, which is stored as the first field of a tagged union type to indicate which type is stored there. Following the tag, a union of all of the possible types is stored.
Ah, cool, that’s clever. I guess that means all union values must have a 4-byte tag, even if they only need one byte - but that’s probably fine in practice.
I’m always amazed by Drew DeVault breadth of projects: from window managers, to microkernels including a web forge and now a programming language with its compiler and libraries. While I find him sometimes too antagonistic/abrasive, this type of technical breadth is (to my knowledge) only surpassed by Fabrice Bellard.
Even other people which I look up to, like Jason A. Donenfeld (aka /u/zx2c4 or “the wireguard guy”) and Andrew Gallant (aka /u/burntsushi or “the ripgrep guy”), only specialize in one field. (respectively security and text processing)
I can only be in awe when I see this level of skills. I’m looking forward to playing around with his new language when I find some free time.
I am also amazed by his work, and don’t wish to disparage him in any way, but in his own words
As a prolific maintainer of several dozen FOSS projects, I’m often asked how I can get so much done, being just one person. The answer is: I’m not just one person. I have enjoyed the help of thousands of talented people who have contributed to these works. Without them, none of the projects I work on would be successful.
– https://drewdevault.com/2022/03/14/It-takes-a-village.html
Where they were writing window managers, I was writing GPU drivers. We both worked on Minecraft servers. We both contributed various low-level experiments for DCPU-16. Where they were writing microkernels, I was reverse-engineering video-game consoles. While they were founding their small business for a Free Software forge, I was starting my small business for encrypted cloud storage. They’re working on Hare, which is basically their opinion on how to make an ergonomic C; I worked on Monte, which was basically our opinion of how to make a practical E.
And yet, I’m not a good person. Maybe they aren’t, either.
I ignored almost everything and went straight to the bit I have some expertise on: cryptography.
I like the relative lack of bloat. We could argue that their cryptographic library is not complete, but that can be fixed.
I like that (apparently) slices are used for the API. Having written a cryptographic library in C, I saw how we are reading from and writing to buffers all the time, and having to specify their length explicitly means my functions have many more arguments than I would have liked.
I like the choice of primitives. Except perhaps AES (slow or vulnerable to timing attacks on pure software implementations), but I understand its appeal in the face of widespread hardware support.
There is one thing I’d like to insist on, that I realised fairly late in my cryptographic career: naive key exchange is not high-level enough.
By “key exchange”, I mean the kind of key exchange that happens in NaCl’s crypto_box(): a key exchange proper followed by a hash, so the two parties have a shared key. Nowadays it’s not quite enough to just exchange Alice’s and Bob’s long term keys, we also want stronger properties like forward secrecy and key compromise impersonation resistance. To do that, you need a full key exchange protocol involving 2 or 3 messages in most cases. I don’t know what Hare is actually using here (the key exchange link is not live), but if they don’t have it already, something like Noise would be a good addition some time in the future.
As someone who is rather new to languages like C (I only recently got into it by making a game with it), I have a few newbie questions:
Why do people want to replace C? Security reasons, or just old and outdated?
What does Hare offer over C? They say that Hare is simpler than C, but I don’t understand exactly how. Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?
That being said, I find it cool to see these languages popping up.
Why do people want to replace C? Security reasons, or just old and outdated?
#include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from
C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.
There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)
C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).
C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.
C’s standard library isn’t really standard, a lot of stuff isn’t consistent across OS’s. (I have agreeable memories of that time I tried to get a simple 3kloc project from Linux running on Windows. The amount of hoops you have to jump through, tearing out functions that are Linux-only and replacing them with an ifdef mess to call Windows-only functions if you’re on compiling on Windows and the Linux versions otherwise…)
C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.
C has no anonymous functions. (Whether this matters really depends on your coding style.)
Manual memory management without defer is a PITA and error-prone.
Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)
EDIT: As Forty-Bot noted, one of the biggest issues are null-terminated strings.
I could go on and on forever.
What does Hare offer over C?
It fixes a lot of the issues I mentioned earlier, as well as reducing footguns and implementation-defined behavior in general. See my blog post for a list.
They say that Hare is simpler than C, but I don’t understand exactly how.
It’s simpler than C because it comes without all the cruft and compromises that C has built up over the past 50 years. Additionally, it’s easier to code in Hare because, well, the language isn’t trying to screw you up every 10 lines. :^)
Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?
Zig and Hare both occupy the same niche as C (i.e., low-level manual memory managed systems language); they both compile to machine code. And yes, they make it a lot easier to write code.
#include <foo.h> includes all functions/constants into the current namespace, so you have no idea what module a function came from
This and your later point about not being able to associate methods with struct definitions are variations on the same point but it’s worth repeating: C has no mechanism for isolating namespaces. A C function is either static (confined to a single compilation unit) or completely global. Most shared library systems also give you a package-local form but anything that you’re exporting goes in a single flat namespace. This is also true of type and macro definitions. This is terrible for software engineering. Two libraries can easily define different macros with the same name and break compilation units that want to use both.
C++, at least, gives you namespaces for everything except macros.
C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).
The lack of type checking is really important here. A systems programming language is used to implement the most critical bits of the system. Type checks are incredibly important here, casting everything via void* has been the source of vast numbers of security vulnerabilities in C codebases. C++ templates avoid this.
C’s standard library is really tiny, so you end up creating your own in the process, which you end up carrying around from project to project.
This is less of an issue for systems programming, where a large standard library is also a problem because it implies dependencies on large features in the environment. In an embedded system or a kernel, I don’t want a standard library with file I/O. Actually, for most cloud programming I’d like a standard library that doesn’t assume the existence of a local filesystem as well. A bigger problem is that the library is not modular and layered. Rust’s nostd is a good step in the right direction here.
C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.
From libc, most errors are not returned, they’re signalled via the return and then stored in a global (now a thread-local) variable called errno. Yay. Option types for returns are really important for maintainable systems programming. C++ now has std::optional and std::variant in the standard library, other languages have union types as first-class citizens.
Manual memory management without defer is a PITA and error-prone.
defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.
C has no anonymous functions. (Whether this matters really depends on your coding style.)
Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.
defer isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.
True. I’m more saying that defer is the baseline here; without it you need cleanup: labels, gotos, and synchronized function returns. It can get ugly fast.
Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.
I disagree, depends on what you’re doing. I’m doing a roguelike in Zig right now, and I use anonymous functions quite extensively for item/weapon/armor/etc triggers, i.e., where each game object has some unique anonymous functions tied to the object’s fields and can be called on certain events. Having closures would be nice, but honestly in this use-case I didn’t really feel much of a need for it.
Note that C does have “standard” answers to a lot of these.
C’s macro system is very, very error prone and very easily abused, since it’s basically a glorified search-and-replace system that has no way to warn you of mistakes.
The macro system is the #1 thing keeping C alive :)
There are no methods for structs, you basically create struct Foo and then have to name all the methods of that struct foo_do_stuff (instead of doing foo_var.do_stuff() like in other languages)
Aside from macro stuff, the typical way to address this is to use a struct of function pointers. So you’d create a wrapper like
do_stuff(struct *foo)
{
foo->do_stuff(foo);
}
C has no generics, you have to do ugly hacks with either void* (which means no type checking) or with the macro system (which is a pain in the ass).
Note that typically there is a “base class” which either all “subclasses” include as a member (and use offsetof to recover the subclass) or have a void * private data pointer. This doesn’t really escape the problem, however in practice I’ve never run into a bug where the wrong struct/method gets combined. This is because the above pattern ensures that the correct method gets called.
C’s error handling is completely nonexistant. “Errors” are returned as integer codes, so you need to define an enum/constants for each function (for each possible returned error), but if you do that, you need to have the actual return value as a pointer argument.
Well, there’s always errno… And if you control the address space you can always use the upper few addresses for error codes. That said, better syntax for multiple return values would probably go a long way.
C has no anonymous functions. (Whether this matters really depends on your coding style.)
IIRC gcc has them, but they require executable stacks :)
Manual memory management without defer is a PITA and error-prone.
Agree. I think you can do this with GCC extensions, but some sugar here would be nice.
Weird integer type system. long long, int, short, etc which have different bit widths on different arches/platforms. (Most C projects I know import stdint.h to get uint32_t and friends, or just have a typedef mess to use usize, u32, u16, etc.)
Arguably there should be fixed width types, size_t, intptr_t, and regsize_t. Unfortunately, C lacks the last one, which is typically assumed to be long. Rust, for example, gets this even more wrong and lacks the last two (c.f. the recent post on 129-bit pointers).
IMO you missed the most important part, which is that C strings are (by-and-large) nul-terminated. Having better syntax for carrying a length around with a pointer would go a long way to making string support better.
Even in C’s domain, where C lacks nothing and is fine for what it is, I would criticize C for maybe 5 things, which I would consider the real criticism:
It has undefined behaviour, of the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code just by switching compiler or inlining some code that wasn’t inlined before. You can’t necessarily point at a piece of code and say it was always broken, because UB is a runtime phenomenon. Not reassuring for a supposedly lowlevel language.
Its operator precedence is wrong.
Integer promotion. Just why.
Signedness propagates the wrong way: Instead of the default type being signed (int) and comparison between signed and unsigned yielding unsigned, it should be opposite: There should be a nat type (for natural number, effectively size_t), and comparison between signed and unsigned should yield signed.
char is signed. Nobody likes negative code points.
the kind that has come to mean that the compiler may disobey the source code. It turns working code into broken code
I’m wary of this same tired argument cropping up again, so I’ll just state it this way: I disagree. Code that invokes undefined behavior is already broken; changing compiler can’t (except perhaps in very particular circumstances, which I don’t think you were referring to) introduce undefined behaviour; it can change the observable behaviour when UB is invoked.
A compiler can’t “disobey the source code” whilst conforming to the language standard. If the source code does something that doesn’t have defined semantics, that’s on the source code, not the compiler.
“It’s easy to accidentally invoke undefined behaviour in C” is a valid criticism, but “C compilers breaks code” is not.
You can’t necessarily point at a piece of code and say it was always broken
You certainly can in some instances. But sure, for example, if some piece of code dereferences a pointer and the value is set somewhere else, it could be undefined or not depending on whether the pointer is valid at the point it is dereferenced. So code might be “not broken” given certain constraints (eg that the pointer is valid), but not work properly if those constraints are violated, just like code in any language (although in C there’s a good chance the end result is UB, which is potentially more catastrophic).
I’m not saying C is a good language, just that I think this particular criticism is unfair. (Also I think your point 5 is wrong, char can be unsigned, it’s up to the implementation).
Thing is, it certainly feels like the compiler is disobeying the source code. Signed integer overflow? No problem pal, this is x86, that platform will wrap around just fine! Right? Riiight? Oops, nope, and since the compiler pretends UB does not exist, it just deleted a security check that it deemed “dead code”, and now my hard drive has been encrypted by a ransomware that just exploited my vulnerability.
Though I agree with all the facts you laid out, and with the interpretation that UB means the program is already broken even if the generated binary didn’t propagate the error. But Chandler Carruth pretending that UB does not invoke the nasal demons is not far. Let’s not forget that UB means the compiler is allowed to cause your entire hard drive to be formatted, as ridiculous as it may sound. And sometimes it actually happens (as it did so many times with buffer overflow exploits).
Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.
Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.
I feel like “disobeying the code” and “not doing what I intended it to do due to the code being wrong” are still two sufficiently different things that it’s worth distinguishing.
But it is also worth noting that C is quite special. This UB business repeatedly violates the principle of least astonishement. Especially the modern interpretation, where compilers systematically assume UB does not exist and any code path that hits UB is considered “dead code”.
The original intent of UB was much closer to implementation defined behaviour. Signed integer overflow was originally UB because some platforms crashed or otherwise went bananas when it occurred. But the expectation was that on platforms that behave reasonably (like x86, that wraps around), we’d get the reasonable behaviour. But then compiler writers (or should I say their lawyers) noticed that strictly speaking, the standard didn’t made that expectation explicit, and in the name of optimisation started to invoke nasal demons even on platforms that could have done the right thing.
Sure the code is wrong. In many cases though, the standard is also wrong.
I agree with some things but not others that you say, but these arguments have been hashed out many times before.
Sure the code is wrong
That’s the point I was making. Since we agree on that, and we agree that there are valid criticisms of C as a language (though we may differ on the specifics of those), let’s leave the rest. Peace.
It doesn’t compile it wrong. Code with no semantics can’t be compiled incorrectly. You’re making the exact same misrepresentation as in the post above that I responded to originally.
I’d almost agree, though I can think of some cases where such code could exist for a reason (and I’ll bet that such code exists in real code bases). In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB, at least) in order to make compile-time type-safety checks. IIRC there are a few such things used in the Linux kernel. There are probably plenty of other cases; there’s a lot of C code out there.
In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated (eg if a pointer is invalid, or if an integer is too large and will result in overflow at some operation), and the compiler can’t always tell that the constraints necessarily will be violated, so it generates code with the assumption that if the code is executed, then the constraints do hold. So if the larger body of code is wrong - the constraints are violated, that is - the behaviour is undefined.
Contrary to what people are saying, C is just fine for what it is.
People complain about the std library being tiny, but you basically have the operating system at your fingers, where C is a first class citizen.
Then people complain C is not safe, yes that’s true, but with a set of best practices you can keep thing under control.
People complain you don’t have generics, you dont need them most of the time.
Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language. Also most of the popular python libraries nowadays are written in C.
Hi! I’d like to introduce you to Fish in a Barrel, a bot which publishes information about security vulnerabilities to Twitter, including statistics on how many of those vulnerabilities are due to memory unsafety. In general, memory unsafety is easy to avoid in languages which do not permit memory-unsafe operations, and nearly impossible to avoid in other languages. Because C is in the latter set, C is a regular and reliable source of security vulnerabilities.
I understand your position; you believe that people are morally obligated to choose “a set of best practices” which limits usage of languages like C to supposedly-safe subsets. However, there are not many interesting subsets of C; at best, avoiding pointer arithmetic and casts is good, but little can be done about the inherent dangers of malloc() and free() (and free() and free() and …) Moreover, why not consider the act of choosing a language to be a practice? Then the choice of C can itself be critiqued as contrary to best practices.
nginx is well-written, but Redis is not. SQLite is not written just in C, but also in several other languages combined, including SQL and TH1 (“test harness one”); this latter language is specifically for testing that SQLite behaves property. All three have had memory-unsafety bugs. This suggests that even well-written C, or C in combination with other languages, is unsafe.
Additionally, Nix is written in C++ and package definitions are written in shell. I prefer PyPy to CPython; both are written in a combination of C and Python, with CPython using more C and PyPy using more Python. I’m not sure where you were headed here; this sounds like a popularity-contest argument, but those are not meaningful in discussions about technical issues. Nonetheless, if it’s the only thing that motivates you, then consider this quote from the Google Chrome security team:
Since “memory safety” bugs account for 70% of the exploitable security bugs, we aim to write new parts of Chrome in memory-safe languages.
I am curious about your claim that Redis is not well-written? I’ve seen other folks online hold it up as an example of a well-written C codebase, at least in terms of readability.
I understand that readable is not the same as secure, but would like to understand where you are coming from on this.l
I also believe C will still have a place for long time. I know I’m a newbie with it, but making a game with C (using Raylib) has been pretty fun. It’s simple and to the point… And I don’t mind making mistakes really, that’s how I learn the best.
But again it’s cool to see people creating new languages as alternatives.
Hare makes a number of conservative improvements on C’s ideas, the biggest bet of which is the use of tagged unions. Here are a few other improvements:
A context-free grammar
Less weird type syntax
Language tooling in the stdlib
Built-in and semantically meaningful static and runtime assertions
A lightweight system for dependency resolution
defer for cleanup and error handling
An optional build system which you can replace with make and standard tools
Even with these improvements, Hare manages to be a smaller, more conservative language than C, with our specification clocking in at less than 1/10th the size of C11, without sacrificing anything that you need to get things done in the systems programming world.
It’s worth reading the whole piece. I only pasted his summary.
This is a deeply philosophically incoherent statement:
Our design principles are:
Trust the programmer.
Provide tools the programmer may use when they don’t trust themselves.
Prefer explicit behavior over implicit behavior.
A good program must be both correct and simple.
First of all, I can only assume this is a list of differentiators, because there are many more implied principles which aren’t mentioned here; that familiar syntax is better, that *nix are the most important OSes, that modularity is important, etc. These are not just axioms for a coming argument; these are the raison d’etre of Hare.
Proceeding on that assumption, we immediately have a problem. The first point makes no sense as a differentiator. There is no language in the world that is built on the premise “don’t trust the programmer” (or, perhaps, no industrial language; an esolang like that would be fascinating.) This strikes me as a shot at garbage collected languages, or perhaps at Rust; the latter being a bit more likely as the two langauges compete in the same space. Either way, it is a rather, ah, hare-brained criticism, since in most languages like this, and certainly in Rust, there are facilities for bypassing the GC (or borrow-checker) and doing whatever the heck you want.
The second point is fine, but in combination with the first, it seems… I don’t know, petty? “We’re good enough at programming that we don’t need any compiler telling us what to do, but we recognize that some of you aren’t, so we’re giving you some memory safety tools.” This attitude is reinforced in “Hare’s Advances Compared to C”:
Despite the veneration with which we look upon C, those who look upon it with scorn do have some valid concerns. A complete lack of memory-safe features and a miserable error handling experience both make it pretty easy to make mistakes in C that are not possible elsewhere. Some of this also makes it possible to do desirable things which are not possible in those other languages, but it would be nice to have options for when you don’t need to do those things and just want to write reliable code.
This is absolutely true and I agree with it (though “veneration” is a bit much - what are you, a Confucian?), but when placed in the context of other things DeVault has written, such as in “Rust is not a good C replacement”, it reads like a double standard.
Rust is more safe. I don’t really care. In light of all of these problems, I’ll take my segfaults and buffer overflows.
Here, “these problems” are, among others, a lack of portability compared to C, lack of a spec, and a lack of competing implementations. To its credit, Hare has a spec, but I don’t see competing implementations, and it doesn’t even support Windows, let alone the 90s RISC architectures people are complaining about missing in LLVM/Rust. Why is that okay in Hare but not in Rust? It’s almost enough to make one think that DDV just doesn’t like Rust because it opens systems programming up to people he doesn’t deem worthy.
Beyond this, point 4 is just a straight-up shot at programming languages with advanced type systems. “I’m smart enough to write correct code without having the compiler check my work. Why aren’t you?” This is ridiculous; nobody can write code as perfectly and consistently as a computer can check mathematically encoded invariants, and while there is a very important conversation to be had about what the right tradeoff is between flexibility and powerful static analysis, that is not the conversation being had here.
This language is interesting and I’m excited to see it progress, but I sincerely hope this is the end of DDV’s bashing of all languages which are not strictly better than C in every way, and that the elitism on display is toned down somewhat in later revisions of the meta-information around the language.
EDIT: I should talk about why I’m interested. It looks like an expression-oriented C with required initialization, slices, and slicey UTF-8 strings (and UCS-32 types!). A modern C, in other words. That’s valuable, if it can gain traction.
i’ve known about this language for awhile, and it has been a lot of fun to toy around with. i’m excited to see where the project goes.
my only complaint so far (fwiw, i know very little about programming in general) is the seemingly gratuitous use of semi-colons. here’s a snippet that displays what i mean:
# ctrl+f for ; - there are so many!
for (let i = 0z; i < len(items); i += 1) {
if (items[i] == "Hare") {
continue;
};
fmt::println(items[i])!;
};
to me, this represents the compiler yelling at me about five thousand times before i finally get it right. does anyone know why this might be, or have any insight into whether this syntax is here to stay?
I’m not related to Hare in any way, but in my experience, the formatting and style stuff like this tends to be laid down in stone right towards the very beginning. Sure enough: https://harelang.org/style/
Does it have more ; than it really needs, maybe, but it’s the way they want it, so :)
If that’s the official style of Hare, then Drew should enforce it through the compiler—don’t be like Rob Pike and chicken out not enforcing it with the compiler.
Drew is known for not having a problem stating his opinions and enforcing them, I imagine it’s either still a work in progress or they just haven’t gotten around to it yet.
Random example: sr.ht doesn’t have a www DNS entry, so www.sr.ht doesn’t work at all.
I see it has defer. I would instead be tempted for explicit destructors (aka. “higher RAII” or whatever Vale calls it). Defer-style cleanup kind of has the wrong default: It works well for classic resources that the user expects, like memory, files and mutexes, but if making an API around some other resource, it is nice to be able to give an object to the user that he/she can’t simply forget to do something about. If they want to ignore it, they just make a function that consumes it – destructors in this sense are just normal functions.
No, not if you have to remember to call it. Not a special function name either. Forget defer (or maybe keep it for convenience’s sake). Think destructors, except that the calls are explicit – it’s an error to let such an object go out of scope alive.
// Let's say this means that HotPotato must not outlive its scope.
#[derive(Resource)]
type HotPotato = struct{};
/// Classic destructor – takes nothing, returns nothing
fn foo(self: HotPotato) void = {
whatever_cleanup();
// Only in a destructor implementation, after all obligations are met:
self.forget(); // Actually go out of scope (derived from Resource).
};
/// The typestate pattern – consumes the object, returns something else
fn bar(self: HotPotato, otherArgument: int) SomeOtherTypeState = {
// implementation defined
};
I find the interpretation of trust in a language context to be a really interesting division. For example, looking at Hare’s first two design principles:
Trust the programmer.
Provide tools the programmer may use when they don’t trust themselves.
Are generics not implemented because we don’t trust the programmer not to make complex code? I could use the above principles to justify Rust and Haskell levels of compiler logic too.
I’m not saying Hare’s interpretation is wrong and I quite like what it’s trying to do, I just find that those two principles can have wildly varying interpretations.
The space of language that take C and improve on its dev experience without going fully different (e.g., C++ and Rust) is getting crowded lately. I can think of Zig, Odin, and V which are recent languages taking slightly different approaches to C += ε
I’m not sure this is exactly the same design space though unless you define “systems programming” as scripting systems together. In my mind at least, systems programming is more low-level and includes things like kernels, device drivers and low power devices. See also the insistence on not shipping a runtime.
@ddevault has used Go and there are many similarities. But I find some of the differences interesting:
syntax for error handling
no runtime
no goroutines (because of no runtime)
no generics (since go just got them)
tagged unions
no plans for async / await (go also doesn’t have them, but while I really like avoiding async/await since it colours functions, I would expect something like goroutines or a great threading library instead)
I wonder if it would be more straight-forward to translate C programmatically to Hare than to Go or Rust. If one could, with relative ease, improve the safety of existing programs, it could be a big win.
Please take it easy in the comments. Yes, the primary author of the language is famous for posting incendiary, attention-grabbing rants. If you respond to this link with vitriol, even if you’re explaining how you dislike parts of this project’s design, you reward trolling and diminish our community.
Remember there are always a few thousand readers to every commenter. If you’re disagreeing with someone, you’re more persuasive to readers when you make your points and let them judge than descend into bickering.
Rants which I might note that he apologized for, and is doing a lot less of as well.
Do you have a link to the apology you’re talking about? I’ve steered clear of his projects for a while now because of his toxicity, but if he’s truly making an effort to change that, I may be willing to reconsider.
(Disclaimer: I don’t know Drew personally. I have found several of his public comments abrasive. I also use and like sr.ht quite a bit.)
Just because I couldn’t remember when that toxic series of posts went up either, and couldn’t remember when his (apology or at least apology-adjacent) conciliatory comments went up, I used search engines to find the words “fuck” and “asshole” on his site. Those turned up both the earlier vitriolic posts and his statement of his ambitions to no longer make such posts. And they showed that he has largely avoided those two words in the past year, as far as the search engines are concerned, since. Those posts where he did use them don’t direct them pointedly at people, IMO.
Sample size is small, obviously, but it looks like the stated determination to be less toxic has been accompanied by progress in that direction.
There’s the note at the bottom of https://drewdevault.com/2021/04/26/Cryptocurrency-is-a-disaster.html .
Thanks! I wish it was a little more prominent, but I’m glad someone was able to find it.
I went looking for it, and… both the apology, as well as the angry rants which prompted him to apologize, have been deleted from his blog. (Unless I’m misremembering which status update the apology was written in the first place.)
Thanks for looking! I’m assuming it was his Wayland rant, since I can’t find a link to it from his blog index, but I can confirm the page still exists. I even wrote a pretty large response to it on this site because it was in very poor taste. I’m avoiding linking to it directly because he seems to have taken it down on purpose, and I’m happy to respect his decision.
I appreciate that he’s taking steps in the right direction. I don’t think I’m quite ready to look past his behavior, but I’m still willing to reconsider if he continues in this direction. It’s a very good sign.
Less vitriol in the open source community and software in general is definitely a good thing.
I tried out this language while it was in early development, writing some of the standard library (
hash::crc*
andunix::tty::*
) to test the language. I wrote about this experience, in a somewhat haphazard way. (Note, that blog post is outdated and not all my opinions are the same. I’ll be trying to take a second look at Hare in the coming days.)In general, I feel like Hare just ends up being a Zig without comptime, or a Go without interfaces, generics, GC, or runtime. I really hate to say this about a project where they authors have put in such a huge amount of effort over the past year or so, but I just don’t see its niche – the lack of generics mean I’d always use Zig or Rust instead of Hare or C. It really looks like Drew looked at Zig, said “too bloated”, and set out to create his own version.
Another thing I find strange: why are you choosing to not support Windows and macOS? Especially since, you know, one of C’s good points is that there’s a compiler for every platform and architecture combination on earth?
That said, this language is still in its infancy, so maybe as time goes and the language finds more users we’ll see more use-cases for Hare.
In any case: good luck, Drew! Cheers!
DdV’s answer on HN:
We don’t want to help non-FOSS OSes.
(Paraphrasing a lot, obvs.)
My personal 2c:
Some of the nastier weirdnesses in Go are because Go supports Windows and Windows is profoundly un-xNix-like. Supporting Windows distorted Go severely.
I think that’s the consequence of not planning for Windows support in the first place. Rust’s standard library was built without the assumption of an underlying Unix-like system, and it provides good abstractions as a result.
Amos talks about that here: Go’s file APIs assume a Unix filesystem. Windows support was kludged in later.
Windows and Mac/iOS don’t need help from new languages; it’s rather the other way around. Getting people to try a new language is pretty hard, let alone getting them to build real software in it. If the language deliberately won’t let them target three of the most widely used operating systems, I’d say it’s shooting itself in the foot, if not in the head.
(There are other seemingly perverse decisions too. 80-character lines and 8-character indentation? Manual allocation with no cleanup beyond a general-purpose “defer” statement? I must not be the target audience for this language, is the nicest response I have.)
Just for clarity, it’s not my argument. I was just trying to précis DdV’s.
I am not sure I agree, but then again…
I am not sure that I see the need for yet another C-replacement. Weren’t Limbo, D, Go, & Rust all attempts at this?
But that aside: there are a lot of OSes out there that are profoundly un-Unix-like. Windows is actually quite close, compared to, say, Oberon or classic MacOS or Z/OS or OpenVMS or Netware or OS/2 or iTron or OpenGenera or [cont’d p94].
There is a lot of diversity out there that gets ignored if it doesn’t have millions of users.
Confining oneself to just OSes in the same immediate family seems reasonable and prudent to me.
My understanding is that the lack of generics and comptime is exactly the differentiating factor here – the project aims at simplicity, and generics/compile time evaluations are enormous cost centers in terms of complexity.
You could say that generics and macros are complex, relative to the functionality they offer.
But I would put comptime in a different category – it’s reducing complexity by providing a single, more powerful mechanism. Without something like comptime, IMO static languages lose significant productivity / power compared to a dynamic language.
You might be thinking about things from the tooling perspective, in which case both features are complex (and probably comptime even more because it’s creating impossible/undecidable problems). But in terms of the language I’d say that there is a big difference between the two.
I think a language like Hare will end up pushing that complexity out to the tooling. I guess it’s like Go where they have
go generate
and relatively verbose code.Yup, agree that zig-style seamless comptime might be a great user-facing complexity reducer.
I’m not being Zig-specific when I say that, by definition,
comptime
cannot introduce user-facing complexity. Unlike other attributes,comptime
only exists during a specific phase of compiler execution; it’s not present during runtime. Like a static type declaration,comptime
creates a second program executed by the compiler, and this second program does inform the first program’s runtime, but it is handled entirely by the compiler. Unlike a static type declaration, the user uses exactly the same expression language forcomptime
and runtime.If we think of metaprogramming as inherent complexity, rather than incidental complexity, then an optimizing compiler already performs compile-time execution of input programs. What
comptime
offers is not additional complexity, but additional control over complexity which is already present.To put all of this in a non-Zig context, languages like Python allow for arbitrary code execution during module loading, including compile-time metaprogramming. Some folks argue that this introduces complexity. But the complexity of the Python runtime is there regardless of whether modules get an extra code-execution phase; the extra phase provides expressive power for users, not new complexity.
Yeah, but I feel like this isn’t what people usually mean when they say some feature “increases complexity.”
I think they mean something like: Now I must know more to navigate this world. There will be, on average, a wider array of common usage patterns that I will have to understand. You can say that the complexity was already there anyway, but if, in practice, is was usually hidden, and now it’s not, doesn’t that matter?
As a concrete example, I don’t have to know about a new keyword or what it means when the optimizing compiler does its thing.
A case can be made that this definition of complexity is a “good thing” to improve code quality / “matters”:
Similar arguments can be used for undefined behavior (UB) as it changes how you navigate a language’s world. But for many programmers, it can be usually hidden by code seemingly working in practice (i.e. not hitting race conditions, not hitting unreachable paths for common input, updating compilers, etc.). I’d argue that this still matters (enough to introduce tooling like UBSan, ASan, and TSan at least).
The UB is already there, both for correct and incorrect programs. Providing tools to interact with it (i.e.
__builtin_unreachable
->comptime
) as well as explicit ways to do what you want correctly (i.e.__builtin_add_overflow
-> comptime specific lang constructs interacted with using normal code e.g.for
vsinline for
) would still be described as “increases complexity” under this model which is unfortunate.Unless one is purposefully using a specific compiler (or set thereof), that actually defines the behaviour the standard didn’t, then the program is incorrect. That it just happens to generate correct object code with this particular version of that particular compiler on those particular platforms is just dumb luck.
Thus, I’d argue that tools like MSan, ASan, and UBSan don’t introduce any complexity at all. The just revealed the complexity of UB that was already there, and they do so reliably enough that they actually relieve me of some of the mental burden I previously had to shoulder.
Python doesn’t allow compile-time metaprogramming for any reasonable definition of the word. Everything happens and is introspectable at runtime, which allows you to do similar things, but it’s not compile-time metaprogramming.
One way to see this is that sys.argv is always available when executing Python code. (Python “compiles” byte code, but that’s an implementation detail unrelated to the semantics of the language.)
On the other hand, Zig and RPython are staged. There is one stage that does not have access to argv (compile time), and another one that does (runtime).
Related to the comment about RPython I linked here:
http://www.oilshell.org/blog/2021/04/build-ci-comments.html
https://old.reddit.com/r/ProgrammingLanguages/comments/mlflqb/is_this_already_a_thing_interpreter_and_compiler/gtmbno8/
Yours is a rather unconventional definition of complexity.
I am following the classic paper, “Out of the Tar Pit”, which in turn follows Brooks. In “Abstractive Power”, Shutt distinguishes complexity from expressiveness and abstractedness while relating all three.
We could always simply go back to computational complexity, but that doesn’t capture the usage in this thread. Edit for clarity: Computational complexity is a property of problems and algorithms, not a property of languages nor programming systems.
Good faith question: I just skimmed the first ~10 pages of “Out of the Tar Pit” again, but was unable to find the definition that you allude to, which would exclude things like the
comptime
keyword from the meaning of “complexity”. Can you point me to it or otherwise clarify?Sure. I’m being explicit for posterity, but I’m not trying to be rude in my reply. First, the relevant parts of the paper; then, the relevance to
comptime
.On p1, complexity is defined as the tendency of “large systems [to be] hard to understand”. Unpacking their em-dash and subjecting “large” to the heap paradox, we might imagine that complexity is the amount of information (bits) required to describe a system in full detail, with larger systems requiring more information. (I don’t really know what “understanding” is, so I’m not quite happy with “hard to understand” as a concrete definition.) Maybe we should call this “Brooks complexity”.
On p6, state is a cause of complexity. But
comptime
does not introduce state compared to an equivalent non-staged approach. On p8, control-flow is a cause of complexity. Butcomptime
does not introduce new control-flow constructs. One could argue thatcomptime
requires extra attention to order of evaluation, but again, an equivalent program would have the same order of evaluation at runtime.On p10, “sheer code volume” is a cause of complexity, and on this point, I fully admit that I was in error;
comptime
is a long symbol, adding size to source code. In this particular sense,comptime
adds Brooks complexity.Finally, on a tangent to the top-level article, p12 explains that “power corrupts”:
comptime
and similar metaprogramming tools don’t make anything newly possible. It’s an annotation to the compiler to emit specialized code for the same computational result. As such, they arguably don’t add Brooks complexity. I think that this argument also works forinline
, but not for@compileError
.Yeah, I can see that. But under what conditions would I care how small, big, or icecream-covered the compiler is? Building/bootstrapping for a new platform is a one-time thing, but writing code in the language isn’t. I want the language to make it as easy as possible on me when I’m using it, and omitting features that were around since the 1990’s isn’t helping.
Depends on your values! I personally see how, eg, generics entice users to write overly complicated code which I then have to deal with as a consumer of libraries. I am not sure that not having generics solves this problem, but I am fairly certain that the problem exists, and that some kind of solution would be helpful!
In some situations, emitted code size matters a lot (and with generics, that can quickly grow out of hand without you realizing it).
I see what you mean, but I think in those situations it’s not too hard to, you know, refrain from use generics. I see no reason to force all language users to not use that feature. Unless Hare is specifically aiming for that niche, which I don’t think it is.
There are very few languages that let you switch between monomorphisation and dynamic dispatch as a compile-time flag, right? So if you have dependencies, you’ve already had the choice forced on you.
If you don’t like how a library is implemented, then don’t use it.
Ah, the illusion of choice.
Where is the dividing line? What makes functions “not complex” but generics, which are literally functions evaluated at compile time, “complex”?
I don’t know where the line is, but I am pretty sure that this is past that :D
https://github.com/diesel-rs/diesel/blob/master/diesel_cli/src/infer_schema_internals/information_schema.rs#L146-L210
Sure, that’s complicated. However:
that’s the inside of the inside of a library modeling a very complex domain. Complexity needs to live somewhere, and I am not convinced that complexity that is abstracted away and provides value is a bad thing, as much of the “let’s go back to simpler times” discourse seems to imply. I rather someone takes the time to solve something once, than me having to solve it every time, even if with simpler code.
Is this just complex, or is it actually doing more than the equivalent in other languages? Rust allows for expressing constraints that are not easily (or at all) expressable in other languages, and static types allow for expressing more constraints than dynamic types in general.
In sum, I’d reject a pull request with this type of code in an application, but don’t mind it at all in a library.
I find that’s rarely the case. It’s often possible to tweak the approach to a problem a little bit, in a way that allows you to simply omit huge swaths of complexity.
Possible, yes. Often? Not convinced. Practical? I am willing to bet some money that no.
I’ve done it repeatedly, as well as seeing others do it. Occasionally, though admittedly rarely, reducing the size of the codebase by an order of magnitude while increasing the number of features.
There’s a huge amount of code in most systems that’s dedicated to solving optional problems. Usually the unnecessary problems are imposed at the system design level, and changing the way the parts interface internally allows simple reuse of appropriate large-scale building blocks and subsystems, reduces the number of building blocks needed, and drops entire sections of translation and negotiation glue between layers.
Complexity rarely needs to be somewhere – and where it does need to be, it’s in often in the ad-hoc, problem-specific data structures that simplify the domain. A good data structure can act as a laplace transform for the entire problem space of a program, even if it takes a few thousand lines to implement. It lets you take the problem, transform it to a space where the problem is easy to solve, and put it back directly.
You can write complex code in any language, with any language feature. The fact that someone has written complex code in Rust with its macros has no bearing on the feature itself.
It’s the Rust culture that encourages things like this, not the fact that Rust has parametric polymorphism.
I am not entirely convinced – to me, it seems there’s a high correlation between languages with parametric polymorphism and languages with culture for high-to-understand abstractions (Rust, C++, Scala, Haskell). Even in Java, parts that touch generics tend to require some mind-bending (producer extends consumer super).
I am curious how Go’s generic would turn out to be in practice!
Obligatory reference for this: F# Designer Don Syme on the downsides of type-level programming
It’s a good example of the culture and the language design being related.
https://lobste.rs/s/pkmzlu/fsharp_designer_on_downsides_type_level
https://old.reddit.com/r/ProgrammingLanguages/comments/placo6/don_syme_explains_the_downsides_of_type_classes/
which I linked here: http://www.oilshell.org/blog/2022/03/backlog-arch.html
What if you were on a platform unsupported by LLVM?
When I was trying out Plan 9, lack of LLVM support really hurt; a lot of good CLI tools these days are being written in Rust.
Zig has rudimentary plan9 support, including a linker and native codegen (without LLVM). We’ll need more plan9 maintainers to step up if this is to become a robust target, but the groundwork has been laid.
Additionally, Zig has a C backend for those targets that only ship a proprietary C compiler fork and do not publish ISA details.
Finally, Zig has the ambitions to become the project that is forked and used as the proprietary compiler for esoteric systems. Although of course we would prefer for businesses to make their ISAs open source and publicly documented instead. Nevertheless, Zig’s MIT license does allow this use case.
I’ll be damned! That’s super impressive. I’ll look into Zig some more next time I’m on Plan 9.
I think that implies that your platform is essentially dead ( I would like to program my Amiga in Rust or Swift or Zig, too) or so off-mainstream (MVS comes to mind) that those tools wouldn’t serve any purpose anyway because they’re too alien).
Good news: LLVM does support 68k, in part to many communities like the Amiga community. LLVM doesn’t like to include stuff unless there’s a sufficient maintainer base, so…
Bad news: LLVM does support S/390. No idea if it’s just Linux or includes MVS.
About that…
I won’t disagree with describing Plan 9 as off-mainstream ;) But I’d still like a console-based Signal client for that OS, and the best (only?) one I’ve found is written in Rust.
I imagine that many people will be wondering how Hare differs from Zig, which seems similar to me as an outsider to both projects. Could someone more familiar with the goals of Hare briefly describe why (assuming a future in which both projects are reasonably mature) someone may want to choose Hare over Zig, and Zig over Hare?
As someone who used Hare briefly last year when it was still in development (so this may be slightly outdated), I honestly see no reason to use Hare for the time being. While it provides huge advances over C, it just feels like a stripped-down version of Zig in the end.
My understanding is that Hare is for people who want a modern C (fewer footguns, etc) but who also want a substantially more minimalist approach than what Zig offers. Hare differs from Zig by having a smaller scope (eg it doesn’t try to be a C cross-compiler), not using LLVM, not having generic/templated metaprogramming, and by not having async/await in the language.
That definitely sounds appealing to me as someone who has basically turned his back on 95% of the Rust ecosystem due to it feeling a bit like stepping into a candy shop when I just wanted a little fiber to keep my programs healthier by rejecting bad things. I sometimes think about what a less-sugary Rust might be like to use, but I can’t practically see myself doing anything other than what I am doing currently - using the subset of features that I enjoy while taking advantage of the work that occasionally improves the interesting subset to me. And every once in a while, it’s nice to take a bite out of some sugar :]
If I remember correctly, there was some discussion about a kind of barebones Rust at some point around here. Is that what you would ideally have/work with? Which features would survive, and which be dropped?
It looks like it’s a lot simpler. Zig is trying to do much more. I also appreciate that Hare isn’t self-hosting and can be built using any standard C compiler and chooses QBE over LLVM, which is simpler and more light-weight.
As I understand it, the current Zig compiler is in C++; they are working on a self-hosting compiler, but intend to maintain the C++ compiler alongside it indefinitely.
Ah, thanks for the correction!
Indefinitely, as in, there will always be two official implementations?
Yes. See https://lobste.rs/s/0j45v4/zig_self_hosted_compiler_can_now_build#c_h052bp
If it’s not self-hosting today it looks like self-hosting is a goal:
Well, at some point it would make sense, much like C compilers are ubiquitously self-hosted. As long as it doesn’t make it too hard to bootstrap (for instance, if it has decent cross-compilation support), it should be fine.
Warning: this is not supposed to be taken very seriously. It’s not a joke, but I won’t bet 2 cents that I’m right about any of it.
Pretty much all widely used languages today have a thing. Having a thing is not, by far, the only determinant factor in whether a language succeeds, and you can even question whether wide adoption is such a good measure of success. But the fact is, pretty much all languages we know and use professionally have a thing, or indeed, multiple things:
Even older languages like SQL, Fortran, Cobol, they all had a thing. I can’t see what Hare’s thing might be. And to be fair, it’s not a problem exclusively with, or specially represented by, Hare. 9/10 times, when there’s a post anywhere about a new language, it has no thing. None. It’s not even that is not actually particularly well suited for it’s thing, it can’t even articulate what it’s thing is.
“Well, Hare’s thing is system’s programming” that’s like saying that McDonald’s thing is hamburgers. A thing is more than a niche. It’s … well, it’s a thing.
It might well be the case that you can only see a thing in retrospect (I feel like that might be the case with Python, for instance), but still, feels like it’s missing, and not not only here.
Considering how many false starts Java had, there was an obvious and error-ridden search process to locate the thing—first delivering portability, mainly for the benefit of Sun installations nobody actually had, then delivering web applets, which ran intolerably poorly on the machines people needed them to run on, and then as a mobile device framework that was, again, a very poor match for the limited hardware of the era, before finding a niche in enterprise web platforms. Ironically, I think Sun did an excellent job of identifying platforms in need of a thing, seemingly without realizing that their thing was a piss-poor contender for being the thing in that niche. If it weren’t for Sun relentlessly searching for something for Java to do, I don’t think it would have gotten anywhere simply on its merits.
I agree, but I also think it’s a day old, and Ruby was around for years before Rails. Although I would say that Ruby’s creator did so out of a desire for certain affordances that were kind of unavailable from other systems of the time—a Smalltalk placed solidly in the Perl-Unix universe rather than isolated in a Smalltalk image. What we seem to have here is a very small itch (Zig with a simpler compiler?) being scratched very intensely.
Ruby and Python were in the back of my mind the whole time I was writing the thing about things (hehe), and you have a point about Java, that thing flailed around A LOT before settling down. Very small itch is a good summary.
Time will tell, but I ain’t betting on it.
I’m with you. But we’ll see, I guess.
An obvious language you do not mention is C. What’s C’s thing in that framework? And why couldn’t Hare’s thing be “C, but better”, like C# is to Java? (Or arguably C++ is to C, or Zig is to C)
C’s thing was Unix.
Incorrect…C’s thing was being a portable less terrible macroassembler-ish tool.
Well, I did say a thing is not the only determinant for widespread adoption. I don’t think C had a thing when it became widely used. Maybe portability? It was the wild wild west days, though.
Hare could very well eat C’s lunch and became big. But being possible is far away from being likely.
C’s thing is that it’s a human-friendly assembly.
strcpy
isrep stosb
,va_list
is a stack parser, etc.But it’s not. At least not once you turn on optimizations. This is a belief people have that makes C seem friendlier and lower level, but there have been any number of articles posted here about the complex transformations between C and assembly.
(Heck, even assembly isn’t really describing what the CPU actually does, not when there’s pipelining and multiprocessing going on.)
But it is. Sure, you can tell the compiler to optimize, in which case all bets are obviously off, but it doesn’t negate the fact that C is the only mainstream high-level language that gets you as close to the machine language as it gets.
That’s not a belief, it’s a fact.
…and since all deployed code is optimized, I’m not sure what your point is.
Any modern C compiler is basically humoring you, taking your code as a rough guideline of what to do, but reordering and inlining and unrolling and constant-folding, etc.
And then the CPU chip gets involved, and even the machine instructions aren’t the truth of what’s really going on in the hardware. Especially in x86, where the instruction set is just an interpreted language that gets heavily transformed into micro-ops you never see.
If you really want to feel like your C code tells the machine exactly what to do, consider getting one of those cute retro boards based on a Z-80 or 8086 and run some 1980s C compiler on it.
No need to lecture and patronize if you don’t get the point.
C was built around machine code, with literally every language construct derived from a subset of the latter and nothing else. It still remains true to that spirit. If you see a piece of C code, you can still make a reasonable guess to what it roughly translates to. Even if it’s unrolled, inlined or even trimmed. In comparison with other languages, where “a += b” or “x = y” may translate into the pages of binary.
Do you understand the point?
C Is Not a Low-level Language
The post you’re replying to isn’t patronizing you, it’s telling the truth.
You are missing the point just the same.
It’s not that C generates an exact assembly you’d expect, it’s that there’s a cap on what it can generate from a given piece of code you are currently looking at. “x = y” is a memcpy at worst and a dereferencing a pointer does more or less just that. Not the case with C++, leave alone Go, D, etc.
I suggest reading an intro to compilers class textbook. Compilers do basic optimizations like liveliness analysis, dead store eliminations etc. Just because you write down “x = y” doesn’t mean the compiler will respect it and keep the load/store in your binary.
I suggest trying to make a rudimentary effort to understand what others are saying before dispensing advice that implies they are dolts.
As someone who works on a C compiler for their day job and deals with customer support around this sort of thing, I can assure you this is not true.
See my reply to epilys.
Can you share an example of resulting code not being even roughly what one was expecting?
Some general observations. I don’t have specific examples handy and I’m not going to spend the time to conjure them up for what is already a thread that is too deep.
volatile
) regularly result in bug reports saying the compiler is broken. In nearly every case, the user expects all the code they wrote to be there in the order they wrote it, with all the function calls in the same order. This is rarely the case.x = y
can be very far apart.This is just from recent memory. There is really no end it. I won’t get into the “this instruction sequence takes too many cycles” reports as those don’t seem to match your desired criteria.
Thanks. These are still more or less in the ballpark of what’s expected with optimizations on.
I’ve run into at least a couple of these, but I can remember only one case when it was critical and required switching optimizations off to get what we needed from the compiler (had to with handling very small packets in the nic driver). Did kill a day on that though.
Some initial thoughts, with the caveats that I quite like Rust and am put off by ddv:
Suspicion - the “new systems” niche seems filled pretty well by Rust and Zig IMO, and Hare would have to be pretty interesting to justify using it over one of those. As spacejam said, everyone’s initial question is “why this instead of Rust or Zig (or C)” - those comparisons should be front and center.
I’m confused about who the tutorial’s written for. It includes passages like this:
Which I can’t imagine is new or useful information to anyone reading this.
The tagged unions seem cool, especially if it’s possible to transparently pass a value of
(a | b)
to something expecting a(a | b | c)
. I wonder how that’s implemented?The
insert
keyword seems like a strange choice. This means the syntax of the language relies on the presence of a single global dynamic allocator. C, Rust, and especially Zig all do a better job of limiting allocation to library functions, which makes those languages more suitable to embedded devices, contexts where allocation failure must be cleanly handled (like most C libraries), or contexts where multiple dynamic allocators are available (like the Linux kernel). I guess they did it this way because they don’t have user-accessible generics, so they only way to provide a function like this is compiler-level magic?Edit:
?
operator (same as Rust’s?
or Zig’stry
) and a!
operator (same as Rust’sunwrap
).?
is great, but I’m not a huge fan of the single-character!
operator here - a big advantage of Rust’s approach is thatunwrap
is much longer than?
, and sticks out in code review.I dunno, Zig has
catch unreachable
, which is sticks out way more than.unwrap()
, but I just find it annoying to type that out dozens of times, especially in testing/experimental/throwaway code where I couldn’t care less about robustness.Tagged union implementation is really simple. From https://harelang.org/tutorials/introduction/#tagged-unions-in-depth
Ah, cool, that’s clever. I guess that means all union values must have a 4-byte tag, even if they only need one byte - but that’s probably fine in practice.
I’m always amazed by Drew DeVault breadth of projects: from window managers, to microkernels including a web forge and now a programming language with its compiler and libraries. While I find him sometimes too antagonistic/abrasive, this type of technical breadth is (to my knowledge) only surpassed by Fabrice Bellard.
Even other people which I look up to, like Jason A. Donenfeld (aka /u/zx2c4 or “the wireguard guy”) and Andrew Gallant (aka /u/burntsushi or “the
ripgrep
guy”), only specialize in one field. (respectively security and text processing)I can only be in awe when I see this level of skills. I’m looking forward to playing around with his new language when I find some free time.
I am also amazed by his work, and don’t wish to disparage him in any way, but in his own words
I think that’s good to keep in mind, too.
Be careful with hero worship.
Where they were writing window managers, I was writing GPU drivers. We both worked on Minecraft servers. We both contributed various low-level experiments for DCPU-16. Where they were writing microkernels, I was reverse-engineering video-game consoles. While they were founding their small business for a Free Software forge, I was starting my small business for encrypted cloud storage. They’re working on Hare, which is basically their opinion on how to make an ergonomic C; I worked on Monte, which was basically our opinion of how to make a practical E.
And yet, I’m not a good person. Maybe they aren’t, either.
I ignored almost everything and went straight to the bit I have some expertise on: cryptography.
There is one thing I’d like to insist on, that I realised fairly late in my cryptographic career: naive key exchange is not high-level enough.
By “key exchange”, I mean the kind of key exchange that happens in NaCl’s
crypto_box()
: a key exchange proper followed by a hash, so the two parties have a shared key. Nowadays it’s not quite enough to just exchange Alice’s and Bob’s long term keys, we also want stronger properties like forward secrecy and key compromise impersonation resistance. To do that, you need a full key exchange protocol involving 2 or 3 messages in most cases. I don’t know what Hare is actually using here (the key exchange link is not live), but if they don’t have it already, something like Noise would be a good addition some time in the future.As someone who is rather new to languages like C (I only recently got into it by making a game with it), I have a few newbie questions:
Why do people want to replace C? Security reasons, or just old and outdated?
What does Hare offer over C? They say that Hare is simpler than C, but I don’t understand exactly how. Same with Zig. Do they compile to C in the end, and these languages just make it easier for user to write code?
That being said, I find it cool to see these languages popping up.
#include <foo.h>
includes all functions/constants into the current namespace, so you have no idea what module a function came fromstruct Foo
and then have to name all the methods of that structfoo_do_stuff
(instead of doingfoo_var.do_stuff()
like in other languages)void*
(which means no type checking) or with the macro system (which is a pain in the ass).ifdef
mess to call Windows-only functions if you’re on compiling on Windows and the Linux versions otherwise…)defer
is a PITA and error-prone.long long
,int
,short
, etc which have different bit widths on different arches/platforms. (Most C projects I know importstdint.h
to getuint32_t
and friends, or just have a typedef mess to useusize
,u32
,u16
, etc.)EDIT: As Forty-Bot noted, one of the biggest issues are null-terminated strings.
I could go on and on forever.
It fixes a lot of the issues I mentioned earlier, as well as reducing footguns and implementation-defined behavior in general. See my blog post for a list.
It’s simpler than C because it comes without all the cruft and compromises that C has built up over the past 50 years. Additionally, it’s easier to code in Hare because, well, the language isn’t trying to screw you up every 10 lines. :^)
Zig and Hare both occupy the same niche as C (i.e., low-level manual memory managed systems language); they both compile to machine code. And yes, they make it a lot easier to write code.
Thanks for the great reply, learned a lot! Gotta say I am way more interested in Hare and Zig now than I was before.
Hopefully they gain traction. :)
This and your later point about not being able to associate methods with
struct
definitions are variations on the same point but it’s worth repeating: C has no mechanism for isolating namespaces. A C function is eitherstatic
(confined to a single compilation unit) or completely global. Most shared library systems also give you a package-local form but anything that you’re exporting goes in a single flat namespace. This is also true of type and macro definitions. This is terrible for software engineering. Two libraries can easily define different macros with the same name and break compilation units that want to use both.C++, at least, gives you namespaces for everything except macros.
The lack of type checking is really important here. A systems programming language is used to implement the most critical bits of the system. Type checks are incredibly important here, casting everything via
void*
has been the source of vast numbers of security vulnerabilities in C codebases. C++ templates avoid this.This is less of an issue for systems programming, where a large standard library is also a problem because it implies dependencies on large features in the environment. In an embedded system or a kernel, I don’t want a standard library with file I/O. Actually, for most cloud programming I’d like a standard library that doesn’t assume the existence of a local filesystem as well. A bigger problem is that the library is not modular and layered. Rust’s
nostd
is a good step in the right direction here.From libc, most errors are not returned, they’re signalled via the return and then stored in a global (now a thread-local) variable called
errno
. Yay. Option types for returns are really important for maintainable systems programming. C++ now hasstd::optional
andstd::variant
in the standard library, other languages have union types as first-class citizens.defer
isn’t great either because it doesn’t allow ownership transfer. You really need smart pointer types and then you hit the limitations of the C type system again (see: no generics, above). C++ and Rust both have a type system that can express smart pointers.Anonymous functions are only really useful if they can capture things from the surrounding environment. That is only really useful in a language without GC if you have a notion of owning pointers that can manage the capture. A language with smart pointers allows you to implement this, C does not.
True. I’m more saying that
defer
is the baseline here; without it you needcleanup:
labels, gotos, and synchronized function returns. It can get ugly fast.I disagree, depends on what you’re doing. I’m doing a roguelike in Zig right now, and I use anonymous functions quite extensively for item/weapon/armor/etc triggers, i.e., where each game object has some unique anonymous functions tied to the object’s fields and can be called on certain events. Having closures would be nice, but honestly in this use-case I didn’t really feel much of a need for it.
Note that C does have “standard” answers to a lot of these.
The macro system is the #1 thing keeping C alive :)
Aside from macro stuff, the typical way to address this is to use a struct of function pointers. So you’d create a wrapper like
Note that typically there is a “base class” which either all “subclasses” include as a member (and use offsetof to recover the subclass) or have a void * private data pointer. This doesn’t really escape the problem, however in practice I’ve never run into a bug where the wrong struct/method gets combined. This is because the above pattern ensures that the correct method gets called.
Well, there’s always errno… And if you control the address space you can always use the upper few addresses for error codes. That said, better syntax for multiple return values would probably go a long way.
IIRC gcc has them, but they require executable stacks :)
Agree. I think you can do this with GCC extensions, but some sugar here would be nice.
Arguably there should be fixed width types,
size_t
,intptr_t
, andregsize_t
. Unfortunately, C lacks the last one, which is typically assumed to belong
. Rust, for example, gets this even more wrong and lacks the last two (c.f. the recent post on 129-bit pointers).IMO you missed the most important part, which is that C strings are (by-and-large) nul-terminated. Having better syntax for carrying a length around with a pointer would go a long way to making string support better.
Even in C’s domain, where C lacks nothing and is fine for what it is, I would criticize C for maybe 5 things, which I would consider the real criticism:
int
) and comparison between signed and unsigned yielding unsigned, it should be opposite: There should be anat
type (for natural number, effectively size_t), and comparison between signed and unsigned should yield signed.char
is signed. Nobody likes negative code points.I’m wary of this same tired argument cropping up again, so I’ll just state it this way: I disagree. Code that invokes undefined behavior is already broken; changing compiler can’t (except perhaps in very particular circumstances, which I don’t think you were referring to) introduce undefined behaviour; it can change the observable behaviour when UB is invoked.
A compiler can’t “disobey the source code” whilst conforming to the language standard. If the source code does something that doesn’t have defined semantics, that’s on the source code, not the compiler.
“It’s easy to accidentally invoke undefined behaviour in C” is a valid criticism, but “C compilers breaks code” is not.
You certainly can in some instances. But sure, for example, if some piece of code dereferences a pointer and the value is set somewhere else, it could be undefined or not depending on whether the pointer is valid at the point it is dereferenced. So code might be “not broken” given certain constraints (eg that the pointer is valid), but not work properly if those constraints are violated, just like code in any language (although in C there’s a good chance the end result is UB, which is potentially more catastrophic).
I’m not saying C is a good language, just that I think this particular criticism is unfair. (Also I think your point 5 is wrong, char can be unsigned, it’s up to the implementation).
Thing is, it certainly feels like the compiler is disobeying the source code. Signed integer overflow? No problem pal, this is x86, that platform will wrap around just fine! Right? Riiight? Oops, nope, and since the compiler pretends UB does not exist, it just deleted a security check that it deemed “dead code”, and now my hard drive has been encrypted by a ransomware that just exploited my vulnerability.
Though I agree with all the facts you laid out, and with the interpretation that UB means the program is already broken even if the generated binary didn’t propagate the error. But Chandler Carruth pretending that UB does not invoke the nasal demons is not far. Let’s not forget that UB means the compiler is allowed to cause your entire hard drive to be formatted, as ridiculous as it may sound. And sometimes it actually happens (as it did so many times with buffer overflow exploits).
Sure, it’s not like the compiler is actually disobeying your source code. But since UB means “all bets are off”, and UB is not always easy to catch, the result is pretty close.
I feel like “disobeying the code” and “not doing what I intended it to do due to the code being wrong” are still two sufficiently different things that it’s worth distinguishing.
Okay, it is worth distinguishing.
But it is also worth noting that C is quite special. This UB business repeatedly violates the principle of least astonishement. Especially the modern interpretation, where compilers systematically assume UB does not exist and any code path that hits UB is considered “dead code”.
The original intent of UB was much closer to implementation defined behaviour. Signed integer overflow was originally UB because some platforms crashed or otherwise went bananas when it occurred. But the expectation was that on platforms that behave reasonably (like x86, that wraps around), we’d get the reasonable behaviour. But then compiler writers (or should I say their lawyers) noticed that strictly speaking, the standard didn’t made that expectation explicit, and in the name of optimisation started to invoke nasal demons even on platforms that could have done the right thing.
Sure the code is wrong. In many cases though, the standard is also wrong.
I agree with some things but not others that you say, but these arguments have been hashed out many times before.
That’s the point I was making. Since we agree on that, and we agree that there are valid criticisms of C as a language (though we may differ on the specifics of those), let’s leave the rest. Peace.
But why not have the compiler reject the code instead of silently compiling it wrong?
It doesn’t compile it wrong. Code with no semantics can’t be compiled incorrectly. You’re making the exact same misrepresentation as in the post above that I responded to originally.
Code with no semantics shouldn’t be able to be compiled at all.
I’d almost agree, though I can think of some cases where such code could exist for a reason (and I’ll bet that such code exists in real code bases). In particular, hairy macro expansions etc which produce code that isn’t even executed (or won’t be executed in the case where it would be UB, at least) in order to make compile-time type-safety checks. IIRC there are a few such things used in the Linux kernel. There are probably plenty of other cases; there’s a lot of C code out there.
In practice though, a lot of code that potentially exhibits UB only does so if certain constraints are violated (eg if a pointer is invalid, or if an integer is too large and will result in overflow at some operation), and the compiler can’t always tell that the constraints necessarily will be violated, so it generates code with the assumption that if the code is executed, then the constraints do hold. So if the larger body of code is wrong - the constraints are violated, that is - the behaviour is undefined.
That’s why it’s good to have a proper macro system that isn’t literally just find and replace.
True, and I’m mostly talking about UB that can be detected at compile time, such as
f(++x, ++x)
.Contrary to what people are saying, C is just fine for what it is.
People complain about the std library being tiny, but you basically have the operating system at your fingers, where C is a first class citizen.
Then people complain C is not safe, yes that’s true, but with a set of best practices you can keep thing under control.
People complain you don’t have generics, you dont need them most of the time.
Projects like nginx, SQLite and redis, not to speak about the Nix world prove that C is perfectly fine of a language. Also most of the popular python libraries nowadays are written in C.
Hi! I’d like to introduce you to Fish in a Barrel, a bot which publishes information about security vulnerabilities to Twitter, including statistics on how many of those vulnerabilities are due to memory unsafety. In general, memory unsafety is easy to avoid in languages which do not permit memory-unsafe operations, and nearly impossible to avoid in other languages. Because C is in the latter set, C is a regular and reliable source of security vulnerabilities.
I understand your position; you believe that people are morally obligated to choose “a set of best practices” which limits usage of languages like C to supposedly-safe subsets. However, there are not many interesting subsets of C; at best, avoiding pointer arithmetic and casts is good, but little can be done about the inherent dangers of
malloc()
andfree()
(andfree()
andfree()
and …) Moreover, why not consider the act of choosing a language to be a practice? Then the choice of C can itself be critiqued as contrary to best practices.nginx is well-written, but Redis is not. SQLite is not written just in C, but also in several other languages combined, including SQL and TH1 (“test harness one”); this latter language is specifically for testing that SQLite behaves property. All three have had memory-unsafety bugs. This suggests that even well-written C, or C in combination with other languages, is unsafe.
Additionally, Nix is written in C++ and package definitions are written in shell. I prefer PyPy to CPython; both are written in a combination of C and Python, with CPython using more C and PyPy using more Python. I’m not sure where you were headed here; this sounds like a popularity-contest argument, but those are not meaningful in discussions about technical issues. Nonetheless, if it’s the only thing that motivates you, then consider this quote from the Google Chrome security team:
I am curious about your claim that Redis is not well-written? I’ve seen other folks online hold it up as an example of a well-written C codebase, at least in terms of readability.
I understand that readable is not the same as secure, but would like to understand where you are coming from on this.l
It’s 100% personal opinion.
Ah yes, you can see the safety of high-quality C in practice:
https://nginx.org/en/security_advisories.html https://www.cvedetails.com/vulnerability-list/vendor_id-18560/product_id-47087/Redislabs-Redis.html
Including some fun RCEs, like CVE-2014-0133 or CVE-2016-8339.
I also believe C will still have a place for long time. I know I’m a newbie with it, but making a game with C (using Raylib) has been pretty fun. It’s simple and to the point… And I don’t mind making mistakes really, that’s how I learn the best.
But again it’s cool to see people creating new languages as alternatives.
Here’s a list of ways that Drew says Hare improves over C:
It’s worth reading the whole piece. I only pasted his summary.
This is a deeply philosophically incoherent statement:
First of all, I can only assume this is a list of differentiators, because there are many more implied principles which aren’t mentioned here; that familiar syntax is better, that *nix are the most important OSes, that modularity is important, etc. These are not just axioms for a coming argument; these are the raison d’etre of Hare.
Proceeding on that assumption, we immediately have a problem. The first point makes no sense as a differentiator. There is no language in the world that is built on the premise “don’t trust the programmer” (or, perhaps, no industrial language; an esolang like that would be fascinating.) This strikes me as a shot at garbage collected languages, or perhaps at Rust; the latter being a bit more likely as the two langauges compete in the same space. Either way, it is a rather, ah, hare-brained criticism, since in most languages like this, and certainly in Rust, there are facilities for bypassing the GC (or borrow-checker) and doing whatever the heck you want.
The second point is fine, but in combination with the first, it seems… I don’t know, petty? “We’re good enough at programming that we don’t need any compiler telling us what to do, but we recognize that some of you aren’t, so we’re giving you some memory safety tools.” This attitude is reinforced in “Hare’s Advances Compared to C”:
This is absolutely true and I agree with it (though “veneration” is a bit much - what are you, a Confucian?), but when placed in the context of other things DeVault has written, such as in “Rust is not a good C replacement”, it reads like a double standard.
Here, “these problems” are, among others, a lack of portability compared to C, lack of a spec, and a lack of competing implementations. To its credit, Hare has a spec, but I don’t see competing implementations, and it doesn’t even support Windows, let alone the 90s RISC architectures people are complaining about missing in LLVM/Rust. Why is that okay in Hare but not in Rust? It’s almost enough to make one think that DDV just doesn’t like Rust because it opens systems programming up to people he doesn’t deem worthy.
Beyond this, point 4 is just a straight-up shot at programming languages with advanced type systems. “I’m smart enough to write correct code without having the compiler check my work. Why aren’t you?” This is ridiculous; nobody can write code as perfectly and consistently as a computer can check mathematically encoded invariants, and while there is a very important conversation to be had about what the right tradeoff is between flexibility and powerful static analysis, that is not the conversation being had here.
This language is interesting and I’m excited to see it progress, but I sincerely hope this is the end of DDV’s bashing of all languages which are not strictly better than C in every way, and that the elitism on display is toned down somewhat in later revisions of the meta-information around the language.
EDIT: I should talk about why I’m interested. It looks like an expression-oriented C with required initialization, slices, and slicey UTF-8 strings (and UCS-32 types!). A modern C, in other words. That’s valuable, if it can gain traction.
i’ve known about this language for awhile, and it has been a lot of fun to toy around with. i’m excited to see where the project goes.
my only complaint so far (fwiw, i know very little about programming in general) is the seemingly gratuitous use of semi-colons. here’s a snippet that displays what i mean:
to me, this represents the compiler yelling at me about five thousand times before i finally get it right. does anyone know why this might be, or have any insight into whether this syntax is here to stay?
All semicolons are too many semicolons as far as I’m concerned here. The ergonomics alone are a hard pass for me.
I think that’s because Hare strives to have a context-free grammar. Kinda like how Lua disallows lone
return
statements…?I’m not related to Hare in any way, but in my experience, the formatting and style stuff like this tends to be laid down in stone right towards the very beginning. Sure enough: https://harelang.org/style/
Does it have more ; than it really needs, maybe, but it’s the way they want it, so :)
Zig might be more in your style wheelhouse: https://ziglang.org/documentation/master/#Style-Guide
If that’s the official style of Hare, then Drew should enforce it through the compiler—don’t be like Rob Pike and chicken out not enforcing it with the compiler.
Drew is known for not having a problem stating his opinions and enforcing them, I imagine it’s either still a work in progress or they just haven’t gotten around to it yet.
Random example: sr.ht doesn’t have a www DNS entry, so www.sr.ht doesn’t work at all.
It’s sort of crazy how influenced this is by Go.
Go is in many aspects a better C, but the runtime makes it not a C replacement.
On a related note, I suspect the name “Hare” and the mascot are a nod to Plan 9’s Glenda mascot.
I see it has
defer
. I would instead be tempted for explicit destructors (aka. “higher RAII” or whatever Vale calls it). Defer-style cleanup kind of has the wrong default: It works well for classic resources that the user expects, like memory, files and mutexes, but if making an API around some other resource, it is nice to be able to give an object to the user that he/she can’t simply forget to do something about. If they want to ignore it, they just make a function that consumes it – destructors in this sense are just normal functions.There is at least one person looking at ways to make forgetting to call a cleanup function a compile error.
You mean to have something like:
?
No, not if you have to remember to call it. Not a special function name either. Forget defer (or maybe keep it for convenience’s sake). Think destructors, except that the calls are explicit – it’s an error to let such an object go out of scope alive.
I find the interpretation of trust in a language context to be a really interesting division. For example, looking at Hare’s first two design principles:
Are generics not implemented because we don’t trust the programmer not to make complex code? I could use the above principles to justify Rust and Haskell levels of compiler logic too.
I’m not saying Hare’s interpretation is wrong and I quite like what it’s trying to do, I just find that those two principles can have wildly varying interpretations.
I read it like:
I am spoiled. I like The Zen of Python as a list of language design goals in an ordered format. I look at Hare and see no strong statement.
The space of language that take C and improve on its dev experience without going fully different (e.g., C++ and Rust) is getting crowded lately. I can think of Zig, Odin, and V which are recent languages taking slightly different approaches to
C += ε
Folks interested in this design space may enjoy reading about the design rationale for scsh (A Scheme Shell): https://web.archive.org/web/20081010222846/http://www.scsh.net/docu/scsh-paper/scsh-paper-Z-H-4.html
I’m not sure this is exactly the same design space though unless you define “systems programming” as scripting systems together. In my mind at least, systems programming is more low-level and includes things like kernels, device drivers and low power devices. See also the insistence on not shipping a runtime.
@ddevault has used Go and there are many similarities. But I find some of the differences interesting:
I wonder if it would be more straight-forward to translate C programmatically to Hare than to Go or Rust. If one could, with relative ease, improve the safety of existing programs, it could be a big win.