Ooh, memories :-D
Back when I was in high school, I was the guru on writing high-performance TI-basic. There were a lot of arcane and/or vaguely documented tricks that I painstakingly discovered in the process of trying to write a 3D renderer; by the end, I could get a poly every ~20ms or so. (Obviously, this could be done much faster in assembly, but I was limited in my time in front of a PC, but not my time in front of a calculator).
The big ones I remember are:
2*(3+4)
became something like 20% faster if you left off the close parenAns
was almost an order of magnitude faster than any other variable. Presumably Ans was in a fixed memory location, while other variables needed to walk the object directory.I had a hand-written document that I photocopied and handed out to the other calculator nerds with these and another ~15 tricks, but it doesn’t seem that it ever made it online. I think it got left behind when I was escaping from a hostile living situation a couple years ago, but I really don’t know.
At one point I made a solitaire game (complete with fisher-yates shuffle and graphics) as part of a graphical shell, again entirely in Basic. Those programs got so big that I needed to introduce subroutines, which TI Basic lacked; IIRC, you couldn’t even call another program without stopping the program you were running. I ended up implementing that by starting each of the programs with something that checked whether Ans had a magic value in it; if so, it would jump to an appropriate label depending on the “call number” in some variable. I forget how I handled returns, but I suspect that I just used that variable as a link register and let each program have its own set of subroutine libraries.
Sadly, all this happened almost 20 years ago and the only thing that remains is a very cringy thing[1] that “encrypts” strings by encoding them as numbers in a matrix and multiplying that by a key. “Very secure” indeed :-/
[1] https://www.ticalc.org/archives/files/fileinfo/252/25214.html
Conditions are a largely forgotten language feature that solves what I see as the fundamental problem with exceptions.
Fundamentally, exceptions solve the problem that the best place to handle an unusal state is often somewhere up the call stack where that unusual state is identified. Conditions, on the other hand, solve the problem that the best place to handle that unusual state is often down the call stack from the best place to decide how it should be handled.
(FWIW, it’s possible to implement conditions as a library feature in any language with both first-class functions and the ability to arbitrarily unwind the stack; e.g., with exceptions. I’ve done it in Java and Python, though I’ve never managed to figure out how to make the interface remotely acceptable)
I mostly design mounting brackets and panels using OpenSCAD, and I’ve frequently run into the problem that a given piece of geometry can’t be described with just an additive or a subtractive operation. Consider, for example, adding a keystone jack to a panel. You need a protruding part for the keystone jack to click into, but you also need an appropriately-sized hole in the panel for the jack to fit into. (Further, that hole isn’t just a simple cube because keystone jacks rotate as they’re installed).
The pattern that I’ve found is that each type of slot is defined by a module with a “mode” parameter. When that parameter is negative, the module produces the subtractive parts of the geometry; positive produces the additive parts of the geometry, and zero produces a mockup of the part that will go into the slot.
Then, the top-level geometry has the pattern:
module slots(mode) {
// insert all of the connectors here
keystone(mode);
translate([60, 0, 0]) d_mount(mode);
}
difference() {
union() {
// The basic panel shape, centered on the Y axis
translate([0, -15, 0])
cube([200, 30, 5]);
slots(1);
}
slots(-1);
}
I saw this and felt that there were some missed optimization opportunities. So, of course, I had to go in and try them out.
As a baseline, my machine averages 426ns/iteration for the encode function. (I’m ignoring decoding, because, as we’ll see, this is just a special case of encoding)
First, we have some simple Rust-level optimizations: if we pass the blocks into the hamming code functions by value instead of by reference, Rust’s optimizer can keep them in a register instead of writing them back to memory. Further, x86_64 calling conventions leave the first few 64-bit integer arguments to a function in a register, so we can avoid three memory accesses. I’d imagine that a Sufficiently Smart Compiler™ might optimize away the dereference, but based on my benchmarks, Rust isn’t doing that. This alone brings encode down to 364ns/iter, a 16% improvement.
And now we have some algorithmic changes we can do. I make two changes here: First, we spend a lot of time shuffling bits around to put the parity bits at power-of-two bit positions. However, we could just as easily put them in the currently-unused high bits and save nearly all of the shuffling, at the cost of having to keep separate track of “physical” and “virtual” bit numbers during the parity calculation. Second, we can compute all of the parity bits at the same time by simply XORing the virtual bit number of each set bit together. (We also compute the word parity at the same time by always setting the low bit in that virtual bit number):
const DATA_BITS: u64 = 57;
const DATA_MASK: u64 = (1 << 57) - 1;
pub fn encode(mut block: u64) -> u64 {
// We put the parity bits at the top for performance reasons
return (block & DATA_MASK) |
((full_parity(block) as u64) << DATA_BITS);
}
pub fn full_parity(code: u64) -> u8 {
// Bits 0, 1, and 2 of the putative check word are parity bits, so the first bit is logically bit 3
let mut dv = 3;
let mut check = 0;
for i in 0..(DATA_BITS as u8) {
let mut virt_bit = i + dv;
if virt_bit & (virt_bit - 1) == 0 {
virt_bit += 1;
dv += 1;
}
check ^= if code & (1 << i) != 0 { (virt_bit << 1) | 1 } else { 0 };
}
return check;
}
This brings us down to 139.31ns/iter, a 62% improvement. However, we’re not even remotely done yet.
What if we could compute the parity values for multiple bits at once? We’re doing all of these computations in a 64-bit word, but we’re only using 7 bits of it. This seems wasteful, and in fact it is. Observe that for any given bit, the check value for that bit will either be 0 or a bit-specific value. We can then compute the parity bits for the first bit in each byte at the same time, then the second bit in each byte, and so on.
We do this by defining a constant for each bit position:
const PARITY_WIDE: [u64; 8] = [
0x7f6f5f4f3d2d1b07,
0x017161513f2f1d0b,
0x0373635343311f0d,
0x057565554533230f,
0x0977675747352513,
0x1179695949372715,
0x217b6b5b4b392917,
0x417d6d5d4d3b2b19,
];
Then, we mask out the check values that should be included and merge it into our check code:
let mut check = 0;
for i in 0..8 {
let bitset = 0x01010101_01010101 & (code >> i);
let code_part = u64::wrapping_sub(bitset << 8, bitset) & PARITY_WIDE[i];
check ^= code_part;
}
Finally, we combine the 8 check codes in our check word horizontally, the same way that the fast parity function worked:
check ^= check >> 32;
check ^= check >> 16;
check ^= check >> 8;
return (check & 0xFF) as u8;
Using this new full_parity function, our time goes down to 8.5ns/iter, a 93% improvement!
(As an aside, how is this so much faster? Normally, you’d expect that because the loop runs 1/8 of the number of times, it would be a 87.5% improvement. However, we no longer have an unpredictable branch in the inner loop, and that saves a lot of time)
I’m sure there are more tricks that could be used to eke out a bit more performance, but I’m happy with where this stands now :-D
If you’d like to play with my code, I’ve pushed it here: https://github.com/thequux/hamming-code
I have a PCEngines apu4c4 running NixOS as my router, using a custom zone-based firewall script, a pile of OpenVPN tunnels that share routing information using a combination of BGP and OSPF. Within my apartment, I host wifi using Ubiquiti gear (an AC Lite and a nanoHD) with a locally-hosted controller. I see no need to use their cloud offering, and given that my modus operandi is to host everything I can myself, I find it unlikely that I ever will.
For switches, I’ve been using a Netgear GS324T and GS108E depending on how many ports I need; they’re not the most full-featured things I’ve used, but they’re cheap and good enough.
My NAS lives on the other end of one of the VPN tunnels for noise reasons; it’s a used HP DL380e Gen8 server that I’ve packed to the gills with RAM. It’s also running NixOS, and has enough horsepower to handle running a pile of VMs for miscellaneous vendor appliances as well as some publicly accessible services (Matrix, Plex, etc).
All told, it’s simple, reliable, and I can have a full backup of everything necessary to bring back a clean slate on a single floppy disk. If I were doing it again, the only change I might consider is using a custom-built mini PC instead of the APU; it’s not that there’s any problem with the APU, but running nixos-rebuild takes just long enough to be irritating.
I got the impression that the questionnaire is making the assumption that touch typing is better and/or faster than all alternative methods of typing.
It might be, if you’re transcribing or writing, but I find that thinking/coding is not constrained by my typing speed or correction rate (~35 wpm) and my half-learned touch-typing.
I touch type but don’t care about the home row. I just hit the keys I want to without sight being necessary.
Not needing to use your eyes to type helps
The questionnaire defines touch typing as “using all the fingers and thumbs to type without looking at the keyboard.” By this definition, I touch type, because I use all of my fingers and thumbs to type, and I don’t look at the keyboard, even though my resting position bears no resemblance to the standard “home row” technique that’s usually taught. Rather, my fingers simply know where they need to be and I hit each key with whatever finger happens to be closest at the time. My accuracy isn’t brilliant, but I can sustain 60wpm corrected, and at that point, I don’t find myself limited by typing speed.
I didn’t mean to give that impression.
As @Vaelatern and @thequux point out not looking at the keyboard when typing is generally more efficient. From the research I’ve found, the evidence seems to point to the minimising finger movements and setting keys so that common letter clusters can be typed by different hands seems to improve speed, accuracy and comfort. The current default staggered QWERTY layout - does not allow natural hand placement, thus increasing the risks of fatigue and injury. The default key size and spacing was design for less than 6.1% of the worlds population, I’m hoping in the near future everyone will be able to get a keyboard that is a unique fit to their bodies, and thus a pleasure to use :~)
MIPS is everywhere, still. Including in network gear, wireless, IoT, and other embedded applications.
This. While it seems to me that most high-end network gear is slowly migrating towards ARM, MIPS keeps turning up in odd places. I recently dug around in the weird world of handheld video game consoles designed to run emulators, and found this spreadsheet compiled by the fine folks here. I was surprised to see a relatively large number of CPU’s with “XBurst” architecture, which MIPS32 plus some DSP extensions.
I have a friend who recently got an internship at a company to help optimize their AS/400-based database infrastructure, and it looks like the current IBM systems are still backwards-compatible with S/390 programs. So while you might not see s390 much it’s probably not going away quickly.
I believe Alpha, PA-RISC and IA-64 are officially deprecated these days, so nobody is making new ones and nobody seems to want to. To my surprise, it appears that people are still manufacturing SPARC hardware though.
it looks like the current IBM systems are still backwards-compatible with S/390 programs
My understanding is that IBM Z stuff today is extremely compatible with System/360 programs from the mid-’60s.
So while you might not see s390 much it’s probably not going away quickly.
For legacy applications on MVS and friends, yeah, but IBM basically killed 31-bit Linux.
To my surprise, it appears that people are still manufacturing SPARC hardware though.
There’s still a market for legacy Solaris systems.
How frequently are these legacy Solaris systems updated? How frequently are IBM Z systems updated? I heard (might be unsubstantiated) that some mainframes still run 20 year old Perl, even though the OS gets updates.
Depends how much they care; if they do, they’ll keep their ancient application onto newer Solaris on newer hardware (i.e M8).
The 20-year-old-Perl makes me think you’re talking USS on z/OS (aka MVS); that’s a world I know very little of.
IBM i (née AS/400) is all on PowerPC these days. It’s a very different system from s390/mainframe/zOS
Oh, this is very handy! I’ll be switching to the same ISP within a few weeks, and a working reference guide for how to get NixOS (also on a PCEngines APU) will make it easy.
If you’re running a complex routing setup or dealing with multiple VLANs, you might want to look into my zone-based-firewall script. While I still haven’t figured out a good interface for adding port forwarding rules, I’ve been using this code for the last year and it works very nicely.
For my current setup (VyOS + ansible to generate the rules) I just have a set of “custom rules” that don’t easily fit into any abstraction easily. I tried coming up with a good abstraction for this, but it’s so rare for my setup that I didn’t bother…
Part of the the zone firewall is a low-level way to add rules to an nftables firewall from different places, by adding to networking.nftables.tables.<table>.chains.<chain>
. It was important to me to allow for different abstractions or even coexisting abstractions
I’m still working on getting my blog together for it, but I’m doing all of the problems in COBOL on IBM i as a way to learn both COBOL and IBM i. It’s been relatively straightforward so far, but I expect things to get much trickier soon.
Once I get it published, the blog will be available at https://thequux.github.io
That sounds really cool. I hope you manage to get it published. And even if it gets too tricky to keep going, saying why would be interesting.
If you’re not already planning to say so, I’d be interested to hear about the setup that lets you do this, too.
It’s published, along with the writeups for days 1 and 2.
And yes, I do plan a post in a day or two that’s more on how I actually got my hands on one of these mythical beasts (and you can too!)
I hate how all of these threads devolve into a Plan 9 hagiography. No one ever brings up the fact it encourages strings in the most string hostile language and encourages wasteful de/serialization from/into strings, which is hard to make secure or fast, on top of being a poor abstraction.
I can speak for interesting ideas from platforms I’ve used:
$ git
add file.c
commit -m "updated"
push origin master
$
That’s how it is in VMS for essentially any complex command.
A help system worth a damn, so people don’t have to use Stack Overflow as their first line of defense. DCL on VMS has easy to read documentation with examples and drilldown into arguments, instead of infodumping you pages of roff. IBM i has context sensitive help - press F1 over any element on screen (including arguments, keybindings, etc.) and get an explanation of it.
Single-level storage. This pops up in a lot of research/non-Unix systems, but there’s two different implementations; one is the Multics/Domain like system of having files/segments worked with in memory at ephemeral locations (basically like Unix, but all files are mmaped), and single-level storage where the system has a single address space with objects in it having fixed addresses, even across reboots. IBM i is the latter, and probably the most successful example of such a system. You stop thinking of files and things in memory as separate and realize paging breaks the barrier down between them. A pointer to a file and the file itself are the same thing.
To actually make this secure, IBM i actually is a capability system - maybe not quite object capabilities, but you have capabilities that only the kernel can provide you and you can’t forge them. Tagged memory is used to implement this - another example of object-capabilities with tagged memory would be BiiN.
Programs aren’t native code, but stored as machine-neutral bytecode. This allows binary backwards compatibility going back to 1980 on IBM i, but it has precedent with things like Mesa. It also allows for the kernel to reoptimize programs and enforce (the trusted translator is how the capability system is enforced) and improve security after compilation.
You can pass pointers, integers, and floats to programs. Because IBM i has a single address space, buffers from other programs are valid. You don’t need environment variables or temporary files as much. The lines between a program and function call are blurred.
Virtualization as a first-class citizen. VM invented virtualization and blurs the line between virtual machine and process. It’s multi-user, but users get a VM running a single-user operating system. The IPC between VMs/processes (same thing on VM) are things like connecting virtual card decks/punch readers between them. Ever booted Linux from virtual punch cards?
Programs aren’t native code, but stored as machine-neutral bytecode.
This is quite tricky. C code is not architecture neutral after it’s run through the preprocessor, let alone after early bits of the compile pipeline. WebAssembly has recognised this by baking the pointer size into the bytecode and, as a side effect, backing the notion that pointers are integers into the abstract machine. With CHERI, we make pointers into a 128-bit type that is distinct from integers and enforced in hardware. Nothing in a bytecode that allows pointer arithmetic and bakes in a pointer size can take advantage of this.
In the bytecode for i programs (MI), pointers are 128-bit too, and pack a lot of metadata in there. They’re not integers either. The in-kernel translator crunches them down to the machine length. (That way, it went from a 48-bit pointer world to a 64-bit pointer world.)
Realizing this isn’t your point, but I recommend gitsh by George Brocklehurst, if you want to run git as a subshell. Saves me several milliseconds of typing git
each day!
Neat! Glad to know people are independently discovering DCL from first principles every day. (I don’t mean that as backhanded sarcasm either.)
I don’t know if guessing commands is something people should be doing, let alone the operating system supporting
Why not? If the command space is sensibly laid out like in IBM i, where every create command behaves the same way, every “work with” command behaves the same way, the individual commands stop being distinct commands and instead become specializations of a single command. Thus, even though “work with members”, “work with objects”, “work with libraries”, and “work with line descriptions” are implemented as separate commands, they are essentially a single “work with ” command. The result is that instead of having to remember m*n commands like on Unix/Windows/etc, you can remember m actions and n types of thing, and automatically know how to submit your request to the system.
That, and when you do type a command you guessed but aren’t familar with, press F4 for a form of its arguments, and press F1 (help) or F4 (prompt) over each argument you’re unsure about.
I guess I would have stopped once it needed 32-bit inodes, and just created a mount with 32-bit inodes for it to use. Great hack.
I’d reach for a 32-bit Linux container or VM. After briefly googling, it looks like Docker’s ok with 32-bit containers on 64-bit hosts.
I don’t think a container would have helped, though; stat64 existed even on 32-bit Linux and there’s no difference between making a syscall from within a container vs. outside of a container. The underlying filesystem is the only thing that’s relevant here, and you don’t need a full VM just to have your files on a different filesystem.
I wonder if anybody has looked into whether PowerPC’s minimally-documented “tags active” mode can be used to implement return guards. It would involve a single additional instruction in the function prologue to set the tag bit (and likewise in the epilogue to check the tag bit), and for various reasons you’d lose the top 16 bits of your address space, but in return you’d have an effectively unforgeable return address.
I’ve been thinking about something similar for a while now. Working on it—slowly, very slowly, maybe two decades will pass and it’ll still be vapourware;
There is a single ‘blessed’ application runtime for userspace. It is managed and safe. (In the tradition of java, javascript, lua, c#, etc.) This is necessary for some of the later points.
As gary bernhardt points out, this can be as fast as or even faster than running native code directly.
Since everything is running in ring 0, not only are ‘syscalls’ free, but so is task switching.
All objects are opaque. That is:
All objects are transparently synced to disc.
All objects may transparently be shared between threads (modulo permissioning; see below)
All objects may transparently originate in a remote host (in which case changes are not necessarily synced to disc, but are synced to the remote; a la nfs)
Instead of a file system, there is a ‘root’ object, shared between all processes. Its form is arbitrary.
Every ‘thread’ runs in a security domain which is, more or less, a set of permissions (read/write/execute) for every object.
A thread can shed permissions at will, and it can spawn a new thread which has fewer permissions than itself, but never gain permissions. There is no setuid escape hatch.
However, a thread with few permissions can still send messages to a thread with more permissions.
All threads are freely introspectible. They are just objects, associated with a procdir-like object in which all their unshared data reside.
pst (since I’m the guy who has to point these out to everyone each time):
IBM i has a lot of these (object persistence, capabilities, high-level runtime only; older systems didn’t even have unprivileged mode on the CPU), but not all.
Domain has network paging (since that’s how it does storage architecture), but not most of the others.
Phantom does persistence for basically pausable/restartable computation. Weird, but interestingly adjacent.
I need to write a blog post about this!
Interesting! Never encountered that.
Wiki says it’s proprietary and only works with ppc. Is there any way to play with it without shelling out impressive amounts of $$$ to IBM?
If you want your own hardware, you can buy a used IBM Power server for an amount on the order of a few hundred dollars and installation media for a trial is available direct from IBM. While that’ll only work for 70 days before you need to reinstall, back up and restore procedures are fairly straightforward.
If you don’t care about owning the hardware, there’s a public server with free access at https://pub400.com/.
Whichever route you take, you’ll probably want to join ##ibmi on Freenode because you’ll have a lot of questions as you’re getting started.
If you want it to run IBM i, you’re going to need to read a lot of documentation to figure out what to buy, because it’s all proprietary and licensed, and IBM has exactly 0 interest in officially licensing stuff for hobbyists. It also requires special firmware support, and will therefore not run on a Raptor system.
I think the current advice is to aim for a Power 5, 6, or 7 server, because they have a good balance of cost, not needing a ton of specialized stuff to configure, and having licenses fixed to the server. (With older machines, you really want to have a 5250 terminal, which would need to be connected using IBM-proprietary twinax cabling. Newer machines have moved to a model where you rent capacity from IBM on your own hardware.)
I’d browse ebay for “IBM power server” and looking up the specs and license entitlements for each server you see. Given a serial number, you can look up the license entitlements on IBM’s capacity on demand website. For example, my server is an 8233-E8B with serial number 062F6AP. Plugging that into IBM’s website, you see that I have a POD code and a VET code. You can cross reference those codes with this website to see that I have entitlements for 24 cores and PowerVM Enterprise (even though there are only 18 cores in my server, in theory I could add another processor card to add another 6. I’m given to understand that this is risky and may involve needing to contact IBM sales to get your system working again)
You really want something with a PowerVM entitlement, because otherwise you need special IBM disks that are formatted with 520-byte sectors and support the SCSI skip read and skip write commands. You will also need to cross reference your system with the IBM i system map to see what OS versions you can run.
Plan to be watching eBay for a while; while you can find decent machines for €300-500, it’s going to take some time for one to show up.
Also, I’m still relatively new to this whole field; it’s a very good idea to join ##ibmi
on freenode to sanity check any hardware you’re considering buying.
There’s no emulator, and I’m not holding my breath for one any time soon.
Domain is emulated by MAME, and Phantom runs in most virtualization software though.
It’s still in the planning phase, sadly. I only have so much time given it’s one of my many side projects.
You might be interested in this research OS, KeyKOS: http://cap-lore.com/CapTheory/upenn/
It has some of what you’re describing: the transparent persistence, and the fine-grained permissions. I think they tried to make IPC cheap. But it still used virtual memory to isolate processes.
I think it also had sort of… permissions for CPU time. One type of resource/capability that a process holds is a ticket that entitles it to run for some amount of time (or maybe some number of CPU cycles?). I didn’t really understand that part.
Looks interesting. (And, one of its descendants was still alive in 2013.) But, I think anything depending on virtual memory to do permissioning is bound to fail in this regard.
The problem is that IPC can’t just be cheap; it needs to be free.
Writing text to a file should be the same kind of expensive as assigning a value to a variable. Calling a function should be the same kind of expensive as creating a process. (Cache miss, maybe. Mispredict, maybe. Interrupt, full TLB flush, and context switch? No way.)
Otherwise, you end up in an in-between state where you’re discouraged from taking full advantage of (possibly networked) IPC; because even if it’s cheap, it’ll never be as cheap as a direct function call. By making the distinction opaque (and relying on the OS to smooth it over), you get a more unified interface.
One thing I will allow about VM-based security is that it’s much easier to get right. Just look at the exploit list for recent chrome/firefox js engine. Languages like k can be fast when interpreted without JIT, but such languages don’t have wide popularity. Still working on an answer to that. (Perhaps formal verification, a la compcert.)
CPU time permissions are an interesting idea, and one to which I haven’t given very much thought. Nominally, you don’t need time permissions as long as you have preemptive multitasking and can renice naughty processes. But there are other concerns like power usage and device lifetime.
I’ve been imagining a system that’s beautiful. It’s a smalltalk with files, not images, with a very simple model. Everything is IPC. If you are on a Linux with network sockets, that socket is like every other method call, every addition, every syscall.
Let’s talk. I like your ideas, and think you might like this system in my mind.
These sound great until you try and implement any of it, in which case you realise that now every single call might fail and/or simply never return, or return twice, or return to somebody else, or destroy your process entirely.
Not saying it can’t be done, just saying it almost certainly won’t resemble procedural, OO, or functional programming as we know it.
Edit: Dave Ackley is looking into this future, and his vision is about the distance from what we do now as I expect: https://www.youtube.com/user/DaveAckley
Sounds an awful lot like Microsoft Midori. It doesn’t mention transparent object persistence, but much of what you mentioned is there.
This doesn’t solve all of the problems brought up in TFA. The main one is scheduling/autoscale. It is certainly easier—for instance, you can send a function object directly as a message to a thread running on a remote host—but you still have to create some sort of deployment system.
I really want to like OCaml, but it has what I see as a fundamental misfeature, and I can’t tell whether I’m holding it wrong or whether there’s a good reason for it that I haven’t found yet. Specifically, partial application monomorphizes the result type.
For example:
# let id1 = fun a b -> b;;
val id1 : 'a -> 'b -> 'b = <fun>
# let id : 'a -> 'a = id1 "foo";;
val id : '_a -> '_a = <fun>
# id;;
- : '_a -> '_a = <fun>
# id 1;;
- : int = 1
# id;;
- : int -> int = <fun>
In my opinion, the type of id
shouldn’t be monomorphized just because I happened to use it with a specific type. The fact that it does make it feel really awkward to define functions in a point-free style.
What am I missing?
Specifically, partial application monomorphizes the result type.
Partial application only sometimes monomorphizes the result type. This ‘misfeature’ is caused by the relaxed value restriction, which is required to make the type system sound in the presence of mutation. You can read more about the value restriction in the OCaml manual.
The fact that it does make it feel really awkward to define functions in a point-free style.
In my experience, the value restriction is seldom an obstacle to most OCaml code. It’s not particularly common to mix point-free style pure code with code that uses side-effects and mutations—and when it happens, it’s probably a bad idea anyway, in terms of making it difficult to reason about the evaluation order. Plus, OCaml programmers don’t have as much of a desire to make their code point-free as, say, some Haskell folks seem to want to do.
It’s not generally considered good practice to define functions in point-free style in OCaml, so there you have it.
Whether this is due to the value restriction you mention or to the language’s general “aesthetics” is up for debate; personally I’ve never found the value restriction to be anything more than a mild annoyance.
This is interesting, and I think I agree with many arguments when it comes to the reasons java, OCaml, Haskell, Go, etc. haven’t replaced C. However the author cites rust only briefly (and C++ not at all?) and doesn’t really convince me why rust isn’t the replacement he awaits: almost all you can do in C, you can do in rust (using “unsafe”, but that’s what rust is designed for after all), or in C++; you still have a higherer-level, safer (by default — can still write unsafe code where needed), unmanaged language that can speak the C ABI. Some projects have started replacing their C code with rust in an incremental (librsvg, I think? And of course firefox) because rust can speak the C ABI and use foreign memory like a good citizen of the systems world. Even C compilers are written in C++ these days.
To me that’s more “some were meant for no-gc, C ABI speaking, unsafe-able languages” than “some were meant for C”. :-)
Besides Rust, I think Zig, Nim, and D are strong contenders. Nothing against Rust, of course, but I’m not convinced it’s the best C replacement for every use case. It’s good to have options!
Nonetheless, I imagine C will linger on for decades to come, just due to network effects and economics. Legacy codebases, especially low-level ones, often receive little maintenance effort relative to usage, and C code is incredibly widespread.
I love Rust, but I think Zig and D (in the ‘better C’ mode and hopefully their new borrow checker) are closer to the simplicity and low-level functionality of C. Rust is a much nicer C++, with simpler (hah!) semantics and more room to improve. C++ is, unfortunately, a Frankenstein monster of a language that requires a 2000 page manual just to describe all the hidden weird things objects are doing behind your back. Every time I have to re-learn move semantics for a tiny project, I want to throw up.
i was also wondering, while reading the article, how well ada would fit the author’s use case (i’m not at all familiar with the langauge, i’ve just heard it praised as a safe low-level language)
You might want to read this brief follow-up:
Maybe I’m the archetype of a C-programmer not going for Rust. I appreciate Rust and as a Mathematician, I like the idea of hard guarantees that aren’t a given in C. However, Rust annoys me for three main reasons, and these are deal-breakers for me:
To put it shortly: What I like about C is its simplicity and self-reliance. You don’t need Cargo to babysit it, you don’t need dozens of external crates to do basic stuff and it doesn’t get in the way of the package manager. I actually like Rust’s ownership system, but hate almost anything around it.
See, I feel the exact opposite when it comes to Cargo vs. system’s package manager: managing versions of libraries using your system’s package manager is a royal pain in the ass or outright impossible when you have multiple projects requiring different versions of a library. In my experience with C and C++, you’ll end up using CMake or Meson to build exactly the same functionality that Cargo deploys for you, at a much higher cost than just adding one line in a configuration file.
In fact, my biggest gripe with C and C++ is that they still depend on a 3-step build system (preprocessing, compiling, linking) each of which requires you to specify the location of a group of files. I get why having header files was attractive in the 1970s when you counted your computer’s memory in KBs, but it makes writing and maintaining code such a pain in the ass when compared with a modern module system.
The funniest bit is I used to consider all these things as ‘easy to deal with’ when all I did was write C/C++ code 15 years ago. Nowadays, having to switch from Go, Rust or any other language to C for a small project makes me want to cry because I know I’ll spend about 20% of the time managing bullshit that has nothing to do with the code I care about.
Build systems in C give a feeling of craftsmanship. It takes a skill to write a Makefile
that correctly supports parallel builds, exact dependencies, interruptions and cleanups, etc. And so much work into making it work across platforms, and compilers.
And then Cargo just makes it pointless. It’s like you were king’s best messenger trained to ride the fastest stallions, and Cargo’s like “thanks, but we’ve got e-mail”.
LOL, I guess it’s a matter of age. When I first started programming, I’d love all that stuff. I can’t count how many libraries and tools I re-implemented or extended because they didn’t do something exactly the way I wanted it. Or the nights I spent configuring my Linux machine to work just right. Or the CPU time I spent re-encoding all of my MP3 collection to VBR because it’d save 5% of storage.
Now, I learned Cargo for a tiny project and I keep swearing every time I have to start a Python virtualenv because it’s just not easy enough, goddammit!.
This is a fair criticism of C, personally I would love to see a set commonly used data structures added to the C standard library. However, currently in the C world you either write your own or use something like glib, neither of these cases require the equivalent of Cargo.
However, currently in the C world you either write your own or use something like glib, neither of these cases require the equivalent of Cargo.
Neither does using the Rust standard library, which also has a hash table implementation (and many other useful data structures). You can just use std
and compile your project with rustc
.
We’re talking about dependencies in general, not just hash tables. FRIGN’s point is that the Rust standard library is lacking, so you end up needing crates.
But you and FRIGN are complaining about the Rust standard library compared to C. The Rust standard library is much more comprehensive than the C standard library or the C standard library + glib. So, the whole point seems to be void if C is the point of comparison.
If you are comparing to the Java standard library, sure!
But you and FRIGN are complaining about the Rust standard library compared to C.
Not really. The point being made is that a typical Rust application has to download a bunch of stuff from Github (crates), where as a typical C application does not.
That’s just because it’s convenient and most people don’t really care that it happens. But it’s not inherent to the tooling:
$ git clone -b ag/vendor-example https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ cargo build --release
Other than the initial clone (obviously), nothing should be talking to GitHub or crates.io. You can even do cargo build --release --offline
if you’re paranoid.
I set that up in about 3 minutes. All I did was run cargo vendor
, setup a .cargo/config
to tell it to use the vendor directory, committed everything to a branch and pushed it. Easy peasy. If this were something a lot of people really cared about, you’d see this kind of setup more frequently. But people don’t really care as far as I can tell.
where as a typical C application does not
When was that last time you built a GNU C application? Last time I tried to build GNU grep, its build tooling downloaded a whole bunch of extra goop.
None of the C code I’ve worked on was written by GNU, and most of the C code out in the real world wasn’t written by GNU either. I find it frankly bizarre that you are seriously trying to suggest that GNU’s practices are somehow representative of all projects written in C.
You said “a typical C application.” Now you’re saying “representative” and “what I’ve worked on.”
If the implementation of coreutils for one of the most popular operating systems in history doesn’t constitute what’s “typical,” then I don’t know what does.
Talk about bizarre.
Moreover, you didn’t even bother to respond to the substance of my response, which was to point out that the tooling supports exactly what you want. People just don’t care. Instead, you’ve decided to double down on your own imprecise statement and have continued to shift the goal posts.
and most of the C code out in the real world wasn’t written by GNU either.
^ typical
I don’t have the time or patience to debate semantics though.
As for your other point, see FRIGN’s comment for my response. (It doesn’t matter what’s possible when the reality is random crates get pulled from github repos)
C doesn’t even have a hash table.
Why do you say “even”? There are many hash table implementations in C, with different compromises. It would be untoward if any of them made its way into the base language. There are other things missing in C which are arguably more fundamental (to me) before hash tables. It is only fair if all of these things are kept out of the language, lest the people whose favorite feature has not been included feel alienated by the changes.
but there doesn’t seem to be much push to improve the situation
Definitely not true. There are people working on this and there has been quite a bit of progress:
$ git clone https://github.com/BurntSushi/ripgrep
$ cd ripgrep
$ git checkout 0.4.0
$ time cargo +1.12.0 build --release
real 1:04.05
user 1:51.42
sys 2.282
maxmem 360 MB
faults 736
$ time cargo +1.43.1 build --release
real 19.065
user 2:34.51
sys 3.101
maxmem 740 MB
faults 0
That’s 30% of what it once was a few years ago. Pretty big improvement from my perspective. The compilation time improvements come from all around too. Whether it’s improving the efficiency of parallelism or micro-optimizing rustc itself: here, here, here, here, here, here or here.
People care.
The Rust developers should stop trying to please everybody and come up with standard interfaces that also get shipped with the standard install.
That’s one of std’s primary objectives. It has tons of interfaces in it.
This criticism is just so weird, given that your alternative is C. I mean, if you want the C experience of “simplicity and self-reliance,” then std alone is probably pretty close to sufficient. And if you want the full POSIX experience, bring in libc and code like its C. (Or maybe use a safe interface that somebody else has thoughtfully designed.)
When installing a Rust package, you end up having to be connected to the internet
You do not, at least, no more than you are with a normal Linux distro package manager. This was a hard requirement. Debian for example requires the ability to use Cargo without connecting to the Internet.
and often end up downloading dozens of small crates from some shady GitHub repos.
Yup, the way the crates.io model works means the burden of doing due diligence is placed on each person developing a Rust project. But if you’re fine with the spartan nature of C’s standard library, then you should be just fine using a pretty small set of well established crates that aren’t shady. Happy to see counter examples though!
but I could imagine a similar scenario to leftpad in the future.
The leftpad disaster was specifically caused by someone removing their package from the repository. You can’t do that with crates.io. You can “yank” crates, but they remain available. Yanking a crate just prevents new dependents from being published.
Rust really needs a better standard library so you don’t have to pull in so much stuff from other people.
… like C? o_0
This point is really close to the one before it: Cargo is an interesting system, but really ends up becoming a “monosolution” for Rust setups. Call me old-fashioned, but I like package managers (especially Gentoo’s) and Cargo just works around it.
C has just been in the luxurious position that its package managers have been the default system package managers. Most Linux package managers are effectively a C package managers. Of course, over time packages for other languages have been added, but they have mostly been second-class citizens.
It is logical that Cargo works around those package managers. Most of them are a mismatch for Rust/Go/node.js packages, because they are centered around distributing C libraries, headers, and binaries.
but I could imagine a similar scenario to leftpad in the future.
Rust et al. certainly have a much higher risk, since anyone can upload anything to crates.io. However, I think it is also an illusion that distribution maintainers are actually vetting code. In many cases maintainers will just bump versions and update hashes. Of course, there is some gatekeeping in that distributions usually only provide packages from better-known projects.
Rust really needs a better standard library so you don’t have to pull in so much stuff from other people.
You mean a large standard library like… C?
C has just been in the luxurious position that its package managers have been the default system package managers.
This just isn’t true, if you look at how packages are built for Debian for example you will find that languages such as Python and Perl are just as well supported as C. No, the system package managers are for the most part language agnostic.
This just isn’t true, if you look at how packages are built for Debian for example you will find that languages such as Python and Perl are just as well supported as C.
Most distributions only have a small subset of popular packages and usually only a limited number of versions (if multiple at all).
The fact that most Python development happens in virtual environments with pip
-installed packages, even on personal machines, shows that most package managers and package sets are severely lacking for Python development.
s/Python/most non-C languages/
No, the system package managers are for the most part language agnostic.
Well if you define language agnostic as can dump files in a global namespace, because that’s typically enough for C libraries, sure. However, that does not work for many other languages, for various reasons, such as: no guaranteed ABI stability (so, any change down the chain of dependencies needs to trigger builds of all dependents, but there is no automated way to detect this, because packages are built in isolation), no strong tradition of ABI stability (various downstream users need different versions of a package), etc.
No, most development happens in virtualenv because python packaging is so broken that if you install a package you cannot reliably uninstall it.
If we didn’t have a package manager for each language then the packages maintained by the OS would be more comprehensive, by necessity. Basically having a different packaging system for each programming language was a mistake in my view. I have some hope that Nix will remedy the situation somewhat.
edit: it’s also difficult to reply to your comments if you substantially edit them by adding entirely new sections after posting…
No, most development happens in virtualenv because python packaging is so broken that if you install a package you cannot reliably uninstall it.
I have no idea what you mean here. Can’t you use dpkg/APT or rpm/DNF to uninstall a Python package?
If we didn’t have a package manager for each language then the packages maintained by the OS would be more comprehensive, by necessity.
We are going in circles. Why do you think languages have package managers? Technical reasons: the distribution package managers are too limited to handle what languages need. Social/political reasons: having the distributions as gatekeepers slows down the evolution of language ecosystems.
I have some hope that Nix will remedy the situation somewhat.
Nix (and Guix) can handle this, because it is powerful enough to implement the necessary language-specific packaging logic. In fact, Nix’ buildRustCrate
is more or less an implementation of Cargo in Nix + shell script. It does not use Cargo. Moreover, Nix can handle a lot of the concerns that I mentioned upthread: it can easily handle multiple different versions of a package and ABI-instability. E.g. if in Nix the derivation of say the Rust compiler is updated, all packages of which Rust is a transitive dependency are rebuilt.
As I said, traditional package managers are built for a C world. Not a Rust, Python, Go, or whatever world.
The first problem is a technical one, unless Rust is doing things such that it can’t be compiled efficiently, but the latter two are cultural ones which point up differences in what language designers and implementers are expected to provide then versus now: In short, Rust tries to provide the total system, everything you need to build random Rust code you find online, whereas C doesn’t and never did. Rust is therefore in with JS, as you mention, but also Perl, Python, Ruby, and even Common Lisp now that Quicklisp and ASDF exist.
I was going to “blame” Perl and CPAN for this notion that language implementations should come with package management, but apparently CPAN was made in imitation of CTAN, the Comprehensive TeX Archive Network, so I guess this goes back even further. However, the blame isn’t with the language implementers at all: Packaging stuff is one of those things which has been re-invented so many times it’s bound to be re-invented a few more, simply because nobody can decide on a single way of doing it. Therefore, since language implementers can’t rely on OS package repos to have a rich selection up-to-date library versions, and rightly balk at the idea of making n different OS-specific packages for each version of each library, it’s only natural each language would reinvent that wheel. It makes even more sense when you consider people using old LTS OS releases, which won’t get newer library versions at this point, and consider longstanding practice from the days before OSes tended to have package management at all.
Therefore, since language implementers can’t rely on OS package repos to have a rich selection up-to-date library versions, and rightly balk at the idea of making n different OS-specific packages for each version of each library, it’s only natural each language would reinvent that wheel.
This is right on the mark.
Sorry if this is a bit of a tangent, but I think it is not just a failing of package sets – from the distributor’s perspective it is impossible to package every Rust crate and rust crate version manually – but especially of package managers themselves. There is nothing that prevents a powerful package management system to generate package sets from Cargo.lock
files. But most package managers were not built for generating package definitions programmatically and most package managers do not allow allow installing multiple package versions in parallel (e.g. ndarray 0.11.0
and ndarray 0.12.0
).
Nix shows that this is definitely feasible, e.g. the combo of crate2nix
and buildRustCrate
can create Nix derivations for every dependency in a Cargo.lock
file. It does not use cargo
at all, compiles every crate into a separate Nix store path. As a result, Rust crates are not really different from any other package provided through nixpkgs
.
I am not that familiar with Guix, but I bet it could do the same.
Build times are important, and you’re right, Rust takes a while to compile. Given the choice between waiting for rustc to finish and spending a lot longer debugging a C program after the fact, I choose the former. Or, better yet, use D and get the best of both worlds.
rust can be damn fast to compile, most rust library authors just exercise fairly poor taste in my opinion and tend not to care how bad their build times get. sled.rs compiles in 6 seconds on my laptop, and most other embedded databases take a LOT longer (usually minutes), despite sled being Rust and them being C or C++.
rust is a complex tool, and as such, you need to exercise judgement (which, admittedly, is rare, but that’s no different from anything else). you can avoid unnecessary genericism, proc macros, and trivial dependencies to get compile times that are extremely zippy.
Feel free to reach out if you hit any friction, I’m happy to point folks in the direction they want to go with Rust :)
almost all you can do in C, you can do in rust
As an anecdata, I‘ve immediately recognized the snippet with elf header from the article, because I used one of the same tricks (just memcpying a repr(C) struct) for writing elf files in Rust a couple of months ago.
I though about using rust to implement a byte-code compiler / vm for a gc’d language project, but I assumed that this would require too much fighting to escape rust’s ownership restrictions. Do you have any insight into how well suited rust is for vm implementation? I haven’t used the language much but I’d love to pick it up if I though I could make it work for my needs.
(I see that there’s a python interpreter written in rust, but I’m having trouble locating its gc implementation)
I honestly don’t know, I have never written an interpreter. You probably can fall back on unsafe for some things anyway, and still benefit from the move semantics, sum types, syntax, and friendly error messages. I’m doing a bit of exploring symbolic computations with rust and there are also some design space exploration to be done there.
IMO, Rust is just as good as C and C++ for projects like this, if not better thanks to pattern matching and a focus on safety (which goes far beyond the borrow checker). Don’t be afraid to use raw pointers and NonNull
pointers when they are appropriate.
Also, just saw this GC for Rust on the front page: https://github.com/zesterer/broom/blob/master/README.md
Looks like it’s designed specificly for writing dynamic languages in Rust.
I wrote a ST80 VM in rust just to play around; the result was beautifully simple and didn’t require any unsafe code, though it doesn’t currently have any optimization at all. The result was still reasonably snappy, but I suspect that a big part of that is that the code I was running on it was designed in, well, 1980.
I recently did a simple lisp interpreter in rust. Eventually decided to re-do it in c because of shared mutation and garbage collection.
Rust works well for VM implementation. For GC, you may want to look at https://github.com/zesterer/broom.
I’m gonna go with Qt on this one. I learned it a long time ago (I think it was still at version 2!) and it never really let me down. It’s got very good documentation, and it’s pretty reliable for long-term development. I have projects that have been through three Qt versions (3.x, 4.x, 5.x) and the migration has been pretty painless each time. It’s not on the nimble end of the spectrum and it’s C++, but I found it to be the most productive, even though the widgets library hasn’t been as high on the parent company’s priority list. (They insist that’s not true but actions speak louder than words…). I’ve used it for huge projects (200 KloC+) and it held out great.
I used GTK 2 back in the day, too, and while some bits weren’t exactly enjoyable, it was generally efficient, and it was a pretty safe bet for cross-platform development, and an especially safe bet for Linux and Unix development. I really wanted to like GTK 3. I don’t know if it’s because I’m getting grumpy and impatient, or if there really is something objectively wrong with it, but I didn’t manage to like it, and now I tend to avoid it, both when it comes to writing code that uses it and when it comes to using applications written against it. Also I’m not sure how its cross-platformness is doing these days.
I’ve played with Dear ImGui and I can definitely say I enjoy it. I’ve used it for some pretty small and special-purpose tools (and obviously you get about as much native integration with it as you get with Electron :P) but I definitely had fun with it. I’ve also
I’m also a big fan of QT, and in particular, QtQuick is the single most productive rapid prototyping platform I’ve ever used (beating out even Visual Basic and Electron). The first app I ever wrote with it started out as an excuse to learn Qt Quick, and I had a working, polished app within two weeks.
I really like Qt as well. I recently started building things with PyQt5 and it’s been pretty nice to work with:
+1 for Qt. I was surprised to see Telegram’s desktop client not using Electron, when every popular IM client is using it, and the UI seems much faster, and pleasant to work with. Another advantage is, Qt is available on more platforms than Electron, so if you like to be portable, don’t want to be limited by GNU/Linux, Windows, or macOS, then Qt is a good choice.
Did you intend to continue?
It looks like I did but whatever I wanted to say has long been swapped to the write-only section of my memory :)
Happy with Qt too, but only when keeping the project up to date (and then it’s much easier with small projects). The least progress I’ve ever made as part of a software team was when we had a long-running Qt app where some parts were Qt5-ready, but we were mostly building with Qt4 and even then using 3-to-4 adapters in parts. Not that this isn’t true of other frameworks, but that sticks out as a raw nerve in my memory.
I’ve also used wxwidgets (but long enough ago that I don’t remember much specific, it seemed to work), GNUstep (OK if you don’t use any super-modern Cocoa APIs, where the approach to claiming 100% coverage has been to stub out all of the implementations), and Eclipse RCP which is a real curate’s egg.
Smalltalk is an easy answer. :-)
In Smalltalk, you can replace the contents of any method in any class of the system. Obviously, you’d better not be wrong.
Almost, but not quite: certain methods are actually opcodes (in ST80, arithmetic, ifTrue, and a couple of others), and modern smalltalks will open-code them, so overriding the method will do nothing unless the primitive fails. Even the ST80 blue book interpreter handles ifTrue directly in the bytecode, and the 32 “special” messages with their own opcodes result in directly calling a primitive without message lookup. Now, you’re not likely to want to override integer addition or Object>>class
, but it still bothers me that you can’t.
I’m curios when (and if) will see:
I don’t have the repos (“google3” and “fbcode”) available to me any more.
Are these the actual names of Google’s and Facebook’s monorepos?
I can confirm that Google’s is google3; there was a google2 at least 15 years ago that used make as a build system, and I can only assume that there was a google at some point. No idea about facebook.
An easy search online returned this: https://github.com/angular/google3
so I’d assume so.
I’ve wanted to see a language with a 2-armed while statement for a while.
In C-ish syntax, we have the classic while loop:
while (condition) { things(); }
and the do-while loopdo { things(); } while(condition);
, depending on whether to check the condition at the beginning or the end.However, I’d love to see a combination of the two:
do { some_things(); } while(condition) { other_things();}
, which would have similar semantics towhile(true) { some_things(); if(!condition) break; other_things();}
.The two-armed while has the benefit of making it clearer where the loop condition is tested and satisfying colleagues that are picky about while loops that appear infinite, while nicely unifying the two types of while loop and being completely backwards compatible with traditional C syntax. The only cost is that you’d lose the ability to emit a diagnostic if you forget the
;
after a do-while loop.(for the nitpickers, yes, Forth has this in the form of its
BEGIN WHILE REPEAT
construct. However, Forth tends to be a difficult sell in most development shops)Common lisp has this. E.G.:
This prints a b a b a.