1. 15

    Rust has some nice quality-of-life improvements that make up for the toil of memory management. Even when I don’t strictly need a systems programming language, I still use Rust, because I like sum types, pattern matching, foolproof resource cleanup, and everything-is-an-expression so much.

    So for glue code in GUI apps or CRUD webapps I’d love some Rustified TypeScript, Rustscript on the Golang runtime, or a Rust/Swift hybrid.

    1. 9

      Pretty much all of the ‘quality-of-life improvements’ you mention came to rust by way of ocaml, and are also present in other functional languages like haskell and scala. For webapps, try purescript or possibly elm.

      1. 5

        ocaml won’t be a “higher level rust” until multicore ocaml has matured.

        1.  

          To any other readers: I wouldn’t ignore the language based solely on this, especially since multicore parallelism wasn’t one of the criteria in the top comment. Lwt and async give you concurrency if you want it.

        2.  

          Pedantic nitpicking: resource cleanup (RAII) is a big one, and it‘s from C++. Everything else is indeed from ML.

          1.  

            And traits are from haskell, and single ownership is from cyclone.

            But a large part of rust was directly inspired by ocaml.

            1. 5

              I was specifically referring to the list in the top comment. For fuller list, I would consult the reference.

              Cyclone doesn’t have single ownership/affine types. It has first-class support for arena-based memory management, which is a weaker form of rust‘s lifetimes, and is a different feature.

              See http://venge.net/graydon/talks/rust-2012.pdf for list of influences for both ownership and borrowing parts.

                1.  

                  I stand corrected, thanks a lot!

        3.  

          Rustified TypeScript

          I write a lot of TypeScript but have only written a few hundred lines of Rust. I’m curious, which features of Rust do you miss when using TypeScript? The obvious one is traits, but I’m curious if there’s anything else.

          1.  

            Not the person you replied to, but true sum types, especially Option. It’s much cleaner than undefined, even with all of the nice chaining operators. And not having to worry about exceptions thanks to Result.

            1.  

              Ah, I see your point. TypeScript discriminated unions are used frequently to overcome this limitation, but I agree it would be preferable to have proper sum types with pattern-matching.

              type Square {
                  type: "square";
                  width: number;
                  height: number;
              }
              
              type Circle {
                  type: "circle";
                  radius: number;
              }
              
              type Shape = Square | Circle;
              
              1.  

                Oh god I miss TypeScript’s sum types so much when I’m writing Rust. When I context switch back to Rust after writing some TypeScript I with I could do

                type Foo = Bar | Baz;
                

                Eventually I get over it but I find TypeScript so much nicer for describing the types of my programs.

                1.  

                  Oh yes. I wish TypeScript had more Rust, but I also wish Rust had row-level polymorphism.

                  1.  

                    Maybe someday https://github.com/tc39/proposal-pattern-matching will happen. That will be a great day.

                  2.  

                    Yes, you can relax a function that returns Foo | undefined to return Foo only, but in most “maybe” systems you can’t return a naked Foo without the Maybe<Foo> box

              2.  

                In that Rust/Swift hybrid, what parts of Rust would you want added to Swift to make the hybrid? I read this article thinking Swift wouldn’t be a bad start.

                1.  

                  In Swift I miss “everything is an expression”. I have mixed feelings about Swift’s special-casing of nullability and exceptions. Terser syntax is nice, but OTOH it’s more magical and less flexible/generalizable than Rust’s enums. Swift inherited from ObjC multiple different ways of handling errors (boolean, **NSError, exceptions) — that’d be an obvious candidate for unification if it could start from scratch.

                  And if it was a “smaller Swift”, then I’d prefer a mark-and-sweep GC. With memory management Swift is neither here nor there: you need to be careful about reference cycles and understand lifetimes for bridging with FFI, but it doesn’t have a borrow checker to help.

              1. 6

                Yes, it sucks.

                But I’ll still take it over JSON any day, for defining infrastructure as code.

                I just wish there was an human-workable alternative focused on minimalism. YAML is an unnecessarily complex clusterfuck.

                1. 11

                  HCL is a decent DSL.

                  1. 1

                    HCL

                    Hmm. That one actually looks interesting.

                  2. 7

                    Toml?

                    1. 2

                      Ugh. Not a fan.

                      1. 3

                        Why not?

                    2. 6

                      So, there’s three axes I look for in a replacement for either configuration or serialization.

                      One: What everyday problems do you run into.

                      For instance, can you encode every country code without no getting turned into false; can you include comments, can you read the damn thing or is it too much markup. Do common validation tools exist? Have they got useful error messages?

                      Two: What capabilities do you have at the language layer.

                      For instance, can you encode references? how about circular references? What kind of validation is possible? What is the range of types?

                      Three: Does anyone else know about the language. Is it supported in most languages?

                      Unfortunately, so far none of the languages that fare well on questions 1 and 2 also do well on question 3. Dhall (for config) and ADL (Aspect Definition Language, for data serialization) come to mind as “solve 1 and 2 but not 3”.

                      1. 2

                        Dhall

                        Your point three is fixable, and Dhall is great tech. The community has been very responsive IME, to the point where I need to write a blog post about just how abnormally good they are.

                    1. 4

                      it turns out there’s a Phd bachelor thesis on this topic.

                      Edited to correct error per child post

                      1. 3

                        Looks like a bachelor’s thesis, not a Ph.D thesis.

                        Pretty big difference there.

                        1. 1

                          You’re right, my mistake.

                        2. 3

                          Fascinating! At some point I read one of the previous explanations cited in this paper. I’m delighted to see Robertson went one step further and algebraically derived the constant optimal post-Newton iteration, producing optimal constants for 64 and 128 bit floats.

                        1. 2

                          Far be it from me to question the value of good defaults, but I don’t think they’re the answer here.

                          Emacs has never been an editor that promised a good experience out of the box; if anything, that’s completely orthogonal to its actual purpose. (That niche already being filled by the likes of vim, vscode, sublime text, etc.) If somebody doesn’t want to create their own editing experience, then there is no reason for them to even consider using emacs in the first place.

                          If anything, the poor defaults nudge you to learn how to fix them and, in the process, learn more about the editor. I probably would never have learnt about early-init.el if not for emacs’s coming with a light theme, hideous splash screen, scrollbar, etc.

                          1. 1

                            I’ve been working on a compiler backend of my own, along similar lines, recently. Very fun, and I’ve learned a lot; would recommend making one.

                            A warning, though: optimizations are a bottomless pit, and you will never be as good as gcc.

                            1. 3

                              That’s true – however, you can get suprisingly far with surprisingly little effort. To quote https://developers.redhat.com/blog/2020/01/20/mir-a-lightweight-jit-compiler-project/

                              Recently, I did an experiment by switching on only a fast and simple RA and combiner in GCC. There are no options to do this, I needed to modify GCC. (I can provide a patch if somebody is interested.) Compared to hundreds of optimizations in GCC-9.0 with -O2, these two optimizations achieve almost 80% performance on an Intel i7-9700K machine under Fedora Core 29 for real-world programs through SpecCPU, one of the most credible compiler benchmark suites

                              The place that these modern compilers really tend to win is when your code can be auto-vectorized.

                              1. 2

                                auto-vectorized

                                There was an interesting talk at PLDI this year about using machine learning to auto-vectorize code. It’s not difficult to recognize opportunities for vectorization, but given a piece of vectorizable code, it’s O(big) to choose the best way to vectorize it. Existing solutions are mostly heuristics-based, but their ML solution was able to generate very favourable results compared with the brute-force O(big) solution, in much less time.

                              2. 2

                                A warning, though: optimizations are a bottomless pit, and you will never be as good as gcc.

                                It depends a bit on your goals and your source language.

                                The JavaScript Core developers abandoned LLVM for their back end in part because they cared about compile latency a lot more than LLVM did (LLVM has improved this a lot for ORCv2 than the older JITs). Optimisation is analysis and transformation and often the transformation part is the easy bit. If your source language expresses things that LLVM or GCC spends a lot of complexity trying to infer then you may be able to do as well as one of them without too much effort.

                                I used to teach a compiler course where one of the examples was a toy language for 2D cellular automata. Every store in the language was guaranteed non-aliasing and it ran kernels that, at the abstract machine level, were entirely independent. It used LLVM on the back end, but starting from that language you could fairly trivially get close to LLVM’s performance because everything that you want autovectorisation to infer is statically present in the source language and most of the loop optimisations can similarly be applied once statically given the shape of the grid, rather than generically implemented for all of possible loop nests.

                                This project aims to get 70% of LLVM / GCC perf from 10% of the code. That’s probably easy on average. There are a lot of things in LLVM that give a huge perf win for a tiny subset of inputs. If you care about the performance of something that hits one of these on its most critical path, you will see a huge difference. If you care about aggregate performance over a large corpus, very few of these makes a very significant overall difference.

                              1. 35

                                Here are functions that do the same as the musl versions in this blog post, but are always constant time: https://git.zx2c4.com/wireguard-tools/tree/src/ctype.h This avoids hitting those large glibc lookup tables (cache timing) as well as musl’s branches. It also happens to be faster than both in some brief benchmarking.

                                1. 7

                                  always constant time

                                  Musl and glibc’s versions are both constant time. (Technically, since || is short-circuiting, musl’s version won’t always take the same amount of time, but it’s still O(1).)

                                  Do you mean branchless?

                                  1. 29

                                    zx2c4 means “constant time” in the side-channel avoidance sense, not the complexity theoretic sense. They’re not the same definition.

                                1. 9

                                  I think this section from the GNU Coding Standards explains why the guts of so much GNU software is so damn weird:

                                  If you have a vague recollection of the internals of a Unix program, this does not absolutely mean you can’t write an imitation of it, but do try to organize the imitation internally along different lines, because this is likely to make the details of the Unix version irrelevant and dissimilar to your results.

                                  For example, Unix utilities were generally optimized to minimize memory use; if you go for speed instead, your program will be very different. You could keep the entire input file in memory and scan it there instead of using stdio. Use a smarter algorithm discovered more recently than the Unix program. Eliminate use of temporary files. Do it in one pass instead of two (we did this in the assembler).

                                  Or, on the contrary, emphasize simplicity instead of speed. For some applications, the speed of today’s computers makes simpler algorithms adequate.

                                  Or go for generality. For example, Unix programs often have static tables or fixed-size strings, which make for arbitrary limits; use dynamic allocation instead. Make sure your program handles NULs and other funny characters in the input files. Add a programming language for extensibility and write part of the program in that language.

                                  1. 3

                                    None of this sounds particularly weird, especially:

                                    Make sure your program handles NULs and other funny characters in the input files.

                                    1. 3

                                      The point is that the programs were intentionally somewhat contorted to avoid being suspected of copyright infringement.

                                      Remember, back in the late 80s~early 90s when GNU was getting its start, the only reason anybody used linux over BSD is that the latter were getting (baselessly) sued by AT&T.

                                      1. 3

                                        The GNU tools and utilities predates the Linux kernel by a significant margin.

                                        It’s possible that the legal situation kept BSD back. It’s also possible that it simply wasn’t as fun and rewarding to contribute to BSD as it was to the Linux kernel.

                                      2. 1

                                        I think the bit about going all-out for speed is why you get these weird lookup tables and a grep that tries to munch multiple bytes at a time.

                                    1. 1

                                      no source control system

                                      Why not fossil, which is a 3mb download on windows?

                                      1. 7

                                        Unfortunately history doesn’t point to great outcomes with respect to standards and interoperability. Ten or 20 years ago the big battle was interoperable word processing and spreadsheets, not web browsers.

                                        That issue seems less important now, but it’s still a problem, and it’s still unresolved.

                                        Programmers use text and markdown so they probably don’t feel it, but lots of the world still runs on Word and Excel. Newer companies likely use cloud solutions which ironically don’t have STABLE formats, let alone OPEN ones (and occasionally break).

                                        We still have a bunch of silos and “fake” standards. That is, stuff that’s too complicated for anyone besides a single company to implement. I haven’t really followed this mess, but it appears to be still going on:

                                        https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats

                                        https://en.wikipedia.org/wiki/OpenDocument


                                        I’ll also point out that POSIX shell is way behind the state of the art… There is a ton of stuff that is implemented in bash, ksh, zsh, busybox ash, and Oil that’s not in the standard. I’d estimate you could double the size of the POSIX shell standard with features that are implemented by multiple shells.

                                        It’s a lot of work and I think basically nobody has the time or motivation to argue it out.

                                        I’m still publishing spec tests that show this, and that other shell implementers can use. Example:

                                        https://www.oilshell.org/release/0.8.0/test/spec.wwz/survey/brace-expansion.html

                                        https://www.oilshell.org/release/0.8.0/test/spec.wwz/survey/dbracket.html

                                        All that stuff could be in POSIX. Actually I noticed busybox ash is also copying bash – so bash is a new de facto standard with multiple implementations.

                                        Likewise I’ve documented enhancements for other shells to implement:

                                        https://www.oilshell.org/release/0.8.0/doc/simple-word-eval.html

                                        1. 3

                                          Related article about how not just the POSIX shell but the POSIX operating system APIs have become outdated:

                                          https://www.usenix.org/publications/login/fall2016/atlidakis

                                          https://www.usenix.org/system/files/login/articles/login_fall16_02_atlidakis.pdf

                                          Basically OS X, Android, and Ubuntu are all build on top of POSIX, but it’s not really good enough for modern applications, so they have diverging solutions to the same problems. This can be “working as intended” for awhile, but if it goes on for too long, then the lack of interoperability impedes progress.

                                          1. 2

                                            Impediments to progress are sort of relative to some genuinely viable alternative. If there isn’t one, or people aren’t aware of one, we just keep slogging through the tar pits. Those who grow up in the tar pits don’t even notice.

                                            1. 1

                                              iOS as well, obviously, but that’s a different beast to test, so.

                                              1. 3

                                                iOS disallows stuff that’s part of POSIX, like fork, so even if all of the facilities are present, they aren’t part of the public API. So it’s not a POSIX OS.

                                                1. 1

                                                  That’s an even better observation.

                                                  1. 1

                                                    Macos apparently warns you if you do anything after fork except immediately execing.

                                            1. 7

                                              This seems super useful, and I imagine I will refer to it next time I struggle with ALSA. But I really would have appreciated a little context, relating ALSA to OSS, JACK, PulseAudio, and maybe now PipeWire… and some others. Because that is quite a hairball.

                                              1. 9

                                                ALSA = Advanced Linux Sound Architecture. This is the interface provided by the linux kernel to physical sound devices.

                                                ALSA = Advanced Linux Sound Architecture. This is a relatively simple userspace API that allows to interact with physical and virtual sound devices. Linux-specific, built directly on top of the other alsa.

                                                ALSA refers variably to either (or sometimes both!) of the above; TFA discusses the userspace portion of ALSA.

                                                OSS = Open Sound System. This is a userspace API to access sound devices based special files (/dev/dsp and similar), and was for a long time the primary audio interface on unices and unix-like systems. Older versions of the linux kernel used this, before switching to alsa due to some limitations of the api. FreeBSD, however, still uses oss, having extended the api to eliminate the limitations.

                                                Pulseaudio. This is a userspace daemon that provides an interface to sound devices, and is built on top of the kernel-level alsa (but not the user-level alsa). Has some neat networking features. Its design was modelled after that apple’s coreaudio. It primarily favours linux but has ports to most other operating systems.

                                                JACK = JACK Audio Connection Kit. Like pulse, this is a userspace daemon that offers sound capabilities. Like pulse, it is quite portable; it can actually run on top of pulse! Its primary aim is to provide reliable latency guarantees to pro audio applications (like DAWs).

                                                Pipewire. Yet another daemon. This one aims to be a generic multimedia server—not just for sound—to provide latency guarantees similarly to jack, and to patch up some issues in pulse. It also provides compatibility with jack and pulse; and is somewhat portable, though afaik not so much as pulse.

                                                1. 3

                                                  Older versions of the linux kernel used this, before switching to alsa due to some limitations of the api. FreeBSD, however, still uses oss, having extended the api to eliminate the limitations.

                                                  Didn’t they switch because of licensing issues? At some point the main OSS author decided to make the software proprietary; FreeBSD continued with the last free version, and Linux decided to make everything better (“better”).

                                                  1. 2

                                                    Like pulse, it is quite portable; it can actually run on top of pulse! Its primary aim is to provide reliable latency guarantees to pro audio applications (like DAWs).

                                                    Just to note, Jack typically runs underneath pulse, and exposes itself as a pulseaudio device.

                                                    1. 1

                                                      FWIW, the other BSDs I believe use a variant of the SunOS 4 audio APIs (at least Net), and OpenBSD has sndio, which is their equivalent of Pulse.

                                                  1. 2

                                                    One point of fairly harmless confusion, when I hit the sentence:

                                                    The main place they are used in this project I’m not clear what project you’re talking about, since I’ve landed on a page on your blog rather than a document in some project specific documentation. Was this part extracted from docs you wrote for something specific? :)

                                                    1. 4

                                                      You pulled the text of your own comment into the quote. I expect you typed this:

                                                      > The main place they are used in this project
                                                      I’m not clear what project you’re talking about, since I’ve landed on a page on your blog rather than a document in some project specific documentation. Was this part extracted from docs you wrote for something specific? :)
                                                      

                                                      You need an extra newline:

                                                      > The main place they are used in this project
                                                      
                                                      I’m not clear what project you’re talking about, since I’ve landed on a page on your blog rather than a document in some project specific documentation. Was this part extracted from docs you wrote for something specific? :)
                                                      
                                                      1. 3

                                                        oof! the perils of writing markdown on a phone

                                                        alas it’s too late to edit it,but thanks

                                                      2. 2

                                                        I extracted it from a discovery project in Rust I used to help me learn Rust. I thought I got rid of that all. Oops! I’ll go fix that. Thanks!

                                                      1. 1

                                                        The explanation of virtual memory is not quite right. The true state of affairs is more like this.

                                                        The kernel maintains a reference count, for each physical page, of how many virtual mapping there are to it. This allows it to avoid copying when there is only one private reference to the page; and to free when there are no more mappings.

                                                        1. 1

                                                          Initially the array is just zeroes (at least on Linux and macOS; Windows may differ)

                                                          This is the case on windows as well; per the documentation for VirtualAlloc (windows’’ analogue to mmap):

                                                          Memory allocated by this function is automatically initialized to zero

                                                          1. 1

                                                            VirtualAlloc isn’t quite the same as mmap, it doesn’t give you a way of doing CoW mappings. You need MapViewOfFile, as I recall, to get this. Note that Windows does not allow failure at any point other than the mapping though, so your process will incur commit charge for each mapping and this may later cause allocation failures if you have exhausted the total amount of memory + swap available, even if you’re actually using a fraction of the total available memory. In contrast, most *NIX systems will happily allow the you to generate thousands of CoW mappings of the same page and then run out of memory when it needs to do the copying. I haven’t tried it for a while, but there was a fun attack that you could do on the Linux VM subsystem as an unprivileged user: mmap the same page of a file at every other page in your virtual address space. You’d end up exhausting kernel memory for the page tables, but that memory wasn’t accounted to you and so the OOM killer would kill every process before it tried killing yours to get memory back. In Windows, the commit charge for the page-table pages would be added to your process and this situation wouldn’t occur.

                                                          1. 3

                                                            I know I’m in the minority but if there were a graphical editor for the Mac that advertised how few features it had, I’d prefer it.

                                                            (Is there a graphical editor for the Mac that supports remote editing and gets out of your way? TextMate is the closest I’ve found but it still likes to boss me around with telling me where to put parentheses and quotes and will randomly insist on indenting things in ways that I don’t want…I know I can edit bundles to change that, but I’d love an editor that didn’t have it to begin with…)

                                                            That’s not to say that this isn’t impressive and I will definitely take a look.

                                                            1. 2

                                                              I’m not sure I know what you mean by “remote editing”, and it’s been some years since I used a Mac at work. But SubEthaEdit is (at least, was) a friendly, polished, relatively minimal editor with only one fancy feature set: remote collaboration. It appears to still be supported, and worked pretty well last time I used it, although that was long ago.

                                                              1. 2

                                                                “Remote” in this case means “SSH to remote server, type edit foo in that terminal, and have a window pop up locally.” TextMate, VSCode, Sublime, sam, and a few others can do that. It’s really nice.

                                                                Honestly, I spent the first 39 years of my life using mostly console-only text editors. I just figured if I’m gonna have a Mac, I should get to use a GUI text editor…:)

                                                                And thank you for mentioning SubEthaEdit; it looks very nice. I don’t think it supports remote editing in the style I was imagining though, unfortunately.

                                                                1. 2

                                                                  Oh, I got you now. I just use SSHFS for that, but honestly I don’t use that workflow very often lately, so there may be a better (but still editor-agnostic) way.

                                                                  1. 1

                                                                    SSH to remote server, type edit foo in that terminal, and have a window pop up locally

                                                                    What machinery even exists through which one would do this? How would the remote machine know to somehow tunnel back through your SSH session into your local machine to look for an editor, and then tell that editor to open a new window, and create a new session through which the server and your editor can communicate after you’ve closed your shell?

                                                                    1. 2

                                                                      The way most of them do it is you have SSH set up a forwarded port and the editor listens on one end and the remote utility talks to the other and transfers commands and data back and forth. TextMate with its “rmate” script do this, for example, and VSCode and Sublime emulate the rmate mechanism too.

                                                                      1. 1

                                                                        TextMate with its “rmate” script do this

                                                                        I’ll look into this, cheers!

                                                                    2. 1

                                                                      Sounds like X forwarding (ssh -X).

                                                                      There are X servers for mac (afaik xquartz is the main one?)

                                                                      1. 1

                                                                        It’s similar to X forwarding, and X forwarding uses it under the hood I think, but it’s its own thing. The SSH protocol supports “channels” multiplexed over a single connection and OpenSSH can create local sockets on either end that it will forward traffic through via the various channels.

                                                                        We use SSH tunnels pretty extensively where I work, for both remote-local tooling interaction and various other bits and bobs.

                                                                        As for X, a friend of mine used XQuartz to get sam up and running on a Mac. Apparently it was quite the adventure.

                                                                      2. 1

                                                                        Yeah, I use Emacs for that. And for everything else, of course. I’d be interested in a different editor, and I’ve tried BBEdit, but lord almighty, it doesn’t support “proper” tab behaviour (always indent, never insert a literal tab character).

                                                                        1. 2

                                                                          I have the opposite problem with emacs. It’s very hard to get it to insert tab characters instead of inserting arbitrary whitespace I didn’t ask for

                                                                          1. 2

                                                                            Right. I just use tine but, as I said above, if I’m gonna have a Mac I might as well use a graphical editor so I can use the same keyboard shortcuts everywhere and get the benefits of a mouse. I just spend too much time SSH’d into remote systems to make it viable unless the editor supports it natively.

                                                                            1. 2

                                                                              Oh, I meant that I use ange-ftp to do the remote editing thing.

                                                                    1. 4

                                                                      I’ve been thinking about something similar for a while now. Working on it—slowly, very slowly, maybe two decades will pass and it’ll still be vapourware;

                                                                      • There is a single ‘blessed’ application runtime for userspace. It is managed and safe. (In the tradition of java, javascript, lua, c#, etc.) This is necessary for some of the later points.

                                                                        • As gary bernhardt points out, this can be as fast as or even faster than running native code directly.

                                                                        • Since everything is running in ring 0, not only are ‘syscalls’ free, but so is task switching.

                                                                          • There is thus no reason to use coroutines over ‘real’ threads.
                                                                      • All objects are opaque. That is:

                                                                        • All objects are transparently synced to disc.

                                                                          • Thus, the ‘database’ doesn’t need to be something special, and you can form queries directly with code (this doesn’t necessarily scale as well, but it can be an option, and you can use DSL only for complex queries if necessary)
                                                                        • All objects may transparently be shared between threads (modulo permissioning; see below)

                                                                        • All objects may transparently originate in a remote host (in which case changes are not necessarily synced to disc, but are synced to the remote; a la nfs)

                                                                          • Network-shared objects can also be transparently upgraded to be shared via a distributed consensus system, like raft.
                                                                      • Instead of a file system, there is a ‘root’ object, shared between all processes. Its form is arbitrary.

                                                                      • Every ‘thread’ runs in a security domain which is, more or less, a set of permissions (read/write/execute) for every object.

                                                                        • A thread can shed permissions at will, and it can spawn a new thread which has fewer permissions than itself, but never gain permissions. There is no setuid escape hatch.

                                                                        • However, a thread with few permissions can still send messages to a thread with more permissions.

                                                                      • All threads are freely introspectible. They are just objects, associated with a procdir-like object in which all their unshared data reside.

                                                                      1. 3

                                                                        pst (since I’m the guy who has to point these out to everyone each time):

                                                                        • IBM i has a lot of these (object persistence, capabilities, high-level runtime only; older systems didn’t even have unprivileged mode on the CPU), but not all.

                                                                        • Domain has network paging (since that’s how it does storage architecture), but not most of the others.

                                                                        • Phantom does persistence for basically pausable/restartable computation. Weird, but interestingly adjacent.

                                                                        I need to write a blog post about this!

                                                                        1. 2

                                                                          Interesting! Never encountered that.

                                                                          Wiki says it’s proprietary and only works with ppc. Is there any way to play with it without shelling out impressive amounts of $$$ to IBM?

                                                                          1. 3

                                                                            If you want your own hardware, you can buy a used IBM Power server for an amount on the order of a few hundred dollars and installation media for a trial is available direct from IBM. While that’ll only work for 70 days before you need to reinstall, back up and restore procedures are fairly straightforward.

                                                                            If you don’t care about owning the hardware, there’s a public server with free access at https://pub400.com/.

                                                                            Whichever route you take, you’ll probably want to join ##ibmi on Freenode because you’ll have a lot of questions as you’re getting started.

                                                                            1. 2

                                                                              Is there a particular model you recommend of Power? The Talon stuff is way too pricey.

                                                                              1. 2

                                                                                If you want it to run IBM i, you’re going to need to read a lot of documentation to figure out what to buy, because it’s all proprietary and licensed, and IBM has exactly 0 interest in officially licensing stuff for hobbyists. It also requires special firmware support, and will therefore not run on a Raptor system.

                                                                                I think the current advice is to aim for a Power 5, 6, or 7 server, because they have a good balance of cost, not needing a ton of specialized stuff to configure, and having licenses fixed to the server. (With older machines, you really want to have a 5250 terminal, which would need to be connected using IBM-proprietary twinax cabling. Newer machines have moved to a model where you rent capacity from IBM on your own hardware.)

                                                                                I’d browse ebay for “IBM power server” and looking up the specs and license entitlements for each server you see. Given a serial number, you can look up the license entitlements on IBM’s capacity on demand website. For example, my server is an 8233-E8B with serial number 062F6AP. Plugging that into IBM’s website, you see that I have a POD code and a VET code. You can cross reference those codes with this website to see that I have entitlements for 24 cores and PowerVM Enterprise (even though there are only 18 cores in my server, in theory I could add another processor card to add another 6. I’m given to understand that this is risky and may involve needing to contact IBM sales to get your system working again)

                                                                                You really want something with a PowerVM entitlement, because otherwise you need special IBM disks that are formatted with 520-byte sectors and support the SCSI skip read and skip write commands. You will also need to cross reference your system with the IBM i system map to see what OS versions you can run.

                                                                                Plan to be watching eBay for a while; while you can find decent machines for €300-500, it’s going to take some time for one to show up.

                                                                                Also, I’m still relatively new to this whole field; it’s a very good idea to join ##ibmi on freenode to sanity check any hardware you’re considering buying.

                                                                            2. 1

                                                                              There’s no emulator, and I’m not holding my breath for one any time soon.

                                                                              Domain is emulated by MAME, and Phantom runs in most virtualization software though.

                                                                            3. 2

                                                                              Hey Calvin, Please write a blog post about this.

                                                                              1. 1

                                                                                Please do

                                                                              2. 3

                                                                                I’ve been working on this but with WebAssembly.

                                                                                1. 1

                                                                                  I am curious. Is there source code available?

                                                                                  1. 1

                                                                                    It’s still in the planning phase, sadly. I only have so much time given it’s one of my many side projects.

                                                                                2. 2

                                                                                  Sounds an awful lot like Microsoft Midori. It doesn’t mention transparent object persistence, but much of what you mentioned is there.

                                                                                  1. 1

                                                                                    You might be interested in this research OS, KeyKOS: http://cap-lore.com/CapTheory/upenn/

                                                                                    It has some of what you’re describing: the transparent persistence, and the fine-grained permissions. I think they tried to make IPC cheap. But it still used virtual memory to isolate processes.

                                                                                    I think it also had sort of… permissions for CPU time. One type of resource/capability that a process holds is a ticket that entitles it to run for some amount of time (or maybe some number of CPU cycles?). I didn’t really understand that part.

                                                                                    1. 2

                                                                                      Looks interesting. (And, one of its descendants was still alive in 2013.) But, I think anything depending on virtual memory to do permissioning is bound to fail in this regard.

                                                                                      The problem is that IPC can’t just be cheap; it needs to be free.

                                                                                      Writing text to a file should be the same kind of expensive as assigning a value to a variable. Calling a function should be the same kind of expensive as creating a process. (Cache miss, maybe. Mispredict, maybe. Interrupt, full TLB flush, and context switch? No way.)

                                                                                      Otherwise, you end up in an in-between state where you’re discouraged from taking full advantage of (possibly networked) IPC; because even if it’s cheap, it’ll never be as cheap as a direct function call. By making the distinction opaque (and relying on the OS to smooth it over), you get a more unified interface.


                                                                                      One thing I will allow about VM-based security is that it’s much easier to get right. Just look at the exploit list for recent chrome/firefox js engine. Languages like k can be fast when interpreted without JIT, but such languages don’t have wide popularity. Still working on an answer to that. (Perhaps formal verification, a la compcert.)


                                                                                      CPU time permissions are an interesting idea, and one to which I haven’t given very much thought. Nominally, you don’t need time permissions as long as you have preemptive multitasking and can renice naughty processes. But there are other concerns like power usage and device lifetime.

                                                                                      1. 1

                                                                                        I’ve been imagining a system that’s beautiful. It’s a smalltalk with files, not images, with a very simple model. Everything is IPC. If you are on a Linux with network sockets, that socket is like every other method call, every addition, every syscall.

                                                                                        Let’s talk. I like your ideas, and think you might like this system in my mind.

                                                                                        1. 3

                                                                                          These sound great until you try and implement any of it, in which case you realise that now every single call might fail and/or simply never return, or return twice, or return to somebody else, or destroy your process entirely.

                                                                                          Not saying it can’t be done, just saying it almost certainly won’t resemble procedural, OO, or functional programming as we know it.

                                                                                          Edit: Dave Ackley is looking into this future, and his vision is about the distance from what we do now as I expect: https://www.youtube.com/user/DaveAckley

                                                                                          1. 1

                                                                                            You might want to read up on distributed objects from NeXT in the early 90s.

                                                                                      2. 1

                                                                                        This doesn’t solve all of the problems brought up in TFA. The main one is scheduling/autoscale. It is certainly easier—for instance, you can send a function object directly as a message to a thread running on a remote host—but you still have to create some sort of deployment system.

                                                                                        1. 1

                                                                                          (sorry, replied to wrong comment)

                                                                                      1. 1

                                                                                        That links to a nonexistent github wiki page. Perhaps you meant to link here instead?

                                                                                        1. 1
                                                                                          1. 1

                                                                                            oh, I posted a wrong link, but I can’t modify it

                                                                                            1. 1

                                                                                              But I added a new link to fix it.

                                                                                        1. 2

                                                                                          Neat!

                                                                                          Is it possible to play this as a single player game—to practice?

                                                                                          1. 2

                                                                                            Sure thing. You’ll have to provide both bots yourself, though. :) You can find a bunch of example bots here, including all r2con 2020 submissions—practice against these, maybe?

                                                                                          1. 2

                                                                                            Is Robert C. Seacord here? He maybe can comment if fat pointer is on the spec roadmap (also see https://news.ycombinator.com/item?id=22865357).

                                                                                            1. 3

                                                                                              He’s relatively active on twitter; can ask there.

                                                                                              Unfortunately, however, I suspect that the answer is ‘no’.

                                                                                              As it turns out, it’s quite easy to create your own fat pointer implementation, and use it throughout your codebase. (My version is 50 lines, and stb stretchy buffers is slightly smaller (it does less). Neither accounts for automatic arrays, but it would be trivial to add.) What’s needed is boundschecking for regular array accesses, which is to say operator overloading, which is to say ‘probably not’.

                                                                                              However, if what you want is boundschecking, then address sanitizer has you covered.