This is a mirror of Leif Svalgaard’s unfinished book, AS/400 Machine-Level Programming. I have converted all the original documents from Word 97 format to PDF; and included a version compiled into a single PDF; and included a version compiled into a single PDF; with the individual converted and unconverted documents in the
allchaptdirectory.
Really fascinating stuff here. So little is still known about some of the innards of the AS/400, or at least so I thought. I’d been wondering if anyone had ever looked at the internals of that tagged memory system… apparently so.
There are a few interesting things that stand out here. It looks like the tagged memory system isn’t used to enforce any security controls at the hardware level; it’s purely informational. Literally user-level machine code has to execute an instruction to trap if the tag bit in a pointer isn’t set… remove that instruction, your security collapses. The security appears to derive from the fact that the code that translates from the “MI” bytecode to Power assembly is trusted, very JVM-like. It’s also stated there’s also a normal (micro)kernel running under everything. Reading between the lines, it sounds like this kernel is used for task switching (and probably swapping from disk, due to the SLS design) but all of its threads run using the same page tables.
If my interpretation is correct, this means pretty much everything running on an AS/400 can be thought of as running under a microkernel but all in the same process(!), with JVM-like techniques used to enforce security between tasks.
Makes me think more and more about the potential of using an OS with a JVM-like design prohibiting the use of native code, like the JVM itself, the CLR, WebAssembly (this even just made the rounds: https://github.com/nebulet/nebulet), or alternatively an SFI-based design like NaCl. It would be very interesting to have a CheriBSD-like OS which doesn’t need special hardware, since (see the CHERI paper) it would allow more flexibility in how security controls are structured than a microkernel, without the overheards of a microkernel and the very coarse manner of access control provided by page tables. Things like Singularity and the AS/400 imply that this could really work.
It started as a hardware-enforcement mechanism. The customers of most suppliers doing that cared about performance more than security. So, they migrated off the hardware mechanisms that worked on a per operation basis. You can learn more about it by reading System/38 chapter of this book. Burroughs B5000 and Intel i432 were other commercial offerings with hardware-level security. Successors to those concepts are the SAFE and CHERI architectures.
Edit: Writing as I read your comment. I see CheriBSD in it later. :)
“. It’s also stated there’s also a normal (micro)kernel running under everything.”
I’m skeptical of that. Even System/38 had around a million lines of high-level code in OS. Microkernels weren’t fast back then either. I’d guess monolithic even if it went for a modular design. A microkernel would be a pleasant surprise, though.
“Makes me think more and more about the potential of using an OS with a JVM-like design prohibiting the use of native code, like the JVM itself”
That would be JX Operating System.
“alternatively an SFI-based design like NaCl”
CheriBSD looks stronger than that since it enforces POLA on protected primitives. However, Criswell’s group doing stuff like SVA-OS had a SFI-like design called KCoFI applied to FreeBSD 9.
I’m pretty generally annoyed with the container ecosystem prevalent today, I don’t want to run an OS, or manage containers, I want to upload/push my app to a mainframe in the cloud. To do this you really need mainframe level isolation; limit access to system resources, limit heap, limit stack, limit cycles per request essentially. There’s pretty much no language/runtime that allows this level of sandboxing. Some cloud stuff gets close, GAE, and Amazon elastic bean stalk, but they’re pricing is container/instance priced.
The refrain I get when discussing this is; use containers with cpulimit, x, y, and z. My response is; a container is at the minimum one process, how many processes can one machine host, 100s, 1000s? Point is there’s a limit imposed by hardware. A mainframe like environment would not have this limit, since it could be async, per request, per user account.
Toward that end I’ve been playing around with sand boxing lua; with a custom allocator you can manage heap resource, with a debug hook you can manage “cycles”, and stack is limited also. I don’t know how I feel about writing “mainframe webapps” in lua though.
I’m guessing by “really need” mainframe-level isolation you just mean the feature set. In that case, it is a good feature set that many have tried to approximate in homebrew situations. The last project doing it also ended up exploring Lua. I want to look at your first requirement, though, since I’m not sure it’s necessary.
“My response is; a container is at the minimum one process, how many processes can one machine host, 100s, 1000s? Point is there’s a limit imposed by hardware. “
The traditional solution to this was a load-balancing cluster. They were down to microsecond latencies on some network cards even back in the day. You can have the load balancer distributing the requests across machines or a master machine driving the worker machines. You get as much processing as you have hardware.
The times when this doesn’t work is if one application needs more RAM than one system can have or even microsecond latencies are unacceptable. This is rarely a problem given the distributed components we have these days. Yet, the solution in the past was not mainframes but NUMA machines like SGI’s UV. They run Linux these days, too. Today’s big x86 servers are a lot like older NUMA machines. There’s also special-purpose devices that can connect them into NUMA machines. There were also so-called Distributed, Shared Memory (DSM) machines that emulated the NUMA model across a low-latency, high-performance cluster. CompSci had a few prototypes for doing that on x86 that were FOSS licensed.
So, consider load-balancing clusters, Beowulf clusters, NUMA machines, and DSM before “mainframes” since mainframes have a very, specific meaning with associated cost, restrictions, lock-in, etc.
“Toward that end I’ve been playing around with sand boxing lua”
Although I don’t have original work, I did find the Supple sandbox doing a search. Another approach from high-assurance security was combining a separation or microkernel OS with lightweight runtimes with each component running in its own partition communicating with IPC-based middleware. The Barracuda Application Server is only commercial attempt I know that used a combo of INTEGRITY RTOS, C-based apps, and Lua-based apps. If they did typical usage, the Lua-based apps would run in a dedicated partition with time and space partitioning. They could’ve bullshitted, though. FOSS options for that include Muen separation kernel, Genode, and the L4’s. That’s on top of any Linux-based solution you might go with for maturity and ease of use.