1. 2

    I guess real world input is more prone to predictability than input designed to not stabilize.

    1. 3

      I’m not sure what you mean. Before I answer, can you clarify what you mean by “input” here? Input to the benchmarking programs (e.g. CLI arguments), or the benchmark programs as inputs to the VMs?

      1. 2

        I refer to the work put in to make sure the vms didn’t optimize code away into constants or other such “cheating” optimizations. I suspect most real programs do stumble into readily optimized paths, then don’t change over there course of running.

        1. 9

          That’s an open question which I wouldn’t be able to answer without doing another (lengthy) experiment.

          But if you were to ask for my gut feeling: yeah, depending on the program, some code paths might be more readily optimisable. But on the other hand I suspect that real programs:

          • Are much larger than your typical CLBG benchmark.
          • Are less deterministic than our (deterministic) benchmarks and the path taken through the CFG is likely to depend upon (e.g.) RNGs seeded with the time, or stuff read from outside the program itself, like the environment and file descriptors, etc.

          I think those factors would negatively impact JIT compliation, and for this reason I suspect that most real-world programs are less likely to stabilise, and when they do, would take much longer.

          But, like I said, this is pure speculation and my co-authors may even disagree with me on this.

          In any case, more research needed!

          1. 1

            Thank you for your work on this! I especially appreciate the effort went through to ensure a clean environment for each run.

            1. 1

              Could you create a instrumented build of Chromium and measure this in the wild? Perhaps with Alexa top 1k? The problem is getting deep application usage, which would most likely require a human to drive the application.

              1. 1

                I guess it’s possible (you could use selenium to drive the browser).

        2. 1

          “prone”?

          1. 1

            Having a tendency.

            1. 1

              “likely or liable to suffer from, do, or experience something unpleasant or regrettable”

              The word is quite negative.

              1. 1

                Mildly negative at most, according to Webster’s.

        1. 9

          It landed, but I still have some follow up work to get full support for all types of snapshots. The process that does disk I/O starts with the fds preopened, and is chrooted and pledged, which makes opening the base images of the multi disk snapshots hard.

          1. 2

            Thanks for working on this.

            (I’m really hoping that someday I’ll be able to install Debian under vmm(4) from official install media. Currently it doesn’t detect any CD drives, and I’ve not been able to figure out why).

            1. 1

              Because Debian install media lack virtio drivers.

              1. 1

                I wonder if we could persuade them to include them?

                I suppose virtualbox and qemu emulate physical CD drives rather than virtio?

          1. 5

            Nice idea.

            Do the papers have to be new, or can old papers be discussed?

            1. 1

              I think each paper has to have a link in the Paperkast. Old, new doesn’t matter.

              1. 1

                This is also a question I had, along with what expectations there are on commenters. I’m a “casual” reader of the primary literature in that I’m not a researcher myself so have different expectations and social norms than someone who’s in the community, do I engage with the conversations at paperkast, or is it for academics?

                1. 1

                  This is for everybody. However, If the community consists of academics, grad students there will be technical discussions.

              1. 10

                I’m a bit puzzled why the author seems to think that integer wrap on overflow behaviour has something to do with C and undefined behaviour. The same thing happens with nearly all languages which use the processor’s integer arithmetic, because those semantics are provided by the processor itself. Java, C#, etc. all wrap on overflow. There are some exceptions though - Ada provides the “exception on overflow” semantics the author prefers, but it does come with a significant performance penalty because checking for overflow requires additional instructions after every arithmetic operation.

                The point here is that if you want performant arithmetic it’s all about what the processor is designed to do, not anything to do with the languages. Java defines integer wrap as the language’s standard behaviour but as a result it incurs a performance penalty for integer arithmetic on processors which don’t behave this way. C doesn’t incur this penalty because it basically accepts that overflow works however the processor implements it. And let’s face it if your program is reliant on the exact semantics of overflowing numbers you’re probably doing it wrong anyway.

                There are some processors which provide interrupts on integer overflow. This eliminates the performance penalty associated with overflow checks if your language is Ada and so you want to trap on overflow. There are other semantics around too - DSP processors often have “clamp on overflow” instead since that suits the use case better and old Unisys computers use ones complement rather than twos complement so their overflow behaves slightly differently.

                1. 4

                  Performance penalty of “trap on overflow” can be reduced by clever modeling, for example by allowing delayed trap instead of immediate trap. As-if Infinitely Ranged is one such model. Immediate trap disallows optimizing a+b-b to a, because if a+b overflows the former traps and the latter doesn’t. Delayed trap allows such optimization.

                  1. 3

                    I’m a bit puzzled why the author seems to think that integer wrap on overflow behaviour has something to do with C and undefined behaviour.

                    You are mixing up underlying behaviour of the processor with defined (or un-defined) behaviour of the language. Wrap on integer overflow is indeed the natural behaviour of most common processors, but C doesn’t specify it. The post is saying that some people have argued that wrap-on-overflow should be the defined behaviour of the C language, or at least the implementation-defined behaviour implemented by compilers, and then goes on to provide arguments against that. There is a clear example in the post of where behaviour of a C program doesn’t match that of 2’s complement arithmetic (wrapping).

                    The same thing happens with nearly all languages which use the processor’s integer arithmetic, because those semantics are provided by the processor itself.

                    That’s the point - in C, it doesn’t happen.

                    1. 1

                      I don’t get the point. The advantage of using integer wrap for C on processors that implement integer wrap is that it is high performance, simplifies compilation, has clear semantics, and is the semantics programmers expect. If you want to argue that it should be e.g. trap on overflow, you need to provide a reason more substantive than theoretical compiler optimizations that are shown by hand waving. The argument that it should be “generate code that overflows but pretend you don’t” you needs a stronger justification because the resulting semantics are muddy as hell. I’m actually in favor of a debug mode overflow trap for C but a optimized mode of use processor semantics.

                      1. 1

                        you need to provide a reason more substantive than theoretical compiler optimizations that are shown by hand waving

                        Read the post, then; there are substantive reasons in it. I’m not engaging with you if you’re going to start by misrepresenting reasoned arguments as “hand waving”.

                        1. 0

                          “However, while in many cases there is no benefit for C, the code generation engines and optimisers in compilers are commonly general and could be used for other languages where the same might not be so generally true; “

                          Ok! You think that’s a substantive argument.

                          1. 1

                            You’re making a straw man. What you quoted is part of a much larger post.

                            1. 1

                              That’s not what “straw man” means.

                              1. 1

                                It means that you’re misrepresenting the argument, which you are. I said that the post contained substantive reasons, you picked a particular part and insinuated that I had claimed that that particular part on its own constituted a substantive reason, which I didn’t. And: you said “If you want to argue that it should be e.g. trap on overflow, you need to provide a reason more substantive than theoretical compiler optimizations that are shown by hand waving” but optimisations have nothing very little to do with trapping being a better behaviour than wrapping, and I never claimed they did, other than to the limited extent that trapping potentially allows some optimisations which wrapping does not. But that was not the only reason given for trapping being a preferable behaviour; again, you mis-represented the argument.

                    2. 2

                      I’m a bit puzzled why the author seems to think that integer wrap on overflow behaviour has something to do with C and undefined behaviour.

                      They are related, yes. E.g. whilst signed integer overflow is well defined in most individual hardware architectures (usually as a two’s compliment wrap), it could vary between architectures, and thus C leaves signed integer overflow undefined.

                      1. 0

                        The whole argument is odd.

                      1. 9

                        wrap comes naturally from hardware implementation of numbers.

                        but certainly a programming language can throw an exception when the carry bit comes on.

                        1. 1

                          wrap comes naturally from hardware implementation of numbers.

                          Correct for most platforms, but different platforms may have different integer implementations (for example some DSPs have saturating integers which do not wrap at all). As I understand (and OP can verify) the C spec leaves signed integer overflow undefined to allow for the whole range of signed integer wrap/saturation behaviours.

                          It’s true that on the the popular architectures that signed integers do indeed wrap. So why does the C spec not make that the norm for x86_64 and ARM? Well, because then your C programs wouldn’t be portable.

                          It’s also interesting that unsigned integer overflow is defined to wrap in C, yet unsigned integers can saturate on some platforms! Madness.

                          but certainly a programming language can throw an exception when the carry bit comes on.

                          Well, OP is talking about C…

                          1. 1

                            Is there any well known PGP alternative other than this? Based from history, I cannot blindly trust code written by one human being and that is not battle tested.

                            In any case, props to them for trying to start something. PGP does need to die.

                            1. 7

                              a while ago i found http://minilock.io/ which sounds interesting as pgp alternative. i don’t have used it myself though.

                              1. 2

                                Its primitives and an executable model were also formally verified by Galois using their SAW tool. Quite interesting.

                              2. 6

                                This is mostly a remix, in that the primitives are copied from other software packages. It’s also designed to be run under very boring conditions: running locally on your laptop, encrypting files that you control, in a manual fashion (an attacker can’t submit 2^## plaintexts and observe the results), etc.

                                Not saying you shouldn’t be ever skeptical about new crypto code, but there is a big difference between this and hobbyist TLS server implementations.

                                1. 5

                                  I’m Enchive’s author. You’ve very accurately captured the situation. I didn’t write any of the crypto primitives. Those parts are mature, popular implementations taken from elsewhere. Enchive is mostly about gluing those libraries together with a user interface.

                                  I was (and, to some extent, still am) nervous about Enchive’s message construction. Unlike the primitives, it doesn’t come from an external source, and it was the first time I’ve ever designed something like that. It’s easy to screw up. Having learned a lot since then, if I was designing it today, I’d do it differently.

                                  As you pointed out, Enchive only runs in the most boring circumstances. This allows for a large margin of error. I’ve intentionally oriented Enchive around this boring, offline archive encryption.

                                  I’d love if someone smarter and more knowledgeable than me had written a similar tool — e.g. a cleanly implemented, asymmetric archive encryption tool with passphrase-generated keys. I’d just use that instead. But, since that doesn’t exist (as far as I know), I had to do it myself. Plus I’ve become very dissatisfied with the direction GnuPG has taken, and my confidence in it has dropped.

                                  1. 2

                                    I didn’t write any of the crypto primitives

                                    that’s not 100% true, I think you invented the KDF.

                                    1. 1

                                      I did invent the KDF, but it’s nothing more than SHA256 applied over and over on random positions of a large buffer, not really a new primitive.

                                2. 6

                                  Keybase? Kinda?…

                                  1. 4

                                    It always bothers me when I see the update say it needs over 80 megabytes for something doing crypto. Maybe no problems will show up that leak keys or cause a compromise. That’s a lot of binary, though. I wasn’t giving it my main keypair either. So, I still use GPG to encrypt/decrypt text or zip files I send over untrusted mediums. I use Keybase mostly for extra verification of other people and/or its chat feature.

                                  2. 2

                                    Something based on nacl/libsodium, in a similar vein to signify, would be pretty nice. asignify does apparently use asymmetric encryption via cryptobox, but I believe it is also written/maintained by one person currently.

                                    1. 1

                                      https://github.com/stealth/opmsg is a possible alternative.

                                      Then there was Tedu’s reop experiment: https://www.tedunangst.com/flak/post/reop

                                    1. 2

                                      I found this very readable. I learned last week from another lobsters post that I had been using monads in Rust without knowing (Option and and_then) and this post helped to reinforce what I had learned.

                                      I think the author down-plays exceptions a little too far though. In a language like Java or Python, you could have different exception types for each step (say ScanError, ParseError, …) all subclassing a common base exception (say InterpError), then do (e.g. Python):

                                      def interpret(program):
                                          try:
                                              tokens = scan(program)
                                              ast = parse(tokens)
                                              checked-ast = typecheck(ast)
                                              result = eval(checked-ast)
                                          except InterpError as e:
                                              print(e.to_string())
                                      

                                      It may not be as concise as:

                                      fun compile program =
                                        (Success program) >>= scan >>= parse >>= typecheck >>= eval
                                      

                                      but it isn’t painful to look at or understand either.

                                      (I’ll probably be flamed for this!)

                                      1. 1

                                        but it isn’t painful to look at or understand either.

                                        The problem is you can’t safely refactor it. You can’t move your calls around even when they look like they have nothing to do with each other, because changing the order they’re called in changes the control flow. And within the try: block you no longer have any way to tell which functions might error and which functions can’t, which makes it much harder to reason about possible paths through the function.

                                      1. 6

                                        Yep, this is how I figured out monads too, but when using Rust! There is more to them though - the laws are important, but it’s sometimes easier to learn them by examples first!

                                        1. 3

                                          Can you show an example where a monad is useful in a Rust program?

                                          (I’m not a functional programmer, and have never knowingly used a monad)

                                          1. 10

                                            I learned about monads via Maybe in Haskell; the equivalent in Rust is called Option.

                                            Option<T> is a type that can hold something or nothing:

                                            enum Option<T> {
                                                None,
                                                Some(T),
                                            }
                                            

                                            Rust doesn’t have null; you use option instead.

                                            Options are a particular instance of the more general Monad concept. Monads have two important operations; Haskell calls them “return” and “bind”. Rust isn’t able to express Monads as a general abstraction, and so doesn’t have a particular name. For Option<T>, return is the Some constructor, that is,

                                            let x = Option::Some("hello");
                                            

                                            return takes some type T, in this case, a string slice, and creates an Option<T>. So here, x has the type Option<&str>.

                                            bind takes two arguments: something of the monad type, and a function. This function takes something of a type, and returns an instance of the monad type. That’s… not well worded. Let’s look at the code. For Option<T>, bind is called and_then. Here’s how you use it:

                                            let x = Option::Some("Hello");
                                            let y = x.and_then(|arg| Some(format!("{}!!!", arg)));
                                            
                                            println!("{:?}", y);
                                            

                                            this will print Some("Hello!!!"). The trick is this: the function it takes as an argument only gets called if the Option is Some; if it’s None, nothing happens. This lets you compose things together, and reduces boilerplate when doing so. Let’s look at how and_then is defined:

                                            fn and_then<U, F>(self, f: F) -> Option<U> 
                                            where F: FnOnce(T) -> Option<U>
                                            {
                                                match self {
                                                    Some(x) => f(x),
                                                    None => None,
                                                }
                                            }
                                            

                                            So, and_then takes an instance of Option and a function, f. It then matches on the instance, and if it’s Some, calls f passing in the information inside the option. If it’s None, then it’s just propagated.

                                            How is this actually useful? Well, these little patterns form building blocks you can use to easily compose code. With just one and_then call, it’s not that much shorter than the match, but with multiple, it’s much more clear what’s going on. But beyond that, other types are also monads, and therefore have bind and return! Rust’s Result<T, E> type, similar to Haskell’s Either, also has and_then and Ok. So once you learn the and_then pattern, you can apply it across a wide array of types.

                                            Make sense?

                                            1. 3

                                              Make sense?

                                              It absolutely does! I’ve used and_then extensively in my own Rust code, but never known that I was using a monad. Thanks for the explanation Steve.

                                              But there’s one gap in my understanding now. Languages like Haskell need monads to express things with side-effects like IO (right?). What’s unique about a monad that allows the expression of side effects in these languages?

                                              1. 7

                                                No problem!

                                                This is also why Rust “can’t express monads”, we can have instances of individual monads, but can’t express the higher concept of monads themselves. For that, we’d need a way to talk about “the type of a type”, which is another phrasing for “higher minded types”.

                                                So, originally, Haskell didn’t have monads, and IO was done another way. So it’s not required. But, I am about to board a flight, so my answer will have to wait a bit. Maybe someone else will chime in too.

                                                1. 2

                                                  higher minded types

                                                  (Just so others don’t get confused, I think you meant “kinded” here, right?)

                                                  1. 1

                                                    Heh, yes. Thanks.

                                                2. 3

                                                  A monad has the ability to express sequence, which is useful for imperative programming. It’s not unique, e.g. you can write many imperative programs using just monoid, functor, applicative or many other tools.

                                                  The useful function you get out of realising that IO forms a Monad is:

                                                  (>>=) :: IO a -> (a -> IO b) -> IO b
                                                  

                                                  An example of using this function:

                                                  getLine >>= putStrLn
                                                  
                                                  1. 4

                                                    I should say Monad is unique in being able to express that line of code, but there’s many imperative programs which don’t need Monad. For example, just Semigroup can be used for things like this:

                                                    putStrLn "Hello" <> putStrLn "World"
                                                    

                                                    Or we could read some stuff in with Applicative:

                                                    data Person = Person { firstName :: String, lastName :: String }
                                                    liftA2 Person getLine getLine
                                                    

                                                    So Monad isn’t about side-effects or imperative programming, it’s just that imperative programming has a useful Monad, among other things.

                                                    1. 2

                                                      You are way ahead of me here and I’m probably starting to look silly, but isn’t expressing sequence in imperative languages trivial?

                                                      For example (Python):

                                                      x = f.readline()
                                                      print(x)
                                                      

                                                      x must be evaluated first because it is an argument of the second line. So sequence falls out of the hat.

                                                      Perhaps in a language like Haskell where you have laziness, you can never be sure if you have guarantees of sequence, and that’s why a monad is more useful in that context? Even then, surely data dependencies somewhat impose an ordering to evaluation?

                                                      For me, the utility of Steve’s and_then example wasn’t only about sequence, it was also about being able to (concisely) stop early if a None arose in the chain. That’s certainly useful.

                                                      1. 2

                                                        but isn’t expressing sequence in imperative languages trivial?

                                                        Yes.

                                                        In Haskell it is too:

                                                        (>>=) :: IO a -> (a -> IO b) -> IO b
                                                        

                                                        But we generalise that function signature to Monad:

                                                        (>>=) :: Monad m => m a -> (a -> m b) -> m b
                                                        

                                                        We don’t have a built in idea of sequence. We just have functions like these. A generalisation which comes out is Monad. It just gives code reuse.

                                                        1. 1

                                                          Maybe is an instance of a monad, and there are many different kinds of monads. If you think of Maybe as “a monad that uses and_then for sequencing”, then “vanilla” sequencing can be seen as “a monad that uses id for sequencing” (and Promises in JavaScript can be seen as “a monad that uses Promise#flatMap for sequencing”).

                                                          Yes, expressing sequence in eager imperative languages is trivial because you can write statements one after the other. Now imagine a language where you have no statements, and instead everything is expressions. In this expression-only language, you can still express sequence by using data dependencies (you hit this nail right on the head). What would that look like? Probably something like this (in pseudo-JavaScript):

                                                          function (next2) {
                                                            (function (next) {
                                                              next(f.readline())
                                                            })(function (readline_result) {
                                                              next2(print(readline_result))
                                                            })
                                                          }
                                                          

                                                          with additional plumbing so that each following step has access to the variables bound in all steps before it (e.g. by passing a dictionary of in-scope variables). A monad captures the spirit of this, so instead of doing all the plumbing yourself, you choose a specific implementation of >>= that does your plumbing for you. The “vanilla” monad’s (this is not a real thing, I’m just making up this name to mean “plain old imperative sequences”) implementation of >>= just does argument plumbing for you, whereas the Maybe monad’s implementation of >>= also checks whether things are None, and the Promise monad’s implementation of >>= also calls Promise#then and flattens any nested promises for you.

                                                          What’s useful here is the idea that there is this set of data structures (i.e. monads) that capture different meanings of “sequencing”, and that they all have a similar interface (e.g. they have all an implementation of >>= and return with the same signature) so you can write functions that are generic over all of them.

                                                          Does that make sense?

                                                      2. 2

                                                        There is a comment below saying it pretty succintly:

                                                        A monad is basically defined around the idea that we can’t always undo whatever we just did (…)

                                                        To make that concrete, readStuffFromDisk |> IO.andThen (\stuff -> printStuff stuff) - in the function after andThen, the “stuff” is made available to you - the function runs after the side effect happened. You can say it needed specific API and the concept of monads satisfies that API.

                                                        Modelling IO with monads allows you to run functions a -> IO b (take a pure value and do an effectful function on it). Compare that to functions like a -> b (Functor). These wouldn’t cut it - let’s say you’d read a String from the disk - you could then only convert it to another string but not do an additional effect.

                                                        EDIT: I might not have got the wording entirely right. I ommited a part of the type annotation that says the a comes an effect already. With Functor you could not reuse values that came from an effect; with Monad you can.

                                                1. 7

                                                  I’ve been guilty of trash-talking other projects myself in the past

                                                  Well, the blog is titled “Software is Crap” :)

                                                  [Rust’s] designers made the unfortunate choice of having memory allocation failure cause termination – which is perhaps ok for some applications, but not in general for system programs, and certainly not for init

                                                  Rust can help with not allocating at all (e.g. heapless), and try_reserve is in nightly already.

                                                  Zig though is a language oriented exactly at this: it forces you to manually pick an allocator and handle allocation failure. But it is much younger than Rust, so if you’re worried about Rust “mutating” (FYI, Rust 1.x is stable, as in backwards compatible), it’s way too early to consider Zig (0.x).

                                                  non-Linux OSes are always going to be Rust’s second-class citizens

                                                  Yeah, related to that: Rust’s designers made the unfortunate assumption that OSes don’t break userspace from one release to another, just like Linux. The target extension RFC would solve this.

                                                  Other than that… while the core team is indeed focused on the “big three” (Linux/Windows/Mac), Rust does support many “unusual” targets, including Haiku, Fuchsia, Redox, CloudABI.

                                                  Back to inits and service managers/supervisors:

                                                  There are so many of them, many of them are interesting (I’ve been looking at immortal recently), but they all have one big problem: existing service files/scripts on your system are not written for them. So I usually end up just using FreeBSD’s rc for basic pre-packaged daemons + runit for my own custom stuff.

                                                  The Ideal Service Manager™ should:

                                                  • read existing service definitions from system packages (rc scripts, OpenRC scripts, systemd units, daemontools/runit/s6 style bare shell scripts)
                                                  • prevent the services from daemonizing, somehow (injecting -f into $SERVICE_flags? horrible and evil hacks like LD_PRELOADing a library that overrides libc’s daemon() with a no-op? lol)
                                                  • force the services to log to syslog, somehow (redirect stdout/stderr, but what about daemons that open a custom logfile by default? maybe just let them do that)
                                                  • supervise them like runit does

                                                  I guess instead of preventing forking it can support tracking forking services with cgroups on Linux, and… with path=/ ip4=inherit ip6=inherit sysvmsg=inherit ... jails on FreeBSD? I wish there was a 100% reliable way to make sure any service runs in the foreground.

                                                  1. 5

                                                    Well, the blog is titled “Software is Crap” :)

                                                    Yeah, there is that. I had originally wanted to emulate a humorous style I’d seen elsewhere (the long defunct “bileblog”) which badmouthed things in such an over-the-top fashion that you knew it was humorous; I could never quite get that right and it always seemed like I was just being nasty. Now I just try to provide objective criticism; it’s probably not as entertaining to read, but it’s also less likely to upset people. And of course, I also write about Dinit and occasionally write (hopefully) helpful articles on other topics.

                                                    Rust can help with not allocating at all (e.g. heapless), and try_reserve is in nightly already. Zig though is a language oriented exactly at this:

                                                    heapless probably wouldn’t serve my needs, but things like try_reserve are what are sorely needed for Rust to be a serious systems language, so I’m glad that’s happening. There are other reasons (perhaps more subjective) that I don’t like Rust - particular aspects of its syntax and semantics bother me - but in general I think the concept of ownership and lifetime as part of type are worthwhile. I have no doubt that good things will come from Rust.

                                                    As for Zig, I need to look at it again. It certainly also has promise; but you’re right that I’d be worried about its stability and future.

                                                    I guess instead of preventing forking it can support tracking forking services with cgroups on Linux, and… with path=/ ip4=inherit ip6=inherit sysvmsg=inherit … jails on FreeBSD? I wish there was a 100% reliable way to make sure any service runs in the foreground.

                                                    Yeah, that’s a fundamental problem. Linux and DragonFlyBSD both have a simple means to prevent re-parenting past a particular process, which is one potential way to solve it (if you are ok with inserting an intermediate process, and really I don’t think that’s a big deal); cgroups/jails as you mention are another; any other option starts to feel pretty hacky (upstart apparently used ptrace to track forks, but that really feels like abuse of the mechanism to me).

                                                    Thanks for your comments.

                                                    1. 5

                                                      Yeah, there is that. I had originally wanted to emulate a humorous style I’d seen elsewhere (the long defunct “bileblog”) which badmouthed things in such an over-the-top fashion that you knew it was humorous; I could never quite get that right and it always seemed like I was just being nasty. Now I just try to provide objective criticism; it’s probably not as entertaining to read, but it’s also less likely to upset people. And of course, I also write about Dinit and occasionally write (hopefully) helpful articles on other topics.

                                                      The problem with it is: that style of humor is so common in the programming world that even good one is not at all novel. Also, as you say it, it’s also very hard to get right, even for seasoned comedians, which - no offense - most programmer aren’t.

                                                      heapless probably wouldn’t serve my needs, but things like try_reserve are what are sorely needed for Rust to be a serious systems language, so I’m glad that’s happening.

                                                      Everyone attaches their own meaning to “systems language”, and adding “serious” feels a bit like moving goalposts. “Ah, yeah, you got the systems part down, but how about serious”. It might not be convenient at all places and I agree that some things are undone, but we’re up against literally decades old languages. We’re definitely serious about getting that issue solved in a foreseeable timeframe.

                                                      Heapless helps in the sense that you can provide your own stuff on top. Even the basic Box type in Rust is not part of libcore, but libstd.

                                                      Servo takes a middle ground of extending Vec with fallible push. (https://github.com/servo/servo/blob/master/components/fallible/lib.rs)

                                                      The thing here is mostly that stdlibs collection considers allocation failure and unrecoverable error. For ergonomic reasons, that’s a good pick for a standard library.

                                                      So, it’s perfectly feasible to write your own collection library (or, for example extension) even now.

                                                      Also, here’s a list of notes about what’s needed to make fallible stuff in the language proper cool. I can assure you after attending the All Hands that this is definitely a hot topic, but also a hard one.

                                                      This just as a little bit of context, I’m not trying to convince you.

                                                      I’d be very interested in what your semantic issues with Rust are.

                                                      To add to that, I’m happy that you took a look at the language, even if you came away wanting.

                                                      As for Zig, I need to look at it again. It certainly also has promise; but you’re right that I’d be worried about its atability and future.

                                                      I’m definitely hoping for more “new generation” systems programming languages. I think there is quite some space around and I hope that some of these make it.

                                                      1. 4

                                                        I’d be very interested in what your semantic issues with Rust are.

                                                        A proper answer to that would need me to sit down for an hour (or more) and go through again the material on Rust to remember the issues I had. Some of them aren’t very significant, some of them are definitely subjective. I should qualify: I’ve barely actually used Rust, just looked at it a number of times and had second-hand exposure via friends who’ve been using it extensively. The main thing I can remember off the top of my head that I didn’t like is that you get move semantics by default when passing objects to functions, except when the type implements the Copyable trait (in which case you get a copy), so the presence or absence of a trait changes the semantics of an operation. This is subtle and, potentially, confusing (though the error message is pretty direct). I’d rather have a syntactic distinction in the function call syntax to specify “I want this parameter moved” vs copied.

                                                        Other things that bother me are lack of exceptions (I realise this was most likely a design decision, just not one that I agree with) and limited metaprogramming (the “hygienic macro” facility, when I looked at it, appeared a bit half-baked; but then, I’m comparing to C++ which has very extensive metaprogramming facilities, even if they have awful syntax).

                                                        I can assure you after attending the All Hands that this is definitely a hot topic, but also a hard one.

                                                        Yep, understood.

                                                        I’m happy that you took a look at the language, even if you came away wanting.

                                                        I’ll be continuing to watch closely. I’m very interested in Rust. I honestly think that some of the ideas it’s brought to the table will change the way future languages are designed.

                                                        1. 3

                                                          …you get move semantics by default when passing objects to functions, except when the type implements the Copyable trait (in which case you get a copy), so the presence or absence of a trait changes the semantics of an operation.

                                                          I can definitely understand how that would feel worrying, but in practice it’s not so bad: Rust doesn’t have copy constructors, so the Copytrait means “this type can be safely memcpy()d”. For types that can be cheaply and infinitely duplicated without (heap) allocation, like u32, copy vs. move isn’t that much of a semantic difference.

                                                          The closest thing to C++’s copy constructor is the Clone trait, whose .clone() method will make a separately-allocated copy of the thing. Clone is never automatically invoked by the compiler, so the difference between moving a String versus copying a String is somefunc(my_string) versus somefunc(my_string.clone()).

                                                          lack of exceptions

                                                          As a Python programmer, I’m pretty happy with Rust’s error-handling, especially post-1.0 when then ? early-return operator was added. I feel it’s a very nice balance between C and Go-style error handling, which is explicit to the point of yelling, and Java and Python-style error handling, which is minimal to the point where it’s hard to say what errors might occur where.

                                                          limited metaprogramming

                                                          It depends how much you care about getting your hands dirty. Rust doesn’t have full-scale template metaprogramming like C++, but the hygenic macro system (while limited) is a good start. If you want to go further, Rust’s build system includes a standard and cross-compilation-friendly system for running tasks before your code is compiled, so you can run your code through cpp or xsltproc or m4 or a custom Python script or whatever before the Rust compiler sees it. Lastly, “nightly” builds of the compiler will load arbitrary plugins (“procedural macros”) which will let you do all the crazy metaprogramming you like. Since this involves tight integration with the compiler’s internals, this is not a stable, supported feature, but nevertheless some high-profile Rust libraries like the Rocket web framework are built on it.

                                                      2. 2

                                                        Linux and DragonFlyBSD both have a simple means to prevent re-parenting past a particular process

                                                        Hmm?? This sounds very interesting! Please tell me more about it.

                                                        upstart apparently used ptrace to track forks

                                                        Oh, this made me realize that I can actually use DTrace to track forks!

                                                        1. 4

                                                          Hmm?? This sounds very interesting! Please tell me more about it.

                                                          In linux: prctl(PR_SET_CHILD_SUBREAPER, 1); In DragonFlyBSD (and apparently FreeBSD too, I see): procctl(P_PID, getpid(), PROC_REAP_ACQUIRE, NULL);

                                                          In both cases this marks the current process as a “reaper” - any child/grandchild process which double-forks or otherwise becomes orphaned will be reparented to this process rather than to init. Dinit uses this already to be able to supervise forking processes, but it still needs to be able to determine the pid (by reading it from a pid file). There’s the possibility though of inserting a per-service supervisor process which can then be used to keep track of all the processes that a particular service generates - although it still doesn’t provide a clean way to terminate them; I think you really do need cgroups or jails for that.

                                                      3. 2

                                                        [Rust’s] designers made the unfortunate choice of having memory allocation failure cause termination – which is perhaps ok for some applications, but not in general for system programs, and certainly not for init

                                                        Or just run under Linux and have random processes killed by the OOM killer and random times because that’s so much better letting a program know the allocation didn’t really succeed twenty minutes ago when it could do something about it.

                                                        1. 2

                                                          Agreed, the OOM killer is totally bonkers, but its existence doesn’t justify stopping a program due to a failed allocation.

                                                          1. 3

                                                            its existence doesn’t justify stopping a program due to a failed allocation.

                                                            Yes, especially since overcommit can be turned off, which should largely (if not always - I’m not sure) prevent the OOM killer from acting.

                                                            1. 1

                                                              IIRC overcommit is even off by default in Debian.

                                                            2. 1

                                                              Right. I was saying just let malloc return NULL and let the program deal with it instead of basically lying about whether the allocation succeeded or not. I disable memory overcommit on most of my systems.

                                                              1. 1

                                                                For C I totally agree.

                                                                The Rust equivalent would be:

                                                                let b = Box::new(...);
                                                                

                                                                But Box::new doesn’t return a Result. If allocation fails, the program is terminated.

                                                                And so far we have only really talked about the heap. As far as I can tell you never know if stack allocation succeeded until you get a crash! Even in C. But I suppose once the stack is hosed, so is your program, which may not be true for the heap.

                                                        1. 1

                                                          Looking forward to having filter support. No more shelling out to rspamd (I hope)!

                                                          1. 4

                                                            I’m not clicking that :P

                                                            1. 3
                                                              1. 1

                                                                No viruses… so far.

                                                              1. 1

                                                                Interesting. I’m using Matrix at the moment, but I’ll be keeping an eye on this.

                                                                1. 3

                                                                  The offhand ‘even perl’ in there struck me as unfair. It reminds me that perl is actually pretty fast (specifically at startup, but my recollection was also that it runs quickly):

                                                                  $ time for i in `seq 1 1000`; do perl < /dev/null; done
                                                                  
                                                                  real    0m2.786s
                                                                  user    0m1.337s
                                                                  sys     0m0.686s
                                                                  
                                                                  $ time for i in `seq 1 1000`; do python < /dev/null; done
                                                                  
                                                                  real    0m19.245s
                                                                  user    0m9.329s
                                                                  sys     0m4.860s
                                                                  
                                                                  $ time for i in `seq 1 1000`; do python3 < /dev/null; done
                                                                  
                                                                  real    0m48.840s
                                                                  user    0m30.672s
                                                                  sys     0m7.130s
                                                                  
                                                                  
                                                                  1. 1

                                                                    I can’t comment on how fast Perl is, but you are measuring the time taken to tear down here too.

                                                                    The correct way would be to take the raw monotonic time immediately before invoking the VM, then inside the guest language immediately print it again and take the difference.

                                                                    P.S. Wow Python3 is slower.

                                                                    1. 2

                                                                      but you are measuring the time taken to tear down here too.

                                                                      I guess so? I’m not sure that’s a useful distinction.

                                                                      The people wanting “faster startup” are also wanting “fast teardown”, because otherwise you’re running in some kind of daemon-mode and both times are moot.

                                                                      1. 1

                                                                        The people wanting “faster startup” are also wanting “fast teardown”

                                                                        Yeah, I guess I agree that they should both be fast, but if we were measuring for real, I’d measure them separately.

                                                                        1. 1

                                                                          I’m not sure that’s a useful distinction.

                                                                          If latency matters then it could be. If you’re spawning a process to handle network requests for example then the startup time affects latency but the teardown time doesn’t, unless the load gets too high.

                                                                      2. 1

                                                                        Hah before I read the comments I did the same thing! My results on a 2015 MBP - with only startup and teardown on an empty script, and I included node and ruby also:

                                                                        ~/temp:$ time python2 empty.txt 
                                                                        real    0m0.028s
                                                                        user    0m0.016s
                                                                        sys     0m0.008s
                                                                        
                                                                        ~/temp:$ time python3 empty.txt 
                                                                        real    0m0.042s
                                                                        user    0m0.030s
                                                                        sys     0m0.009s
                                                                        
                                                                        ~/temp:$ time node empty.txt 
                                                                        real    0m0.079s
                                                                        user    0m0.059s
                                                                        sys     0m0.018s
                                                                        
                                                                        ~/temp:$ time perl empty.txt 
                                                                        real    0m0.011s
                                                                        user    0m0.004s
                                                                        sys     0m0.002s
                                                                        
                                                                        ~/temp:$ time ruby empty.txt 
                                                                        real    0m0.096s
                                                                        user    0m0.027s
                                                                        sys     0m0.044s
                                                                        
                                                                        1. 2

                                                                          Ruby can do a bit better if you don’t need gems (and it’s Python 3 here):

                                                                          $ time for i in $(seq 1 1000); do ruby </dev/null; done
                                                                          
                                                                          real	0m31.612s
                                                                          user	0m27.910s
                                                                          sys	0m3.622s
                                                                          
                                                                          $ time for i in $(seq 1 1000); do ruby --disable-gems </dev/null; done
                                                                          
                                                                          real	0m4.117s
                                                                          user	0m2.848s
                                                                          sys	0m1.271s
                                                                          
                                                                          $ time for i in $(seq 1 1000); do perl </dev/null; done
                                                                          
                                                                          real	0m1.225s
                                                                          user	0m0.920s
                                                                          sys	0m0.294s
                                                                          
                                                                          $ time for i in $(seq 1 1000); do python </dev/null; done
                                                                          
                                                                          real	0m13.216s
                                                                          user	0m10.916s
                                                                          sys	0m2.275s
                                                                          
                                                                          1. 1

                                                                            So as long python3 is faster than ruby/node, we are ok…?

                                                                        1. 25

                                                                          This seems a good time to promote a paper our team published last year (sorry to blow my own trumpet :P ): http://soft-dev.org/pubs/html/barrett_bolz-tereick_killick_mount_tratt__virtual_machine_warmup_blows_hot_and_cold_v6/

                                                                          We measured not only the warmup, but also the startup of lots of contemporary JIT compilers.

                                                                          On the a quad-core i7-4790 @ 3.6GHz with 32GB of RAM, running Debian 8:

                                                                          • C was the fastest to start up at 0.00075 secs (+/- 0.000029) – surprise!
                                                                          • LuaJIT was the next fastest to start up at 0.00389 secs (+/- 0.000442).
                                                                          • V8 was in 3rd at 0.08727 secs (+/- 0.000239).
                                                                          • The second slowest to start up was HHVM at 0.75270 secs (+/- 0.002056).
                                                                          • The slowest overall to start up was JRubyTruffle (now called TruffleRuby) at 2.66179 sec (+/- 0.011864). This is a Ruby implementation built on GraalVM (plain Java on GraalVM did much better in terms of startup).

                                                                          Table 3 in the linked paper has a full breakdown.

                                                                          The main outcome of the paper was that few of the VMs we benchmarked reliably achieved a steady state of peak performance after 2000 benchmark iterations, and some slowed down over time.

                                                                          1. 1

                                                                            I saw a talk about this. Very cool stuff! It is a good antidote to the thrall of benchmarks.

                                                                            1. 1

                                                                              Cool work! You should make that a submission on its own in the morning in case someone misses it due to a filter. For instance, people who don’t care about Python specifically like main post is tagged with. Just programming, performance, and compiler tags should do. Good news is a lot of people still saw and enjoyed it per the votes. You definitely deserve an “authored by” submission, though. :)

                                                                              1. 3

                                                                                It was on the lobsters front page about six months ago. https://lobste.rs/s/njsxtv/virtual_machine_warmup_blows_hot_cold

                                                                                It was a very good paper and I personally wouldn’t mind seeing it reposted, but I don’t actually know what the etiquette for that is here.

                                                                                1. 1

                                                                                  I forgot. My bad. I should probably do a search next time.

                                                                            1. 3

                                                                              Well I’ve been using Rust for about a year now, and this article really only beings to scratch the surface. Error handling in Rust is actually quite involved and requires a fair amount of study when you are new to the language.

                                                                              Take a look at the section in the Rust book on error handling for an idea of what I’m talking about (the first edition of the book explains better than the second edition IMHO).

                                                                              The failure crate that the author touched on claims to improve the situation, but I’ve not yet tried it.

                                                                              1. 6

                                                                                I’ve been using Riot for about 2 years now. It shows promise, but has some teething issues:

                                                                                • Initial implementation in Python is a resource hog. There’s an official effort to reimplement in golang, but it seems like the team’s time is mostly spent keeping the existing infrastructure running.
                                                                                • E2E key validation is pretty bad. Every device has to verify every other device. As a result, no-one checks the authenticity of devices because it takes too long.
                                                                                1. 1
                                                                                  E2E key validation is pretty bad. Every device has to verify every other device. As a result, no-one checks the authenticity of devices because it takes too long.
                                                                                  

                                                                                  You can just ignore this and press send anyway which makes it as secure as every other E2E service because manually checking everyones key is way too much work.

                                                                                  1. 2

                                                                                    That’s the equivalent of adding a local exception when hitting a HTTPS website whose key is bogus.

                                                                                    So yes, you’d get encryption, but not authentication. The recipient may not be who she/he says she is.

                                                                                    1. 2

                                                                                      I don’t think any of the other IM services have solved multi device E2E either but I seem to remember there being some work on when you sign in on another device you get a popup on your existing device asking if it’s yours and sharing the key.

                                                                                      1. 2

                                                                                        That (or something similar) is what they’ve said they are aiming for I think. A change I welcome!

                                                                                        1. 1

                                                                                          Keybase has an IM solution (the client is pretty bad) that supports multi device E2E.

                                                                                    2. 1

                                                                                      E2E key validation is pretty bad. Every device has to verify every other device. As a result, no-one checks the authenticity of devices because it takes too long.

                                                                                      This sucks a lot, yeah. Especially with people using throwaway browser sessions.

                                                                                    1. 1

                                                                                      This reminds me that we are still light-year away from the simplicity of hypermedia when embedding software. I understand their objective is to “Run Program Faster Anywhere” but once you show the polyglot approach I will ask if it’s possible to do it inline and mix languages. Very sad to see that it’s not possible (yet).

                                                                                      Also the examples are missing a “start from python”, is it possible? Silly me, there is

                                                                                      1. 3

                                                                                        VMs written on Graal/Truffle effectively do inline by runtime AST specialisation. The various languages share a common AST representation whereby nodes can be specialised and compiled down to native code.

                                                                                        Cross-language tracing has also been done with RPython: http://soft-dev.org/pubs/html/barrett_bolz_tratt__approaches_to_interpreter_composition/

                                                                                        (disclaimer, I’m one of the authors of that linked paper)

                                                                                        1. 1

                                                                                          Even the Parrot vm had the same very simple and abstract instructions and you could run every scripting language on top of it. What I was trying to point out is that mixing different languages in the same source code is still not a thing. I was thinking at % magic commands available when using ipython, like %R to call some R code.

                                                                                          1. 1

                                                                                            It’s not a widespread thing, no. Probably because it comes with some tricky practical challenges (type conversions, performance, grammar composition, …). Nonetheless it can be done efficiently. The paper I linked allows Python and Prolog code from the same file to be JITted. Later we published a paper showing that we can JIT PHP and Python code from the same file.

                                                                                            The real question though is, is that level of granularity useful? Our PHP/Python system allowed cross language closures between arbitrarily deeply nested scopes, but we didn’t find any particularly compelling examples of why that might be useful :P

                                                                                      1. [Comment removed by author]

                                                                                        1. 2

                                                                                          Likely some sort of transpiler from one bytecode to another, I would imagine.

                                                                                          1. 1

                                                                                            Thank you for the response.

                                                                                            For the rest of the thread, my deleted comment was “This seems like witchcraft. How does it work?” I deleted it because I thought I was being lazy. (Now I am adding it back in because it’s a pet peeve to see answers to deleted questions.)

                                                                                            Here is an article (I have yet to read) on how the Graal VM works:

                                                                                            http://chrisseaton.com/truffleruby/jokerconf17/

                                                                                            1. 2

                                                                                              There are no bytecodes involved. All of the languages share a common AST representation where nodes can be compiled and specialised if they are frequently executed.

                                                                                        1. 10

                                                                                          Cool that you went and did this! I built @technomancy’s atreus a while back, but don’t actually use it. I should, though…

                                                                                          1. 3

                                                                                            Thanks - that’s a very cool looking keyboard!

                                                                                            1. 1

                                                                                              Why don’t you use it?

                                                                                              1. 4

                                                                                                The reason I don’t use it is simply because I don’t want to become dependent on it. @technomancy travels everywhere with his, and sets it up on top of his laptop keyboard. I could try that, I suppose, but it seems like a habit that’d be very hard to get into. Above all, I don’t have pain from regular laptop keyboards, so the increased ergonomics haven’t pushed me into it by necessity.

                                                                                                But, now that I’m saying this, I really should give it more of a chance, and try it again… There’s no reason not to, for sure.

                                                                                                1. 3

                                                                                                  I don’t think learning a new keyboard will prevent you from using your laptop keyboard.

                                                                                                  I switch freely between a maltron 3d and a thinkpad keyboard. The biggest challenge is learning the new keyboard in the first place (about 2 months for the maltron)

                                                                                                  1. 1

                                                                                                    You’re right, it doesn’t stop me from using a different keyboard. I spend enough time away from my desk, though, that I feel I’d have to bring it with to ever get comfortable with it.