1. 3

    There’s a spectrum to how close you want your VM to be to your language. Closer means better performance (e.g. luajit using bytecode tailored to lua 5.1’s semantics), farther means easier support for a variety of languages.

    But farther also means that you’ll eventually end up adding new intermediate representations that better suit a specific language (e.g. Rustc having MIR, Swiftc having SIL…) and then are you really better off than where you started? In the case of LLVM’s IR/GCC’s tree representation you are because you get a ton of optimizations and targets for free, but if the IR is very low level and only has a single backend I’m not so sure.

    I’ll keep an eye on this project, I’m not sure anything usable will come out of it but I’m certain many interesting insights will.

    1. 3

      There’s a spectrum to how close you want your VM to be to your language. Closer means better performance (e.g. luajit using bytecode tailored to lua 5.1’s semantics), farther means easier support for a variety of languages.

      While I largely agree with you, I think that organizational and personal preferences have a lot more to do with VM/language infrastructure choices. Graal.js is competitive with Node and Oracle, Mozilla, Apple, and Google would spend far fewer engineering resources optimizing a shared runtime than they currently spend on Graal, Spidermonkey, FTL, and V8.

      But they don’t do that for the same reasons that Google developed Dalvik or V8 or Blink: because they wanted to do their own thing. Google didn’t want to license HotSpot from Sun for Android and they wanted to try different techniques than what was in WebKit. Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle or RPython, even though it would be very beneficial to their community.

      In the case of LLVM’s IR/GCC’s tree representation you are because you get a ton of optimizations and targets for free, but if the IR is very low level and only has a single backend I’m not so sure.

      I’ll keep an eye on this project, I’m not sure anything usable will come out of it but I’m certain many interesting insights will.

      Any thoughts on MLIR?

      1. 1

        I think that organizational and personal preferences have a lot more to do with VM/language infrastructure choices.

        I think I agree.

        Graal.js is competitive with Node

        While that’s true for hot-loop performance, it isn’t for startup time. Graal would be a very poor choice for CGI-style execution of js scripts.

        Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle

        Truffle is a very poor choice if startup times matter (can you imagine recompiling your scripts everytime you want to run them?). I would wager the same thing about code size, although I don’t have any data to prove that. Truffle being worse at some things is normal - you have to sacrifice some aspects to win on others, but it also means that you can’t have a generic technology that does everything better than a non-generic technology. The non-generic technology can be tailored to a use case.

        Any thoughts on MLIR?

        I know nothing about MLIR - from what I’ve read it’s a machine-learning IR that targets LLVM’s IR. This fits the pattern my original comment described: LLVM was too generic for ML applications and now it makes economical sense to build something else on top that better describes the actual programs and allows better optimizations. To me, MLIR makes sense because it’s higher-level, contrary to SOIL which seems lower-level.

        1. 2

          Graal.js is competitive with Node

          While that’s true for hot-loop performance, it isn’t for startup time. Graal would be a very poor choice for CGI-style execution of js scripts.

          OpenJ9 has experimented with the full range of trade-offs including AOT, AOT+JIT, caching JIT output, and server based JIT (transcript, slides). I can’t find the paper right now, but I know that there was an academic publication exploring a shared Android JIT for phones, which showed a lot of energy savings.

          Similarly, I suspect Python, Ruby, and Lua developers enjoy programming in C and don’t want to spend their time in Truffle

          Truffle is a very poor choice if startup times matter (can you imagine recompiling your scripts everytime you want to run them?). I would wager the same thing about code size, although I don’t have any data to prove that. Truffle being worse at some things is normal - you have to sacrifice some aspects to win on others, but it also means that you can’t have a generic technology that does everything better than a non-generic technology. The non-generic technology can be tailored to a use case.

          Interpreted languages in general don’t have great startup times 😉. I know there are some asterisks associated with Graal/Truffle AOT, but Graal/Truffle can perform AOT compilation.

          Any thoughts on MLIR?

          I know nothing about MLIR - from what I’ve read it’s a machine-learning IR that targets LLVM’s IR. This fits the pattern my original comment described: LLVM was too generic for ML applications and now it makes economical sense to build something else on top that better describes the actual programs and allows better optimizations. To me, MLIR makes sense because it’s higher-level, contrary to SOIL which seems lower-level.

          It started in Machine Learning land, but it (now?) stands for Multi-Level Intermediate Representation and is designed to make it easier to apply common optimizations in IRs above the LLVM IR (explainer).

          There will always be room for language specific optimizations, hand-tooled machine code, etc. But on the technical merits, I think the “roll everything yourself” crowd is more wrong than it is right.

      2. 1

        In the case of LLVM’s IR/GCC’s tree representation […] you get a ton of optimizations and targets for free

        Does GCC really have an IR you can inspect (like Clang’s -emit-llvm)?

        1. 3

          Yes, GCC has a higher-level AST named Tree/Generic and a lower-level, SSA-based representation named Gimple. You can see the various stages the program goes through by passing the -fdump-tree-all option to GCC. There’s also a register-level representation named RTL but I don’t remember the flags you need to use in order to dump that.

          1. 3

            I’m not a GCC expert, but I think GCC has two intermediate representations, one called GIMPLE and the other called RTL. I think GIMPLE is a higher-level IR, while RTL is a very low level representation using s-expressions. Not 100% on this.

            You can get a GIMPLE dump with gcc -fdump-tree-gimple

        1. 4

          Preparing to get my life back together, and move back to NYC.

          1. 1

            I’ve been working on two write-ups on Idris. One on proving miscellaneous things in Zermelo Frankel Set Theory, and the other on implementing type systems in Idris. The type systems start with the untyped lambda calculus, and work up to System F.

            If anyone would find these interesting, please let me know!

            Aside from that, my friend and I have been reflecting on our lives by sharing short stories with each other. We’re doing this as part of a program called the “Crystal-Barkely method”, which aims to help one focus their career path though reflection and storytelling. So far, it’s been rewarding :)

              1. 1

                What takeaways do you have from this paper that proquints could use to improve?

                I only skimmed the paper, but it seems largely focused on naming variables, rather than providing mnemonics for numbers.

                1. 1

                  There are a variety of books on improving recall of information.

                  The basic technique is to associate the information to be learned with information that is already stored in long term memory, e.g., the person who learned to store long sequences of numbers by using his knowledge of record breaking running times.

                  1. 5

                    This isn’t about teaching people how to memorize long strings of digits.
                    It’s about how to replace the strings of digits with something more intrinsically mnemonic.
                    It’s also not about designing memorable names from scratch, which appears to be the topic of the paper you linked.

              1. 18

                This is so great. There’s a nice little easter egg in the HTTP headers

                1. 7

                  One of best companies to work for. Happy to see the wayfarer reactive framework used this way! And libblonde is also pretty cool.

                1. 2

                  This is done as part of an overall lesson in the value of inventing a new domain-specific testing language for your tests. I was left so confused by this assertion. I would use exactly the same code to demonstrate exactly the opposite lesson! Don’t do this!

                  I disagree, slightly. I think if you have a large number of tests that cover a very specific domain, a DSL might make things easier to read and write. I was reminded of a talk given by Brian Kernighan, where at one point he reviews a tiny DSL that was used to unit-test regexes in AWK.

                  I think the “turnOnLoTempAlarmAtThreshold” example in Clean Code is a little too contrived, and it doesn’t really convey the idea.

                  1. 1

                    Sure, just make sure to test your test(helper)s…

                    When I saw “clever” and “convenient” test helpers, I encountered bugs in them masking real bugs, because the helpers had zero tests, but quite complicated logic.

                    I prefer DAMP tests, and test looking as clean, as in the examples in the article: preferably no control structures, or at most 1 level of them, and repeat yourself as needed, add comments when needed. Usually these are clearly understandable, while the custom DSLs needs a lot of knowledge if its internals, as rarely is it designed thoroughly to be a contradiction free as needed for writing (and especially reading) test.

                  1. 31

                    There most certainly is discrimination, but the VC-types will put you through all kinds of mental gymnastics to convince you there isn’t. I’ve seen it first hand, several times over, and denying it is just part of the gaslighting that goes on to try and pretend it doesn’t exist.

                    I also don’t buy the “lying on the beach gasping because they can’t get enough talented people” narrative that Marc is pushing. In most cases companies are looking for people who are a) cheap and b) willing to put up with a lot of BS. They don’t actually want talented free thinkers. They want pod people who will hammer out the code in accordance with the party line. They don’t want innovators, they want button pushers and cogs in the wheel so they can build an assembly line.

                    1. 6

                      […] people who are a) cheap and b) willing to put up with a lot of BS. They don’t actually want talented free thinkers.

                      In my hiring experience, we were always looking for people who were a) cheap and b) talented free thinkers. “willing to put up with a lot of BS” is usually a part of any job description ;).

                    1. 4

                      Where can I find more information about why Plan 9 is amazing, especially how it compares/contrasts to Linux or Unixes?

                      1. 9

                        I found this paper to be a wonderful walkthrough. I highly recommend getting a copy of 9front running, and going through some the exercises in the paper.

                        It’s very long, but definitely a great way to get a feel for how some of the concepts in Plan 9 are applied.

                        Edit: Since I was reminded how much I like this paper, I decided to submit it as a story.

                        1. 5

                          Some goodies from Plan 9 were ported to *nixes, for example procfs, unfortunately not all of them (Plan 9 like process namespaces).

                          1. 7

                            Unfortunately, the best piece of plan 9 is impossible to port: A unified, interposable way of doing everything, so you don’t have to think about huge numbers of special cases and strange interactions between features. 9p is, more or less, the only way to talk to the OS, and the various interfaces that are exposed over it can be transparently swapped out with namespaces, allowing you to replace, mock out, or redirect across the network any part of the system that you want.

                            1. 4

                              Well, Linux namespaces got 90% of the way there, although they certainly didn’t get their ergonomics.

                            2. 3

                              I just watched the video https://www.youtube.com/watch?v=3d1SHOCCDn0

                              Found the way the presenter explained the core concept very understandable. It is 40 minutes long.

                            1. 6

                              This is neat! I was really hoping to see some of the underlying clojure data structures implemented in rust – but I don’t think that’s one of the author’s goals as of now.

                              For context, Clojure uses some clever persistent data structures for hash maps and vectors. (Here’s a snippet of Rick Hickey giving an overview of some data structures). The implementation here currently implements persistent vectors as Vec<Rc<T>>, and maps are associated lists instead.

                              There is a persistent hash trie crate in rust, and there may be other persistent structures that could be used. I’d love to see how this project develops, and if it takes the route of using these data structures.

                              1. 10

                                im (https://docs.rs/im/14.3.0/im/) is a collection of all of those types, in a threadsafe and a non-threadsafe version.

                                1. 3

                                  Hey there, author here. I indeed plan to eventually implement the appropriate efficient data structures, as they are implemented in Clojure – or use a library like im. In fact, everything you see will likely change – I’m taking a very iterative approach to this altogether. That particular alist-implemented map is a tide-me-over map for small datasets, implemented at the last minute, and for now is analogous to Clojure’s PersistentArrayMap, which it will likely be replaced with.

                                  1. 2

                                    Awesome! I’m really exited to hear you’re working on this :)

                                1. 7

                                  I think there’s some open source stuff for certain chips, I haven’t experimented with them at all though

                                  I highly recommend getting a Lattice Icestick to try out! You can buy them for fairly cheap (around $40 - $60 USD). There is an open source toolchain called Yosys, which can be used with icestorm to target the Lattice chips.

                                  I personally find it much nicer to be able to develop in a Unix-like environment (text files, Makefile, etc.), over the massive Altera and Xilinx IDEs.

                                  1. 3

                                    Second this. I bought a Lattice Icestick for playing with FPGA purposes, and I had relatively little trouble setting up the open source toolchain and flashing circuit designs onto the device. I haven’t so far done much beyond the “hello world” of making an LED blink, but getting that far was relatively painless.

                                    1. 2

                                      This is a bit late, but Xilinx ISE 14.7 can be invoked without the IDE. If you instantiate a new DDR controller (from the IDE) it generates an example project in a folder in your workspace, which has some .bat files to build it.

                                      1. 2

                                        Word of warning: the ice sticks tend to be out of stock a lot. If you’re going to end up spending $40-50, check out some of the other ice40 boards. Go for at least an HX8k, since 1k LUTs is not enough to do much. The ice40UP line also has a FOSS toolchain, but the boards are more expensive (around $80).

                                      1. 3

                                        Java is ideal for large enterprise applications

                                        AbstractSingletonProxyFactoryBean

                                        1. 11

                                          Is that actually used outside of Spring internals? I’m not a Java developer but to my understanding Spring is a very old and very complex framework, so you’d expect to see this kind of thing. If that’s so, it’s no worse than a language having Zygohistomorphic prepromorphisms.

                                          1. 4

                                            Is that actually used outside of Spring internals?

                                            To be honest, I have no idea. I’m also not a Java developer. Though, I have worked with some enterprise C++ libraries that have a similar pattern: classes with long names that try to describe the some abstract design pattern.

                                            I’m not necessarily saying that an AbstractSingletonProxyFactoryBean is a bad design. If you’re familar with the design patterns and framework, I’m sure the name “AbstractSingletonProxyFactoryBean” alone probably provides a lot of context clues to developer, in the same way that zygohistomorphic prepromorphisms might.

                                            The name itself is pretty funny though – its sounds like a bunch of meaningless buzz words tacked together. I also got a good laugh out of zygohistomorphic prepromorphisms :).

                                            1. 3

                                              Is that actually used outside of Spring internals?

                                              No, this class isn’t used outside of Spring’s internals. But, yes, this pattern is sadly common in Java codebases.

                                              IMHO AOP is just as bad in Guice.

                                          1. 21

                                            Perhaps one of the biggest things worth mentioning here (other than the standard: Vim, ag, LSP, etc.) is gofmt.

                                            gofmt takes any valid Go source code and formats it. One of the benefits of this is that it eliminates low-value discussions about where to place braces, spaces, and all of that, but another huge benefit is that it’s quite a productivity win.

                                            Before, I would manually muck about to get the indentation, spacing, and all these thing correct. Now? I just write if foo == "" { return } on a single line and let gofmt take care of it. This is especially a massive benefit when you copy/paste some code and the indentation is off, or you eliminate/add an if, and so forth. gofmt will just take care of all of that.

                                            goimports does gofmt and also takes care of adding/removing imports; fmt.Println("DEBUG", v), write, and it adds import "fmt" automatically (and removes it again when the line is removed).

                                            There are similar tools for other languages; I encourage you to check it out if you haven’t; I never thought I would like it so much until I started using it (“real programmers just get the indentation correct on their own and don’t need a tool!”)

                                            1. 2

                                              Perhaps one of the biggest things worth mentioning here (other than the standard: Vim, ag, LSP, etc.) is gofmt.

                                              Clang-format does the same thing for [Objective-]C[++] codebases. One of the biggest productivity wins is not just using it, but integrating it with the build system and with CI. I’ve recently started doing this with a few projects so that there is a clangformat target that applies the formatting to all source files. CI builds that target one each PR and if it produces a diff then this CI step fails with a message asking the submitter to build with the clangformat target and then update the PR.

                                              This means that no human needs to either look for minor style nits, or fix them, during the code review process. You can’t quite represent my preferred style with clang-format, but it’s close enough. Now code review can focus on the important style issues, such as consistent naming in APIs, rather than the trivial ones.

                                              1. 5

                                                We do an equivalent at work with prettier formatting typescript. I’ve thing I’ve found is that it feels much nicer to have auto formatting on save in my text editor as well as in CI.

                                                1. 2

                                                  The unfortunate thing about clang-format is that you can configure the formatter any way you like. At a large company, you can end up with a huge number of repositories and teams, each with different .clang-format configurations. Unless there’s outside pressure to decide which configuration to go with, you might end up in endless debates over which format is The Best.

                                                  Go has the beauty of having One True Format™

                                                  1. 2

                                                    That doesn’t really bother me, as long as it’s integrated into the build system and CI pipeline: I write without paying attention to it and then reformat the code before I commit. It’s a slightly higher cognitive load than having a single format, but the go style makes several decisions that are the opposite of what you’d get if a cognitive psychologist had been involved in any part of the process, so I prefer having the option to make the right choice some times than having to always make the wrong choice.

                                                    That said, in my ideal world, there would be no global layout style: the revision control system would store an AST and it would be up to the reader how to typeset it. Having a per-project clang-format style is pretty close to this: in theory, I can apply my own preferred style on checkout and then the project’s style before I commit a patch. I’d love to see better tooling around this.

                                                    Some years ago, I had a student work on a project called Code Editing in Local Style (CELS), which he presented in EuroLLVM in Paris. It did a lot more trivial reformattings than clang-format (for example, it could have variables declared at the start of a function or at their minimal scope, could handle tabs-for-indent-spaces-for-alignment style, could switch variables of different kinds between conventions like underscore_separated or camelCase, and implemented the TeX line breaking algorithm with weights for badness). I’d love to see that part of a normal editing workflow.

                                                2. 2

                                                  I absolutely agree. I used those tools so often that I just threw them into my vimrc:

                                                  autocmd VimLeave *.go :!go fmt %
                                                  autocmd VimLeave *.go :!goimports -w %
                                                  
                                                1. 8

                                                  There’s also aseprite: https://github.com/aseprite/aseprite I’m very fond of its pixel UI.

                                                  1. 11

                                                    Aesperite is a nice tool, however, they switched from GPL to a custom EULA sometime back. I respect their decision, but I’d rather use GIMP.

                                                    1. 7

                                                      Oh what a shame, I thought it was still GPL’d. Thank you!

                                                    2. 3

                                                      Thank you for the link! I had some fun making a little plant for my lobste.rs avatar :)

                                                      1. 2

                                                        I love it! Very nice

                                                    1. 7

                                                      I’ve been trying to learn more about dependent types and the theory behind them, and I recently discovered these lecture series on HoTT. Though they go a little over my head, I’ve found the lectures great at providing some additional background that I was missing. I thought I’d share!

                                                      The Homotopy Type Theory book is also available, and is open source.

                                                      1. 19

                                                        I’m going to buy Animal Crossing.

                                                        1. 3

                                                          Hope you enjoy it! I’m glad I picked it up.

                                                        1. 2

                                                          For personal servers, I name them after streets that intersect Kickerbocker Ave between Flushing Ave and Cooper St, in New York City.

                                                          I.e. George, Melrose, Jefferson, Troutman, etc.

                                                          For work, we used to name the servers by number. Everytime you add a server, you count up by one. Sound ridiculous, but its fairly memorable. Everyone has a little 8.5x11 printout with a diagram showing where each server fit in the architecture; so you could visually see that server s157 handled the client facing API, for example.

                                                          1. 10

                                                            State machines are kinda a huge pain in any language, as far as I can tell. Either you try to make them checked at compile time, like the first example does, and run into issues, or you do lots of checking at runtime and end up asking yourself things like “how did I get into this state?” and “why is it trying to transition into the wrong one?”, or you make a table of transitions and states and altering/updating it becomes error prone. Annoying for such a fundamental pattern.

                                                            1. 6

                                                              in Rust you can have both strongly-typed transitions and runtime dynamism. This kind of simulates session types, like what Idris is really nice for encoding in a lower boiler-plate way.

                                                              1. 4

                                                                State machines are kinda a huge pain in any language

                                                                I’ve found state machines in dependently typed languages to be wicked cool, and much less painful. You can encode the state transitions into the types, which guarantee that your program is in a given state during compile time. The flexibility of the dependent type system allow you to encode state machines of arbitrary complexity.

                                                                For example, in idris, you can do things like ensure that a file handle is open when you call read:

                                                                -- legal 
                                                                do f <- open "file.txt"
                                                                   content <- read f
                                                                   close f
                                                                
                                                                -- illegal 
                                                                do f <- open "file.txt"
                                                                   close f
                                                                   content <- read f   -- won't compile because the file is closed!
                                                                

                                                                Some documentation on idris state machines, and an example of stateful socket programming

                                                                1. 3

                                                                  I think it’s partially from a mismatch between conceptual model and language semantics, but a lot of it seems to come from people half-assing it, either by generating code that can get into a bad state, or not being explicit on what happens when receiving a message unsupported by the current state, things of that nature. I don’t think any of these issues are fundamental to the model of state machines though, or the semantics of the languages we use. Being more careful about generated code, and probably doing a lot more design-time / compile-time verification of the definitions would make it much more pleasant to use, in which case it would actually be used a lot more often, almost certainly to the improvement of software in general.

                                                                1. 3

                                                                  I looked into doing this once. At work, we had a tool where users could write little boolean expressions to classify bonds (for example something like: highYield = rating < BBB). These expressions got pretty complicated, and versioning them was tricky. I thought that git would be a nice fit.

                                                                  The main problem was replication. We had a strict requirement at the company that all services be distributed over multiple datacenters. With clustered databases, you can get this for free. With git, you need to commit to one repository, and then have some job/hook push out changes to other replicas. This generates a number of small nightmares: which repository is master? What if replication fails? How do you guarantee that commits aren’t lost when switching master?

                                                                  Using git turned out to be a pretty awkward thing to do; using a standard RMDBS seemed much more maintainable.

                                                                  I don’t think I would recommend using git as a database unless:

                                                                  • All (or almost all) the data you’re storing is plain text
                                                                  • It’s a non critical service that can withstand some downtime
                                                                  1. 16

                                                                    Moreover we don’t live in 1980 anymore. The view and the purpose of computers had changed completely since this time.

                                                                    I feel that the view and purpose of computers hasn’t changed too much. We still solve a lot of the same problems, and even work with a lot of the same tools (especially in unixes). How we interact with computers has changed dramatically (see smartphones, the internet), but things on the “system” level are relatively the same. Operating systems try to provide the same higher-level abstractions on processes as they did in 1980.

                                                                    1. 12

                                                                      This reminds me of the source code for the original Bourne shell. Stephen Bourne wanted a more ALGOL-like syntax, so he added macros to convert keywords like BEGIN, END, IF, and FI into braces { and }.

                                                                      Here’s an example file, macro.c, from the archived UNIX v7 source tree.

                                                                      The macros are defined in mac.h. :)

                                                                      1. 4

                                                                        There was an abomination called MORTRAN

                                                                        It was where I learnt the true meaning of “Syntactic Sugar”…..

                                                                        You can sugar your syntax as much as you like, if the semantics sucks, life sucks,