1. 37

  2. 17

    Awesome to see so much progress in Pharo. Pharo is a fork of Squeak, which was pretty much a direct reimplementation of the Smalltalk as described in the Smalltalk-80 blue book. Pharo has evolved hugely from that starting point. Most importantly (from my incredibly biased perspective) it is focused on running code that interacts with the world outside of the Smalltalk VM.

    1. 2

      Why doesn’t the website mention this? As I read through the feature list I quickly realized it’s Smalltalk, but I didn’t see the words “Smalltalk” or “Squeak” anywhere. Is there some bad blood?

      1. 19

        From hn:

        The page seems to be going out of it’s way to not mention the word “Smalltalk”. Does anyone have an idea why that is? Does Smalltalk have such a bad reputation nowadays or has someone trademarked the term?

        Pharo considers itself to be Smalltalk-inspired.

        Maybe the most succinct explanation of what this means comes from their vision document1:

        What we want to say is that if there is something that can be improved but does not conform the ANSI Smalltalk, we will do it anyway.

        This really does seem to be a fundamental value within the Pharo community, and it distinguishes them from the rest of the Smalltalk community. When observing the communities around Squeak or the commercial Smalltalks, I often get the impression that they view themselves as a sort of Jedi order that’s trying to preserve an older, better tradition amidst a decadent modern world. The Pharo community, by contrast, actively acknowledges that Smalltalk was not perfect, and is working hard to improve many things about it, including some of the fundamentals. For example, Pharo’s made active steps to make their platform more amenable to source control, and, by extension, collaboration among larger groups of people.

    2. 2

      I love Smalltalk as a language, and would love if it could be compiled to a small, efficient executable. Why in this day and age is it still necessary to deploy your application with your code, vm, and the whole development environment instead?

      1. 27

        There are several problems with doing this for Smalltalk. The first is that dead-code elimination is basically impossible in a dynamic language with rich reflection support. Just because you don’t use a class now doesn’t mean that you can’t convert a user-provided string to a class name and instantiate it. The same applies for methods. Just because nothing calls a particular method doesn’t mean that you won’t use it via reflection.

        Objective-C was intended as a Smalltalk-like language amenable to ahead-of-time compilation and it suffers from this problem as well. You can call NSClassFromString() and get a class that nothing in your source code references and then use NSSelectorFromString() to get a selector that lets you invoke a method on it that nothing in your source code calls. This makes it very hard to do the ‘small’ bit of a ‘small, efficient executable’. On non-Apple platforms, people typically address this in Objective-C by explicitly removing classes that they don’t need from the standard-library builds but it’s very hard to tell if you’ve removed too many things.

        The second problem is the degree of dynamic dispatch made possible by the fact that you’re allowed to add methods at run time and by duck typing. In C++ or Java, for example, it is trivial to implement dynamic dispatch via vtables: every selector is bound to a concrete type and so when you call a.foo() you can trivially transforms the foo part into a fixed integer that is bound to the static type of a. Because most classes implement few methods, this ends up being a small number and so you don’t need much memory to store it. In contrast, in Smalltalk or Objective-C, every class must be able to respond to every selector. Logically this means that you need an NxM matrix for every (class, method) pair. If you don’t have reflection that lets you add methods then you can use tricks such as selector colouring to significantly reduce this space. Pony does this and Verona will (and I think Go does as well): you statically know which methods a class implements at compile time and so you can use the same value (vtable index) for any pair of selectors that are not implemented on the same class and rely on the static type system to avoid confusion. This is not possible for Smalltalk because even if you created such a system any new class you create or modify can invalidate it. You can potentially use this system in a JIT’d Smalltalk, because you have a mechanism for invalidating it whenever you need to (I think Anamorphic Smalltalk implemented this or something similar, but it transformed the duck typing into explicit interface casts in its IR layer and so could get early notification that a class didn’t implement a method). Without something like this, any AoT environment needs to provide a dispatch mechanism that’s either slow or uses a lot of memory. Apple’s Objective-C runtime, for example, uses a linked list of methods and a small hash table to cache it.

        This is made worse by the fall-back message dispatch mechanisms. In Smalltalk, if a class doesn’t respond to a selector, you can implement #doesNotUnderstand: and this method will be called as fallback. Objective-C has a richer set of these allowing you to do things like rewrite the receiver and then redo normal dispatch or have a fast path for adding the method. All of these add complexity on the dispatch path that in a language such as C++, C#, Java, Pony, or Verona is just ’foo->vtable[selector](foo, …)`.

        The third problem is specific to Smalltalk. The feature Smalltalk / Pharo users love and implementers hate: #become:. This lets you replace an object with another at run time. I’ve implemented this in an AoT Smalltalk, but it came with some overhead (it modified the receiver to be a proxy and forward every method) but you really want to hook it into garbage collection so that you can rewrite the methods. This means you need to bundle a complex GC with the system, which again adds to the size.

        Finally, as mjn points out, there is the eval problem: any language that provides a mechanism for run-time code generation needs to bundle at least a parser and interpreter and ideally a complete JIT, even if the main program is AoT compiled. This is a problem for AoT JVM and .NET implementations: both provide mechanisms for injecting bytecode and having new classes loaded. This is a problem for size and has a knock-on problem for performance: You can optimise the main program based on whole-program analysis, but you have to be able to undo any of these things if new code is loaded (I think the C# AoT compiler has an option to just not support dynamic code loading). For example, if you do some flow analysis and determine that a particular call site is reachable with a single concrete type, you can inline the method (or, at least, avoid dynamic dispatch). Unfortunately you then load some code that makes this call site reachable with another path and now your optimisation is wrong. This is one of the main cases where JIT environments outperform AoT ones: AoT compilers must be conservative and optimise for what may happen, JIT compilers can be aggressive and optimise for what does happen and then throw the optimisations away and do something different if something else happens. When you have a very dynamic language, an AoT compiler ends up having to make very conservative choices in its optimisations because pretty much any invariant that appears to be true at the start of execution can become untrue later on.

        I don’t like it when people talk about a JIT’d language vs an AoT language, because JIT vs AoT is a property of the implementation but there are some language features that particularly favour one over the other. C, for example, gains very little benefit from JIT compilation (some extra profiling information around which code paths are hot, but little else) and the fact that you can take function pointers in C code means that you may need to add thunks (which hurt performance) if you want to do any of the optimisations that a JIT may do that an AoT compiler can’t. In contrast, generating good code for Smalltalk relies very heavily on being able to observe the running behaviour of a system. A naïve AoT Smalltalk will outperform interpreted Smalltalk and probably outperform a naïve (method-at-a-time) Smalltalk JIT, but an modern trace-based Smalltalk JIT will outperform the best Smalltalk AoT compiler that it’s possible to write.

        1. 4

          Thanks a lot for this comment, it’s very informative.

          1. 3

            You can optimise the main program based on whole-program analysis, but you have to be able to undo any of these things if new code is loaded (I think the C# AoT compiler has an option to just not support dynamic code loading). For example, if you do some flow analysis and determine that a particular call site is reachable with a single concrete type, you can inline the method (or, at least, avoid dynamic dispatch). Unfortunately you then load some code that makes this call site reachable with another path and now your optimisation is wrong.

            One approach to this is CMUCL’s block compilation, recently re-added in SBCL. It lets you mark a section of code where you want whole-program optimization semantics, with specified entry points. Anything not an entry point is fair game for let-conversion, specialization, etc., and the compiler doesn’t have to undo any optimizations on redefinition. Arguably a bit of a hack, but it sped up some code significantly in practice, and is less extreme than the C# option you mention of disabling dynamic code loading entirely.

            I believe the history of this is that people were doing something like that manually for performance, writing large subsections of Lisp programs as one big function with a bunch of local/inner functions. Local/inner functions can’t be redefined separately from their containing function, so you get “whole program” optimization out of intra-function optimization by putting your whole program into one function. Block compilation lets you get similar semantics with more normal looking code.

          2. 7

            Languages that can do a lot of things dynamically at runtime instead of statically at compile time are notoriously hard to compile to standalone executables, without including the whole dev environment in the executable. For example, if your language allows you to load code from a file and evaluate it, you need essentially the entire development environment available at runtime, because you don’t know in advance what parts of it the loaded-and-eval’d code might need.

            Python and Common Lisp are two other languages where this poses a problem.

            1. 4

              The problem doesn’t seem that specific to Smalltalk to me, since we’re using Docker to package up those “small efficient executables” along with all of their dependencies in a cross-linguistic fashion. Frankly, image + VM isn’t that different from Python code + virtual environments or Java jar/war files + servlet/application containers.

            2. 1

              Does anyone have pointers to interesting Pharo code examples? If so, feel free to share :)

              1. 2

                See if https://pharo.org/success.html helps (although the header links of the stories aren’t working).