Threads for tekknolagi

  1. 2

    It would be fun to have a benchmark so that people can do targeted work on emulator perf and see it make the superoptimizer faster. But maybe that’s just my bias as a runtimes optimization person :)

    1. 2

      Also, consider using PyPy–it’s at least 5x faster on my machine at doing the superoptimization.

    1. 3

      Hi sillycross! Ever since your first post in this series I have been eagerly watching the repo. This is such a neat project and I am excited to see where it goes!

      1. 3

        A related project (not mentioned in the readme): https://github.com/HigherOrderCO/HVM

        1. 1

          Oh yeah! HVM is super cool.

        1. 7

          The OCaml code in the wild example is spot on. And hilarious.

          1. 7

            FYI, they strip out Google Analytics, which feels a little weird.

            1. 46

              They actually just block all third party content from loading by telling the browser to not load it via content security policy. They do not modify what you upload to the site (afaik).

              1. 33

                It’s the only ethical choice for hosting providers.

                1. 11

                  This is also mentioned in the documentation. I guess the target audience are not really using analytics.

                1. 5

                  This is such a nice and thorough walk-through. Props!

                  1. 2

                    Note: I do not mean to discount this research. I am asking why we don’t have this yet.

                    If you write a lexer in a functional style, and you write a parser in at least a little bit of a functional style, can you get this for free? This is all a bit handwavy, but I thought compilers like GHC and maybe others could do stream fusion or loop fusion or something like that. It would be really nice to automatically be able to skip all the overhead in lexing/parsing, or operating over compressed data (reading structures out of a zip file, for example), or …

                    1. 2

                      We propose a deterministic variant of Greibach Normal Form that ensures deterministic parsing with a single token of lookahead and makes fusion strikingly simple

                      I don’t expect a generic compiler optimization to produce an effect as specific as what’s in the paper. You generally need to write code in a specific form (a DSL), and use optimizations specific to the DSL.

                      In this case, the lexer and parser must be converted to DGNF, a format they just invented, before their new optimization algorithm can do its work.

                      1. 1

                        In particular, they wouldn’t have seen 2-7x speed improvements like they did, if the compiler was already applying their optimization.

                    1. 3

                      One day, after bringing up enough of a new Python runtime to run a webserver, we saw abysmal performance. A quick bit of profiling showed that string processing was a huge bottleneck. Not super unexpected, but this looked especially bad. We did some investigation and found this commit to be at fault, authored by yours truly. Turns out str.rpartition is used a lot to look at file extensions and parse HTTP requests (IIRC) and my one-liner from early development was murdering performance:

                      class str:
                          # ...
                          def rpartition(self, sep):
                              before, itself, after = self[::-1].partition(sep[::-1])[::-1]
                              return before[::-1], itself[::-1], after[::-1]
                      

                      Rewriting it to not be… that… brought performance to somewhere near normal.

                      1. 2

                        Extremely well-known hostnames such as “sri-nic” and “uunet”

                        Certainly not know to me… what were those?

                        1. 2

                          See https://en.wikipedia.org/wiki/InterNIC#SRI at least for the former

                          1. 2

                            Thanks for the shout-out :)

                            1. 18

                              Do you have any more information on the project? This is a bit light.

                              1. 3

                                I haven’t shared the open source project publicly yet, but I plan to later this year.

                                This thread has some example code and a link for more info if you’re interested (some details have changed since): https://twitter.com/haxor/status/1618054900612739073

                                And I wrote a related post about motivations here: https://www.onebigfluke.com/2022/11/the-case-for-dynamic-functional.html

                                1. 18

                                  There is no static type system, so you don’t need to “emulate the compiler” in your head to reason about compilation errors.

                                  Similar to how dynamic languages don’t require you to “emulate the compiler” in your head, purely functional languages don’t require you to “emulate the state machine”.

                                  This is not how I think about static types. They’re a mechanism for allowing me to think less by making a subset of programs impossible. Instead of needing to think about if s can be “hello” or 7 I know I only have to worry about s being 7 or 8. The compiler error just meant I accidentally wrote a program where it is harder to think about the possible states of the program. The need to reason about the error means I already made a mistake about reasoning about my program, which is the important thing. Less errors before the program is run doesn’t mean the mistakes weren’t made.

                                  I am not a zealot, I use dynamically typed languages. But it is for problems where the degree of dynamism inherent in the problem means introducing the ceremony of a program level runtime typing is extra work, not because reading the compiler errors is extra work.

                                  This is very analogous to the benefits of functional languages you point out. By not having mutable globals the program is easier to think about, if s is 7 it is always 7.

                                  Introducing constraints to the set of possible programs makes it easier to reason about our programs.

                                  1. 4

                                    I appreciate the sentiment of your reply, and I do understand the value of static typing for certain problem domains.

                                    Regarding this:

                                    “making a subset of programs impossible”

                                    How do you know what subset becomes impossible? My claim is you have to think like the compiler to do that. That’s the problem.

                                    I agree there’s value in using types to add clarity through constraints. But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                    1. 10

                                      I really like your point about having to master several languages. I’m glad to be rid of a preprocessor, and languages like Zig and Nim are making headway on unifying compile-time and runtime programming. I disagree about the type system, though: it does add complexity, but it’s scalable and, I think, very important for larger codebases.

                                      Ideally the “impossible subset” corresponds to what you already know is incorrect application behavior — that happens a lot of the time, for example declaring a “name” parameter as type “string” and “age” as “number”. Passing a number for the name is nonsense, and passing a string for the age probably means you haven’t parsed numeric input yet, which is a correctness and probably security problem.

                                      It does get a lot more complicated than this, of course. Most of the time that seems to occur when building abstractions and utilities, like generic containers or algorithms, things that less experienced programmers don’t do often.

                                      In my experience, dynamically-typed languages make it easier to write code, but harder to test, maintain and especially refactor it. I regularly make changes to C++ and Go code, and rely on the type system to either guide a refactoring tool, or at least to produce errors at all the places where I need to fix something.

                                      1. 4

                                        How do you know what subset becomes impossible? My claim is you have to think like the compiler to do that. That’s the problem.

                                        You’re right that you have “think like the compiler” to be able to describe the impossible programs for it to check it, but everybody writing a program has an idea of what they want it to do.

                                        If I don’t have static types and I make the same mistake, I will have to reason about the equivalent runtime error at some point.

                                        I suppose my objection is framing it as “static typing makes it hard to understand the compiler errors.” It is “static typing makes programming harder” (with the debatably worth it benefit of making running the program easier). The understandability of the errors is secondary, if there is value there’s still value even the error was as shitty as “no.”

                                        But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                        I think this is the same for “functionalness”. For example, often I find I’d rather set up a thread local or similar because it is easier to deal with then threading through some context argument through everything.

                                        I suppose there is a difference in the sense that being functional is not (as of) a configurable constraint. It’s more or less on or off.

                                        1. 3

                                          I agree there’s value in using types to add clarity through constraints. But there’s a cost for the programmer to do so. Many people find that cost low and it’s easy. Many others — significantly more people in my opinion — find the cost high and it’s confusing.

                                          I sometimes divide programmers in two categories: the first acknowledge that programming is a form of applied maths. The seconds went to programming to run from maths.

                                          It is very difficult for me to relate to the second category. There’s no escaping the fact that our computers ultimately run formal systems, and most of our job is to formalise unclear requirements into an absolutely precise specification (source code), which is then transformed by a formal system (the compiler) into a stream of instructions (object code) that will then be interpreted by some hardware (the CPU, GPU…) with more or less relevant limits & performance characteristics. (It’s obviously a little different if we instead use an interpreter or a JIT VM).

                                          Dynamic type systems mostly allow scared-of-maths people to ignore the mathematical aspects of their programs for a bit longer, until of course they get some runtime error. Worse, they often mistake their should-have-been-a-type-error mistakes for logic errors, and then claim a type system would not have helped them. Because contrary to popular beliefs, type errors don’t always manifest as such at runtime. Especially when you take advantage of generics & sum types: they make it much easier to “define errors out of existence”, by making sure huge swaths of your data is correct by construction.

                                          And the worst is, I suspect you’re right: it is quite likely most programmers are scared of maths. But I submit maths aren’t the problem. Being scared is. People need to learn.

                                          My claim is you have to think like the compiler to do that.

                                          My claim is that I can just run the compiler and see if it complains. This provides a much tighter feedback loop than having to actually run my code, even if I have a REPL. With a good static type system my compiler is disciplined so I don’t have to be.

                                          1. 6

                                            Saying that people who like dynamic types are “scared of math” is incredibly condescending and also ignorant. I teach formal verification and am writing a book on formal logic in programming, but I also like dynamic types. Lots of pure mathematics research is done with Mathematica, Python, and Magma.

                                            I’m also disappointed but unsurprised that so many people are arguing with a guy for not making the “right choices” in a language about exploring tradeoffs. The whole point is to explore!

                                            1. 3

                                              Obviously people aren’t monoliths, and there will be exceptions (or significant minorities) in any classification.

                                              Nevertheless, I have observed that:

                                              • Many programmers have explicitly taken programming to avoid doing maths.
                                              • Many programmers dispute that programming is applied maths, and some downvote comments saying otherwise.
                                              • The first set is almost perfectly included in the second.

                                              As for dynamic typing, almost systematically, arguments in favour seem to be less rigorous than arguments against. Despite CISP. So while the set of dynamic typing lovers is not nearly as strongly correlated with “maths are scary”, I do suspect a significant overlap.

                                              While I do use Python for various reasons (available libraries, bignum arithmetic, and popularity among cryptographers (SAGE) being the main ones), dynamic typing has systematically hurt me more than it helped me, and I avoid it like the plague as soon as my programs reach non-trivial sizes.

                                              I could just be ignorant, but despite having engaged in static/dynamic debates with articulate peers, I have yet to see any compelling argument in favour. I mean there’s the classic sound/complete dilemma, but non-crappy systems like F* or what we see in ML and Haskell very rarely stopped me from writing a program I really wanted to write. Sure, some useful programs can’t be typed. But for those most static check systems have escape hatches. and many programs people think can’t be typed, actually can. Se Ritch Hickey’s transducers for instance. All his talk he was dismissively daring static programmers to type it, only to have a Haskell programmer actually do it.

                                              There are of course very good arguments favouring some dynamic language at the expense of some static language, but they never survive a narrowing down to static & dynamic typing in general. The dynamic language may have a better standard library, the static language may have a crappy type system with lots of CVE inducing holes… all ancillary details that have little to do with the core debate. I mean it should be obvious to anyone that Python, Mathematica, and Magma have many advantages that have little to do with their typing discipline.


                                              Back to what I was originally trying to respond to, I don’t understand people who feel like static typing has a high cognitive cost. Something in the way their brain works (or their education) is either missing or alien. And I’m highly sceptical of claims that some people are just wired differently. It must be cultural or come from training.

                                              And to be honest I have an increasingly hard time considering the dynamic and static positions equal. While I reckon dynamic type systems are easier to implement and more approachable, beyond that I have no idea how they help anyone write better programs faster, and I increasingly suspect they do not.

                                              1. 6

                                                Even after trying to justify that you’ve had discussions with “articular peers” and “could just be ignorant” and this is all your own observations, you immediately double back to declaring that people who prefer dynamic typing are cognitively or culturally defective. That makes it really, really hard to assume you’re having any of these arguments in good faith.

                                                1. 1

                                                  To be honest I only recall one such articulate peer. On Reddit. He was an exception, and you’re the second one that I recall. Most of the time I see poorer arguments strongly suggesting either general or specific ignorance (most of the time they use Java or C++ as the static champion). I’m fully aware how unsettling and discriminatory is the idea that people who strongly prefer dynamic typing would somehow be less. But from where I stand it doesn’t look that false.

                                                  Except for the exceptions. I’m clearly missing something, though I have yet to be told what.

                                                  Thing is, I suspect there isn’t enough space in a programming forum to satisfactorily settle that debate. I would love to have strong empirical evidence, but I have reasons to believe this would be very hard: if you use real languages there will be too many confounding variables, and if you use a toy language you’ll naturally ignore many of the things both typing disciplines enable. For now I’d settle for a strong argument (or set thereof). If someone has a link that would be much appreciated.

                                                  And no, I don’t have a strong link in favour of static typing either. This is all deeply unsatisfactory.

                                                  1. 5

                                                    There seems to be no conclusive evidence one way or the other: https://danluu.com/empirical-pl/

                                                    1. 3

                                                      Sharing this link is the only correct response to a static/dynamic argument thread.

                                                      1. 1

                                                        I know of — oops I do not, I was confusing it with some other study… Thanks a ton for the link, I’ll take a look.

                                                        Edit: from the abstract there seem to be some evidence of the absence of a big effect, which would be just as huge as evidence of effect one way or the other.

                                                        Edit 2: just realised this is a list of studies, not just a single study. Even better.

                                            2. 1

                                              How do you know what subset becomes impossible?

                                              Well, it’s the subset of programs which decidably don’t have the desired type signature! Such programs provably aren’t going to implement the desired function.

                                              Let me flip this all around. Suppose that you’re tasked with encoding some function as a subroutine in your code. How do you translate the function’s type to the subroutine’s parameters? Surely there’s an algorithm for it. Similarly, there are algorithms for implementing the various primitive pieces of functions, and the types of each primitive function are embeddable. So, why should we build subroutines out of anything besides well-typed fragments of code?

                                            3. 4

                                              Sure, but I think you’re talking past the argument. It’s a tradeoff. Here is another good post that explains the problem and gives it a good name: biformity.

                                              https://hirrolot.github.io/posts/why-static-languages-suffer-from-complexity

                                              People in the programming language design community strive to make their languages more expressive, with a strong type system, mainly to increase ergonomics by avoiding code duplication in final software; however, the more expressive their languages become, the more abruptly duplication penetrates the language itself.

                                              That’s the issue that explains why separate compile-time languages arise so often in languages like C++ (mentioned in the blog post), Rust (at least 3 different kinds of compile-time metaprogramming), OCaml (many incompatible versions of compile-time metaprogramming), Haskell, etc.

                                              Those languages are not only harder for humans to understand, but tools as well

                                              1. 4

                                                The Haskell meta programming system that jumps immediately to mind is template Haskell, which makes a virtue of not introducing a distinct meta programming language: you use Haskell for that purpose as well as the main program.

                                                1. 1

                                                  Yeah the linked post mentions Template Haskell and gives it some shine, but also points out other downsides and complexity with Haskell. Again, not saying that types aren’t worth it, just that it’s a tradeoff, and that they’re different when applied to different problem domains.

                                                2. 2

                                                  Sure, but I think you’re talking past the argument

                                                  This is probably a fair characterization.

                                                  Those languages are not only harder for humans to understand, but tools as well

                                                  I am a bit skeptical of this. Certainly C++ is harder for a tool to understand than C say, but I would be much less certain of say Ruby vs Haskell.

                                                  Though I suppose it depends on if the tool is operating on the program source or a running instance.

                                              2. 7

                                                One common compelling reason is that dynamic languages like Python only require you to learn a single tool in order to use them well. […] Code that runs at compile/import time follows the same rules as code running at execution time. Instead of a separate templating system, the language supports meta-programming using the same constructs as normal execution. Module importing is built-in, so build systems aren’t necessary.

                                                That’s exactly what Zig is doing with it’s “comptime” feature, using the same language, but while keeping a statically typed and compiled approach.

                                                1. 4

                                                  I’m wondering where you feel dynamic functional languages like Clojure and Elixir fall short? I’m particularly optimistic about Elixir as of late since they’re putting a lot of effort in expanding to the data analytics and machine learning space (their NX projects), as well as interactive and literate computing (Livebook and Kino). They are also trying to understand how they could make a gradual type system work. Those all feel like traits that have made Python so successful and I feel like it is a good direction to evolve the Elixir language/ecosystem.

                                                  1. 3

                                                    I think there are a lot of excellent ideas in both Clojure and Elixir!

                                                    With Clojure the practical dependence on the JVM is one huge deal breaker for many people because of licensing concerns. BEAM is better in that regard, but shares how VMs require a lot of runtime complexity that make them harder to debug and understand (compared to say, the C ecosystem tools).

                                                    For the languages themselves, simple things like explicit returns are missing, which makes the languages feel difficult to wield, especially for beginners. So enumerating that type of friction would be one way to understand where the languages fall short. Try to recoup some of the language’s strangeness budget.

                                                  2. 2

                                                    I’m guessing the syntax is a pretty regular Lisp, but with newlines and indents making many of the parenthesis unnecessary?

                                                    Some things I wish Lisp syntax did better:

                                                    1. More syntactically first-class data types besides lists. Most obviously dictionaries, but classes kind of fit in there too. And lightweight structs (which get kind of modeled as dicts or tuples or objects or whatever in other languages).
                                                    2. If you have structs you need accessors. And maybe that uses the same mechanism as namespaces. Also a Lisp weak point.
                                                    3. Named and default arguments. The Lisp approaches feel like cludges. Smalltalk is a kind of an ideal, but secretly just the weirdest naming convention ever. Though maybe it’s not so crazy to imagine Lisp syntax with function names blown out over the call like in Smalltalk.
                                                    1. 1

                                                      Great suggestions thank you! The syntax is trying to avoid parentheses like that for sure. If you have more thoughts like this please send them my way!

                                                      1. 1

                                                        This might be an IDE / LSP implementation detail, but would it be possible to color-code the indentation levels? Similar to how editors color code matching brackets these days. I always have a period of getting used to Python where the whitespace sensitivity disorients me for a while.

                                                        1. 2

                                                          Most editors will show a very lightly shaded vertical line for each indentation level with Python. The same works well for this syntax too. I have seen colored indentation levels (such as https://archive.fosdem.org/2022/schedule/event/lispforeveryone/), but I think it won’t be needed because of the lack of parentheses. It’s the same reason I don’t think it’ll be necessary to use a structural editor like https://calva.io/paredit/

                                                1. 3

                                                  Regarding Handles, I experimented with them, but never got them to work well in the context of Oil. I tried to pick Erik Corry’s brain a little before the second try, and got this interesting feedback:

                                                  The big advantage is no handles, no smart pointers, just plain C++ pointers and references living in the moment :-)

                                                  https://old.reddit.com/r/cpp/comments/v1vkce/a_garbagecollected_heap_in_c_shaped_like_typed/id1b0mo/

                                                  I don’t know exactly what it means :) But they used Handles in a long line of VMs, and seemingly came back around to raw pointers. (I’d be interested in more details.)

                                                  All I know is that I much prefer non-moving GC and raw pointers to the moving GC. It’s not quite as fast, but you can debug it like a normal C++ program, and all your performance tools work!

                                                  It was EXTREMELY disorienting to debug C++ code where objects are flying around. This was exacerbated by the fact that our collector is at the C++ level (collecting structs and classes), but I think it’s pretty much the same issue in a VM.

                                                  And after instrumenting Oil’s workload and writing some R scripts, I have a design for a pool allocator that integrates with the GC (and our separate mark bitmap), that should take over 75% of allocations. The pool should “approximate” a bump allocator. It may not be as fast, but I think it will be very simple and robust.

                                                  Performance comes from many places, and I think having all the normal C++ tools working is extremely important.

                                                  I don’t really “trust” moving GC in C++ … I think the language has to be changed so that the compiler can be aware of moving pointers.

                                                  Also, lazy sweeping in mark and sweep gives you the O(num live objects) runtime of Cheney, rather than the O(heap size) runtime of eager sweeping. This should matter a lot for many workloads.


                                                  FWIW I stumbled across their most recent GC when writing a blog post back in May:

                                                  https://github.com/toitlang/toit/tree/master/src/third_party/dartino

                                                  https://www.oilshell.org/blog/2022/05/gc-heap.html


                                                  I also found this a comment from a post about Rust and rooting interesting:

                                                  http://blog.pnkfx.org/blog/2016/01/01/gc-and-rust-part-2-roots-of-the-problem/

                                                  Just to be clear: the joke here is that we are basically suggesting layering our own semi-automated memory management system on top of a third-party automated memory management system. We should be striving to reduce our problems to smaller subproblems, not reproducing them.

                                                  This is the problem I found with handles – I don’t know what safety they really provide. My feeling is that they sit in an awkward spot between static and dynamic. It’s impossible to express handles in C, so obviously they are extraneous in some sense.

                                                  The garbage collector is already a dynamic mechanism for ensuring memory safety at runtime. It ensures there aren’t dangling references.

                                                  It doesn’t really make sense to me to recapitulate that in a Handle mechanism, but just for roots. As far as I know, it’s impossible to write a Handle that enforces correct usage statically. You always end up with a bunch of compromises between performance and safety.

                                                  Again I think the real solution is to not have 500,000 lines of C++ in your language implementation. Like GC, handles are global to the program, because reachability is a global property. Once you start using them, you have to use them everywhere. They’re not modular.

                                                  Also, some people advocated handles, but there isn’t a single design, and I never found a good explanation of the tradeoffs. I think even the “professional” GC implementers are struggling with this issue, and the state of the art keeps changing. Someone should write an overview, but I think nobody is quite satisfied with what they currently have, because of fundamental limitations of C++.


                                                  Regarding 24 bits, the initial version of Oil’s GC had that hard-coded limit, basically to support the separate mark bitmap. But a few weeks ago, with not much effort, @abathur managed to make Oil allocate over 16 Mi objects and crash. So I just changed it to 30 bits :)

                                                  Also reminds me that have been thinking about serializing heaps for a long time as well, and the prototype in this post used 24 bit pointers as well!

                                                  https://www.oilshell.org/blog/2017/01/09.html

                                                  That design was more read-only, and I never implemented the graph part – it was a tree. Right now I’m actually back on that horse, experimenting with Python’s pickle as a better format for a graph. It’s a nice format once you get rid of all the insecurity with Python code execution – i.e. making it pure data with lists and dicts, not Python classes and constructors. It’s a stack-based VM with object identity and no control flow.

                                                  1. 4

                                                    It’s impossible to express handles in C, so obviously they are extraneous in some sense.

                                                    In C a handle is just a pointer to a pointer, where the second pointer is owned by the memory manager and is updated when the object moves. For example a Window**, which you access like (**w).title.

                                                    Handles like these were ubiquitous in the “classic” MacOS; they existed to counter the problem of heap fragmentation in very small heaps (128KB and smaller!) When the heap filled up the memory manager would relocate moveable blocks to make room. All kinds of system data was accessed through handles (although strangely, not windows.)

                                                    And yeah, every developer at some point fell victim to half-dereferencing a handle into a pointer to the object, and then repeatedly using that pointer. It was faster and cleaner. But if you accidentally called something that could allocate memory, your pointer could become invalid and then you were in for a bad time.

                                                    It kind of sucked and I’m glad those “dumb” handles are gone, but there was nothing about them that C or C++ couldn’t handle.

                                                    [Update: That pun was unintentional!]

                                                    1. 1

                                                      Right, so a couple contributors thought that handles in C++ would provide some kind of static guarantees about correct usage. That didn’t seem possible to me, and never materialized.

                                                      In reality I think you have the possibility for the exact same bugs in C++ as you do in C – that is what I was getting at. Or at least I would like to see someone explain otherwise :)

                                                      To get some any more static guarantees in C++ than C, I think you need your own static analysis, outside what the compiler can provide, e.g. https://firefox-source-docs.mozilla.org/js/HazardAnalysis/index.html

                                                      1. 2

                                                        I think C++ handles, that encapsulate the double-dereference in a -> operator, would be safe if used consistently. The dangerous part is hanging onto the dereference pointer, and they wouldn’t do that. Even the optimizer knows it can’t assume the value of that pointer stays the same across a function call.

                                                        1. 3

                                                          I haven’t tried it, but you might be able to use the lifetime bound attribute to mark raw pointers as having a lifetime coupled to an RAII wrapper that pinned an object, so you wouldn’t expose operator* directly, you’d have a pin() that pinned the object (in the simple case just disabling the GC for its lifetime) and implemented a lifetime-bound operator*. The compiler would complain if the raw pointer outlasted the Pinned.

                                                          1. 1

                                                            It’s not feasible with the collector I’m using. All live objects must be moved, because the entire old heap (often called a “semispace”) gets discarded at the end of collection.

                                                            I know of a recent hybrid collector that generally leaves objects in place but periodically compacts them, and still manages to use a bump allocator. It’s very cool but more complex than I wanted to go.

                                                            1. 1

                                                              I think your collector could just refuse to GC while pinned objects exist and maybe give you some debugging info about where they are. I’d expect correct use of the Pinned wrapper to all be short-lived on-stack objects, so users might not notice.

                                                              1. 1

                                                                Good point; I hadn’t thought about that because currently smol_world doesn’t GC until it can’t satisfy an allocation request, and it has no innate ability to grow the heap, so it’s GC-or-fail.

                                                      2. 1

                                                        Have you looked at integrating with LLVM’s stack map stuff? Google did it for an experiment in Oilpan. I’ll have to go find it when I’m not on my phone but it’s probably on https://bernsteinbear.com/pl-resources/

                                                        1. 2

                                                          Interesting … but my collector can’t use that knowledge to “pin” objects on the stack, because it’s a simple Cheney collector that throws away the entire old heap afterwards. And if it updated the addresses on the stack to point to the moved objects, it be afraid it would break optimized code.

                                                          1. 1

                                                            I think you have to use it with the double dereference Handle/HandleScope situation, but it provides you with faster root finding during GC, I think.

                                                    1. 1

                                                      This website crashes on my phone browser :/ Android + Chrome.

                                                      1. 1

                                                        sorry about that. I’ll try to get an Android phone and fix it. Hope that doesn’t dissuade you from reading the article on another device.

                                                        1. 1

                                                          Oh, no, I will absolutely be reading on another device! I am very interested. I just didn’t have another at the moment.

                                                        2. 1

                                                          Me too. Aw snap after a minute or so of viewing, scrolling

                                                        1. 3

                                                          Ever since coming across this paper/tutorial many years ago, I’ve been on the lookout for more materials from Abdulaziz Ghuloum, but sadly it seems he has had no online footprint after the incremental compiler stuff was released.

                                                          1. 6

                                                            I think he moved away and started a restaurant, last I looked.

                                                            1. 2

                                                              Jeremy Siek has a course[1] and a book[2] to go with it which seems to build on Ghuloum’s work. Updated to generate code for x86_64 as well.

                                                              [1] https://iucompilercourse.github.io/IU-Fall-2022/ [2] https://github.com/IUCompilerCourse/Essentials-of-Compilation

                                                            1. 1

                                                              This is the result of my team’s collaboration with Brown. The full text is available here (PDF). I’m happy to answer any questions or point you to someone who can answer them better!

                                                              1. 11

                                                                At work, we wrote a static type checker for Python with a gradual and sound type system, added primitive (unboxed) types, and flowed the types through to the JIT. The generated code ends up being faster than Cython code for most (really like 99%) use-cases because there are no more FFI boundaries.

                                                                This is called Static Python, which is part of Cinder. I gave a talk (PDF) at ECOOP 2022 that shows off Static Python a bit (among other things).

                                                                1. 2

                                                                  Really cool work on Cinder, thanks for the presentation link!

                                                                  1. 1

                                                                    Let me know what questions you have :)

                                                                1. 2

                                                                  I really like the website design.

                                                                  1. 1

                                                                    Is it possible to construct the grammar manually with a C API instead of strings?

                                                                    1. 2

                                                                      Not yet, but if you make an issue in GitHub I might look into implementing it.

                                                                      1. 1

                                                                        Cool! I may or may not have time/energy but no pressure. It’s an idle question

                                                                        1. 2

                                                                          Issue created. Will look into fixing it when I have time.

                                                                    1. 2

                                                                      So, if the authors are here: can this be used to make, say, a chat application with no server but agreed upon message order?

                                                                      1. 1

                                                                        I really enjoyed reading about Russ Cox’s virtual machine approach to regex which has a kind of lightweight threads in this manner.