1. 28
  1.  

  2. 5

    Of course this algorithm can find a spicy integer whose value just happens to correspond to an object’s address, even if that object wouldn’t have been counted as live otherwise. This approach doesn’t compute the minimal live set, but rather a conservative over-approximation. Oh well. In practice this doesn’t seem to be a big deal?

    This section talks about the typical false positive case (“spicy integer”), but I believe there can be false negatives too.

    In other words, I don’t think stack scanning is guaranteed to give you a conservative over-approximation. It can also give you an under-approximation, which leads to use-after-free.

    That is what I gathered from reading this long thread:

    https://github.com/JuliaLang/julia/issues/11714 (expand messages in the middle)

    which I linked in my recent GC post:

    http://www.oilshell.org/blog/2023/01/garbage-collector.html

    For example, one case that was brought up there (in addition to lots of other objections) was of interior pointers to objects. So if you have some object Foo* foo with 10 fields, and the compiler knows that you only use foo->fifth_field in the rest of the code, AFAIK there’s nothing in the C standard that says that it has to ever put a pointer to foo on the stack.

    If it can prove that you don’t use the earlier fields, which seems quite easy, it could easily store an interior pointer instead. And then you will miss a root, or miss scanning some fields, and have a use-after-free. Or you have to figure out a way to find the beginning of the object.

    Likewise for int a[10] and a[5].

    I don’t know if this actually happens, but I wouldn’t want to rely on imprecise scanning, especially if I don’t control the compiler. At least Julia embeds the compiler in the runtime, so they control it to an extent – i.e. in theory they could test if the bug is present in any given build. While other languages with conservative GC have no idea what compiler is laying out the stack they’re scanning.

    Definitely interested in comments from compiler developers!

    1. 7

      Hi, briefly: conservative stack scanning has to allow interior pointers, for the reasons you mention. This is actually a strength, as it lets the compiler emit interior pointers, something that is generally not the case with stack maps. We can generally avoid interior pointers for intraheap edges though.

      With regards to the stack vs registers: you scan the stack within an inner function, where all caller-save registers have already been spilled. Then in the inner function you spill callee-saves either explicitly or via setjmp. Then all roots are on the stack.

      For what it is worth, the JS engine in Safari uses conservative stack scanning. It’s good enough!

      1. 1

        Hm so how do they allow it? It seems like that is another “approximate” / YOLO algorithm in addition to the stack scanning.

        OK I googled for “boehm GC interior pointers” and I see snippets like this (which are probably very old, but also something people still use):

        If ALL_INTERIOR_PTRS is not defined, then stack roots require special treatment. In this case, the normal marking code ignores interior pointers, but GC_push_all_stack explicitly checks for interior pointers and pushes descriptors for target objects.

        A page may be black listed for interior pointer references (GC_add_to_black_list_stack), if it was the target of a near miss from a location that requires interior pointer recognition, e.g. the stack, or the heap if GC_all_interior_pointers is set. In this case, we also avoid allocating large blocks that include this page.

        https://www.hboehm.info/gc/gcdescr.html

        So I guess this is what you were talking about in the post – Boehm GC has more ifdefs than Whippet has code :)

        This sorta confirms why I was scared to use it:

        • the size of the code
        • the number of #ifdefs, and number of tuning parameters
        • the fact that I saw the Nix evaluator was carrying around patches for Boehm GC on Darwin.
        • The issue of “hints” . Several people suggested using Boehm GC for https://www.oilshell.org/ because it’s “drop in”, but that seems to be a fallacy. In reality, you’re supposed to provide a lot of “hints” to it, taking into account the characteristics of your language’s heap. I’d be interested in Guile’s experience on that (although I guess at this point it doesn’t matter that much, since you’re moving away from it!)

        That said, I did see that Mono and other runtimes use conservative roots and precise tracing, so that does seem to be a fairly common design. It does seem hard to avoid when you have a lot of C or C++.

        So then does whippet has its own stack scanner that takes into account interior pointers? I didn’t see that mentioned in the post.

        1. 1

          For what it is worth, the JS engine in Safari uses conservative stack scanning. It’s good enough!

          Also on this, I’d say “good enough” is relative to a problem. Apple “open source” is a fairly different situation than most open source. Safari runs on a limited number of platforms, and Apple is very vertically integrated – the CPU, compiler, and browser can “collaborate” in their world. They break abstraction layers to get performance and treat it more as an embedded system.

          In open source, you don’t control your compiler, and people want to run on different CPUs, and I think this greatly affects conservative GC (although I don’t use it!)

          I didn’t look that deeply into Mono and others, but I suspect they also have some platform limitations (which is of course fine if that’s the problem they want to solve, similar to Apple)

          Also haven’t looked into it deeply, but I imagine that imprecise roots + precise tracing has a lot fewer headaches than imprecise roots + imprecise tracing.

          1. 2

            Safari isn’t the only end product that JavaScriptCore runs in :) WebKit is many many places, from kitchen appliances to set-top boxes to airplane seat-backs, etc etc. Regarding BDW, it’s as portable as can be, I think; consider https://packages.debian.org/search?keywords=libgc1c2, consider that it runs on even old HP-UX, there are wasm ports…

            And did I mention that there is an ongoing project in Chrome for V8 to switch to conservative stack tracing as well?

            I am not saying that conservative tracing of the stack is the best design point, only that it does work. In your case you mention AFAIU that you GC only when the stack is empty; a good tradeoff, similar to the one made in Oilpan. But having the ability to collect with a non-empty stack (instead of having to expand the heap or abort) is a feature that can be added in addition to empty-stack-precise tracing, if you use conservative tracing.

            1. 1

              Yeah I’d agree it has been “observed to work” over a long period … I would guess the bug rate simply falls below the bug rate of the rest of the software that uses it.

              I’d also make the distinction of “has been ported” vs. “portable”, although I’d have to know how many #ifdefs there are for different platforms. Sounds like a lot, but I didn’t count. (I think Oil will only #ifdefs for endian-ness, and that’s it)

              I would agree with the statement about “background anxiety every time a memory leak comes up, for 20 years” … though for me I would also add “background anxiety about use-after-free” :-)

              In the end I think the “real” solution would have to be changing the C/C++ languages to support finding roots, but this seems unlikely now, since there have apparently been many efforts over the years.

              Anyway, it’s great to see posts with “implementer lore” about garbage collection! I haven’t looked into Immix, but I remember that the Inko language had an implementation, though they seem to be trying to move away from GC altogether.

        2. 2

          Interior pointers can be dealt with. With an immix-style heap layout, it is not even that hard. Beyond that, I don’t know guile, but if the pointers exposed by its c interface are opaque, then there would be no way to form an interior pointer in the first place.