1. 20
    1. 7

      [T]he people who did the browser mistook it as an application. They flunked Operating Systems 101.

      – Alan Kay

    2. 4

      Having programs memory map files works well if all of the data is needed at some point in the exact format that it’s currently described. Unfortunately though, many usermode caches take the form of content which is a transformation of some other content which could be regenerated (ie., the raw HTML under most pages are much smaller than the browser memory footprint.) Forcing that type of data to disk and back is not optimal. It wants a mapping where the kernel can discard data and the application can become aware of the discard on later use, and then regenerate.

      This actually happened once upon a time, but modern kernels have gone in a different direction. You can see the traces in things like GMEM_DISCARDABLE.

      1. 6

        Windows and XNU both have this mechanism - memory that can be discarded under memory pressure - but it’s incredibly hard to use correctly . *NIX systems also typically have MADV_FREE, which is a bit harder to use for this (you must perform a write to detect if it’s been evicted and mark it as not-safe-to-evict until the next MADV_FREE). In all cases, there’s a complex ordering that you have to get right to ensure that the memory is marked as non-discardable while in use.

        XNU and Windows (and Android) also provide a form of low-memory notification, which allows you to implement custom policies for discarding memory when you don’t need it. This lets you consume lots of memory while there is free memory and then free a load of cache elements that you’re not using when it becomes constrained. .NET uses the Windows mechanism to become a bit more aggressive about GC. There’s a bunch of stuff in Apple’s libraries that builds on top of their version of this mechanism to discard caches when memory pressure reaches certain levels.

        1. 3

          I added some support for “discardable” address space to Chromium while I was working at Google (circa 2009-10.) I don’t recall the details now, but I’m sure my code was Mac-specific … it’s likely other engineers did the equivalent code for Windows & Linux.

          The Darwin/Mac API isn’t hard to use IMO. You mmap some address space with a special flag, and there are calls to lock and unlock regions so the pages don’t get discarded while you’re in the middle of accessing them. (The Foundation framework has some higher level convenience APIs using this, such as NSCache.)

        2. 2

          Post author here. That’s great to know; thanks. I wasn’t aware that such APIs existed and that they were even put to use in Chrome, but based on what you are saying here, it doesn’t seem as if they are widely used due to nonportability and difficulty. Maybe Chrome does this already in some cases, but it definitely isn’t the case for many other apps :)

          Do you have pointers to the APIs you are referring to? I’m finding some stuff online but I’m not sure I got to the right information and would love to read more on it and amend the post.

          1. 2

            For low-memory notification:

            The macOS APIs are exposed by the kernel as an undocumented kqueue event type, which is then wrapped in something public in libdispatch, which lets you register blocks to be executed when you have an event of the correct type.

            The Windows ones are objects created with CreateMemoryResourceNotification that can then be waited on just like any other handle. In my experience, the Windows ones trigger at somewhat surprising times (for example, I was able to get my desktop into a state where allocations would fail because the kernel reported the commit charge was exhausted, but the low memory notification didn’t fire. From what I can tell, the notification is triggered by the amount of physical memory in use, whereas allocations can fail if all physical memory + swap is reserved for use but not actually used).

            The Android ones, as far as I can tell, are only exposed to Java.

            These ones are relatively fresh in my mind because I looked them up for snmalloc a couple of years ago - we get better performance if we never give memory back to the OS until it asks for it. I think XNU actually has the best API for doing this (though it’s undocumented). They have a variant of MADV_FREE that requires an explicit operation to undo. It’s very cheap because you don’t need to do any of the dirty-bit tracking that MADV_FREE does and so the kernel part can be entirely core-local (possibly acquiring some locks that are almost certainly uncontended). The slow path happens only when the OS needs to reclaim the memory and even that avoids the complex cases of MADV_FREE because you’re just locking the vm_map objects and removing the vm_pages from them so that the undo operation will get CoW faults of zero pages. The other nice thing about this API (from the perspective of a memory allocator) is that pages returned via this don’t count towards your RSS total because the kernel can always discard them even if you’ve written to them.

            For discardable data, it’s been a much longer time since I looked (probably around 2005ish when I started looking at adding NSCache to GNUstep). I think macOS has an madvise flag for purgeable memory and I can’t remember how Windows exposed it. The macOS one probably uses the same underlying mechanism as MADV_FREE_REUSABLE / MADV_FREE_REUSE (I actually wouldn’t be surprised if the flags were the same): Once the memory has been marked as discarded, the kernel is free to reclaim it. Once it’s marked as reused, the kernel can’t. If the kernel did discard the contents in the middle then the pages are guaranteed to be full of zeroes. This makes them somewhat clunky to use because you have to ensure that you store some non-zero data on a page and if you want to store objects smaller than a page then you need to handle your own refcounting so that the page is locked when any of them are in use. They’re fine for large things like cached files.

          2. 1

            I think NSCache is the easiest way to do it on Apple platforms.

        3. 2

          For what it’s worth, I don’t think low memory notification is sufficient. If you have two programs trying to cache state, unless they both trim in response to the notification in the same way and grow at the same rate afterwards, one of them will crowd out the other. Mechanisms where the kernel discards data based on when it was last used don’t have this problem, since the kernel will end up allocating resources to where the load happens to be.

          That said, the problem with the kernel discarding is it has no idea what the cost of a cache miss is, so it treats all caches as offering the same performance benefit.

      2. 2

        Well, you can make this work, kind of, at a suitably low level. You can allocate a virtual memory area for your cache, and if you cache your application data at page granularity, you can tell the OS when you’re done with a range of pages using madvise(MADV_FREE), which lets the OS lazily deallocate the pages, and allows you to reuse the pages if the OS hasn’t gotten around to releasing them yet, without the kernel having to allocate/zero new pages.

    3. 1

      You can imagine having a feedback mechanism between the kernel and the apps to reclaim memory. The kernel could say “hey Chrome, I’m running out of memory so please free some stuff you don’t really need” to request memory back. Unfortunately, this would not work because it would require all applications to comply and do the right thing in all cases. A single rogue or buggy application would hoard all memory, rendering a worse situation.

      Wouldn’t that just be the same as today, just with another app? The complaint is that Chrome (or another application) hogs the memory and is a “bad citizen” on the computer. At least having that method to reclaim the memory would mean any good citizens could actually “be good”, and we can pester the writers of “bad citizen” software to improve their act since there is now a method for them to do so. Without this method, there is no “solution” for Chrome to be better. It just has to trundle along the same way, or create a worse user experience for their users.

      1. 5

        This mechanism exists on Windows, macOS, iOS, and Android. Safari uses it on two of those platforms. It composes well with a policy that the OOM killer favours applications with large RSS: if you are using a lot of memory, receive the notification, and don’t free a lot of memory, then you’re likely to be killed. If you do, then the badly behaved process will be higher priority to kill than you.

        There is some subtlety in getting the notification timing right (which is why XNU provides multiple thresholds). You do want all of the memory to be used, you don’t want handling the low-memory notification to trigger swapping. Often, deallocating a lot of objects requires allocating a small number, so you need to make sure that there’s a little bit of headroom available. This is easier if you’ve got a reasonable number of clean pages in the buffer cache that you can evict when the application asks for more memory.