1. 46
  1.  

    1. 22

      The .NET release announcements often refer to “peanut butter”, “a thin smearing of overhead across everything”, typically when describing a scattering of little micro-optimizations that cumulatively can improve performance a lot.

      I can’t remember where I first encountered “peanut butter”, but I got the idea that it was based on the analogy that if you build a sandwich with even a very thin layer of peanut butter, the whole thing tastes of peanut butter. “Peanut butter” stands for slowness, and the idea is that if the low levels of your system are even the tiniest bit slow, it can add up to pervasive slowness because the low levels are exercised a lot.

      Anyway, flamegraph.pl --reverse is the option for “show me the peanut butter!”

      1. 8

        https://github.com/KDAB/hotspot

        https://profiler.firefox.com/

        For those of you, who like me always forget how to use flamegraph.pl.

        1. 6

          yep, and the “reverse” thing is just a checkbox to check that updates in realtime - I’d never have thought it would be worth a whole article about this feature as it’s just readily available to use in Hotspot (and Heaptrack, which does the same but for memory allocations)

          1. 1

            I’ve used flamegraphs but never heard of hotspot, so…

            1. 1

              I tried Hotspot. It segfaulted on the first use, so I didn’t investigate it further. But it looks enticing, I’ll give it a shot again.

              1. 3

                strange, I’ve used it for.. like 7ish years at this point and don’t really recall it segfaulting

                1. 1

                  I also recently tried Hotspot and when I installed it from AUR on Arch Linux, it wouldn’t read perf.data files — it just kept on consuming memory until exhausting my machine.

                  When I tried it on an Ubuntu 24.04 machine, hotspot worked fine with the same perf.data file.

                  So double-check whether you might have a broken version / broken installation :)

                  BTW: I ended up not using hotspot, as for the purpose of annotating source code files with counter values, it did not seem any better than perf report. Would be curious what others are using hotspot for primarily…?

                  1. 2

                    For flame graphs.

                    I also prefer perf report for annotating source.

                    I would like some tool to detect register / stack thrashing, though.

            2. 7

              Also worth pointing out that by default flamegraphs only show on-cpu time (i.e. your application/kernel running code at the time the sample is taken). That is not the whole story, if the application/thread is asleep waiting for something and doesn’t run on any of the CPUs at the time the sample is taken it won’t show up at all. To see them you need to use “off-cpu” flamegraphs.

              I once found a literal “sleep” in the code deep in a 3rdparty library that way (it was PAM, and it kept loading/unloading the crypto library every time, triggering it’s initialization code many times, which had a ‘sleep’ inside it as it was too early for pthread_cond to work. More modern Linux distros don’t have this problem anymore since they switched to libxcrypt).

              1. 6

                Flamegraphs can actually help visualise anything where you can produce a weighted frequency for a given stack! You can also do things like trace disk I/O or memory allocations, using the size in bytes as a weight, to get interesting visualisations as well.

                1. 2

                  I never thought about measuring things other than runtime! I’ll have to keep this in mind; stuff like profiling memory allocations seems like it could be really handy.

                  1. 1

                    Is there a good guide for how to build up the flamegraph data with custom metrics like that? I must admit I have relied on “tool spits out stuff for flamegraph.pl consumption” and haven’t thought of it further

                    1. 0

                      you are everywhere.

                      1. 3

                        I assure you I am not!

                    2. 1

                      Sampling profilers (such as rbspy) do the opposite - visualizing wall time only.