1. 11

  2. 4

    The SIGSEGV seems better solved by using core dumps and a debugger, right?

    1. 4

      If you have a SIGSEGV and a core dump, yes, just SSH to the machine and boot up GDB.

      If you have two or three SIGSEGVs a day across a fleet of a thousand machines, you want some basic information about those (and all other) errors fed into your monitoring and log analytics, so you can prioritize which of the various rare failures you’re seeing is most urgent. A nice little signal-handler summary as the article describes is much better suited to your existing log analytics than a multi-hundred-megabyte core dump.

      1. 4

        Maybe. At Joyent they appear to take all core dump and upload them to their object storage and analyze them with thoth, so the more coredumps the better. Maybe @jclulow can speak to it.

        Maybe you’re right though, I don’t know, I haven’t experienced that.

      2. 2

        You can also do fairly evil things with SIGSEGV handling…as an exercise, imagine you had a shared address space with other processes on a network, and wanted to implement network shared memory. How might you accomplish that if you can intercept SIGSEGV?

        1. 1

          Yeah I know you can do a lot of evil things with SIGSEGV. I think it’s very dangerous to try to do anything given the nature of the error. That’s why I think one should probably let a SIGSEGV do it’s normal thing and use a debugger, unless they really know what they are doing.

      3. 1

        Wasn’t sigaction() an SysV R4 derived call? I learned UNIX systems programming on System V R3 so I’m pretty sure that wasn’t around (also there were issues with signal trapping and race conditions but I’ve succumbed to bitrot and forget the details)