The level of clarity she manages about any topic she chooses to write about is really something special. It’s extra-interesting that she’s exposing this piece as she’s in the process of writing it. I should save this link and look at it next to the eventual final zine to use as an aspirational example for my own writing.
I’m dealing with one now. My Xlib-based program suddenly stopped working, complaining about a bad window. I’ve tried everything – even undoing all changes since the last working version. No dice. I’m shutting the computer down and going to bed!
I feel like you’ve pointed at a category of bug similar to but distinct from “distributed system” and “external library you don’t understand” and some of the other ones Julia called out:
The bug is due to complected systems and their interactions
The worst problems I’ve ever worked with arise from interactions. Your input tickles a bug in an upstream system, that system’s internal state gets trashed, and then you get timeouts/crashes/wrong answers. Maybe those things expose fragility in the system you care about.
Then, an indeterminate time later – naturally after you’ve started working on fixing your system, but before you’ve connected the dots – a supervisor daemon/watchdog/coworker reboots/applies a data fix/hits the upstream system with a hammer. Now your carefully reduced testcase starts working again.
One due to excessive cleverness. Colleague reused a buffer for multiple unrelated purposes and even shared it which worked only due to subtle undocumented reasons.
One due to excessive stupidity. Trying to extract common code, only read the first part of a case statement “case FOO_BAH_B…:” and failed to see in one copy it was FOO_BAH_BLAH and the other FOO_BAH_BLUU (case labels names changed to protect the innocent).
When I’m searching for something and there’s a link to Julia’s site, I’m always very happy.
This post about scanimage saved me a couple hours just last week.
The level of clarity she manages about any topic she chooses to write about is really something special. It’s extra-interesting that she’s exposing this piece as she’s in the process of writing it. I should save this link and look at it next to the eventual final zine to use as an aspirational example for my own writing.
I’m dealing with one now. My Xlib-based program suddenly stopped working, complaining about a bad window. I’ve tried everything – even undoing all changes since the last working version. No dice. I’m shutting the computer down and going to bed!
I feel like you’ve pointed at a category of bug similar to but distinct from “distributed system” and “external library you don’t understand” and some of the other ones Julia called out:
The worst problems I’ve ever worked with arise from interactions. Your input tickles a bug in an upstream system, that system’s internal state gets trashed, and then you get timeouts/crashes/wrong answers. Maybe those things expose fragility in the system you care about.
Then, an indeterminate time later – naturally after you’ve started working on fixing your system, but before you’ve connected the dots – a supervisor daemon/watchdog/coworker reboots/applies a data fix/hits the upstream system with a hammer. Now your carefully reduced testcase starts working again.
Yeah.
Totally wasn’t that I was too tired to realize that I didn’t really undo the changes because I apparently simply forgot to run git checkout.
;-)
Currently having one of these “Impossible bugs”, so I can relate…
Dealt with two yesterday….
One due to excessive cleverness. Colleague reused a buffer for multiple unrelated purposes and even shared it which worked only due to subtle undocumented reasons.
One due to excessive stupidity. Trying to extract common code, only read the first part of a case statement “case FOO_BAH_B…:” and failed to see in one copy it was FOO_BAH_BLAH and the other FOO_BAH_BLUU (case labels names changed to protect the innocent).