Just a word of caution: I’ve tried replacing the mitigation patch (which I had been using for 18 months) with the author’s minimal fix (the first out of the 6 patches, which is supposed to fix the issue) and I’ve immediately ran into a deadlock while building Poly/ML 5.9 (3 times, in fact), whereas previously I hadn’t encountered this bug before, despite having built Poly/ML many times.
This was on a 32-core machine with lots of background compilation going on at the same time (i.e. with a high load).
It could be a coincidence or an unrelated bug, but it’s also possible that the author’s patch might not fix the issue completely.
So there’s a glibc bug which causes deadlocks, and a fix (even if partial) has been available for years, and glibc still hasn’t shipped a version with the fix so it’s up to distros and users to patch the broken glibc code.
Reminds me of how a glibc update broke GNU m4 and GNU didn’t ship a fix to m4 for years, so it was up to distros and users to patch the latest release of GNU m4 to make it compile with the latest version of GNU glibc.
And then a couple of years later GNU shipped another version of glibc which broke m4.
GNU doesn’t strike me as an organisation which places a lot of value on software quality and reliability.
This is very true. Back when I was actively working on GNUstep, GCC released a new major release with broken Objective-C support, which completely broke our ability to build any GNUstep components. At the same time, the front page of the FSF web site had a call for contributors for GNUstep, pointing at the project as a strategically important one for the GNU project (the FSF never liked KDE and GNOME was starting to point out that, in spite of GNU in the name, it was never a GNU project, GNUstep was the only actively maintained GUI toolset in the GNU project and was seeing a lot of interest from new Cocoa developers). The issue on the GCC bug tracker was marked as not a release blocker because Objective-C was a second tier language for GCC.
This was almost 20 years ago now. The FSF has always presented the GNU project as a monolith. In practice, it’s a loose federation of projects that, on a good day, don’t actively sabotage each other, with a group of people who periodically try to push them in directions that are in direct opposition to what their users want.
GNU doesn’t strike me as an organisation which places a lot of value on software quality and reliability.
That’s very unfair to say. GNU tools have powered countless projects over multiple decades. They aren’t perfect, but to equate a few bugs with not caring about quality at all is so reductive.
Well, the putative fix, while along the right lines, isn’t correct either, so blindly applying it wouldn’t have helped anyone. What they need is someone capable of putting some concerted effort into it, but you can’t just magic such a person up.
I think you could probably get a significant speedup by replacing all of the procedure calls with macros (assuming the level-of-atomicity isn’t critical here). Might try that tonight.
Just a word of caution: I’ve tried replacing the mitigation patch (which I had been using for 18 months) with the author’s minimal fix (the first out of the 6 patches, which is supposed to fix the issue) and I’ve immediately ran into a deadlock while building Poly/ML 5.9 (3 times, in fact), whereas previously I hadn’t encountered this bug before, despite having built Poly/ML many times.
This was on a 32-core machine with lots of background compilation going on at the same time (i.e. with a high load).
It could be a coincidence or an unrelated bug, but it’s also possible that the author’s patch might not fix the issue completely.
So there’s a glibc bug which causes deadlocks, and a fix (even if partial) has been available for years, and glibc still hasn’t shipped a version with the fix so it’s up to distros and users to patch the broken glibc code.
Reminds me of how a glibc update broke GNU m4 and GNU didn’t ship a fix to m4 for years, so it was up to distros and users to patch the latest release of GNU m4 to make it compile with the latest version of GNU glibc.
And then a couple of years later GNU shipped another version of glibc which broke m4.
GNU doesn’t strike me as an organisation which places a lot of value on software quality and reliability.
GNU isn’t really one org from most points of view. Many of the projects do their work, especially their technical work, very independently
This is very true. Back when I was actively working on GNUstep, GCC released a new major release with broken Objective-C support, which completely broke our ability to build any GNUstep components. At the same time, the front page of the FSF web site had a call for contributors for GNUstep, pointing at the project as a strategically important one for the GNU project (the FSF never liked KDE and GNOME was starting to point out that, in spite of GNU in the name, it was never a GNU project, GNUstep was the only actively maintained GUI toolset in the GNU project and was seeing a lot of interest from new Cocoa developers). The issue on the GCC bug tracker was marked as not a release blocker because Objective-C was a second tier language for GCC.
This was almost 20 years ago now. The FSF has always presented the GNU project as a monolith. In practice, it’s a loose federation of projects that, on a good day, don’t actively sabotage each other, with a group of people who periodically try to push them in directions that are in direct opposition to what their users want.
Because they are heavily understaffed from what I’ve heard.
That’s very unfair to say. GNU tools have powered countless projects over multiple decades. They aren’t perfect, but to equate a few bugs with not caring about quality at all is so reductive.
Well, the putative fix, while along the right lines, isn’t correct either, so blindly applying it wouldn’t have helped anyone. What they need is someone capable of putting some concerted effort into it, but you can’t just magic such a person up.
I think you could probably get a significant speedup by replacing all of the procedure calls with macros (assuming the level-of-atomicity isn’t critical here). Might try that tonight.