There’s a lot to agree with in this article, but why place the boundary between precompiled code and libraries downloaded in source form?
Do npm packages count as “editable code underneath that I have to read or write for these symbols to work”? Sure, if you’re downloading code and compiling it, you may have to edit it. It’s a risk. If you’re just hotlinking to a dll or something, that’s different. If you’re using precompiled code, whatever you’ve got, you’ve got.
Precompiled libraries have bugs too, may need workarounds, and can even be patched if you’re serious about it. I’d be more inclined to place the boundary between “code that is so well-tested that you probably don’t need to debug that code specifically” (which may include e.g. SQLite in source-code form, or very well-tested internal libraries) and “all other code” (which includes most commercial code.)
… but you have seen / considered a lot more software development than I did; if you have time, I’d be interested in seeing a bit more about why the “precompiled” vs. “source” distinction is the right place to draw the boundary?
Yeah, this distinction makes no sense to me either. Maybe he’s assuming that anything worthy of being in a DLL musts be super solid and well tested? But that would be pretty naive.
I thought it was excluded because you can’t go and read through the source to know what it’s doing. So the risks to your product are still there, but there’s no additional cognitive load because you’ve just gotta take it on faith.
If you took this seriously, open source/free software might be one of the worst things to ever happen to software development. We’d be obliged to shove everything into opaque blobs to have any hope.
I assume the author meant something different, but it’s not quite clear.
The Code Cognitive Load is a great idea. I think. There has been a lot of conversations happening about what is good and what is bad code and this might be a great answer especially if we can automate it.
I’d love a linter that tells me that I have written something impenetrable and make me think of to the Tony Hoare’s principle of “There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
That’s a lot of words to say “I’m smarter than other developers” and “your abstractions and dependencies can cause problems”. It would have been good to see some justification for the the arbitrary exclusion of pre-compiled dependencies from the risk model, too.
I didn’t see it as claiming to be smarter than others. If anything, the author was claiming to be “smart enough” to realize their own limitations, and imploring others to recognize limitations as well.
I think minimizing the “amount of risk you take on as a programmer looking at any piece of code to manipulate it, whether coding fresh or doing maintenance on existing code” is a great general heuristic.
All I got was blank stares, and this was from professionals.
When I was writing my second book, I had to address this problem because reasoning about code and controlling complexity is the number one problem in software development today. It’s the reason we have buildings full of hundreds of developers wasting a lot of time and money. Our incentives are wrong.
Nobody talks about ways to understand you’ve gone too far (except for a few folks like me. Apologies for the shameless plug.)
False modesty is a bad thing, too. But I’ve met very few programmers (myself included) who are overly modest.
And, sure, it is good to understand the pieces you are using and the risks they might pose. But that’s applies to almost any work and is pretty obvious.
And, sure, it is good to understand the pieces you are using and the risks they might pose. But that’s applies to almost any work and is pretty obvious.
There’s a divide here between “obviousness” and “actionable understanding”, that is, between:
People who nod their head in agreement to the aphorism as you stated it (almost everyone).
People whose actions align with truly understanding that aphorism and taking it seriously (very few ime).
Much like the statement:
“It’s good to write simple, clear code that other developers can easily understand.”
In my opinion, the toxicity of the “ninja rockstar coder” is so rooted in the software domain, that some (my former manager and director for example) even consider that being able to write (overly) complex code, and (believing to) understand overly complex code is a trait of being an expert at programming.
For me, this is more a synonym of mediocre (not in pejorative sense, in the sense of average but not more) software engineers. Not bad, they are really capable of doing things, that is not the question, but they are doomed to fail most of their complex designs in the blinding lights of self satisfaction.
And guess what, they don’t care whether they do overly complex stuffs, because they leave the maintenance to others. (personal experience inside).
And guess what, they don’t care whether they do overly complex stuffs, because they leave the maintenance to others. (personal experience inside).
Having been a contractor for some years that’s had to pick up and maintain rock star systems, this is depressingly accurate. But unfortunately, management rarely care about whether the code is maintainable when the rock star is in full flim flam mode promising them the world.
This is going to be less a critic to this article itself, or the metric therein proposed, and more a critic to, let’s say, “approach to truth” of software developers in general. And as a software developer myself, I’m fully included and guilty of much of the same. If I come of as a dick, I apologize, but text allows for a limited range of emotions, and I’m bad at them even in person.
First, the study quoted has an N of 78, yet calls itself a large scale study. Sure, compared with some of the studies cited in the article itself, with 10 and 18 participants, 78 is big. I couldn’t think of a specific study to compare with, so I just typed “large scale psychology” on Google Scholar and clicked around. In the first page I found this study, with 23533 subjects, and this one, with 1895.
Now, to be fair, I haven’t actually read these studies, just skimmed the abstracts, but the point is: quantitative studies in software engineering are not even in their infancy, they are so young they can be legally aborted in 67 countries. We all like the validation of “science says so”, but bleeding edge research like is not very trustworthy, and anything it says should be taken with several grains of salt.
I’ll give to the author that he did use <rant></rant>, so, it’s not like he’s claiming scientific truth, and this sorta compensates for the fact that there’s no link to the study cited in the article (which, when quoting research, that’s kind of a must, IMHO). But I feel that the rest of the article fails to maintain that humility, which is what I’ll try to address next.
Second, the CCL metric is baked by pretty much nothing. There’s absolutely no other references to it anywhere other than from the author himself. It has not been tested in any empiric studies, which makes it kinda worse than the studies that the article that illustrates the post low key makes fun off, and is not even referenced by other famous people in the field, lacking even the fallacious support of the appeal to authority (broadly speaking. I’m not familiar with the author, but he looks pretty accomplished, so, what he says might have some weight, but it would be more impressive if more people agreed. Martin Fowler doesn’t have scientific evidence for most of what he says, but if you google the terms he uses, he’s not the only one using them).
Experience counts to a lot in software development. I for sure cannot justify with data and scientific studies 99.99% of the decisions I have made in my career, or of the advices I might have given to less experienced developers. But the mashing together of the (admittedly, illustrative) study, with the authoritative tone of the article, seems very representative of the refusal of developers in general to admit to ourselves that, actually, we’re not really sure of how to write good software.
Third, and that will be brief because I think it’s better addressed by other comments in this thread: when it comes to actually defining his new metric, the author gives no reasoning for the definitions.
The very basic definition is already kinda subject: risk of what? But that looks like it could turn into a rabbit hole so I’ll just give it a pass.
Then there’s the condition:
It’s scoped first by method/function and then by compilation unit. There’s no other scoping (namespace, class, module, etc.)
Why? Why is method/function more important than other scopes? Why are arguably wild different types of scope grouped at same level of (un)importance?
The definition itself doesn’t fare better: what is a symbol? How do you define that cross-languages? What about languages that don’t have exceptions at all? Do we consider only panics, or error codes as well? And then there’s the restriction of “editable code”. There’s some reasoning down the line for it, but it’s just circular: “If you’re just hotlinking to a dll or something, that’s different.” Different how? Why? Is a .jar equivalent to a DLL? What about a library compiled from Cython code? It’s binary, but the source code is pretty much Python. Does it count as editable?
At the end, my points are the following:
what we do ain’t science. There’s science in trying to understand, and I’m really glad that it’s growing, but the results are not that solid yet for us to take then as gospel.
We need to acknowledge that much of what we hold as true is unsubstantiated, fruit of experience and biased observation. It’s the best we have right now, but we shouldn’t hold to it to tightly, least we find ourselves believing things that don’t correspond to reality anymore.
PS: I wanna take exception to this specific paragraph, because I feel it misrepresents not only my work, but that of many excellent co-workers I had the pleasure of work with:
For some odd reason, discussions on frameworks and complexity always devolve into some version of “Dang kids!” vs. “Talentless luddite!” As many people have pointed out, not only do we not teach or talk about incremental complexity, if you don’t have the appropriate buzzwords in your CV, you don’t get hired.
The bit about buzzwords on CVs certainly rings true, but in my experience complexity is very much discussed in terms of increments, of the price you pay for adding complexity now. We talk about keeping things simple, about “You ain’t gonna need it”, about the dangers and benefits of using 3rd party dependencies and services. It might not become blog posts and tweets, but those concerns have been present at pretty much all discussions in projects I actually worked on. Said concerns haven’t always been heard, mind you, but they have been raised, at least.
First, the study quoted has an N of 78, yet calls itself a large scale study. Sure, compared with some of the studies cited in the article itself, with 10 and 18 participants, 78 is big. I couldn’t think of a specific study to compare with, so I just typed “large scale psychology” on Google Scholar and clicked around. In the first page I found this study, with 23533 subjects, and this one, with 1895.
An N of 78 is fairly weak, and an N of 10 is a joke. You can flip 10 fair coins and still have a 0.097% chance of landing all heads (or tails); to put that into perspective, major cloud providers offer higher reliability than the chance of having 10 fair coin flips all land on heads/tails. (Yes, I realize that this is Binomially distributed data not drawn from something like a Student’s T which may be more appropriate for variables under consideration.)
what we do ain’t science. There’s science in trying to understand, and I’m really glad that it’s growing, but the results are not that solid yet for us to take then as gospel.
We need to acknowledge that much of what we hold as true is unsubstantiated, fruit of experience and biased observation. It’s the best we have right now, but we shouldn’t hold to it to tightly, least we find ourselves believing things that don’t correspond to reality anymore.
S/o to @hwayne for really trying to push this. It’s my biggest pet peeve with technological discussions. A veneer of authority is applied to someone’s opinions and experiences in order to make categorical statements and apply condescension to others’ opinions and experiences. Correctness benefits of FP is the archetypical example here, but the metric discussed in this post is more of the same.
I’ve seen linters that raise warnings about cyclomatic complexity in individual functions, and sometimes it seems worthwhile but I’m not convinced.
One measure of complexity I’d like to see is on average, how many different files I need to look at to understand what a given line is doing. I’ve read some Golang code that makes pretty heavy use of interfaces that feels like it’d rate highly on that, and (because of loose typing) a lot of Python code takes tracing through many files to understand what an object actually could be - can it be None? is it always one type? Better check all the callsites…
I have seen those too and I think cyclomatic complexity is a bad measure for standard linters.
The reason is that I have seen more than one example where it has been used to increase complexity (in the way of making it harder to reason what it does). You have some switch statement for mapping some valued where it’s clear and obvious what’s going on. But your cyclomatic complexity check of course hates that switch statements.
The problem here is not the check itself. It makes sense. The problem is that it’s not an exact measure for CCL and there is for example the mentioned case where it’s obviously wrong for measuring CCL. It very much stands out for that type of code.
However once you make exceptions it leads to another problem. Early in my career we started to use liners to clear up some messes. So we each took a project with the goal of applying style and linting changes so we could better reason about stuff. We finished. We both had all checks pass. Only days later I realized that one fixed then by simply changing all the settings to fit the code, not vice versa.
What I wanna say is that linting where warnings are sometimes okay lead to problems creeping in very quickly, because these warnings will be silenced and then they change from a very useful tool to more of a burden because you cannot be sure if you really don’t have that kind of code/problem anywhere.
I guess leaky abstractions are a big problem. However, I feel like this metric throws the baby out with the bathwater. Don’t good abstractions effectively reduce cognitive load?
There’s a lot to agree with in this article, but why place the boundary between precompiled code and libraries downloaded in source form?
Precompiled libraries have bugs too, may need workarounds, and can even be patched if you’re serious about it. I’d be more inclined to place the boundary between “code that is so well-tested that you probably don’t need to debug that code specifically” (which may include e.g. SQLite in source-code form, or very well-tested internal libraries) and “all other code” (which includes most commercial code.)
… but you have seen / considered a lot more software development than I did; if you have time, I’d be interested in seeing a bit more about why the “precompiled” vs. “source” distinction is the right place to draw the boundary?
Yeah, this distinction makes no sense to me either. Maybe he’s assuming that anything worthy of being in a DLL musts be super solid and well tested? But that would be pretty naive.
I thought it was excluded because you can’t go and read through the source to know what it’s doing. So the risks to your product are still there, but there’s no additional cognitive load because you’ve just gotta take it on faith.
If you took this seriously, open source/free software might be one of the worst things to ever happen to software development. We’d be obliged to shove everything into opaque blobs to have any hope.
I assume the author meant something different, but it’s not quite clear.
As long as that .DLL isn’t obfuscated. .NET DLLs written in C# that aren’t obfuscated are trivial to decompile between ILSpy and dotPeek, for example.
The Code Cognitive Load is a great idea. I think. There has been a lot of conversations happening about what is good and what is bad code and this might be a great answer especially if we can automate it. I’d love a linter that tells me that I have written something impenetrable and make me think of to the Tony Hoare’s principle of “There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult.”
That’s a lot of words to say “I’m smarter than other developers” and “your abstractions and dependencies can cause problems”. It would have been good to see some justification for the the arbitrary exclusion of pre-compiled dependencies from the risk model, too.
I didn’t see it as claiming to be smarter than others. If anything, the author was claiming to be “smart enough” to realize their own limitations, and imploring others to recognize limitations as well.
I think minimizing the “amount of risk you take on as a programmer looking at any piece of code to manipulate it, whether coding fresh or doing maintenance on existing code” is a great general heuristic.
Here are some immodest extracts from the article:
False modesty is a bad thing, too. But I’ve met very few programmers (myself included) who are overly modest.
And, sure, it is good to understand the pieces you are using and the risks they might pose. But that’s applies to almost any work and is pretty obvious.
There’s a divide here between “obviousness” and “actionable understanding”, that is, between:
Much like the statement:
“It’s good to write simple, clear code that other developers can easily understand.”
In my opinion, the toxicity of the “ninja rockstar coder” is so rooted in the software domain, that some (my former manager and director for example) even consider that being able to write (overly) complex code, and (believing to) understand overly complex code is a trait of being an expert at programming.
For me, this is more a synonym of mediocre (not in pejorative sense, in the sense of average but not more) software engineers. Not bad, they are really capable of doing things, that is not the question, but they are doomed to fail most of their complex designs in the blinding lights of self satisfaction.
And guess what, they don’t care whether they do overly complex stuffs, because they leave the maintenance to others. (personal experience inside).
Having been a contractor for some years that’s had to pick up and maintain rock star systems, this is depressingly accurate. But unfortunately, management rarely care about whether the code is maintainable when the rock star is in full flim flam mode promising them the world.
This is going to be less a critic to this article itself, or the metric therein proposed, and more a critic to, let’s say, “approach to truth” of software developers in general. And as a software developer myself, I’m fully included and guilty of much of the same. If I come of as a dick, I apologize, but text allows for a limited range of emotions, and I’m bad at them even in person.
First, the study quoted has an N of 78, yet calls itself a large scale study. Sure, compared with some of the studies cited in the article itself, with 10 and 18 participants, 78 is big. I couldn’t think of a specific study to compare with, so I just typed “large scale psychology” on Google Scholar and clicked around. In the first page I found this study, with 23533 subjects, and this one, with 1895.
Now, to be fair, I haven’t actually read these studies, just skimmed the abstracts, but the point is: quantitative studies in software engineering are not even in their infancy, they are so young they can be legally aborted in 67 countries. We all like the validation of “science says so”, but bleeding edge research like is not very trustworthy, and anything it says should be taken with several grains of salt.
I’ll give to the author that he did use
<rant></rant>
, so, it’s not like he’s claiming scientific truth, and this sorta compensates for the fact that there’s no link to the study cited in the article (which, when quoting research, that’s kind of a must, IMHO). But I feel that the rest of the article fails to maintain that humility, which is what I’ll try to address next.Second, the CCL metric is baked by pretty much nothing. There’s absolutely no other references to it anywhere other than from the author himself. It has not been tested in any empiric studies, which makes it kinda worse than the studies that the article that illustrates the post low key makes fun off, and is not even referenced by other famous people in the field, lacking even the fallacious support of the appeal to authority (broadly speaking. I’m not familiar with the author, but he looks pretty accomplished, so, what he says might have some weight, but it would be more impressive if more people agreed. Martin Fowler doesn’t have scientific evidence for most of what he says, but if you google the terms he uses, he’s not the only one using them).
Experience counts to a lot in software development. I for sure cannot justify with data and scientific studies 99.99% of the decisions I have made in my career, or of the advices I might have given to less experienced developers. But the mashing together of the (admittedly, illustrative) study, with the authoritative tone of the article, seems very representative of the refusal of developers in general to admit to ourselves that, actually, we’re not really sure of how to write good software.
Third, and that will be brief because I think it’s better addressed by other comments in this thread: when it comes to actually defining his new metric, the author gives no reasoning for the definitions.
The very basic definition is already kinda subject: risk of what? But that looks like it could turn into a rabbit hole so I’ll just give it a pass.
Then there’s the condition:
Why? Why is method/function more important than other scopes? Why are arguably wild different types of scope grouped at same level of (un)importance?
The definition itself doesn’t fare better: what is a symbol? How do you define that cross-languages? What about languages that don’t have exceptions at all? Do we consider only panics, or error codes as well? And then there’s the restriction of “editable code”. There’s some reasoning down the line for it, but it’s just circular: “If you’re just hotlinking to a dll or something, that’s different.” Different how? Why? Is a .jar equivalent to a DLL? What about a library compiled from Cython code? It’s binary, but the source code is pretty much Python. Does it count as editable?
At the end, my points are the following:
PS: I wanna take exception to this specific paragraph, because I feel it misrepresents not only my work, but that of many excellent co-workers I had the pleasure of work with:
The bit about buzzwords on CVs certainly rings true, but in my experience complexity is very much discussed in terms of increments, of the price you pay for adding complexity now. We talk about keeping things simple, about “You ain’t gonna need it”, about the dangers and benefits of using 3rd party dependencies and services. It might not become blog posts and tweets, but those concerns have been present at pretty much all discussions in projects I actually worked on. Said concerns haven’t always been heard, mind you, but they have been raised, at least.
An N of 78 is fairly weak, and an N of 10 is a joke. You can flip 10 fair coins and still have a 0.097% chance of landing all heads (or tails); to put that into perspective, major cloud providers offer higher reliability than the chance of having 10 fair coin flips all land on heads/tails. (Yes, I realize that this is Binomially distributed data not drawn from something like a Student’s T which may be more appropriate for variables under consideration.)
S/o to @hwayne for really trying to push this. It’s my biggest pet peeve with technological discussions. A veneer of authority is applied to someone’s opinions and experiences in order to make categorical statements and apply condescension to others’ opinions and experiences. Correctness benefits of FP is the archetypical example here, but the metric discussed in this post is more of the same.
Yes, I wanna be @hwayne when I grow up XD
I’ve seen linters that raise warnings about cyclomatic complexity in individual functions, and sometimes it seems worthwhile but I’m not convinced.
One measure of complexity I’d like to see is on average, how many different files I need to look at to understand what a given line is doing. I’ve read some Golang code that makes pretty heavy use of interfaces that feels like it’d rate highly on that, and (because of loose typing) a lot of Python code takes tracing through many files to understand what an object actually could be - can it be
None
? is it always one type? Better check all the callsites…I have seen those too and I think cyclomatic complexity is a bad measure for standard linters.
The reason is that I have seen more than one example where it has been used to increase complexity (in the way of making it harder to reason what it does). You have some switch statement for mapping some valued where it’s clear and obvious what’s going on. But your cyclomatic complexity check of course hates that switch statements.
The problem here is not the check itself. It makes sense. The problem is that it’s not an exact measure for CCL and there is for example the mentioned case where it’s obviously wrong for measuring CCL. It very much stands out for that type of code.
However once you make exceptions it leads to another problem. Early in my career we started to use liners to clear up some messes. So we each took a project with the goal of applying style and linting changes so we could better reason about stuff. We finished. We both had all checks pass. Only days later I realized that one fixed then by simply changing all the settings to fit the code, not vice versa.
What I wanna say is that linting where warnings are sometimes okay lead to problems creeping in very quickly, because these warnings will be silenced and then they change from a very useful tool to more of a burden because you cannot be sure if you really don’t have that kind of code/problem anywhere.
I guess leaky abstractions are a big problem. However, I feel like this metric throws the baby out with the bathwater. Don’t good abstractions effectively reduce cognitive load?