Not a big fan of studies like this tbh but let’s see what we’ve got. Obviously there’s some conflict of interest here as this is a “paper” produced by a vendor that sells a code quality product, but I try not to discount based only on conflicts. Still, it’s notable.
The premise here seems to be that the “copy/paste” vs “moved” metric is a good metric for measuring code quality, citing a study that DRY leads to better quality software. I think that’s questionable (not by me, I buy it entirely) but whatever, let’s just grant that DRY leads to better quality software despite there being a huge change of wind in dogma with regards to that (“a little copy paste is better than a little dependency”, etc).
The methodology is to look at Github with a sort of “dupe block detector”. I’m extremely wary of Github-based studies. While Github contains a ton of code I think it’s an extremely biased metric. Github is often where I push my worst code, often POCs go there, often I’ll not care about DRY or quality, etc. I would have been more curious to see this pruned via things like starred projects (indicating an “actual” project), or otherwise normalized to account for the fact that Github projects are going to overwhelmingly be of lower quality, student work, bootcamp projects, portfolios, etc.
I personally found the “first principle” argument compelling as someone who is compelled by DRY and so I personally buy the premise here that AI probably makes codebases worse because it lacks context to say “go use that other function” instead of just duplicating the code inline.
By analyzing 3,113 pairs of co-changed code lines, the researchers observe that “57.1% of all co-changed clones are involved in bugs.”
This is interesting, I am always skeptical of these sorts of studies but I’ll be curious to read that study too. My confirmation bias likes this though an, again, I really buy the first principle approach (and am glad they spelled it out since I believe that principles are more valuable than studies in this field, currently).
This graph shows that for every 25% increase in the adoption of AI, their model projects a 7.2% decrease in “delivery stability.”
This looks like it could fit almost entirely within margin of error? I’ll need to read the DORA study to tell.
Oof, this is a long study. And there are many referenced, I’m realizing my morning can’t afford this today :(
Couple of thoughts.
I’m surprised by the high quality of this document. I was expecting worse. That “41% increased bug” garbage got sooooo much press coverage and it’s one of the worst things I’ve ever seen, absolute trash. This is well written, well justified, and just good stuff. I’m excited to read more. While I’m skeptical of methodology for these sorts of things I feel like these authors actually built one of the better “analyze Github” studies, at least so far, I’m hoping there’s some more details on things like statistical models used and whatnot later on.
I think these are tractable problems. The obvious one is context size. We can’t currently index large projects into a context and then tell the AI to reuse functions, which means the AI is incentivized to generate new code within its local context vs looking up that code. This is something that we can solve very easily IMO, it’s one of the “give it a few years and we’ll scale out of this one”, I think.
To the blog,
This raises an important question about finely crafted. Are new developers actively thinking about improving their craft? I have observed engineers with ambitions to write compilers, design new programming languages, or even rewrite the TCP stack in Rust.
I think this is an interesting observation. A major trend of the last decade, which was barely a thing when I started about 15 or so years ago, is the idea that programming is just a good field to be in, you don’t need to love it or care or whatever. I don’t care at all on a personal level why people program or if they just want money, but I do think that we all benefit from internal motivation - if that motivation is money, passion, or some combination, we will be better programmers. If work is thought of as a slog, a thing simply to do to survive, we won’t. It seems to me, naively (as I tend to work at companies where that attitude is not represented well), that this would lead to newer devs caring more about “get it done” and less about “maintain it”. Financial incentives push you to leave a company after 2-4 years, meaning that your maintenance burden long term can be forgiven entirely once you leave the company.
Github is often where I push my worst code, often POCs go there, often I’ll not care about DRY or quality, etc.
Very good point. I have 98 public repos and while they might include a couple (20?) forks, I’d put the amount of “I’ll run that in prod, even just for personal stuff” at myabe 10-20%
Not a big fan of studies like this tbh but let’s see what we’ve got. Obviously there’s some conflict of interest here as this is a “paper” produced by a vendor that sells a code quality product, but I try not to discount based only on conflicts. Still, it’s notable.
The premise here seems to be that the “copy/paste” vs “moved” metric is a good metric for measuring code quality, citing a study that DRY leads to better quality software. I think that’s questionable (not by me, I buy it entirely) but whatever, let’s just grant that DRY leads to better quality software despite there being a huge change of wind in dogma with regards to that (“a little copy paste is better than a little dependency”, etc).
The methodology is to look at Github with a sort of “dupe block detector”. I’m extremely wary of Github-based studies. While Github contains a ton of code I think it’s an extremely biased metric. Github is often where I push my worst code, often POCs go there, often I’ll not care about DRY or quality, etc. I would have been more curious to see this pruned via things like starred projects (indicating an “actual” project), or otherwise normalized to account for the fact that Github projects are going to overwhelmingly be of lower quality, student work, bootcamp projects, portfolios, etc.
I personally found the “first principle” argument compelling as someone who is compelled by DRY and so I personally buy the premise here that AI probably makes codebases worse because it lacks context to say “go use that other function” instead of just duplicating the code inline.
This is interesting, I am always skeptical of these sorts of studies but I’ll be curious to read that study too. My confirmation bias likes this though an, again, I really buy the first principle approach (and am glad they spelled it out since I believe that principles are more valuable than studies in this field, currently).
This looks like it could fit almost entirely within margin of error? I’ll need to read the DORA study to tell.
Oof, this is a long study. And there are many referenced, I’m realizing my morning can’t afford this today :(
Couple of thoughts.
I’m surprised by the high quality of this document. I was expecting worse. That “41% increased bug” garbage got sooooo much press coverage and it’s one of the worst things I’ve ever seen, absolute trash. This is well written, well justified, and just good stuff. I’m excited to read more. While I’m skeptical of methodology for these sorts of things I feel like these authors actually built one of the better “analyze Github” studies, at least so far, I’m hoping there’s some more details on things like statistical models used and whatnot later on.
I think these are tractable problems. The obvious one is context size. We can’t currently index large projects into a context and then tell the AI to reuse functions, which means the AI is incentivized to generate new code within its local context vs looking up that code. This is something that we can solve very easily IMO, it’s one of the “give it a few years and we’ll scale out of this one”, I think.
To the blog,
I think this is an interesting observation. A major trend of the last decade, which was barely a thing when I started about 15 or so years ago, is the idea that programming is just a good field to be in, you don’t need to love it or care or whatever. I don’t care at all on a personal level why people program or if they just want money, but I do think that we all benefit from internal motivation - if that motivation is money, passion, or some combination, we will be better programmers. If work is thought of as a slog, a thing simply to do to survive, we won’t. It seems to me, naively (as I tend to work at companies where that attitude is not represented well), that this would lead to newer devs caring more about “get it done” and less about “maintain it”. Financial incentives push you to leave a company after 2-4 years, meaning that your maintenance burden long term can be forgiven entirely once you leave the company.
Very good point. I have 98 public repos and while they might include a couple (20?) forks, I’d put the amount of “I’ll run that in prod, even just for personal stuff” at myabe 10-20%