1. 62
    1. 29

      While LLMs definitely seem to be good for common templating (i.e., more efficiently searching and copying from stack overflow), I’m yet to be convinced they will be a productivity gain beyond that.

      Particularly, to do less programming, you basically have to become a product manager from the 1990’s, specifying requirements to a ridiculous level of detail (that never produces the right results despite the massive amount of work).

      AND you then have a bunch of additional debugging and testing.

      Programming is currently much easier.

      And the downside of using it for templating (for new developers in particular) is that it provides an impediment to building expertise.

      Not to mention that, as LLMs turn Stack Overflow into a ghost town, and as people need to start licensing the content they’re training on, there’s a non-zero chance that, as tech evolves, the outcomes for the templating won’t get any better.

      1. 4

        I think they will be good for more complex refactorings as well.

        1. 43

          Oh using them for refactoring sounds terrifying… when using these tools, your job as a human becomes to review the code, and my experience is that refactors are by far the hardest kind of change to review. If you additionally can’t trust the entity doing the refactor to not hallucinate solutions which make no sense, and can’t expect to get sane or truthful answers to questions, that sounds truly hellish.

          If language models transform programming jobs into “perform code reviews on refactors done by lying imbeciles”, I quit.

          1. 4

            Some languages like Rust make refactors generally painless.

            1. 6

              Somewhat disagree.

              While any properly typed language makes refactors nice, Rust very nicely adds a whole bunch of safety guarantees that give confidence nothing will break. However those same guarantees now mean post-refactor

              a) That mutable field you needed is now held by a different structure which is used in more than 1 place by an immutable borrow. No really easy way out here

              b) (worse) Some life-time annotations that held before refactor now don’t make sense anymore, and good luck entangling that

              1. 4

                They don’t make code reviewing refactors painless.

              2. 2

                I think you have a different understanding compared to what I meant.

                I was rather talking about “mechanical” refactorings. Think new library versions, design/architecture changes etc. And having to review things is already the case for me. I tell IntelliJ to do this and that. Then I have a dozen compile errors and a couple of undesired changes and a couple of places where it’s missing. I have to review all that stuff anyways. The LLM should just do this more efficiently and be able to use context better than IntelliJ does.

                1. 7

                  That’s hardly a “more complex” refactoring though. That’s a dead simple refactoring.

                  1. 1

                    Call it how you want, my point is that even for those kinds of refactorings even the best tools nowadays are still too dumb and most of the burden is still on me.

                    1. 1

                      I wasn’t really disagreeing with you. I just meant that I think your point is more that the tools will be better for simple/mechanical refactoring rather than “complex” ones (which I personally at least would not want to use them for).

                  2. 4

                    In what way is moving to a new library version a “refactor”? Do we use these terms in completely different ways?

                    But if that’s the sort of thing you mean, then yeah, I agree that LMMs could be useful for that. It’d be easy to miss subtle incompatibilities which the LMM didn’t catch but that’s the case for all library upgrades.

                    1. 1

                      Do we use these terms in completely different ways?

                      Looks like it. I didn’t only mean code cleanup and simplification, but also “mindless” changes. I’ve recalibrated my brain to match the group’s definition again though.

              3. 2

                It surprises me how not that good they are at porting logic between languages, or versions of languages.

                1. 1

                  LLMs demostrate some “common sense” – far from perfect, yet, but they’re capable of understanding the context of requests, and extrapolate requirements.

                  What if LLM’s become good managers capable of formalizing vague customer requests?

                  1. 5

                    LLMs demostrate some “common sense” – far from perfect, yet, but they’re capable of understanding the context of requests, and extrapolate requirements.

                    In what world? I’ve been using this Codeium thing at work for a little while now and it’s more of a nuisance than it’s worth with its fully “contextual” suggestions.

                    1. 1

                      I mean their more general abilities (the instruction/chat format).

                      Code completion of a single statement is the wrong place to design a whole application, for humans too.

                      1. 1

                        Have you tried using GPT 4 or Claude Opus? They are quite good at basic “common sense.”

                  2. 17

                    I feel like I’m taking crazy pills whenever I read one of these articles. CoPilot saves me so much time on a daily basis. It just automates so much boilerplate away: tests, documentation, switch statements, etc. Yes, it gets things wrong occasionally, but on balance it saves way more time than it costs.

                    1. 62

                      Comments like this always make me wonder: How much boilerplate are you writing and why? I generally see boilerplate as a thing that happens when you’ve built the wrong abstractions. If every user of a framework is writing small variation on the same code, that doesn’t tell me they should all use an LLM to fill in the boilerplate, it tells me that we want some helper APIs that take only the things that differ between the users as arguments.

                      1. 30

                        “It should be noted that no ethically-trained software engineer would ever consent to write a DestroyBaghdad procedure. Basic professional ethics would instead require him to write a DestroyCity procedure, to which Baghdad could be given as a parameter.” — Nathaniel Borenstein

                        1. 5

                          Yeah it’s definitely that I don’t know when to add abstractions, not that the tool is useful in some specific circumstances 🙄

                          1. 4

                            You created that perception by choosing such a questionable example. It’s reasonable pushback.

                            1. 1

                              What on earth are you talking about? How could “tests, documentation, and switch statements” possibly be a questionable example? They’re the perfect use-case for automated AI completion.

                          2. 2

                            I’ve found it useful when I want to copy an existing test and tweak it slightly. Sure, maybe I could DRY the tests and extract out common behavior but if the test is only 10 LoC I find that it’s easier to read the tests without extracting stuff to helpers or shared setup.

                            1. 14

                              That was one of the places where Copilot significantly reduces the amount I type relative to writing it entirely, but I found it was only a marginal speedup relative to copying and pasting the previous test and tweaking. It got things wrong enough that I had to carefully read the output and make almost as many changes as if I’d copied and pasted.

                              1. 1

                                IME the cumulative marginal savings from each place it was helpful was far, far, far outweighed by one particular test where it used fail instead of error for a method name and it took me a distressingly long time to spot.

                                1. 2

                                  I think I’ve only wasted a cumulative five minutes of debugging test failures caused by Copilot writing almost the right test, but I’m not sure I could claim that it’s actually saved me more than five minutes of typing.

                            2. 2

                              I think the general answer is “a lot”. Once you have a big codebase and several developers the simplicity you get from NOT building abstractions is often a good thing. Same as not DRYing too much and not making too many small functions to simplify code flow and local changes. Easy to maintain code is mostly simple and reducing “boilerplate” while great in theory always means macros or metaprogramming or some other complicated thing in practice.

                              1. [Comment removed by author]

                              2. 12

                                I don’t think you are taking crazy pills! Copilot could totally be saving you time. That’s why I prefaced by saying the kind of project I use Copilot with is atypical.

                                But I also want to say, I once believed Copilot was saving me time too, until I lost access to it and had some time to compare and reflect.

                                1. 4

                                  Programming in Lisp, I rarely have boiler plate, because any repeated code gets abstracted away.

                                  1. 3

                                    I’ve used Copilot for a while and don’t use it anymore. In the end, I found that for most boilerplate can better be solved with snippets and awk scripts, as they are more consistent. For example, to generate types from SQL, I have an AWK script that does it for me.

                                    For lookup, I invested in good offline docs that I can grep, that way I can be sure I’m not trusting hallucinations.

                                    I didn’t think Copilot was useless but my subscription ran out and I don’t really feel like I need to resubscribe, it didn’t add enough.

                                    1. 2

                                      Same here. One of the biggest ways it helps is by giving me more positive momentum. Copilot keeps me thinking forward, offers me an idea of a next step to either accept, adjust, or reject, and in effectively looking up the names and structure of other things (like normal IDE autocomplete but boosted) it keeps me from getting distracted and overfocusing on details.

                                      It does help though that I use (somewhat deliberately) pretty normal mainstream stacks.

                                      1. 2

                                        Ditto. Especially the portion of the article that mentions it being unpredictable. Maybe my usage is biased because I mostly write python and use mainstream libraries, but I feel like I have a very good intuition for what it’s going to be smart enough to complete. It’s also made me realize how uninteresting and rote a lot of code tends to be on a per-function basis.

                                        1. 1

                                          Yeah I feel like I type too slowly so sometimes I’ll just let copilot generate something mostly in line with that I’m thinking and then refine it.

                                        2. 14

                                          For Rust, LLMs aren’t able to genrate working code in my codebase at all.

                                          1. 4

                                            It’s possible this says more about Rust than it does about LLMs.

                                            1. 5

                                              No, because I can write working code. It’s just completely different to the LLM’s code.

                                            2. 1

                                              I have questions. :)

                                              1. Exactly what do you mean by working code? The code doesn’t compile? Or the code doesn’t function as intended?
                                              2. Which LLMs are you referring to? GPT 3.5, GPT 4, Claude Haiku, Claude Opus? LLMs vary widely in their capabilities.

                                              I use LLMs extensively in coding. For in-IDE support: GitHub Copilot. For discussions of complex problems ChatGPT 4 and Claude Opus. (Yes, I use both. They have different strengths and weaknesses.)

                                              1. 7
                                                1. It doesn’t compile, even after several tries of the LLM.
                                                2. Copilot & GPT 3.5 (The ones I have free access too)
                                                1. 1

                                                  It doesn’t compile, even after several tries of the LLM.

                                                  When you say “after several tries,” do you mean asking it the same thing repeatedly? Or do you provide it feedback via the compilation errors? I’ve found that LLMs may not get things right the first time, but when there is a feedback mechanism in place, they’ll often make their way towards something that works.

                                                  Copilot & GPT 3.5 (The ones I have free access too)

                                                  Ah. That’ll be the biggest problem. Copilot is pretty good as a more advanced autocomplete, but I’ve found it’s chat capabilities lacking. And I wouldn’t try using GPT 3.5 for any serious coding task. Unfortunately, the models that are good are behind paywalls. I’d recommend trying either GPT 4 or Claude Opus for a month to see if you find them useful. I definitely derive far more than $20/month in value from them.

                                                  1. 4

                                                    I definitely derive far more than $20/month in value from them.

                                                    Does your employer allow you to expense this cost? Or are you self employed/a freelance worker where increased productivity directly results in earning more?

                                                    As a salaried worker I don’t follow this reasoning, because even if I did find the tool useful and it helped in my day to day work, it would not result in getting paid more, so the subscription would not pay for itself.

                                                    1. 3

                                                      My employer provides GPT 4 and copilot.

                                                      And for my own personal projects I pay for copilot, ChatGPT 4 and Claude Opus because they’re that much of a productivity boost.

                                                      1. 1

                                                        Would you think of it as paying for itself if it gave you a productivity boost that helped you get a promotion?

                                                        You can use GPT-4 for free via Bing, if you can figure out the right incantations - they change often enough that I’m not confident that I have a great link for them right now.

                                                        1. 4

                                                          Would you think of it as paying for itself if it gave you a productivity boost that helped you get a promotion?

                                                          In the abstract, sure, but in practice I would find this difficult justification, for two reasons: (1) I am skeptical that using LLM tools would provide that much of a productivity/performance boost that it alone would result in more/faster promotions [1], and (2) I suspect it would be difficult to trace any productivity improvements (real or perceived) to using an LLM assistant (the OP did a kind of long-term A/B self-study, so I suppose it could be done).

                                                          [1]: I’ve experimented with both ChatGPT 3.5 and Claude (whichever the free one is) in my normal workflow and both were hit or miss on whether they were useful at all, let alone providing any kind of non-trivial productivity multiplier.

                                                          EDIT: I should add that LLMs for me have limited usefulness in my day job because I cannot send any of my company’s source code to a 3rd party server. So any time that I want to use an LLM, I have to first distill my question into some abstract, generalized form. I suspect that if I were able to query an LLM on the actual code I would probably be able to find more use out of it.

                                                          1. 4

                                                            I am at point now where I will confidently estimate that I am getting a 2-5x productivity boost from LLMs for the time I spend typing code into a computer - which is only about 15% of my work, but still material.

                                                            But… I’ve been using them extensively for over two years, and as such I have developed a very strong intuition as to how to get the best results out of them. That learning curve is actually really steep, but few people ever talk about them as tools that are genuinely difficult to use.

                                                            I have extensive notes on how I use them here: https://simonwillison.net/series/using-llms/

                                                            1. 1

                                                              You have indicated a couple of times that you have built some form of intuition on how to query LLMs to get the result you want. I found some of your tutorial very instructive but if you would be up for it, I think it would be amazing to do this on a live-stream for a small project. It would really give people an opportunity to ask “I want to add feature X, how do you ask for it?”.

                                                              1. 1

                                                                I have extensive notes on how I use them here: https://simonwillison.net/series/using-llms/

                                                                Thanks, I will take a look!

                                                        2. 2
                                                          • I provide it with the errors, but sometimes it doesn’t change the code or switches between to wrong codes.

                                                          • I can’t afford paying for AI tools. (I get Copilot trough GitHub education.)

                                                          1. 1

                                                            I see. You may just need to wait until the more capable models become available through educational licenses.

                                                            I will say that I expect the benefits of these models may be less to CS students than to folks with lots of experience. The experience helps me immediately discard incorrect suggestions without having to think about it much.

                                                    2. 1

                                                      Have you tried https://codeium.com/ ? It has some rust capabilities, maybe not at the free level.

                                                      1. 3

                                                        I have, on the Enterprise level, and it’s still not useful.

                                                        1. 1

                                                          That’s unfortunate.

                                                          I don’t really know rust and I was able to get it to fix a small bug, though someone who actually knew rust came back with a better fix.

                                                        2. 3

                                                          I’ve tried Gemini, Copilot and GPT-4

                                                          None of them can create working Rust code beyond the most trivial examples.

                                                      2. 8

                                                        My friends told me that he uses copilot daily, and when meaningful results start to appear, he interprets it as being on wrong layer of abstraction, writing something that should be either configuration or auto-generated. So it’s not junior dev with the atiitude, but rather robot demonstrating what you should not invest your time into.

                                                        1. 1

                                                          I really like that approach. I’d add a few more things:

                                                          • If an LLM is bad at guessing my function names, that’s a signal that I may not have used a consistent naming convention.
                                                          • If an LLM puts arguments in the wrong order, that’s a signal that I may not have a consistent argument order.

                                                          But if an LLM is producing entire functions, that’s a good sign that I’ve written almost the same thing somewhere else and I haven’t built good abstractions.

                                                        2. 8

                                                          I think programmer is better quality when you don’t use an LLM.

                                                          1. 8

                                                            There’s one exception: I’ve found Copilot is pretty good at helping me write decent doc comments. There’s often repetition between these (I don’t want people to have to read docs of multiple functions to understand one) and I am much better at glancing at prose and telling if it’s correct than I am at code, so I spot nonsense outputs far faster. It took me ages to fix a bug where copilot got a 1 and 0 the wrong away around in a network device driver but it takes almost no time to delete a comment that contains nonsense.

                                                            1. 4

                                                              Agreed. IME it’s fine for boilerplate templates for getting started with a new project (it’s stackexchange/README copy-paste on steroids – not the thing I burnt most of my time on, but whatever, I’m fine with using it for that). But the annoying thing is that copilot generated code that’s part of a larger project looks ok at first glance. It’s well formatted, has descriptive looking variable names, nice looking docstrings, and short functions. But as soon as I start tracing through the logic I wonder why the programmer made so many arbitrary breaks in the abstractions and unnecessary redirection before I realize it was AI generated.

                                                              LLMs recognize the syntax-level “visual” aspects of good code, but can’t recognize the deeper patterns of what abstractions make sense (arguably because of wider context about the problem domain or business that there’s no way of feeding it). Sure, the programmer is supposed to guide the LLM, but when writing code with the goal of minimizing how many lines are added to a codebase, the density of decisions being made is at least one per line (barring line noise like brackets and stuff which editors have already completed for us automatically for years). At that point, coaxing an LLM to output the exact code desired isn’t really gaining anything.

                                                              Even worse, nice looking code with crappy abstractions and tons of indirection to hide the badness is the most painful type of code to work with (also very common in Java codebases, unfortunately). The code is actually bad everywhere and it’s hard to see where the real complexity of the problem lives or follow the control flow of the program.

                                                            2. 7

                                                              It sounds (but it’s hard to actually judge from the text in the article) like the OP was spending cycles to “prompt copilot” into the code they wanted it to generate? That would indeed be slow and waste a lot of cycles. I’m sure that wasn’t the the only manner in which they used copilot, of course, but I can see how that would appear to be a productivity boon (“Look how much code it saved me from writing!”) only to realize later that the cycles spent on prompting+reviewing+ultimately not accepting suggestions balances out just writing the code yourself.

                                                              However, I have to push back a little bit. If you use copilot as a purely autocomplete/snippet extension, i.e. no “Copilot pauses,” no “comment prompting,” no “structuring code for a prompt,” etc. And just let copilot suggest what it will, and either accept it or don’t at first glance and move on. It’s a HUGE productivity booster (at least in boilerplate heavy or verbose languages/frameworks). This means sometimes copilot is too slow and misses a chance to save me from some boilerplate (“oh well”), or more often than not it makes the suggestion to complete the boilerplate I was in the middle of (“yay”).

                                                              Conversely, it also means I don’t miss copilot that much when I can’t/don’t use it because it’s effectively just suggesting the boilerplate I was already typing out so it’s as if the inference was a little too slow and so oh well.

                                                              For me it’s just a typing speed tool, not a thinking tool.

                                                              YMMV obviously.

                                                              1. 1

                                                                However, I have to push back a little bit. If you use copilot as a purely autocomplete/snippet extension, i.e. no “Copilot pauses,” no “comment prompting,” no “structuring code for a prompt,” etc. And just let copilot suggest what it will, and either accept it or don’t at first glance and move on.

                                                                I’ve been doing this. I’ve got Rider integrated with the official Copilot plugin (business version, so no snooping or learning from my code)

                                                                I have a bunch of SQL in the project directory. I went to my C# project and started writing code to add a trigger I had in the plain SQL files using Entity Framework. The fucker grabbed the exact SQL snippet from the project as an autocompletion.

                                                                The context-awareness of LLM autocomplete is unmatched compared to anything before.

                                                              2. 7

                                                                I have a hunch that people’s experience with Copilot varies wildly depending on the programming languages they use.

                                                                I mainly write Python and JavaScript - two of the most widely used languages with the most available training data. Copilot flies with those.

                                                                When I’ve tried it with Rust it’s been quite a but less helpful, and more likely to produce incorrect code.

                                                                1. 2

                                                                  Ditto on the Rust experience - most LLMs I’ve tried seem to mess up the basics or generate code that’s too simple to solve the problem I’m asking it to solve. Though my prompting may be the issue.

                                                                  1. 2

                                                                    It’s utter shit for Rust. Proving that not even computers can understand that crap :D

                                                                    Works like a charm for Python, JS/TS, Go and C# though in my experience. In many cases I can just prompt it for a project idea and pretty much copy & paste to completion.

                                                                    (Built a small Go program to grab an RSS feed with links to files and save the linked files locally, took me like 10-15 minutes with the free Gemini version)

                                                                    1. 6

                                                                      Proving that not even computers can understand that crap :D

                                                                      That you think computers “understand” your prompt is a big mark against using these LLM tools for junior devs.

                                                                      1. 1

                                                                        It’s just easier to use humanised terms for these things. Of course it doesn’t “understand” in the way we humans “understand” things.

                                                                        But whatever language model magic it’s doing, it still can’t produce Rust code that compiles past a few very tiny snippets. Either Rust is too complex for LLMs with all the symbols and correctness requirements or it moves so fast as a language that a 6-12 month old LLM training set is already completely invalid.

                                                                        1. 1

                                                                          Humans don’t understand things either! See this discussion for details and examples.

                                                                          1. 2

                                                                            I’m honestly just not sure what part of this link you’re referring to or think supports the very extreme claim that “Humans don’t understand things.” The only thing anyone in that thread seems to agree on is that what understanding means is complicated and hard to define. I don’t think that proves humans don’t understand things; it proves that if humans do understand things we don’t have consensus on a formal description of that process.

                                                                    2. 1

                                                                      I’d say it’s more about what people usually do with those languages, rather than the languages themselves. For example, with JavaScript you are most likely doing something that’s UI adjacent (front/backend of website, electron apps, etc.). And certain tasks might be easier for LLM than others.

                                                                    3. 6

                                                                      So far where I’ve found LLMs to be the most helpful is substituting for documentation and code searches, not directly writing code. There are also no legal issues just asking for information. It’s been faster sometimes to ask the LLM about the Linux kernel source code than do a regex search and read-through, though maybe I just need a faster laptop.

                                                                      1. 5

                                                                        I kind of suspect that inline code completion is, in many cases, the wrong layer for LLM-driven IDE integration – it results in both too little context for the LLM to output anything useful, and is expensive in terms of interaction. Intuitively, it’s the right layer – after all token completion is literally how LLMs work – but in practice it just doesn’t work that well. As the author discovered, it’s both slow and often unpredictable.

                                                                        What I did get a productivity boost from was at the other end of the spectrum – using e.g. ChatGPT to write methods or even whole classes, top-down, by gradually refining them, so that you can build context as you go, rather than having to minutiously specify all of it in one go. The latter approach is exhausting; the robot doesn’t understand anything from what you say so no matter how strict and careful you are, it’s still going to get it wrong, and the difference between right and wrong is often in apparently useless variations like writing “various inputs” instead of “arbitrary inputs”.

                                                                        I don’t use this for contract work for now, as I’m not sure about its legal status. But it’s definitely made a difference in how much I can do on my own projects in the one hour or so that I can now scrape in the evening. I used this to port an old program of mine from TkInter to Pyside, for example; what would’ve easily been a couple of hours’ worth of browsing reference manuals ended up amounting to about 45 minutes of steering ChatGPT through generating boilerplate UI code that I could plug existing UI-independent code into.

                                                                        It seems to me that LLMs are currently like Maxwell’s demon’s intern. Their hearts are in the right place but they still can’t tell tokens apart very well, so they’re bound to let the wrong one through pretty often. Any application that depends on LLMs being minutiously right is bound to be disappointing for now – basically, if you have a LLM generate your next line of code while writing a method, and it’s wrong, you can pretty much throw its output away. Whereas applications that depend on the demon mostly letting the right kind of tokens in are going to work a little better, as long as you can sort out the wrong ones.

                                                                        I tried Copilot, too, and it was… meh, kind of interesting for some types of work, but 19 USD/mo feels like about an order of magnitude too expensive not so much for what it can do, but for how well it can do it. Things like ChatGPT, on the other hand, even on their free tier, are way more useful. (And if you’ve been deprived of nerd condact for long enough, like me, they’re also a lot more fun to work with if you fiddle with the prompt a little :-D)

                                                                        1. 3

                                                                          Yeah llms need a lot of guidance and nudging. Copilot completely missed that subtlety.

                                                                          They are also very flawed at emulating thinking. Eg every token you give them or produced by them biased subsequent output.

                                                                          Thus when I wrote https://chatcraft.org it was partially out of frustration of not being able to abort/branch at dead end chats(eg super helpful to edit and delete both ai and own messages) and partially out of frustration with walled garden off what I was and wasn’t allowed to do in proprietary ui.

                                                                        2. 3

                                                                          My free access to Copilot must’ve happened on about the same timeframe as yours. I also only used it for side projects, and was not tempted to pay for it when it went away. (To be clear: I do pay for tools that I use in my side projects if I find them to be reasonably priced and useful. If it were more helpful, I’d have paid $10 per month to maintain access to it.)

                                                                          Unlike your project, mine should be the kind that play to copilot’s strengths. I mostly write python, and do plenty of web stuff lately.

                                                                          When copilot was right, it was great. It did result in better doc strings a lot of the time. But on the whole, especially after not having it for a couple weeks, I now believe I spent more time cleaning up after it than it ever saved me.

                                                                          I never tried its chat interface, though. Maybe that would’ve been better? I do find Claude helpful that way. Describing a function or a class in that chat interface often quickly eliminates a bunch of boilerplate, and telling it things like “Now generate tests for that using pytest.” and “Parameterize test_x” or “Turn that into a fixture” has drastically cut down the amount of time I spend writing tests. For one recent library, that improvement allowed enough testing to fit in my timebox that I was willing to publish it to pypi.

                                                                          1. 2

                                                                            I’ve been using copilot for around a year now and have gone for days or weeks at a time without it, and this largely mirrors my experience. The main difference being that a lot of my code has boilerplate and copilot is able to greatly reduce the time I spend writing it. I’ve tended to avoid copilot in code that’s less boilerplate heavy as it tends to make more work for me.

                                                                            A couple of examples from my workflow:

                                                                            I occasionally write JS/TS and have been writing it on and off for over a decade. I try to avoid using copilot because it often gets parens in the wrong spot (yes, it’s written bad if statements for me) or just generally doesn’t write any faster than I do. When it writes the bad code I often don’t know till I run it and it doesn’t behave as expected.

                                                                            I primary write Elm. I love using copilot with Elm because it’s faster at filling in the boilerplate and Elm is fast at finding errors & excellent at telling me how to fix them.

                                                                            So I guess, copilot is helpful when it’s used along side a language & tooling that can easily correct it.

                                                                            1. 4

                                                                              It’s almost like Copilot is the one writing some of these blog posts/articles and comments here as well.

                                                                              1. 1

                                                                                Don’t know about Copilot but Perplexity is for sure helpful when I’m in unfamiliar domains (a lot) and get stuck.

                                                                                1. 1

                                                                                  I feel like GPT4 has been more helpful as a coding tool than Copilot.

                                                                                  1. 1

                                                                                    I find no benefit from GitHub Copilot, but a huge one from ChatGPT, a tool that is specifically not built into the your code editor. Akin to a digital rubber ducky, breaking down my own problem into a couple sentences that balance what information should go in, without overwhelming ChatGPT with too much information, which could lead it down a wrong decision path and inserting just the relevant code pieces reframes my own challenges into new perspectives I wouldn’t have thought about. Getting solution suggestions to apply or reject is the cherry on top.

                                                                                    That being said, it really depends on programming language, more so than I thought. For graphics programming and GLSL it’s consistently stupid with the most basic of challenges.

                                                                                    1. 2

                                                                                      I feel the same way. I don’t use GPT all day long but I dip in and occasionally find enough value for 20 bucks a month (for now).

                                                                                      But for the current LLMs, the editor is the wrong interface. At least editors as they now exist. My gut tells me there’s some UX that’s yet to be invented that really cracks this stuff wide open. I hope I’m retired before then. I love to code 😊