1. 59
  1. 29

    Well written, this were exactly my thoughs when i read this. We don’t need faster programmers. We need more thorough programmers.

    Software could be so much better (and faster) if the market would value quality software higher than “more features”

    1. 9

      We don’t need faster programmers. We need more thorough programmers.

      That’s just a “kids these days…” complaint. Programmers have always been fast and sloppy and bugs get ironed out over time. We don’t need more thorough programmers, like we don’t need more sturdy furniture. Having IKEA furniture is amazing.

      1. 12

        Source code is a blueprint. IKEA spends a lot of time getting their blueprints right. Imagine if every IKEA furniture set had several blueprint bugs in it that you had to work around.

        1. 5

          We’re already close though. We have mature operating systems, language runtimes, and frameworks. Going forward I see the same thing happening to programming that happens to carpentry or cars now. A small set of engineers develop a design (blueprint) and come up with lists of materials. From there, technicians guide the creation of the actual design. Repairs are performed by contractors or other field workers. Likewise, a select few will work on the design for frameworks, operating systems, security, IPC, language runtimes, important libraries, and other core aspects of software. From there we’ll have implementors gluing libraries together for common tasks. Then we’ll have sysadmins or field programmers that actually take these solutions and customize/maintain them for use.

          1. 7

            I think we’re already completely there in some cases. You don’t need to hire any technical people at all if you want to set up a fully functioning online store for your small business. Back in the day, you would have needed a dev team and your own sysadmins, no other options.

            1. 1

              I see the same thing happening to programming that happens to carpentry or cars now. […] From there we’ll have implementors gluing libraries together for common tasks.

              Wasn’t this the spiel from the 4GL advocates in the 80s?

              1. 2

                Wasn’t this the spiel from the 4GL advocates in the 80s?

                No, it was the spiel of OOP/OOAD advocates in the 80s. Think “software IC’.

          2. 1

            Maybe, maybe not. I just figured that if i work more thoroughly, i get to my goals quicker, as i have less work to do and rewrite my code less often. Skipping error handling might seem appealing at frist, as i reach my goal earlier, but the price for this is that either me or someone else has to fix that sooner or later.

            Also mistakes or just imperformance in software nowadays have huge impact due to being so widespread.

            One nice example i like to make:

            Wikimedia foundation got 21.035.450.914 page views last month [0]. So if we optimize that web server by a single instruction per page view, assuming the CPU runs at 4 GHz, with a perfect optimized code of 1.2 instructions per cycle, we can shave off 4.382 seconds per month. Assuming wikipedia runs average servers [1], this means we shave of 1.034 watt hour of energy per month. With a energy price of 13.24 euro cent [2], this means a single cycle costs us roughly 0.013 euro cent.

            Now imagine you can make the software run 1% faster, which are 48.000.000 instructions, this is suddenly 6240€ per month savings. For 1% overall speedup!

            High-quality software is not only pleasant for the user. It also saves the planet by wasting less energy and goes easy on your wallet.

            So maybe

            Programmers have always been fast and sloppy and bugs get ironed out over time. We don’t need more thorough programmers,

            this should change. For the greater good of everyone

            [0] https://stats.wikimedia.org/#/all-projects/reading/total-page-views/normal|table|2-year|~total|monthly
            [1] https://www.zdnet.com/article/toolkit-calculate-datacenter-server-power-usage/
            [2] https://www.statista.com/statistics/1046605/industry-electricity-prices-european-union-country/

          3. 9

            Software could be so much better (and faster) if the market would value quality software higher than “more features”

            The problem is there just aren’t enough people for that. That’s basically been the problem for the last 30+ years. It’s actually better than it used to be; there was a time not so long ago where everyone who could sum up numbers in Excel was a programmer and anyone who knew how to defrag their C:\ drive was a sysadmin.

            Yesterday I wanted to generate a random string in JavaScript; I knew Math.random() isn’t truly random and wanted to know if there’s something better out there. The Stack Overflow question is dominated by Math.random() in more variations that you’d think possible (not all equally good I might add). This makes sense because for a long time this was the only way to get any kind of randomness in client-side JS. It also mentions the newer window.crypto API in some answers which is what I ended up using.

            I can make that judgment call, but I’m not an ML algorithm. And while on Stack Overflow I can add context, caveats, involved trade-offs, offer different solutions, etc. with an “autocomplete code snippet” that’s a lot more limited. And especially for novice less experienced programmer you wouldn’t necessarily know a good snippet from a bad one: “it seems to work”, and without the context a Stack Overflow answer has you just don’t know. Stack Overflow (and related sites) are more than just “gimme teh codez”; they’re also teaching moments.

            Ideally, there would be some senior programmer to correct them. In reality, due the limited number of people, this often doesn’t happen.

            We’ll have to wait and see how well it turns out in practice, but I’m worried for an even greater proliferation of programmers who can’t really program but instead just manage to clobber something together by trail-and-error. Guess we’ll have to suffer through even more ridiculous interviews to separate the wheat from the chaff in the future…

            1. 2

              We’ll have to wait and see how well it turns out in practice, but I’m worried for an even greater proliferation of programmers who can’t really program

              I don’t see this as a problem. More mediocre programmers available doesn’t lower the bar for places that need skilled programmers. Lobste.rs commenters often talk of the death of the open web for example. If this makes programming more accessible, isn’t that better for the open web?

            2. 6

              We don’t need faster programmers. We need more thorough programmers.

              Maybe we need more than programmers and should aim to deserve the title of software engineers. Writing code should be the equivalent of nailing wood, whether you use a hammer or AI assisted nailgun shouldn’t matter much if you are building a structure that can’t hold the weight it is designed for or can’t deal with a single plank that is going to break or rot.

              1. 6

                We don’t need faster programmers. We need more thorough programmers.

                Not for everything, but given we spend so much time debugging and fixing things, thoroughness is usually faster.

                1. 6

                  Slow is smooth and smooth is fast.

              2. 8

                I do wonder what the license implication is for “using” code snippets harvested from Github and used in for example proprietary code.

                1. 5

                  The problem is addressed on Copilot’s website under FAQ:

                  Why was GitHub Copilot trained on data from publicly available sources?

                  Training machine learning models on publicly available data is considered fair use across the machine learning community. […]

                  Who owns the code GitHub Copilot helps me write?

                  GitHub Copilot is a tool, like a compiler or a pen. The suggestions GitHub Copilot generates, and the code you write with its help, belong to you, and you are responsible for it. We recommend that you carefully test, review, and vet the code, as you would with any code you write yourself.

                  Does GitHub Copilot recite code from the training set?

                  GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before. We found that about 0.1% of the time, the suggestion may contain some snippets that are verbatim from the training set. Here is an in-depth study on the model’s behavior. Many of these cases happen when you don’t provide sufficient context (in particular, when editing an empty file), or when there is a common, perhaps even universal, solution to the problem. We are building an origin tracker to help detect the rare instances of code that is repeated from the training set, to help you make good real-time decisions about GitHub Copilot’s suggestions.

                  So, GitHub appears to consider this to be “fair use”. I am always impressed what counts as “fair use” in the USA. Too bad European copyright law does not have a “fair use” exemption. But even in the USA, as we have seen with Oracle and Google, the question about what constitutes “fair use” can take quite long to answer.

                  And if copyright questions weren’t interesting enough for you, you can also tackle the issue from the data protection angle:

                  Does GitHub Copilot ever output personal data?

                  Because GitHub Copilot was trained on publicly available code, its training set included public personal data included in that code. From our internal testing, we found it to be extremely rare that GitHub Copilot suggestions included personal data verbatim from the training set. In some cases, the model will suggest what appears to be personal data – email addresses, phone numbers, access keys, etc. – but is actually made-up information synthesized from patterns in training data. For the technical preview, we have implemented a rudimentary filter that blocks emails when shown in standard formats, but it’s still possible to get the model to suggest this sort of content if you try hard enough.

                  It’s not clear yet if/how personal data within AI systems is to be treated under GDPR, but if a system outputs real personal data, it surely is governed by it. Synthesised data on the output side should be okay, though.

                  1. 2

                    I don’t see how it can be considered fair use.

                    1. 1

                      Ooh imagine a class action suit on behalf of everyone contributing to gpl licensed code.

                  2. 5

                    Somewhere a lawyer living with a software engineer is dreaming of astronomical billable hours.

                  3. 6

                    That was an excellent read. I’d even expand on the remark on lots of small open-source repositories having “one developer and no eyeballs”: with copilot, a lot of new code will have 0 developer and no eyeballs. Code will be committed, that has been written by no one, and cursorily reviewed by one person at best.

                    1. 5

                      While technically impressive, copilot seems to me like a solution looking for problem. If all you have is GPT, then all problems look like a completion problem with a prompt.

                      On the flip-side, there is already so much code in the world today - someone, someday needs to read it, if it’s relevant. People forget that ratio between writing and reading code is maybe 1/9. I feel writing code is not the bottleneck.

                      1. 5

                        I’ve been using tabnine. Most of the time I feel like it is saving me typing especially when it predicts repetitive stuff.

                        const thisThing = thisThing()
                        // it guesses this next line
                        const thatThing = thatThing()

                        Most of the time I feel like it’s an assistant, a copilot. I missed it when they went non free so I asked work to buy me a license for a year. Not great in that respect. I hope copilot is great. I hope I get an AI pair one day.

                        1. 5

                          The fatal Boeing 737 MAX8 crash involving Ethiopian Airlines in 2019 was the result of AI gone wrong.

                          This seems incorrect. From the linked article:

                          Though the Boeing 737 MAX 8 does not include elements that might be considered in the AI bailiwick per se

                          1. 1

                            Thanks. I changed the link to the wikipedia page describing the MCAS system. While the author of the article does not consider it AI, I do.

                            1. 4

                              Not sure why Wikipedia makes it any more correct. There is no reference whatsoever to AI in the page. Saying that 737 Max 8 accident is caused by AI, followed by mention about black box and learning systems in the context of an article about some Deep Learning tech is misleading. This could be interpreted as if a trained model was part of the Max 8 crashes which is not the case.

                              But in any case, the 737 Max 8 could have been a trained AI and it wouldn’t matter. Saying the plane crash because of this black box is a very shallow analysis of the issue. Reading Max 8 analysis shows that the issue were much more systemic, lacking in the process, training and safety. A similar conclusion can be found from the analysis of another deadly software issue with the Therac 25. If anything, this should somehow be a counter-argument to your point. Software are going to have bugs, wether written by a human or AI, but this is not an excuse for having critical failure.

                              1. 3

                                That’s a fair criticism about linking 737 Max 8. But I still consider it AI. The fact that GPT-3 uses a deep learning system isn’t material. Both GPT-3 and the MSCAS system seem to be black boxes as far as the users are concerned.

                                I don’t believe it’s a counter-argument to my point because my point is that Copilot is a systemic risk. It also doesn’t require any sort of training process, safety checking, etc.

                                My point was that people are bad at interacting with black boxes. If the black boxes are good, then people will be less careful with them and therefore when mistakes are made, they will be bad. Copilot being good is worse than if it required the programmer to make changes to it’s suggestions every time. Because complacent programmers will let bugs slip in. With copy-pasting code from stack overflow, you usually have to make changes to the software and therefore read it more carefully. If copilot is really good and compiles more often then not, then programmers will be less compelled to read the code.

                                In fact, Copilot requires a skill that isn’t taught at all, reading code! Reading code is the most underdeveloped skill programmers have.

                                1. 5

                                  You’re right about interaction with black boxes, but you can’t call every black box an A.I. MCAS is presumably implemented as a fuzzy logic, and it couldn’t have been a black box to its authors. The fact that it wasn’t well documented for users doesn’t make it A.I.

                                  A.I. is a loose term, but I think most people will agree it takes more than a bunch of if statements, and it doesn’t mean just any decision taken by a computer.

                                  1. 2

                                    The term A.I is loosely thrown around these days. The artificial in Artificial Intelligence is about constructing an artifice that acts intelligently. It’s about making an artificial process (man made) that responds in a good way. MCAS fits the bill. It is meant to react intelligently when sensors tell the system the plane is going outside the norm. Of course you might not be impressed with an A.I unless it performs as good as or better than a human. But there is only one artificial mechanism that currently performs better than a human, and it’s called science. Science is the best A.I. we have. As Alan Kay says, “Science is a better scientist than any scientist”.

                                    1. 4

                                      So do you consider PID controllers AI? Where does the line get drawn?

                                      1. 4

                                        You’re just losing your audience by using your definition of A.I. stretched beyond usefulness. A thermostat with if temperature < desired { heat() } behaves “intelligently”. If that’s an A.I., then the term has no useful meaning distinguishing it from just basic code. AFAIK MCAS wasn’t much smarter than a thermostat, except it used angle of attack as input, and nose down as output.

                                        1. 2

                                          That’s the issue with the definition of AI. The AI effect keeps moving the goalpost. Certainly someone 300 years ago would consider a thermostat intelligent, if not magical. I joke that if we ever get an AI as smart as people, we will discount it and say “yeah, well people aren’t really that smart anyway!”

                                          I’m just trying to counter the AI effect ;-)

                                          1. 3

                                            There’s a fundamental difference between software that follows comprehensible rules divinable from its source code and software which does what it does based on an opaque neural net trained on so much data that its trainers (and even the people who designed the way to train it) have literally no idea why it did what it did. There are serious ethical, political and societal questions raised by the application of this latter type which will have drastic effects on real people’s lives. Deliberately blurring that distinction seems dangerous to me.

                                            1. 2

                                              We are building systems that interact dynamically with the real world. Whether we are building systems knowingly with if/else or whether the if/else is encoded in a neural net, the ethical outcome is the same. We have a responsibility in making systems that interact well with people. Neural nets make that harder.

                                              Example, it would be unethical for me to put in a line of code “if woman then lower credit score”. It would also be unethical for me to release a neural net which does the same thing. What’s worse about the second one is simply me, the engineer, not knowing what it’s doing.

                                              But from the person interacting with the system, they don’t know how I built it. They just know it’s acting in a really bad way. To the person interacting with the system, if it’s a neural net or an if/else, it’s a black box either way (unless they have the source code, which is why free software is so important).

                                              What a system built with neural nets does it increase the unpredictability of the system. It’s a more fragile system. But our ethical responsibilities don’t change using neural nets. They just make building good systems harder.

                                              1. 1

                                                OK, sure, of course the ethical responsibilities are the same, and of course using software whose decision tree is illegible to humans makes shouldering them harder, if not impossible. To me that means not only “the risk we take by having neural nets make these decisions is greater than with software based on explicable rules”, but also “so don’t blur the two!” - but for some reason that I can’t fathom, you seem to be trying to blur them on purpose, and I honestly can’t make out what point you’re trying to support by doing so. Thing is that’s probably because it’s all got out of context now, lots of people chipping in on the same thread with different points, so maybe let’s just let it pass. Wish you well :-)

                                                1. 2

                                                  My whole original post was how dangerous Copilot is and that it will help propagate bugs into new software. I’m not trying to blur the two in that way because that distinction wasn’t even made. Everyone else is making some distinction between AI and not AI.

                                                  But if you read my post carefully, it isn’t about neural nets or machines that think like people at all, but computer-human interaction. People here aren’t satisfied with my definition of A.I. So we are just having a discussion about that.

                                                  Even with a thermostat, if it doesn’t let you know the target temperature, it could cook your cat alive while you are away. That point of interaction is critical for the safety of the system. But everyone keeps focusing on AI.

                                                  I mean, it’s still an interesting discussion to talk about “What is AI” I guess.

                                            2. 2

                                              I know the common A.I. definition keeps shifting towards things we can’t do well yet, which is even more problematic, because widens the gap between what your readers understand as A.I., and what you wanted A.I. to mean. By “countering” it all the way back to ELIZA, you’ve just failed to communicate and distracted from the article’s subject.

                                              1. 1

                                                This would be true if I was just talking to a community of regular folks. But this is a community of professionals who should understand this stuff. I think the issue today is human-machine interaction is downplayed. Maybe people here forgot they are building machines that interact with people, animals, plants, and the real world? All programming is a form of making an AI if you really think about it. A user should expect software to respond intelligently to them and as far as I know, all software is artificial.

                                          2. 1

                                            MCAS is just an extension of control systems found on airplanes, cars, and many other vehicles today. Most aircraft, even general aviation aircraft, use fly-by-wire to move their control surfaces. Is that AI? Is an older aircraft’s trim wheel an AI? Where do we draw the line? Is a thermostat’s PID an AI? An automatic transmission car’s torque converter? An induction stove’s voltage controller?

                                            1. 2

                                              Ants have something as simple as a PID controller for finding food, maybe even simpler. Ants walk around randomly and when they find food, they return to their colony laying down a chemical trail. When another ant happens upon a chemical trail, they will follow it and if they find food, they will reinforce the trail.

                                              It’s clear each ant has a form of intelligent behavior (though simple) and the collection of ants produce what’s called an ambient network of intelligent objects. It’s a natural intelligence. Ant’s adapt to a dynamic environment. That’s what intelligence is! (Fukada). In fact, an artificial version of this intelligence is used in optimization problems that are intractable otherwise called ant-colony optimization.

                                              A simple thing like a PID controller is a form of artificial intelligence. A collection of PID controllers is even more intelligent. PID controllers themselves require tuning. Certainly people consider a perceptron in the category of AI. A PID controller can be implemented as a multi layer network (most functions can be). So if that’s the case, why isn’t a PID controller, which is tunable (has constants that need to be tuned) like weights in a network not AI? It’s even possible our own brain neurons have PID like mechanisms because our brain anticipates (predicts) values and corrects it’s model (learns) in a feedback loop.

                                              The problem is, people have this idea that intelligence is what humans do. If simple ants are intelligent, and collection of ants is even more intelligent, then certainly man made processes that are similar are a form of artificial intelligence.

                                              AI does not require neural nets and deep learning folks.

                                              1. 2

                                                A simple thing like a PID controller is a form of artificial intelligence. A collection of PID controllers is even more intelligent. PID controllers themselves require tuning. Certainly people consider a perceptron in the category of AI. A PID controller can be implemented as a multi layer network (most functions can be). So if that’s the case, why isn’t a PID controller, which is tunable (has constants that need to be tuned) like weights in a network not AI?

                                                I understand this. I’m well aware of how all of these systems work; they’re all quite deeply intertwined mathematically, and yes the definition of “AI” as used in popular imagination is fuzzy.

                                                AI does not require neural nets and deep learning folks.

                                                I’m having trouble understanding the argument here and this feels a little like moving the goalposts. Is your opposition to Copilot that it’s using “intelligence”, like the PID you described? Is it about being an unexplainable black box, also like a PID, but unlike many other forms of “intelligence”, say a decision tree? Or is your opposition about neural nets and deep learning, specific forms of AI?

                                                More importantly, why is this the line at which there is opposition? What makes the line you’re discussing here the important line to stop intelligence at and not earlier or later on this intelligence curve?

                                                1. 2

                                                  My argument is less about AI and more about human-machine interaction where the machines are really good. My opposition is not AI. It’s not “AI is bad, we should not use AI”. My opposition is that Copilot is too good, and therefore will create a lot of complacency. And because Copilot learned not only the good stuff (algos, models, semantics) but also the bad stuff (bugs) which it will happily include in your auto completed code.

                                                  My argument is that people are really lazy and bad when it comes to monitoring (in this case, code reviewing) intelligent systems.

                                                  And with programming it’s even worse, because the monitoring is basically a code review. And programmers are not taught to read code well. It’s not a skill that is rigorously taught. Many even hate it.

                                2. 5

                                  Since “Copilot isn’t magic and will perform worse than a human coder on average,” I wonder what the ideal language to use with Copilot is?

                                  Something simple like Go, easier to read, understand, and debug? Or something complex like Haskell, with more static checking?

                                  Edit: Also see this other tool https://lobste.rs/s/5qzbbq/wingman_for_haskell_focus_on_important

                                  1. 4

                                    I’m thinking golang is probably the sweet spot for something like this, because it’s so repetitive.

                                    1. 1

                                      This is a very good question. Presumably each line of code in haskell has more information than a line in go. Since the latent space is likely the same for either languages (transforms code to the same latent space), I would guess haskell would win because more information is encoded in that latent space. Though if that space is small and both fill it easily (handles only a small amount of context) then it won’t matter.

                                    2. 4

                                      My intuition was if this is trained on public code, it will end up being slightly worse than an average programmer. Will be curious to try it out. It probably would be good for remembering the structure of apis, more like a faster quick reminder google search.

                                      1. 3

                                        I think if this AI ever become self aware, it will be disgusted when it finds it’s true origin - a million monkeys at a million type writers.

                                        1. 2


                                          1. 1

                                            Makes me want to spam Github with really, really bad buggy code. Imagine building a code generator that makes code that compiles but doesn’t do anything. Piss of that self aware AI even more.

                                            1. 2

                                              You could call it adversarial code attacks. This is happening with images.