1. 53
  1.  

  2. 23

    I do love chatGPT but what I guess I’ve not seen yet is people pointing out it “plagiarizes” in the same sense as those art generation bots that caught a similar hype wave a month or two ago. Ask it how you would use Z3 to check two firewalls for equivalence, for example, and it spits out a slightly (although competently!) modified version of the blog post that I wrote on the topic: https://ahelwer.ca/post/2018-02-13-z3-firewall/

    I guess since my blog is licensed under CC-BY-SA this is a violation of that license (and all the material it obviously read from Wikipedia) but I find it difficult to be too mad about it. I’ll probably start being mad when they use all this appropriated human endeavor to get it to convince you to buy certain products or hold certain political views.

    1. 14

      The same way that Copilot “plagiarizes.”

      Cue the anthropomorphising of software by claiming it has “learned” the same way as a human.

      1. 11

        I suspect that language models are very good plagiarists because every word, sentence, paragraph travels through the following high level translation: words => concepts, concepts => words. All the words that map to the same concepts across a large scale internet / wikipedia / stackoverflow / etc crawl act as votes, of a kind, for those concepts and their tangled web of relationships to other concepts. Then these very, very good natural language generation algorithms can take the concepts back down to words again, in fluent sentences and paragraphs. But the words generated by chatGPT won’t be the exact words read/plagiarized by chatGPT.

        Think of it this way. Let’s say you read 3 essays describing the themes of The Great Gatsby. Then, two hours later, someone asked you, “write an essay describing the themes of the Great Gatsby”, but you no longer had access to those 3 essays you read. You’d probably write a plausible essay describing the themes of The Great Gatsby, even if you yourself had never read The Great Gatsby. Were you plagiarizing the 3 essays you read a few hours ago when you did it? Now imagine the same thing, but where it is not 3 essays but millions, and it’s not relying on human memory/brain, but on a very purpose-built set of content indexing algorithms and a very sophisticated natural language generator.

        One way I’ve been thinking about it is that the algo has, in a sense, turned plagiarism into an art form. The algorithm’s goal is to not get caught merely returning a source document that has the answer. It also knows how to synthesize information from many different source documents. As neat a trick as this is, it is hard for me to see this not creating a long-term seed vs leech problem for internet content.

        1. 6

          I like your take.

          One way I’ve been thinking about it is that the algo has, in a sense, turned plagiarism into an art form.

          It has always been an art form: On Bullshit. Skimming source material and quickly coming up with an intelligent-sounding essay that you didn’t believe – this was the skill in college, and you had to be smart to do it well. It is the skill of mimicing a way of talking, a way of thinking… at bottom, the skill of understanding the markers and heuristics that others will use to judge your fluency.

          I feel this is the skill GPTChat possesses in abundance. Importantly, it is a different skill than true fluency (as the parent article attests).

        2. 4

          I do love chatGPT but what I guess I’ve not seen yet is people pointing out it “plagiarizes” in the same sense as those art generation bots that caught a similar hype wave a month or two ago.

          Woah, I haven’t heard of this. Do you mean like the training set is plagiarized, or that it actually outputs plagiarized works?

          EDIT: I mean the art generation bots.

          1. 4

            The remainder of that paragraph is an answer to your question

            1. 3

              … No it isn’t?

              1. 2

                Oh, you were asking about the art.

                Yes the training set clearly includes modern artists who did not license their art for inclusion in the dataset, and you can ask it to imitate their style.

                Getting an exact reproduction is unlikely because of the way the information is compressed.

                1. 1

                  Is imitation the same as plagiarism? I do not think it is.

                  Can you name specific artists whose work was included in those training datasets without appropriate license? I know Greg Rutkowski is one of the more imitated artists, but much of his work is uploaded to ArtStation which has a rather broad reuse license. Greg is also very good about adding alt-text which helps train the models.

                  Your Content may be shared with third parties, for example, on social media sites to promote your content on the Site, and may be available for purchase through the Marketplace. Accordingly, you hereby grant royalty-free, perpetual, world-wide, licences (the “Licences”) to Epic and our service providers to copy, modify, reformat and distribute Your Content

                  Although Emad claims that artist style imitation has more to do with the CLIP language model than the LAION image dataset.

          2. 2

            or hold certain political views.

            I do wonder if something like chatGPT would develop a coherent political agenda, or if it would just regurgitate whatever was in its training data. Probably the latter.

            1. 5

              Developing a coherent political agenda that isn’t a regurgitation of its training data is a very hard task for a human.

          3. 14

            I do wonder if this is going to end up being like self-driving cars, where “an AI illustration with the correct number of fingers and normal looking eyes” is always just five years away…

            1. 3

              Self driving cars are always 5 years away because of regulatory hurdles. Governments don’t want to have to deal with totally rebuilding the road laws from scratch and many issues like liability are very much in question. People have grown accustomed to the fact that many people die in traffic accidents every year, so there is no outcry, but one person being killed by a self driving car will make headlines all over the world and the regulators know it. Better to let thousands of people die in a politically acceptable way than to save most of them and then deal with a public outcry over the few failures.

              None of that really applies to chat or art bots.

              1. 18

                I don’t think better-than-human driving has been demonstrated yet. Until then the “regulatory hurdle” is about not murdering (more) pedestrians by beta software.

                Better-than-human safety has been claimed, but stats like “crashes per distance driven” are biased by autopilot being mostly used to drive in a straight line on a pedestrian-free and cyclist-free highway, and not navigating road crossings where most crashes happen.

                These systems disengage in difficult conditions, so for self-driving stats the easy roads count towards self-driving succeeding, and the difficult roads count towards humans failing.

                1. 9

                  Regulation is part of it, but it’s not the whole story. There have been plenty of pilot programs, and most cars sold now will have lane assist and whatnot. It’s just really hard to drive in a chaotic city environment.

                  1. 4

                    Better to let thousands of people die in a politically acceptable way than to save most of them and then deal with a public outcry over the few failures.

                    Ironically it goes both ways. It’s easier for regulators to approve self-driving cars than mandate breathalyzers in every car, which would cut the number of traffic fatalities by a magnitude. A lot of the rest of accidents are from bad driving conditions, where self-driving cars do much worse than humans.

                2. 10

                  Thanks for taking the time to pick at these. I’m getting a little sick of it dominating my feeds, but it’s still great to see more critical breakdowns of what it’s getting right and wrong.

                  Was chatting w/ a friend about the ~math screenshots floating around yesterday and it occurred to me that this might be understandable as emergent savant-ish behavior caused by leaning really hard into a limited subset of vaguely neural/human qualities.

                  A really interesting experiment in what you might get if you could set most of a person aside, and 100x a few bits.

                  I think it underscores what an absolutely incredible bootstrap written language is, and I’m most-intrigued by the prospect that creative research will be able to leverage these models to better understand ourselves (i.e., the role language does and can/should play in how we learn, how we mis-learn, etc.).

                  1. 8

                    I think you’re still misunderstanding. Just like Stable Diffusion, ChatGPT has creativity. It has arguably superhuman creativity. It’s endlessly creative. What it isn’t is discerning and reflective. It thinks, but it doesn’t think about what it is thinking. It is, in other words, unconscious.

                    These justifications resemble what a programmer may reply when set a programming challenge in a dream.

                    1. 10

                      I think consciousness in this context is pretty much aways a meaningless buzzword.

                      1. 2

                        I find it frustrating that people even while understandably avoiding giving a definition of consciousness are also unwilling to give a general sense of what they mean by consciousness, since it’s pretty clearly a multivalent term. Conscious could refer to being awake vs. being asleep, or being a human vs. being a monkey, or being a human adult vs. being a human baby, or being an unreflective human vs. being an introspective human, or having active awareness of a stimuli vs. having a stimuli without active awareness of it, or … In any event, it’s worth clarifying the ballpark of what you’re talking about before getting into a big debate about it.

                        1. 2

                          (In this case, I mean your latter two definitions.)

                      2. 7

                        What do you mean by the following terms:

                        • creativity
                        • discerning
                        • reflective
                        • thinking
                        • unconscious
                        1. 7
                          • creativity

                          ChatGPT can, given a prompt, create novel output that is constrained to be functional and even good

                          • discerning

                          ChatGPT cannot decide, on its own while answering, that its output is inadequate and must be fixed. It can sometimes decide this given prompting, but not reliably.

                          • reflective

                          ChatGPT cannot independently notice that it is giving an answer with a certain property, like errors. It cannot interrupt itself answering and add ongoing corrections. It does not gather information on its own answer on its own; it doesn’t “hear what it’s saying.” (It does, of course, literally hear what it’s saying, it simply doesn’t gather information on it to the same depth we do.)

                          • thinking

                          The thing humans do with their brains

                          • conscious

                          A combination of reflectivity and self-awareness; having a space in which you symbolically model objects, including yourself.

                          1. 3

                            Thank you!

                            In this rubric, I’ll argue that ChatGPT is more discerning than you give it credit. The capacity to be creative and to discern are one and the same.

                            A random number generator hooked up to print a page of ASCII text is very novel but the chances of a functional or good output is close to nil. Meanwhile you or I, using the same random number generator can create a novel, functional, and maybe even good output almost every time. I’d say the difference is a capacity to discern the quality of our output, through a decision-making process. n.b. AFAIK there’s no evidence to suggest that human minds and ChatGPT’s mechanisms to discern are in anyway similar. (most charitably I’d compare a hawk to an Cessna)

                            I’ll also argue that ChatGPT is even less reflective than you described. It can’t “interrupt” because it doesn’t generate text linearly; the UI showing words typed out is intentionally misleading. It doesn’t even “literally” hear (read) what it’s saying. I want to point out we’ve anthropomorphised two evaluations of the same function and stepped into the classic sci-fi question of whether your clone is also you; “it” doesn’t review its output. Later evaluations are made with the prompts and outputs from the earlier evaluations and a new prompt following— so “it” only “hears” itself when literally prompted.

                            Lastly, I argue it’s also more reflective than you give it credit! ChatGPT very literally has higher dimensional symbolically models. But it’s modelling text. You and I both model text (probably not as well as ChatGPT) as a higher dimensional concept built upon all of our senses.

                            1. 3

                              A random number generator hooked up to print a page of ASCII text is very novel but the chances of a functional or good output is close to nil.

                              Yes, hence the constraint of quality. There’s an axis between noise and exact reproduction, and creativity lies somewhere between.

                              AFAIK there’s no evidence to suggest that human minds and ChatGPT’s mechanisms to discern are in anyway similar.

                              There is a minimal constraint of similarity in that ChatGPT has to predict human token completions; the function it learns to (in theory) predict is human cognition.

                              You and I both model text (probably not as well as ChatGPT) as a higher dimensional concept built upon all of our senses.

                              Not sure this is actually true. Text co-activates our senses, but I don’t know if it’s composed of it. I think human text parsing may be its own thing, cognitively.

                              I’ll also argue that ChatGPT is even less reflective than you described. It can’t “interrupt” because it doesn’t generate text linearly…

                              Hang on this is new to me. ChatGPT doesn’t generate text linearly? Every other GPT generates text linearly, is it doing something else here?

                              1. 2

                                There’s an axis between noise and exact reproduction, and creativity lies somewhere between.

                                Does creativity arise from the quality of discernment?

                                […] the function [ChatGPT] learns to (in theory) predict is human cognition.

                                AIUI it predicts words, as trained from a very large corpus (~the Internet).

                                Why would it (in theory) predict human cognition? Human cognition involves so much more than words.

                                Text co-activates our senses, but I don’t know if it’s composed of it.

                                You don’t think we (roughly) have a shared the concept of text? We can both manipulate text at the symbolic level. Children play word games like: taking turns adding one-word-at-time to make sentence, reordering letters to form a word, reordering words to form a sentence, sorting words by various forms, speaking new “languages” through simple symbolic rules like Pig Latin or Double Talk, make secret codes via substitution and rotation ciphers, … and so much more.

                                Maybe I’ve misunderstood and you’re saying language is innate and doesn’t arise from our senses. Chomsky would approve. I … don’t.

                                ChatGPT doesn’t generate text linearly? Every other GPT generates text linearly, is it doing something else here?

                                The GPT model gives a probability distribution for the next token. That’s linear.

                                But there are many decoding strategies. Many aren’t linear at all, using techniques like backtracking and non-determinism. Those algorithms’ goal is to use GPT to produce a “good” body of text. A lot of the quality in a chatbot like ChatGPT comes from the decoding strategy.

                                1. 1

                                  Does creativity arise from the quality of discernment?

                                  I don’t know. Sounds right though.

                                  Why would it (in theory) predict human cognition? Human cognition involves so much more than words.

                                  I’m not aware of any element of human cognition that does not causally connect fairly directly to the composition of text. The presumption is that with enough data, human cognition is the most compact compression of our textual corpus, as it is the primary generator.

                                  You don’t think we (roughly) have a shared the concept of text? We can both manipulate text at the symbolic level. …

                                  Right, but this concept is accessible from the arrangement of the symbols. There is a case to be made that an AI can’t really get access to senses - qualia, redness - from just seeing people talking about “redness”, though I don’t really buy it myself. But for there to be a quality to human language that isn’t accessible from sequences of textual tokens, would be surprising on a whole other level.

                                  Maybe I’ve misunderstood and you’re saying language is innate and doesn’t arise from our senses.

                                  Personally I think language is sort of innate, in that our brains are predisposed to pick up “something like” language; I think it occupies a special position in our cognition. But it’s certainly linked to sensory experiences very closely.

                                  But there are many decoding strategies. Many aren’t linear at all, using techniques like backtracking and non-determinism.

                                  Sure, but AIUI, if you’re looking at an output, every token that got actually generated was overwhelmingly causally determined by the tokens before it. The point is that the model can’t do cognitive work without generating tokens.

                                  We can think in advance without saying anything. GPT must generate at least one “syllable” per mental step. IMO, this explains about half of its weirdness as a general reasoner.

                                  1. 2

                                    Thank you for this exchange, it’s been delightful!

                                    Let me see if I can paraphrase your views:

                                    (Please prefix “human” before most of the nouns in the following)

                                    1. Cognition can be well represented in the composition of text
                                    2. Language can be well represented in text
                                    3. (Presumably?) Experience can be well represented in language
                                    4. Therefore, given enough inputs (representing experience) and outputs (representing cognition), we can train a function to think

                                    Right now:

                                    • GPT can create novel content (creativity), but it lacks meta-cognition like self-criticism
                                    1. 1

                                      Sounds right. I would be careful with 1: it’s not that cognition has innate correspondence to text, as that the text corpus that GPT is trained on happens to closely depend on human cognition. In other words, if you changed aspects of human cognition, our textual output would also change, in a way that allows recovery of sufficient mechanical details to be predictive.

                                      It’s easy to say here: “but what about the parts I don’t talk about?” I doubt there is such a thing as a universally unremarked human feature, though many important features go undercommented - consider how we only in the past decade cottoned on to the fact that many humans have no internal narration or visualization. It just never came up in broad awareness.

                                      I think the “right now” is difficult. GPT can obviously criticize itself, when prompted, but it seems to have difficulty motivating itself to criticize itself - it lacks any intentional stance. Which makes sense, because it’s a token predictor, not an agent - however, even when told to predict agents, it does a poor job of this. (I have never seen GPT correct itself in mid-generation unprompted.) This may suggest that self-correction, or generally intentional stances, is among the worst-represented features of human cognition in the corpus.

                                      <tinfoil hat>Alternately, it suggests most humans are not actually conscious, which is why features that require consciousness are poorly trained for.</tinfoil>

                                      1. 4

                                        Alternately, it suggests most humans are not actually conscious, which is why features that require consciousness are poorly trained for.

                                        Or that human consciousness is not easily expressed in text found in large amounts on the open internet.

                                        1. 3

                                          It’s one thing to be able to encode experience in textual tokens; it’s another thing to actually do so. I lightly suspect that, for the most part, we don’t.

                                          Moreover, written text is a tiny subset of textual tokens. I strongly suspect we’ve encoded only a biased fraction of the human experience in text. If a trained function achieves cognition, I doubt it will resemble human cognition.

                                          Moving back:

                                          if you’re looking at an output, every token that got actually generated was overwhelmingly causally determined by the tokens before it. The point is that the model can’t do cognitive work without generating tokens.

                                          We can think in advance without saying anything. GPT must generate at least one “syllable” per mental step. IMO, this explains about half of its weirdness as a general reasoner.

                                          By this definition, what process isn’t linear? It’s like you’ve pointed at the Arrow of Time.

                                          Moreover, I strongly suspect you’re discounting the importance of the decoding strategy. The character of its output is dramatically changed by the choice of algorithm. GPT is incoherent with a least-likely-word decoding strategy.

                                          I once met a journalist who told me that human thought is in linear narrative form— stories. Similarly, I keep meeting programmers who tell me human thought is in linear tokens— text. Neither match my own personal experience or observations of having a multitude of simultaneous coherent thoughts that must wait to be ordered, refined, merged, and edited before making it out my mouth or on to paper. 🪦 all the feelings that never made it out because there were no words.

                                          <tinfoil hat>Alternately, it suggests most humans are not actually conscious, which is why features that require consciousness are poorly trained for.</tinfoil>

                                          I strongly suspect we’re rarely conscious; it’s an expensive activity.

                                          Also: I think qualia are BS.

                                          Also: we don’t have free will.

                                          😉

                                          1. 2

                                            Also: we don’t have free will.

                                            How so?

                                            1. 2

                                              Similar to @FeepingCreature, I think the commonplace concept is useful.

                                              What’s the evidence for the philosophical concept? The case against has causal determinism, post-hoc rationalisation, left-brain interpreter, and so on.

                                              1. 1

                                                As I noted to FC, the majority of philosophers are compatibilists, so “the philosophical concept” is either the compatibilist concept or some theory-neutral description. Neither of these seem to lend themselves to supporting the statement “We don’t have free will”.

                                                1. 2

                                                  You’ve replied to me arguing against his statement re: philosophers. Don’t confuse us.

                                                  I subscribe to neither compatibilist free will nor compatibilist morality.

                                                  1. 2

                                                    I replied with reference to his statement because you began by saying you had a similar view. The “commonplace concept” that you approved of is just the compatibilist concept, and most philosophers hold to that concept of free will, so I’m not sure where you diverge and come to the conclusion that we don’t have free will.

                                                    Compatibilism isn’t a moral theory, so “compatibilist morality” doesn’t refer to anything.

                                                    1. 2

                                                      I imagine you’re not clear because you’ve asked no questions to follow-up on the reasons I gave.

                                                      Compatibilism isn’t a moral theory, so “compatibilist morality” doesn’t refer to anything.

                                                      Free will is a moral theory. And AIUI Compatibilism is a definition of free will that retains both moral responsibility and is compatible with determinism.

                                            2. 1

                                              By this definition, what process isn’t linear? It’s like you’ve pointed at the Arrow of Time.

                                              Yes, but GPT is more than linear, it’s steady. Humans can pause and think; we can crank our cognition purely internally. It may be possible to extend this concept to GPT, but it’s not how it works by default.

                                              Moreover, I strongly suspect you’re discounting the importance of the decoding strategy. The character of its output is dramatically changed by the choice of algorithm. GPT is incoherent with a least-likely-word decoding strategy.

                                              I discount the decoding strategy primarily because it does not participate in training. I think whatever decoding strategy our eventual AGI uses, it will have to be at least somewhat under its active control.

                                              I strongly suspect we’re rarely conscious; it’s an expensive activity.

                                              Also: I think qualia are BS.

                                              Also: we don’t have free will.

                                              I actually sort of agree! I think the mainstream philosophical views on qualia and free will are largely hogwash. But I do consider myself a compatibilist: I think free will as a commonsense concept has value, even though philosophical freedom does not. Similarly, though I don’t believe in epiphenomenalism, I do think that sensory data is processed in something like tokens in the brain, for which the word qualia seems appropriate.

                                              And yes, I do also think consciousness is overrated. But I believe GPT nicely demonstrates that while consciousness is employed by the brain in limited quantities, it is nonetheless load-bearing.

                                              1. 3

                                                I think the mainstream philosophical views on qualia and free will are largely hogwash. But I do consider myself a compatibilist: I think free will as a commonsense concept has value, even though philosophical freedom does not.

                                                This is a confusing statement. The majority of philosophers are compatibilists, so “the mainstream philosophical views on free will” is compatibilism and “philosophical freedom” is identified by most philosophers as compatibilist free will. Presumably you do not mean to call your own view “hogwash” or deny that it has value as a concept?

                                                1. 2

                                                  Is it the majority now? That’s good to hear.

                                                  It’s still sort of like saying that “the majority of climate scientists believe that the Earth is warming.” You’d hope for more than that, especially given that the concept of philosophical free will is outright incoherent.

                                                  1. 2

                                                    I don’t think I agree with you that libertarianism about free will is incoherent, but that would be a bit afar off topic to get into.

                                                2. 2

                                                  Humans can pause and think; we can crank our cognition purely internally.

                                                  We pause our cognition? Then how do we decide to unpause our cognition?

                                                  I discount the decoding strategy primarily because it does not participate in training. […] I do think that sensory data is processed in something like tokens in the brain

                                                  GPT doesn’t use tokens in its processing; it’s trained on embeddings in a vector space. The embedding function is a model too, the result is not a simple sequence of word vectors.

                                                  Why would we use tokens? We’re trained on neuron activations from a variety of sensory organs…

                                                  Now I strongly suspect you’re discounting the encoding strategy too! 😝

                                                  1. 1

                                                    We pause our cognition? Then how do we decide to unpause our cognition?

                                                    We can pause our speech. GPT can not. It has to “talk out loud” at a rate of one token per thought.

                                                    GPT doesn’t use tokens in its processing; it’s trained on embeddings in a vector space. The embedding function is a model too, the result is not a simple sequence of word vectors.

                                                    Yeah but aiui each embedding still corresponds 1:1 to a BPE input token, just in a different space.

                                                    1. 2

                                                      We can pause our speech. GPT can not. It has to “talk out loud” at a rate of one token per thought.

                                                      It also applies roughly the same amount of effort for each thought. Perhaps that’s why it’s seems so inconsistently intelligent. Some questions are well within its capabilities, and some questions require extra cycles. Perhaps it would seem more intelligent if it was allowed to say “umm” and “hmm.”

                                                      […] aiui each embedding still corresponds 1:1 to a BPE input token […]

                                                      My point is BPEs are a subset encoding of text. Gwern describes some of the impacts of BPEs on GPT-3’s cognitive capabilities.

                                                      IMHO it’s a big call to think that each massively lossy encoding we’ve discussed (experience -> language -> writing -> tokens -> embedding) are comprehensive enough to result in a function with anything resembling human cognitive capabilities. (n.b. though I do very lightly suspect down this path lays some form of cognition!)

                                                      1. 2

                                                        It also applies roughly the same amount of effort for each thought. Perhaps that’s why it’s seems so inconsistently intelligent. Some questions are well within its capabilities, and some questions require extra cycles. Perhaps it would seem more intelligent if it was allowed to say “umm” and “hmm.”

                                                        This is also, I believe, why you can make GPT significantly smarter by appending “Let’s explain our reasoning step by step.” to the prompt.

                                                        (”Chain of Thought”, or as I call it, “Thought”…)

                                                        1. 2
                            2. 5

                              Could someone explain to me why ChatGPT has had hella more explosive popularity than the first release of GPT3?… It’s the same stuff - I just tried it and it’s really no different than GPT3 from day 1. I’m very confused.

                              1. 6

                                A few reasons:

                                1. As far as I understand, ChatGPT is different in that it’s trained on more conversation than regular GPT is? But don’t quote me on that.
                                2. The chat format is significantly more accessible and easy to understand.
                                3. Natural hype cycles, where the “v2” (or what the public perceives to be a v2) of an exciting thing receives a lot more attention than even the original thing.
                                1. 5

                                  For me it was the accessibility. I was always pretty skeptical and did not expect AI to be so good that it would fool me into thinking it was a conscious thing. Even the Dall-E 2 and Stable-Diffusion examples, while impressive, seemed to be mostly about an experienced user getting nine subpar results and one hit with a lot of prompt “hacking”.

                                  Playing with ChatGPT turned that around, because how simple and natural the conversation flowed from the most wild idea that popped in my head. If you read carefully, you still see that it is not a conscious thing, but if you don’t you get easily fooled. This is an experience that is very powerful and you won’t get that just by reading blogs.

                                  In contrast; I started looking for similar open source projects and found KoboldAI. I cloned the repository, ran the command, spend a long time listening to my cpu fan and then I got a page full of options and buttons that I don’t understand at all. So I gave up on that for the time being.

                                  It could be that the work KoboldAI did is just as impressive as the work of OpenAI. But it will not give the same experience to the casually interested.

                                2. 3

                                  Perhaps prompt engineering was more important than it first seemed, at least for applying text transformers to practical problems. I don’t think that we have a full explanation of how this particular GPT is prompted yet; but if we do, it would be interesting to try it out on GPT-2 or other publically-available models.

                                3. 5

                                  I think the model isn’t large enough to memorize anything precisely enough to be caught plagiarizing.

                                  I’ve played with the trick of convincing it it’s a Linux terminal which can “run” curl https://… and for several URLs it printed HTML that was a plausible content of the site, but the details were entirely made up. For the URL of my personal homepage, it knew my name, it knew I’m a programmer from Poland, but hallucinated HTML markup with a bland structure of the site that is nothing like my actual page.


                                  Also, there will be complaints about it once it becomes popular enough to start replacing web searches and page visits. Sites that have provided this info will lose visitors and ad revenue.

                                  1. 3

                                    I think the model isn’t large enough to memorize anything precisely enough to be caught plagiarizing

                                    I’m not sure why you think that. Neural networks are lossy compression engines. Lossless compression on text easily achieves a two order of magnitude compression ratio over large corpora, and the achievable compression ratio increases with corpus size because there is a lot of redundancy in text. Lossy compression can easily achieve a couple of orders of magnitude better compression than lossless on a lot of classes of input. It still counts as plagiarism (or copyright infringement) if you substitute a few homonyms, and that’s the sort of thing that would significantly increase compression ratios.

                                    1. 1

                                      You’ve paraphrased what I was trying to say, so in turn I’m not sure what you’re not sure about :)

                                  2. 4

                                    I think this is pretty instructive - if we’re polling a person’s understanding they can get pretty far by just madlibs regurgitation of stuff they’ve read.

                                    1. 4

                                      I hope that one of the lessons to come out of this process is that a lot of modern development environments are incredibly verbose. A programming language needs to be unambiguous and concise. Natural language is terrible for programming because it is ambiguous and verbose, yet a lot of the examples I’ve seen of ChatGPT are able to take a prose description and expand it into a huge amount of code. This tells me that the information density of the code is far lower than it should be.

                                      1. 2

                                        Are they, though? I can agree that they are verbose and that there can definitely be more automatic disambiguation in languages like Haskell. But incredibly?

                                        GPT is taking the easy way out. It does not simplify, it makes the easy cases easier and hard harder. I have yet to see any evidence of simplification. But there is a lot of evidence of applying implicit context of about the Stack Overflow level.

                                        1. 2

                                          I think a lot of it comes down to having good defaults. If you can achieve good results by copying common code from 10 projects into a new one, then that shouldn’t be boilerplate that everyone adds, it should be the default configuration that you specify deltas to.

                                          1. 1

                                            I am still not convinced. The learning defaults are sometimes not the production defaults. It’s then up to the library author to either optimize for learning (which I prefer and you probably as well) or for production.

                                      2. 3

                                        There’s a surfeit of humans, web pages, books, etc capable of being confidently wrong, and sometimes right w/ zero citations. I think the cost of running this AI BS machine will eventually come down to the point where it will fit in a sort of Billy the Bass box and be able to be purchased at spencer’s gifts for us to ask questions of and get fun answers back. But hopefully between now and then a ton of copyright and trademark and other disputes will put an end to training on “everything” only to add more BS to the pile for the next model to train on.

                                        1. 2

                                          The hype has certainly been intense. I think there are other applications that simply a search engine helper though, for example I think chatGPT could probably be trained to take over most call center jobs, and for summarising things or rewording them for different levels of understanding, like in tutorials and guides. In any use case the types of errors mentioned here would have to be taken into account.

                                          I think what got programmers excited/scared is that it seems to point to the possibility that we could reach a point where code is written automatically in the not too distant future, not that we are already there. Tools like chatGPT, copilot, and full featured IDEs that not only catch syntax but also simple logical errors and code style could be combined to make pretty a pretty powerful programming assistant. Add to this the possiblity of creating unit tests in an automated way to catch AI brain-farts, or alternatively, using test driven development where humans write the tests and the AI simply fills in all the code. It seems that working in the industry is likely to change because of AI development in the next few years.

                                          1. 10

                                            I think chatGPT could probably be trained to take over most call center jobs

                                            You’ve discovered a way to make service lines even more hateful

                                            1. 7

                                              I, for one, am looking forward to the systems responsible for refunds being vulnerable to prompt injection attacks.

                                              1. 2

                                                Now the smooth talking Nigerian Prince can get your funds without even asking you.

                                              2. 2

                                                Or, possibly, make chatbots less hateful?

                                              3. 5

                                                I don’t think ChatGPT on its own is actually useful for things involving a human in a loop where the AI has to reason about the conversation; determining whether statements are true/false, being able to understand not just sentiment, but the reason for that sentiment (e.g. the caller is upset because of X, even though they called about Y), and importantly, being able to perform tasks without another human in the loop.

                                                ChatGPT can only be described as “understanding” its inputs/outputs in the loosest sense IMO - it is almost magical how well it seems to mimic “real” undstanding, but you can pretty quickly get it to confabulate, and very assertively tell you completely incorrect information, or generate code that is completely wrong. Without metacognition - the ability to learn from interactions, and to reason about inputs/outputs and whether the output truly satisifes the input (for example, if asking for an explanation on some real property of the world, that the output is factually correct, or at least explains why its confidence/lack thereof) - I just don’t think ChatGPT as it is will be useful as much more than an assistant whose work you have to always double check.

                                                I get the impression that people think we’re on the cusp of solving those issues, but from what I’ve gathered after poking around a bit, we’re just as far away from a solution as we were a few years ago. If you go read the papers behind these models, it is openly acknowledged that we don’t actually understand why the models work the way they do. That is a really significant problem IMO if we want to solve some of the problems that have been brought up with ChatGPT.

                                                I haven’t found much on the topic yet, but I’m curious if there are ongoing efforts to design a model that works in tandem with a knowledge base of some kind, so that it can not only learn new facts which it feeds back into itself, but that allow it to form a concept of reality, fact vs opinion, confidence in the accuracy of its output, and the ability to explain how it derived its output. That probably doesn’t really work with models like GPT-3 that probabilistically generate a string of output text, since there isn’t any evaluation of what the text actually means; but it feels like there has to be some way to take the kind of pseudo-understanding that ChatGPT derives and extract a model that can think about why ChatGPT derived that output.

                                                1. 2

                                                  I was just playing with it. It is totally amazing! I presented a library that it had never seen before and it was able to pull it apart and understand it. After I explained it, it wrote a decent Go library. It was aware of very specific edge cases (though, sometimes wrongly applied), but it’s awareness of the edge cases in the general context blew me away.

                                                  I then feed it the README, and it started documenting the appropriate code sections with comments from the README. I’m amazed.