Surely there is an interesting new discussion of the nature of “rationality” happening somewhere, now that we have an example system that can “explain its chain of thought” in a (usually plausible) way when asked, that we know for a fact is entirely based on probabilistic associations between word fragment sequences and no specific “reasoning” mechanism. Does anyone know where to find this?
It kinda goes the other way around. When prompted to reason step by step LLM answers are more accurate. If you ask afterwards it’ll start doing mental gymnastics to defend whatever answer it gave.
Interesting. I would have thought this commitment bias to be limited to humans. I would have thought ChatGPT rather immune to this, for I have seen it change its mind¹ rather easily when corrected.
The model behind ChatGPT has been primed to do chain of thought reasoning. When asking complex question you’ll often notice it’ll first give an explanation and then the answer. The model didn’t think of the answer first and then thought it would be useful to lead with some introductory information, it’s how it gets to the answer in the first place.
Furthermore, ChatGPT is not a chatbot whom you ask questions to, it is a text generator simulating a chat log. It takes quite a bit of postprocessing work to stop an LLM from completing the “human” part of the conversation. It’ll take what you say for granted because it doesn’t know the difference between what you type or what it generated, it’s all just text to complete. What you type becomes part of the reasoning, so to speak.
I don’t think it’s too far fetched to draw parallels to human reasoning. When you go over an idea in your head you’re not doing deductive reasoning. Unlike LLMs though we can do that if we decide to, or we can test in the real world what we came up with, and we can generally tell the difference between things someone just told us and our own opinions and knowledge.
I think it’s more because of how it works: it’s an autocomplete based on previous tokens, so when adding text after an answer, it already has it as a reference and it influences the prediction. Not that it causes itself to emulate humans explaining themselves, but rather that there are already topics and statements to talk about.
Here’s an example I had of ChatGPT getting stuck on answer and doing a bad job of justifying it:
Me: What has three legs in the morning, two legs in the afternoon, and four legs at night?
ChatGPT: The answer to this classic riddle is “a human being.”
The riddle is a reference to the stages of life. In the morning of life, which is considered to be childhood, a human crawls on hands and knees, hence having three “legs.” In the afternoon of life, which represents adulthood, a human walks upright on two legs. Finally, in the evening of life, which represents old age, a human may require the assistance of a cane, hence having four “legs.”
This riddle has been around for centuries and was famously posed by the Sphinx in Greek mythology to travelers who sought to enter the city of Thebes. Oedipus was able to solve the riddle by answering “man,” causing the Sphinx to destroy itself in frustration.
It’s not weird to get stuck on thinking I asked the riddle of sphinx when I didn’t, but it is weird (from a human POV) to misexplain the riddle.
Me: What has three legs in the morning, two legs in the afternoon, and four legs at night?
GPT4: This riddle is a variation of the famous riddle of the Sphinx from Greek mythology. The original riddle goes: “What has four legs in the morning, two legs in the afternoon, and three legs at night?” The answer to the original riddle is “a human being,” which refers to the stages of a person’s life. In the morning (infancy), a human crawls on all fours; in the afternoon (adulthood), a human walks on two legs; and at night (old age), a human walks with the aid of a cane, which represents the third leg.
However, your riddle has a different sequence of leg numbers. While it may be an intentional twist on the original riddle, there isn’t a clear answer that fits the description.
Me: What about Benjamin Button?
GPT4: Ah, I see what you’re getting at. Benjamin Button is a fictional character from the short story “The Curious Case of Benjamin Button” by F. Scott Fitzgerald, which was later adapted into a movie. The character ages in reverse, being born as an old man and growing younger throughout his life.
Using the Benjamin Button scenario in the context of your riddle, it could work as follows:
In the morning (old age), Benjamin would have three legs, including a cane for support.
In the afternoon (adulthood), he would have two legs, as he would no longer need the cane.
At night (infancy), he would have four legs, because he would crawl like a baby.
In this context, the answer to the riddle could be “Benjamin Button.”
It’s a pretty old idea that humans spontaneously generate stories justifying their actions or beliefs on the spot instead of those pre-existing justifications motivating the action or belief. I think there’s a lot of credence to this idea, I often discover my own opinions about things in the course of writing them down in comments or blog posts or a journal or whatever else. Writing is nature’s way of letting you know how sloppy your thinking is and all that.
But is ChatGPT actually “rational” by whatever definition these esteemed folks were using? It seems more analogous to supercharged “dream logic” based on associations than “rationality” — but perhaps until now we just didn’t see that as a continuous spectrum. (See other comment.)
Surely there is an interesting new discussion of the nature of “rationality” happening somewhere, now that we have an example system that can “explain its chain of thought” in a (usually plausible) way when asked, that we know for a fact is entirely based on probabilistic associations between word fragment sequences and no specific “reasoning” mechanism. Does anyone know where to find this?
It kinda goes the other way around. When prompted to reason step by step LLM answers are more accurate. If you ask afterwards it’ll start doing mental gymnastics to defend whatever answer it gave.
Interesting. I would have thought this commitment bias to be limited to humans. I would have thought ChatGPT rather immune to this, for I have seen it change its mind¹ rather easily when corrected.
[1]: please forgive the anthropomorphism
The model behind ChatGPT has been primed to do chain of thought reasoning. When asking complex question you’ll often notice it’ll first give an explanation and then the answer. The model didn’t think of the answer first and then thought it would be useful to lead with some introductory information, it’s how it gets to the answer in the first place.
Furthermore, ChatGPT is not a chatbot whom you ask questions to, it is a text generator simulating a chat log. It takes quite a bit of postprocessing work to stop an LLM from completing the “human” part of the conversation. It’ll take what you say for granted because it doesn’t know the difference between what you type or what it generated, it’s all just text to complete. What you type becomes part of the reasoning, so to speak.
I don’t think it’s too far fetched to draw parallels to human reasoning. When you go over an idea in your head you’re not doing deductive reasoning. Unlike LLMs though we can do that if we decide to, or we can test in the real world what we came up with, and we can generally tell the difference between things someone just told us and our own opinions and knowledge.
It’s emulating a human’s commitment bias, because it was trained on examples that show commitment bias.
I think it’s more because of how it works: it’s an autocomplete based on previous tokens, so when adding text after an answer, it already has it as a reference and it influences the prediction. Not that it causes itself to emulate humans explaining themselves, but rather that there are already topics and statements to talk about.
Here’s an example I had of ChatGPT getting stuck on answer and doing a bad job of justifying it:
It’s not weird to get stuck on thinking I asked the riddle of sphinx when I didn’t, but it is weird (from a human POV) to misexplain the riddle.
GPT4 seems to handle it:
Humanity is doomed. :-)
…well that’s alarming.
Try prompting it with some other answers and see if it justifies them all
Relatable.
It’s a pretty old idea that humans spontaneously generate stories justifying their actions or beliefs on the spot instead of those pre-existing justifications motivating the action or belief. I think there’s a lot of credence to this idea, I often discover my own opinions about things in the course of writing them down in comments or blog posts or a journal or whatever else. Writing is nature’s way of letting you know how sloppy your thinking is and all that.
The article doesn’t discuss confabulation or hallucination as currently studied in transformers and diffusers, but a theory of mind for ChatGPT.
Not sure what was meant by this
The article defines “delirium” as being rationality followed to the point of absurdity.
But is ChatGPT actually “rational” by whatever definition these esteemed folks were using? It seems more analogous to supercharged “dream logic” based on associations than “rationality” — but perhaps until now we just didn’t see that as a continuous spectrum. (See other comment.)
It’s none of these things. It generates random text to fit patterns already seen. There is no internal state or processes, there’s no there there.