Just this week, the New York Times launched a landmark lawsuit against OpenAI and Microsoft over this issue. The 69 page PDF is genuinely worth reading—especially the first few pages, which lay out the issues in a way that’s surprisingly easy to follow.
Thanks for the link. The previous things people had sent me about this were all links to the NYT itself, which is paywalled. I’m especially curious about what the court will make of misattribution. Falsely claiming NYT said something, in a way that can damage their reputation, seems like the textbook definition of libel.
The decision in relation to paragraph 8 are also going to be very interesting. I don’t see how putting something in an LLM is any different from putting it in a lossily compressed database, MS disagrees, it will be interesting to see how the court decides.
This action seeks to hold them responsible for the billions of dollars in statutory and actual damages that they owe for the unlawful copying and use of The Times’s uniquely valuable works.
Interestingly, MS was one of the companies that pushed for the $7,500-per-incident statutory damages for wilful infringement. Given the precedents set by the RIAA, it would be possible to claim that each ChatGPT response that reproduced NYT content deserves a separate $7,500 fine. That would bankrupt Microsoft.
Exhibit J is particularly damning. An MP3 from a CD or an MPEG-4 video from a DVD have more loss than the NYT -> GPT4 transformation and there’s a lot of precedent that these count as derived works for the purpose of copyright infringement.
The OpenAI Defendants consist of a web of interrelated Delaware entities.
I believe that’s legalese for ‘these people are incredibly dodgy’.
My non-lawyer understanding is that libel requires intent to cause damages. Being wrong is not libel. I’m not a lawyer but it’s possible that there’s some exception for negligence, but that’s a whole different case to be made.
The OpenAI Defendants consist of a web of interrelated Delaware entities.
Not really.
a) Everyone incorporates in Delaware
b) OpenAI is structured in an atypical way, kinda like Mozilla but a bit more complex. That’s the nature of a not-for-profit with a for-profit partner.
My non-lawyer understanding is that libel requires intent to cause damages. Being wrong is not libel.
I am also not a lawyer, but I believe that this changes somewhat after you were notified that the claims are both incorrect and damaging. In this case NYT notified MS some months ago (as per the filings) that things were incorrectly attributed to them in a way that caused reputational damage. MS continued to make these claims publicly. Whether the initial claim was intended to cause damage or not, they had been notified that the claims were causing damage and continued to make them.
It’s a computer program, not a human. The intent of the programmers is to have ChatGPT be accurate - that they fail to do so at all times is just an issue of programs being complex, not negligence or malice.
ChatGPT tends to clarify that it can’t provide 100% accurate information or total recall.
I’m not sure it’s within the fair expectation of ChatGPT (as in, the expectation that an average user would have) that when it makes a citation that citation is strictly true.
I’m not sure what the right call here is, let alone the legal one.
I think there are three different categories here:
You do a thing and, via an error, you cause harm. This may count as negligence, but it often avoids liability.
You do a thing knowing that it will cause harm, you are notified of the harm, and you keep doing it. This is normally the kind of criminal negligence that leads to big compensation claims being paid out.
You do something for for the explicit purpose of causing harm. This is where you start getting assault / battery / murder charges or other criminal indictments.
I don’t think anyone is claiming that the GPTs are in the third category, but I think NYT can make a pretty solid case that they’re in the second.
I don’t see why a user wouldn’t be expected to believe that citations are inaccurate. The first time I used Bing Chat, I tried asking it for the difference between CHERI capabilities and a RISC-V PMP. It actually gave a pretty good reply (cribbed from something I’d written, so I’m biased to like it) but it also provided me with three citations about PMPs. All of these were to articles that talked about Project Management Professionals, not Physical Memory Protection units (the prose associated with the citations talked about the latter). These were presented as formal citations, with the name of the sources (Forbes, Gartner, and so on) listed at the bottom. There was nothing in the UI that said ‘check citations, they may be complete bullshit’, they were presented as authorities supporting the claims made by the system.
Now, it would be completely fine if, instead of presenting them as citations, it said: ‘Here are some articles that appear to be on a similar topic that you might want to read for further information’. Then a normal reader would be expected to treat them much like search results and go and read them to see if they really did support the claims. They weren’t though, they were presented in such a way that the prose generated by ChatGPT would be read as a summary of the articles and the names of the publishers of the articles would lend weight to the value of the ChatGPT output.
To make a correct attribution, ChatGPT needs accurate NYT copyrighted materials. Total recall means a perfect reproduction of all NYT copyrighted materials. So to avoid misattribution, ChatGPT should absolutely abstain from any mention of NYT at all cost.
I’m not sure what NYT wants, but definitely they wouldn’t want complete absence from ChatGPT output. That would be like asking Google to delist NYT from search results.
Then the last legal question is, if the disclaimer “ChatGPT can make mistakes. Consider checking important information.” is enough.
but definitely they wouldn’t want complete absence from ChatGPT output
What benefit is there for the NYT to be included in ChatGPT output, as opposed to their content being included being a benefit to the creators of ChatGPT?
If it was technically possible and legally enforceable, I would demand the removal of all my online content from all LLM datasets. That would include everything I’ve written in my blogs, my photographs, comments here and on Reddit etc, as well as any other services.
There’s a trademark claim towards the end of the document which looks to me like it ties into their complaint about outputting a false piece of information and then crediting it to The NY Times.
Great piece! Any thoughts on non-LLM generative AI? (I am personally finding the image generation to be of extraordinarily limited utility – suffering from absolutely wild hallucinations, impossible to fine-tune, etc. – but I also haven’t engaged with it very seriously.) Also, any thoughts on retrieval augmented generation? Just as a user, it feels like that has been a big breakthrough with respect to reducing the hallucinations of LLMs; is that correct?
I decided not to write about image generators, mainly so I could get my post out before the end of 2023!
2023 was the year Midjourney went genuinely photorealistic - the whole Balenciaga Pope thing back in March, and they released increasingly capable models since then. Still really hard to get exactly what you want out of them, but you can get a photorealistic image of pretty much anything now.
In Stable Diffusion world the thing where you can generate more than one image per second on a consumer GPU is pretty wild. I’m fascinated by the tools which let you draw a sketch and have it turned into a prompt-driven image “live” as you edit it: https://fedi.simonwillison.net/@simon/111489351875265358
As for RAG: it’s a really big deal, I should have included it in my roundup. It’s still hard to do it really well, but getting a basic version up and running is practically the “hello world” of working with LLMs. My best result so far has been this one, which is a Bash one-liner! https://til.simonwillison.net/llms/embed-paragraphs#user-content-answering-a-question
That’s awesome – thank you, and thank you more generally for all of the exploration and writing you’ve done on LLMs and AI more broadly: it’s a tremendous service for the practitioner looking to get past the hype (and hysteria)!
so… shouldnt this be “stuff we figured out about LLMs in 2023”? As Much as I enjoyed reading the blog, I am always sad that the field is now reduced almost solely to LLMs.
Yeah, I’d originally planned to include notes about other generative AI stuff - image generation, music, video - but I ran out of time in 2023 and published what I had.
Just as a side note, I want to say that your blog is the best source of technical AI information on LLMs I’ve seen in the wild.
Thanks for the link. The previous things people had sent me about this were all links to the NYT itself, which is paywalled. I’m especially curious about what the court will make of misattribution. Falsely claiming NYT said something, in a way that can damage their reputation, seems like the textbook definition of libel.
The decision in relation to paragraph 8 are also going to be very interesting. I don’t see how putting something in an LLM is any different from putting it in a lossily compressed database, MS disagrees, it will be interesting to see how the court decides.
Interestingly, MS was one of the companies that pushed for the $7,500-per-incident statutory damages for wilful infringement. Given the precedents set by the RIAA, it would be possible to claim that each ChatGPT response that reproduced NYT content deserves a separate $7,500 fine. That would bankrupt Microsoft.
Exhibit J is particularly damning. An MP3 from a CD or an MPEG-4 video from a DVD have more loss than the NYT -> GPT4 transformation and there’s a lot of precedent that these count as derived works for the purpose of copyright infringement.
I believe that’s legalese for ‘these people are incredibly dodgy’.
My non-lawyer understanding is that libel requires intent to cause damages. Being wrong is not libel. I’m not a lawyer but it’s possible that there’s some exception for negligence, but that’s a whole different case to be made.
Not really.
a) Everyone incorporates in Delaware
b) OpenAI is structured in an atypical way, kinda like Mozilla but a bit more complex. That’s the nature of a not-for-profit with a for-profit partner.
I am also not a lawyer, but I believe that this changes somewhat after you were notified that the claims are both incorrect and damaging. In this case NYT notified MS some months ago (as per the filings) that things were incorrectly attributed to them in a way that caused reputational damage. MS continued to make these claims publicly. Whether the initial claim was intended to cause damage or not, they had been notified that the claims were causing damage and continued to make them.
Maybe. I’m sort of split on this.
It’s a computer program, not a human. The intent of the programmers is to have ChatGPT be accurate - that they fail to do so at all times is just an issue of programs being complex, not negligence or malice.
ChatGPT tends to clarify that it can’t provide 100% accurate information or total recall.
I’m not sure it’s within the fair expectation of ChatGPT (as in, the expectation that an average user would have) that when it makes a citation that citation is strictly true.
I’m not sure what the right call here is, let alone the legal one.
I think there are three different categories here:
I don’t think anyone is claiming that the GPTs are in the third category, but I think NYT can make a pretty solid case that they’re in the second.
I don’t see why a user wouldn’t be expected to believe that citations are inaccurate. The first time I used Bing Chat, I tried asking it for the difference between CHERI capabilities and a RISC-V PMP. It actually gave a pretty good reply (cribbed from something I’d written, so I’m biased to like it) but it also provided me with three citations about PMPs. All of these were to articles that talked about Project Management Professionals, not Physical Memory Protection units (the prose associated with the citations talked about the latter). These were presented as formal citations, with the name of the sources (Forbes, Gartner, and so on) listed at the bottom. There was nothing in the UI that said ‘check citations, they may be complete bullshit’, they were presented as authorities supporting the claims made by the system.
Now, it would be completely fine if, instead of presenting them as citations, it said: ‘Here are some articles that appear to be on a similar topic that you might want to read for further information’. Then a normal reader would be expected to treat them much like search results and go and read them to see if they really did support the claims. They weren’t though, they were presented in such a way that the prose generated by ChatGPT would be read as a summary of the articles and the names of the publishers of the articles would lend weight to the value of the ChatGPT output.
To make a correct attribution, ChatGPT needs accurate NYT copyrighted materials. Total recall means a perfect reproduction of all NYT copyrighted materials. So to avoid misattribution, ChatGPT should absolutely abstain from any mention of NYT at all cost.
I’m not sure what NYT wants, but definitely they wouldn’t want complete absence from ChatGPT output. That would be like asking Google to delist NYT from search results.
Then the last legal question is, if the disclaimer “ChatGPT can make mistakes. Consider checking important information.” is enough.
That’s effectively exactly what they’re asking for in the lawsuit: https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec2023.pdf
On page 68 their “prayer for relief” includes:
What benefit is there for the NYT to be included in ChatGPT output, as opposed to their content being included being a benefit to the creators of ChatGPT?
If it was technically possible and legally enforceable, I would demand the removal of all my online content from all LLM datasets. That would include everything I’ve written in my blogs, my photographs, comments here and on Reddit etc, as well as any other services.
There’s a trademark claim towards the end of the document which looks to me like it ties into their complaint about outputting a false piece of information and then crediting it to The NY Times.
Great piece! Any thoughts on non-LLM generative AI? (I am personally finding the image generation to be of extraordinarily limited utility – suffering from absolutely wild hallucinations, impossible to fine-tune, etc. – but I also haven’t engaged with it very seriously.) Also, any thoughts on retrieval augmented generation? Just as a user, it feels like that has been a big breakthrough with respect to reducing the hallucinations of LLMs; is that correct?
I decided not to write about image generators, mainly so I could get my post out before the end of 2023!
2023 was the year Midjourney went genuinely photorealistic - the whole Balenciaga Pope thing back in March, and they released increasingly capable models since then. Still really hard to get exactly what you want out of them, but you can get a photorealistic image of pretty much anything now.
In Stable Diffusion world the thing where you can generate more than one image per second on a consumer GPU is pretty wild. I’m fascinated by the tools which let you draw a sketch and have it turned into a prompt-driven image “live” as you edit it: https://fedi.simonwillison.net/@simon/111489351875265358
As for RAG: it’s a really big deal, I should have included it in my roundup. It’s still hard to do it really well, but getting a basic version up and running is practically the “hello world” of working with LLMs. My best result so far has been this one, which is a Bash one-liner! https://til.simonwillison.net/llms/embed-paragraphs#user-content-answering-a-question
That’s awesome – thank you, and thank you more generally for all of the exploration and writing you’ve done on LLMs and AI more broadly: it’s a tremendous service for the practitioner looking to get past the hype (and hysteria)!
Oxide and Friends Topic/Guest?
Absolutely! If @simonw is up for it, we would love to have him!
Closing the loop: Open Source LLMs with Simon Willison (on Oxide and Friends)
so… shouldnt this be “stuff we figured out about LLMs in 2023”? As Much as I enjoyed reading the blog, I am always sad that the field is now reduced almost solely to LLMs.
Yeah, I’d originally planned to include notes about other generative AI stuff - image generation, music, video - but I ran out of time in 2023 and published what I had.
Minor typo here
Thanks, fixed!
Can’t wait for the corresponding article next year: “Stuff AI figured out about us in 2024”
I have generally enjoyed your posts about AI, especially the prompt engineering ones. Thank you for them!