My guess is that if there are no threat models or security models for LLMs there is no way to intentionally design a model that is robust and safe. And LLMs (or deep learning in general) is model-free - in the sense that there is no theory describing what an LLM does: it fits input to output without being given a model of how they relate (via universal function approximating property of neural networks). Can any one more specialized in security comment on that?
A couple of things that don’t work for me about that.
Here’s the text that is linked to there:
As of March 1st, 2023, we retain your API data for 30 days but no longer use your data sent via the API to improve our models.
This policy only covers the API! Users of the ChatGPT end-user interface (including the brand new ChatGPT iOS app) are not covered by this policy - their input can still be used “to improve our models”.
But that’s my second note here: “fine-tune their models” and “improve our models” are not necessarily the same thing.
OpenAI have always been extremely vague on what “to improve our models” actually means. I’m personally pretty sure this doesn’t mean that input to ChatGPT is directly added to the core training set, or is even directly used for fine-tuning - OpenAI have strong incentives not to accidentally allow garbage data or PII to make it through to the models like that.
I don’t think it’s a 100% certainty that ChatGPT input is poisoning the models in the way described by the article.
But as the article points out… they’re infuriatingly opaque about this. So maybe that data IS being used in this way!
I imagine that the high cost of these models probably part of the reason why they’re being so secretive. If it takes a billion dollars to make a model they’re very hesitant to show the world what they’re actually doing. I’m not in favor of the secrecy but I do see the commercial reality of why they do this.
I’m also reminded of a story that Asimov wrote when the early generation of robots. Where they were so limited that they put these fences around them to keep people from messing them up I think the AI models are similar to this and that they’re very fragile and the marketing folks are afraid of it being exposed.
My guess is that if there are no threat models or security models for LLMs there is no way to intentionally design a model that is robust and safe. And LLMs (or deep learning in general) is model-free - in the sense that there is no theory describing what an LLM does: it fits input to output without being given a model of how they relate (via universal function approximating property of neural networks). Can any one more specialized in security comment on that?
A couple of things that don’t work for me about that.
Here’s the text that is linked to there:
This policy only covers the API! Users of the ChatGPT end-user interface (including the brand new ChatGPT iOS app) are not covered by this policy - their input can still be used “to improve our models”.
But that’s my second note here: “fine-tune their models” and “improve our models” are not necessarily the same thing.
OpenAI have always been extremely vague on what “to improve our models” actually means. I’m personally pretty sure this doesn’t mean that input to ChatGPT is directly added to the core training set, or is even directly used for fine-tuning - OpenAI have strong incentives not to accidentally allow garbage data or PII to make it through to the models like that.
I don’t think it’s a 100% certainty that ChatGPT input is poisoning the models in the way described by the article.
But as the article points out… they’re infuriatingly opaque about this. So maybe that data IS being used in this way!
I imagine that the high cost of these models probably part of the reason why they’re being so secretive. If it takes a billion dollars to make a model they’re very hesitant to show the world what they’re actually doing. I’m not in favor of the secrecy but I do see the commercial reality of why they do this.
I’m also reminded of a story that Asimov wrote when the early generation of robots. Where they were so limited that they put these fences around them to keep people from messing them up I think the AI models are similar to this and that they’re very fragile and the marketing folks are afraid of it being exposed.