Hmm. So as a quick experiment, I asked GPT-3 to write a tutorial for SPIRE, which is an open source project I work on. It spit out the actual tutorial for SPIRE, almost word-for-word from our web site. This was very surprising because I didn’t think GPT-3 included random websites in its corpus, and also we’re a pretty small project that wouldn’t be at the top of anyone’s list of things to memorize. Finally, I thought GPT-3 took some measures to prevent just memorizing documents wholesale, but I guess they did not work.
I kind of suspect that the output of GPT is basically https://dl.acm.org/doi/pdf/10.1145/3442188.3445922
It feels like you can already feel this creeping into knowledge on the network at large, but the way model-generated botspam is about to pollute the available technical documentation is a nice encapsulation of how terrifying this is for the near term future.
We’re like little babies being afraid of Russian disinformation now. Wait until corporations pay GPT-N to spew out content until there’s nothing left on the internet but snack food copy. We’ll miss the quaint old days of handmade propaganda.