1. 4

  2. 1

    It’s only a “visualization” in that it’s a slide show, where the slides have a lot of annoying and unnecessary animation. The basics of text prediction models get ’splained. We learn how $4.5M got poured through some GPUs. Some relatively trivial pre- and post-processing steps are peeled off the big black box… oh look, there are some layers in there… big numbers are thrown around… and then the whole presentation is abruptly over before anything profound has been shown. Although, to be fair, the author does promise at the top to update his unfinished page “over the next few days”.

    But if you click through to his Illustrated GPT-2, there’s some more substance. And since GPT-3 is just scaled-up GPT-2 with a lot more training data (and perhaps some algorithmic tweaks, but hard to say since the code will not be available) that’s probably the one to read.

    1. 1

      It was estimated to cost 355 GPU years and cost $4.6m.

      The dataset of 300 billion tokens of text is used to generate training examples for the model.

      And how much have they paid to the authors of that texts?

      1. 1

        I’d say it’s fair use. They’re not using the exact text and passing it off as their own.It’s more about extracting “patterns” from the texts.

        It’s just Google ngrams on a much larger scale.

        1. 1

          I expect, that it is legal. But the question is whether it is ethical and whether it should be legal. IMHO it should be legal only if you are doing free software and thus returning the results to the authors of texts you used. But we are starting a political discussion now, so I will keep this topic for my blog rather :-)