I really feel a bit for Shutterstock here. Their previews are used to train countless models, without them being involved and depending on the model their watermarks even bleed through to the generated media. The reign of stock footage is clearly coming to an end with the recent AI progress, but Shutterstock not being used as a stepping stone of said progress without getting anything in return makes me feel this weird emotional concoction with empathy towards a faceless company.
I wonder if this is the point at which we get the real “is training fair use” discussion. Shutterstock is big enough to go against Meta on this one, and they have some good reasons why. For what it’s worth, while the dataset of their videos has been collected under an exception for non-comercial research work in UK, it’s likely that many aren’t using in in that way.
I found and submitted this article which was removed because it was seen as “business news”, but to me it does look like the big tech companies are using academics as cut-outs to get around any objections to fair use:
I generally understand what you mean, but I absolutely can’t feel sorry specifically for Shutterstock because how they treat their creators.
Also I think this is questionable practice not only towards $STOCK_PLATFORM, but mostly the authors.
Shutterstock hosts the content but the content itself is contributed by Shutterstock users. Shutterstock might have contributed the captions that apparently were helpful in training the model.