Video is effectively volumetric data. If you’ve ever looked at a still image from standard definition interlaced television, you’ve probably been shocked at how low quality it seems, because your brain is building a much higher detail picture by combining multiple frames. I’d expect the same thing from a neural net. Processing one frame will update state and then the next frame will add to that, and so on. Similar things will reinforce activations and you’ll end up with internal state for the network that has, effectively, an error-corrected view of the scene. Even if you’re downsampling the input frames, the high levels of redundancy between frames will give you a lot more.
We don’t know what Google’s models are trained on, but I’d be shocked if it didn’t include the dataset from Google’s book search, so it almost certainly has the names and authors of the books and possibly even cover photos of a bunch of editions in the training set. Matching on that should give high accuracy.
That said, I do want this. I use LibraryThing to remember which books I have, but periodically find ones I’ve missed when scanning. I’d love to just wave a camera at my bookshelves and have a proper index (especially one telling me which shelf each book is on, which I can update when I move things around by just waving the camera around). Most books without ISBNs and bar codes are not indexed, because I am lazy, but there’s no reason this couldn’t do it.
Now if only it could fix the data in LibraryThing, where often the title that it fetches includes the marketing tag line, the authors are not properly normalised, and so on.
It is a bit crazy to imagine that Google can take in videos, pull location metadata, scan the contents, compare it to information that it has built up about you, see if the video was taken at your home, analyze the physical stuff you have in your home, add it to your internal profile, and then target you with ads. They may not do that for legal or logistical reasons, but it seems like they have all of the individual pieces.
First I was irritated at the title because video isn’t an app. Then I realized video is an application of this, in a different but perfectly valid sense of “application”.
Video is effectively volumetric data. If you’ve ever looked at a still image from standard definition interlaced television, you’ve probably been shocked at how low quality it seems, because your brain is building a much higher detail picture by combining multiple frames. I’d expect the same thing from a neural net. Processing one frame will update state and then the next frame will add to that, and so on. Similar things will reinforce activations and you’ll end up with internal state for the network that has, effectively, an error-corrected view of the scene. Even if you’re downsampling the input frames, the high levels of redundancy between frames will give you a lot more.
We don’t know what Google’s models are trained on, but I’d be shocked if it didn’t include the dataset from Google’s book search, so it almost certainly has the names and authors of the books and possibly even cover photos of a bunch of editions in the training set. Matching on that should give high accuracy.
That said, I do want this. I use LibraryThing to remember which books I have, but periodically find ones I’ve missed when scanning. I’d love to just wave a camera at my bookshelves and have a proper index (especially one telling me which shelf each book is on, which I can update when I move things around by just waving the camera around). Most books without ISBNs and bar codes are not indexed, because I am lazy, but there’s no reason this couldn’t do it.
Now if only it could fix the data in LibraryThing, where often the title that it fetches includes the marketing tag line, the authors are not properly normalised, and so on.
It is a bit crazy to imagine that Google can take in videos, pull location metadata, scan the contents, compare it to information that it has built up about you, see if the video was taken at your home, analyze the physical stuff you have in your home, add it to your internal profile, and then target you with ads. They may not do that for legal or logistical reasons, but it seems like they have all of the individual pieces.
Same, that site has been going on forever. There’s more than (Amazon owned) GoodReads in the world.
Now I want @simonw to make me dinner and cocktails.
First I was irritated at the title because video isn’t an app. Then I realized video is an application of this, in a different but perfectly valid sense of “application”.