I like it but I notice the last commit was 9 months ago while it bills itself as early alpha. Does anyone know if it’s still being developed?
This is from the same developer that brought us cryptocat (circa 2011) and used to argue angrily when security issues were pointed out. He eventually came around on that front IIRC, but as a result of my experience reporting early cryptocat bugs to him, I would treat this more of a source of ideas than any kind of implementation that should be trusted for any reason. Obviously people grow, and I hope my assessment is a gross under-estimation, but it’s probably a healthy approach to anything unproven, regardless of provenance.
Cool, but the next wave of ad blockers will need a completely novel approach once SSAI (server-side ad insertion) takes off unless we all just collectively reject ad monetized video content.
DAI (Google’s SSAI solution) is already in what amounts to a prerelease for larger customers
Could you explain quickly what SSAI is?
Sure! I will limit my explanation to the bounds of HLS (HTTP Live Streaming) since the concept is the same for both HLS & DASH (Dynamic Adaptive Streaming over HTTP) and these are the two most important ABR (Adaptive Bitrate) content types.
Here are some more resources:
Who controls the SSAI server? Would that be Google in this case and the content is made available to them by the company that owns the page where the video will be displayed? So does that mean that in order to host a ad network that uses SSAI you basically have to proxy all traffic for your customers?
It seems weird to do the ads on the server since (as I understand it) advertisers don’t trust content providers not to cheat, and that’s why ads are fetched on the client from separate servers (which can then be blocked with relative ease).
Maybe I just totally don’t understand what’s happening here.
Advertisers don’t trust content providers in general not to cheat.
However, Google have been caught ‘cant-believe-its-not-cheating’ multiple times with no impact, and it took years for it (eg putting brands next to KKK vids) to catch up with them on youtube.
I suspect YT could pull it off and tell advertisers that’s the new deal.
I think the next wave, already here, really, are service-specific user agents. Instead of cutting out the advertising, they cut out the content and make a new frame for it.
These take many different forms including websites (archive.is, youtube downloader sites), scripts (youtube-dl), binary apps (Frost, AlienBlue, NewPipe).
As @whjms noted, unless they are patching the manifest files on the fly to undo the SSAI (possible, but would lead to another type of whack-a-mole) it doesn’t matter how you are showing the content
Wouldn’t newpipe still have to display the SSAI ads, since the ads are dynamically inserted into the video?
Does SSAI get to track you across the web? TBH, I don’t care about ads themselves, especially in video (that last bit may be because I just don’t watch all that much video). What aggravates me is the whole surveillance aspect of most current online advertising. By my read, SSAI should neuter the ability to track you across different sites. I’m set to call that flawless victory, if ad supported content is forced to resort to something that can’t track me.
They still build it to involve tracking, with JS and cookies and whatnot that all happens before the video stream is requested. I believe if all of that is blocked, you still get ads, just not “retargeted” ones.
My old employer, a big player in the video space, has been doing SSAI for a few years now.
I never worked in that directly, because I find it gross, but I suspect you could detect differences in encoding between the “content” and “ad” segments.
That sounds like it would be fun to make. I suspect you’re right, and I would not be surprised if the differences are huge and glaring. On podcasts, which I listen to much more frequently than I watch online video, the differences are often audible. I can detect the ad spots by ear in many cases, just because the artifacts change when they cut over.
I bet that you don’t even need to look at the data, per se. My guess is that the primary method for all of this is HLS, where you have a top-level (text) manifest file that lists the different renditions, and each of those URLs points to another manifest that lists the actual video segment URLs. If I were building SSAI without an eye towards adblockers, I would splice the content and the ads at that second manifest level, so the URLs would suddenly switch over from one URL pattern to another. I believe the manifest also includes the timestamps and segment lengths, so you should be able to detect a partial segment just before you switch from content to ad.
It’s possible that they’re instead delivering it all as one MP4 stream, but that seems out of favor these days. Or they could do HLS but have segments that bridge the gap from content to ad, but that might involve re-transcoding, and if it didn’t… well, you might see something interesting with keyframes or something, I suppose? I don’t think they’d bother with that anyhow, since it sounds more complicated.
I think most of it is currently based around #EXT-X-DISCONTINUITY declarations
It’s already taken off. Quite a few of the youtube videos I watch–maybe as many as 50%–are sponsored by an audiobook company or a learning-video company.
The only solution I can think of to this is a crowd-sourced database of video timestamps to skip between; this is is an impossible-to-complete task which grows ever larger, and it’s open to abuse.
There’s a machine learning model that was trained to skip sponsorship sections, too, though, personally I’m not so bothered if they were picked by the creator and the creator is getting paid directly and reasonably well for it.
The leading extension that blocks sponsorships relies on user-submitted times, what’s this machine learning driven one you’ve mentioned? Actually pretty curious about this, I’ve been planning to build an ad-blocker for the TV!
It was a recurring neural net trained on the automatic video transcriptions: Reddit thread (and very good intro video); repo.
Reminded me of Ad Muncher!