I don’t know how it works when GPU compositing is used (which has to be supported as a fallback for some scenarios I think), but at least when hardware display controller planes are used for video compositing, with the data coming straight from the video decode block, there is some kind of DRM mechanism that encrypts the decoded video frames (the telltale is there is a decryption key selection field in the layer configuration for the display controller).
I don’t know how much of this is hardware-managed via SEP and how much just relies on secure boot… as far as I know, downgrading system security breaks or limits stuff like Netflix so at least some of it seems to rely on the integrity of macOS as a whole.
Honestly, it’s probably mainly/partially compositor policy. When you ask for a screenshot, that has to come from the compositor. If there’s a video overlay plane involved, the compositor is supposed to GPU-composite it back to merge it with the framebuffer (to create a visually complete screenshot). If it’s a DRM video, it just doesn’t do that (even if it does have some mechanism to decrypt and composite it, which it probably does since it needs to support full GPU compositing for fallback reasons). In the opposite case when a DRM video is being GPU composited already, it would just re-composite without it, leaving that area black. Apple can rely on the compositor not being evil or backdoored because they have their whole codesigning and entitlement system, so you couldn’t just replace it with a custom compositor since it wouldn’t have the permissions it needs to bypass DRM.
For this all to work, apps have to use the DRM stuff (FairPlay) and request decoding DRMed video into separate encrypted buffers, which are sent to the compositor as separate display surfaces (instead of apps doing their own compositing within their own window). It’s the same mechanism that allows for offloading final compositing to the display controller, which saves power regardless of DRM. It’s the reason why Safari does not wake up the GPU at all when playing video (DRM or not), since the frames are sent straight from the video decoder to the display controller as a separate plane. The GPU is only powered on when something else in the UI other than the video needs to update.
If you use Asahi Linux then there is no hardware/trusted DRM and no trusted compositor and you can just screenshot whatever you want. You need some manual setup for Netflix to work and it’s limited to 1080p since it’s a software Widevine implementation.
Happily, this gives non-DRM content a competitive advantage in our very diverse and free media marketplaces. It’s not irony, it’s basic economics! And, ok, some sarcasm too… but just a little.
I’m surprised it works on Windows, because the kernel docs suggest it shouldn’t. DRM’d media is sent to the GPU as an encrypted stream, with the key securely exchanged between the GPU and the server. It’s decrypted as a special kind of texture and you can’t (at least in theory) copy that back, you can just composite it into frames that are sent to the display (the connection between the display and GPU is also end-to-end encrypted, though I believe this is completely broken).
My understanding of Widevine was that it required this trusted path to play HD content and would downgrade to SD if it didn’t exist.
No one is going to create bootleg copies of DRM-protected video one screenshotted still frame at a time — and even if they tried, they’d be capturing only the images, not the sound
If you have a path that goes from GPU texture back to the CPU, then you can feed this straight back into something that recompresses the video and save it. And I don’t know why you’d think this wouldn’t give you sound: secure path for the sound usually goes the same way, but most things also support sound via other paths because headphones typically don’t support the secure path. It’s trivial to write an Audio Unit for macOS that presents as an output device and writes audio to a file (several exist, I think there’s even an Apple-provided sample that does). That just leaves you having to synchronise the audio and video streams.
I’m pretty sure that what Gruber is describing is basically just “hardware acceleration is not being enabled on many Windows systems”, but because he has his own little narrative in his head he goes on about how somehow the Windows graphics stack must be less integrated. Windows is the primary platform for so much of this stuff!
I would discount this entire article’s technical contents and instead find some other source for finding out why this is the case.
Well it depends on the type of acceleration we’re speaking of. But I’ve tried forcing hardware acceleration on video decode and honestly you’d be surprised how much it failed and I did this on rather new hardware. It was actually shockingly unreliable. I’m fairly certain it’s significantly worse if you extend your view to older hardware and other vendors.
I’m also fairly sure, judging by people’s complaints, that throwing variable refresh rate, higher bit depths and hardware-accelerated scheduling in the mix has not resulted in neither flagship reliability or performance.
It can be the primary platform but this doesn’t mean it’s good or always does what it should or promises it’ll do.
I think it means: enabling the feature to screenshot DRM protected media would not by itself enable piracy, since people would not use screenshots to pirate media frame at a time.
What you are saying reads like “one technical implementation of allowing screenshots would enable piracy.” I trust that you’re probably right, but that doesn’t contradict the point that people would not use that UI affordance itself for piracy.
No one would use screenshots for piracy because all the DRM is already cracked. Every 4k Netflix, Disney, etc, show is already on piracy websites, and they’re not even re-encoded from the video output or anything, it’s straight up the original h264 or h265 video stream. Same with BluRays.
Yup, if you go through GitHub there are several reverse-engineered implementations of widevine, which just allow you to decrypt the video stream itself with no need to reencode. That then moves the hard part to getting the key - fairly easy to get the lower security ones since you can just root an Android device (and possibly even get it from Google’s official emulator? At least it supports playing widevine video!), the higher security ones are hardcoded into secure enclaves on the GPU/CPU/Video decoder though, but clearly people have found ways to extract them - those no-name TV streaming boxes don’t exactly have a good track record of security, so if I were to guess that’s where they’re getting the keys.
Still, no point blocking screenshots - pirates are already able to decrypt the video file itself which is way better than reencoding.
Those no-name TV streaming boxes usually use the vendor’s recommended way to do it, which is mostly secure, but it’s not super-unusual for provisioning data to be never deleted off the filesystem, even on big brand devices.
The bigger issue with the DRM ecosystem is that all it takes is for one secure enclave implementation to be cracked, and they have a near infinite series of keys to use. Do it on a popular device, and Google can’t revoke the entire series either.
Personally, I’m willing to bet the currently used L1 keys have come off Tegra based devices, since they have a compromised boot chain through the RCM exploit, as made famous by the Nintendo Switch.
https://developer.chrome.com/docs/chromium/videong Here is a link to what chrome is doing, and would provide a bit more insight into why screenshots outside of the DRM framework end up black-screening anyways. Basically it’s Chrome using the GPU to just directly pipe the video to the screen, sidestepping the OS compositor.
I’m not entirely sure what the technical answer to this is, but on MacOS, it seemingly involves the GPU and video decoding hardware.
Illuminating!
I recall Marcan who worked on Asahi posted a very technical account of how this all actually works, but since he deleted his Mastodon account I can no longer find it. IIRC it works using the fact that the screen output comes directly from the GPU, and its memory is not directly readable from userspace apps like the screenshot or screen recording tool. Streaming service DRM takes advantage of this by getting the bits directly (to the extent possible) into GPU memory where they can’t be copied by userspace tools. I believe this actually involves stream decryption performed on the GPU or a Trusted Execution Environment, which is why GPU firmware remains closed-source even as their drivers are open-source.
All streaming services implement a variety of quality fallbacks, so if you watch Netflix on Linux in the browser for example you’ll only get 1080p streams at max. In order to watch their 4k content you need a fully locked-down system. The term to search for here is “widevine DRM”.
All of this is to say that the “black screen” you get from screenshots isn’t from DRM swooping in and painting black over the video in the screenshot tool, but because it’s really all the screenshot tool can see - if you try this in a desktop where the video is in a floating window, there will be a black square there, because that’s what userspace sends to be rendered to the GPU and the GPU inserts the video on the GPU.
I don’t remember anything like that from marcan… and there are no secrets in GPU firmware, I’m pretty sure. I don’t even know how you’d implement DRM in the GPU, we haven’t seen any hints of that. In fact the GPU and the display controller are separate hardware and all framebuffers are in main RAM (it’s all unified memory), so there is no way to “directly” output anything from the GPU in a way that can’t be read by the CPU. The GPU firmware is closed source because everything is closed source, Apple didn’t open source anything. It’s not encrypted or secured though, you can just download it from Apple’s CDN and disassemble it. It’s not possible to replace the firmware but that’s not for DRM reasons, it’s due to system security design, and even if it were possible I don’t think there’s any interest in developing open source firmware since it would be a huge amount of work for little gain.
What does exist is some kind of DRM support in the display controller and video decode blocks, I think (see my top level comment).
I don’t know how it works when GPU compositing is used (which has to be supported as a fallback for some scenarios I think), but at least when hardware display controller planes are used for video compositing, with the data coming straight from the video decode block, there is some kind of DRM mechanism that encrypts the decoded video frames (the telltale is there is a decryption key selection field in the layer configuration for the display controller).
I don’t know how much of this is hardware-managed via SEP and how much just relies on secure boot… as far as I know, downgrading system security breaks or limits stuff like Netflix so at least some of it seems to rely on the integrity of macOS as a whole.
Honestly, it’s probably mainly/partially compositor policy. When you ask for a screenshot, that has to come from the compositor. If there’s a video overlay plane involved, the compositor is supposed to GPU-composite it back to merge it with the framebuffer (to create a visually complete screenshot). If it’s a DRM video, it just doesn’t do that (even if it does have some mechanism to decrypt and composite it, which it probably does since it needs to support full GPU compositing for fallback reasons). In the opposite case when a DRM video is being GPU composited already, it would just re-composite without it, leaving that area black. Apple can rely on the compositor not being evil or backdoored because they have their whole codesigning and entitlement system, so you couldn’t just replace it with a custom compositor since it wouldn’t have the permissions it needs to bypass DRM.
For this all to work, apps have to use the DRM stuff (FairPlay) and request decoding DRMed video into separate encrypted buffers, which are sent to the compositor as separate display surfaces (instead of apps doing their own compositing within their own window). It’s the same mechanism that allows for offloading final compositing to the display controller, which saves power regardless of DRM. It’s the reason why Safari does not wake up the GPU at all when playing video (DRM or not), since the frames are sent straight from the video decoder to the display controller as a separate plane. The GPU is only powered on when something else in the UI other than the video needs to update.
If you use Asahi Linux then there is no hardware/trusted DRM and no trusted compositor and you can just screenshot whatever you want. You need some manual setup for Netflix to work and it’s limited to 1080p since it’s a software Widevine implementation.
the ironic side effect of this is people can’t share memes/screenshots of shows, which is free marketing/virality for new shows
Happily, this gives non-DRM content a competitive advantage in our very diverse and free media marketplaces. It’s not irony, it’s basic economics! And, ok, some sarcasm too… but just a little.
I’m surprised it works on Windows, because the kernel docs suggest it shouldn’t. DRM’d media is sent to the GPU as an encrypted stream, with the key securely exchanged between the GPU and the server. It’s decrypted as a special kind of texture and you can’t (at least in theory) copy that back, you can just composite it into frames that are sent to the display (the connection between the display and GPU is also end-to-end encrypted, though I believe this is completely broken).
My understanding of Widevine was that it required this trusted path to play HD content and would downgrade to SD if it didn’t exist.
If you have a path that goes from GPU texture back to the CPU, then you can feed this straight back into something that recompresses the video and save it. And I don’t know why you’d think this wouldn’t give you sound: secure path for the sound usually goes the same way, but most things also support sound via other paths because headphones typically don’t support the secure path. It’s trivial to write an Audio Unit for macOS that presents as an output device and writes audio to a file (several exist, I think there’s even an Apple-provided sample that does). That just leaves you having to synchronise the audio and video streams.
I’m pretty sure that what Gruber is describing is basically just “hardware acceleration is not being enabled on many Windows systems”, but because he has his own little narrative in his head he goes on about how somehow the Windows graphics stack must be less integrated. Windows is the primary platform for so much of this stuff!
I would discount this entire article’s technical contents and instead find some other source for finding out why this is the case.
Well it depends on the type of acceleration we’re speaking of. But I’ve tried forcing hardware acceleration on video decode and honestly you’d be surprised how much it failed and I did this on rather new hardware. It was actually shockingly unreliable. I’m fairly certain it’s significantly worse if you extend your view to older hardware and other vendors.
I’m also fairly sure, judging by people’s complaints, that throwing variable refresh rate, higher bit depths and hardware-accelerated scheduling in the mix has not resulted in neither flagship reliability or performance.
It can be the primary platform but this doesn’t mean it’s good or always does what it should or promises it’ll do.
Wait wait wait is this,,, checks URL, oh, lmao. Yeah Gruber is useless there’s literally no point in ever reading a single word he says.
I think it means: enabling the feature to screenshot DRM protected media would not by itself enable piracy, since people would not use screenshots to pirate media frame at a time.
What you are saying reads like “one technical implementation of allowing screenshots would enable piracy.” I trust that you’re probably right, but that doesn’t contradict the point that people would not use that UI affordance itself for piracy.
No one would use screenshots for piracy because all the DRM is already cracked. Every 4k Netflix, Disney, etc, show is already on piracy websites, and they’re not even re-encoded from the video output or anything, it’s straight up the original h264 or h265 video stream. Same with BluRays.
Yup, if you go through GitHub there are several reverse-engineered implementations of widevine, which just allow you to decrypt the video stream itself with no need to reencode. That then moves the hard part to getting the key - fairly easy to get the lower security ones since you can just root an Android device (and possibly even get it from Google’s official emulator? At least it supports playing widevine video!), the higher security ones are hardcoded into secure enclaves on the GPU/CPU/Video decoder though, but clearly people have found ways to extract them - those no-name TV streaming boxes don’t exactly have a good track record of security, so if I were to guess that’s where they’re getting the keys.
Still, no point blocking screenshots - pirates are already able to decrypt the video file itself which is way better than reencoding.
Those no-name TV streaming boxes usually use the vendor’s recommended way to do it, which is mostly secure, but it’s not super-unusual for provisioning data to be never deleted off the filesystem, even on big brand devices.
The bigger issue with the DRM ecosystem is that all it takes is for one secure enclave implementation to be cracked, and they have a near infinite series of keys to use. Do it on a popular device, and Google can’t revoke the entire series either.
Personally, I’m willing to bet the currently used L1 keys have come off Tegra based devices, since they have a compromised boot chain through the RCM exploit, as made famous by the Nintendo Switch.
Pre-DoD Speed Ripper early DVD rips jacked into a less-than-protected PowerDVD player and did just screenshot every frame.
I’ve found this is so frustrating as a creative person, when wanting to collect inspiration etc. :(
https://developer.chrome.com/docs/chromium/videong Here is a link to what chrome is doing, and would provide a bit more insight into why screenshots outside of the DRM framework end up black-screening anyways. Basically it’s Chrome using the GPU to just directly pipe the video to the screen, sidestepping the OS compositor.
Illuminating!
I recall Marcan who worked on Asahi posted a very technical account of how this all actually works, but since he deleted his Mastodon account I can no longer find it. IIRC it works using the fact that the screen output comes directly from the GPU, and its memory is not directly readable from userspace apps like the screenshot or screen recording tool. Streaming service DRM takes advantage of this by getting the bits directly (to the extent possible) into GPU memory where they can’t be copied by userspace tools. I believe this actually involves stream decryption performed on the GPU or a Trusted Execution Environment, which is why GPU firmware remains closed-source even as their drivers are open-source.
All streaming services implement a variety of quality fallbacks, so if you watch Netflix on Linux in the browser for example you’ll only get 1080p streams at max. In order to watch their 4k content you need a fully locked-down system. The term to search for here is “widevine DRM”.
All of this is to say that the “black screen” you get from screenshots isn’t from DRM swooping in and painting black over the video in the screenshot tool, but because it’s really all the screenshot tool can see - if you try this in a desktop where the video is in a floating window, there will be a black square there, because that’s what userspace sends to be rendered to the GPU and the GPU inserts the video on the GPU.
I don’t remember anything like that from marcan… and there are no secrets in GPU firmware, I’m pretty sure. I don’t even know how you’d implement DRM in the GPU, we haven’t seen any hints of that. In fact the GPU and the display controller are separate hardware and all framebuffers are in main RAM (it’s all unified memory), so there is no way to “directly” output anything from the GPU in a way that can’t be read by the CPU. The GPU firmware is closed source because everything is closed source, Apple didn’t open source anything. It’s not encrypted or secured though, you can just download it from Apple’s CDN and disassemble it. It’s not possible to replace the firmware but that’s not for DRM reasons, it’s due to system security design, and even if it were possible I don’t think there’s any interest in developing open source firmware since it would be a huge amount of work for little gain.
What does exist is some kind of DRM support in the display controller and video decode blocks, I think (see my top level comment).
Must have been thinking of someone else. Thanks for the explanation!
Were you thinking of retr0id’s article on the subject? https://www.da.vidbuchanan.co.uk/blog/netflix-on-asahi.html