It’s entirely incompatible. As one example, BSD BPF jumps have true and false targets, while Linux BPF jumps only have a true target and fall through if false. Linux kernel documentation documents the details.
That’s a good page. It seems to say that there’s a BPF to eBPF translator. I was thinking they implemented a BPF runtime and an eBPF runtime, which is a very Linux thing to do, but that doesn’t appear to be the case.
BPF is a general purpose RISC instruction set. Not every register and every instruction are used during translation from original BPF to eBPF.
If it actually does that translation, and does it well, I wouldn’t call it incompatible.
It also mentions a good reason for that translator – to make it make use the capabilities of modern machines! e.g. more registers and bigger registers.
Eh, Rosetta translates from x86 to ARM, but it would be a stretch to call ARM compatible with x86, let alone ARM being a pure extension of x86. I highlighted jump change because it makes clear translating BPF to eBPF instruction by instruction is not feasible.
I think the closest analogy is ARM32 and ARM64. Unlike x86_64, which is indeed an extension of x86, ARM64 is not an extension of ARM32 and not compatible, although inspiration is clear. eBPF is clearly inspired by BPF and it even keeps all BPF ALU opcodes exactly the same, but that’s it.
Regardless of what you call it, I think the original post is snark that doesn’t add any real information.
If you can run existing BPF programs in Linux via translation (that’s transparent?), and you have a faster and more capable engine for new programs, that seems pretty great.
Overall what the page describes seems pretty well motivated. It’s not just breaking things because of ignorance. (there was some argument that epoll() failed to learn from prior art, etc. splice() and cgroups seem to have a bunch of mistakes)
It’s often true that Linux makes an implementation-defined mess, but that is what standardization is intended to solve … You can’t really have standardization until you have a bunch of competing implementations. POSIX came after Unix wars, not before, etc.
to encourage NVMe vendors to support BPF offloading
Didn’t expect to see that. According to this paper some people already tried some weird bypasses to the normal kernel storage stack for performance boosts. Wonder how well you can actually offload stuff to the NVME here. This would probably also increase the “fsync isn’t actually fsync” problems. The storage device could lie about even more things.
A lot of deployments have had to disable eBPF because it turns out to be an amazing way of injecting gadgets into the kernel that can use transient execution information leaks in the hardware to leak kernel secrets to an unprivileged attacker. I wonder what exciting security holes will be added by eBPF offload. My guess is violations of tenant isolation hardware accelerators will be the topic of a load of adversarial security research in a few years time.
In general, I largely agree, but in the absence of something like CHERI it’s very hard to avoid using secrets to build other security features and if you can leak secrets from one security context into another then that’s a huge problem. By allowing userspace to inject gadgets into the kernel, you break any security that depends on the kernel protecting secrets or keys (e.g. KTLS, disk encryption, and so on) and any userspace thing that is using the kernel’s secure key storage interfaces. Oh, and because Linux still has a direct map (unless the work to change that landed while I wasn’t looking) you can also exfiltrate keys stored in any other process.
my favorite thing about eBPF is they took code from BSD and make it incompatible with BSD and are now pushing it as a standard.
What parts aren’t compatible? My understanding is that it’s a pure extension
Linux is usually pretty good about compatibility
It’s entirely incompatible. As one example, BSD BPF jumps have true and false targets, while Linux BPF jumps only have a true target and fall through if false. Linux kernel documentation documents the details.
That’s a good page. It seems to say that there’s a BPF to eBPF translator. I was thinking they implemented a BPF runtime and an eBPF runtime, which is a very Linux thing to do, but that doesn’t appear to be the case.
If it actually does that translation, and does it well, I wouldn’t call it incompatible.
It also mentions a good reason for that translator – to make it make use the capabilities of modern machines! e.g. more registers and bigger registers.
Eh, Rosetta translates from x86 to ARM, but it would be a stretch to call ARM compatible with x86, let alone ARM being a pure extension of x86. I highlighted jump change because it makes clear translating BPF to eBPF instruction by instruction is not feasible.
I think the closest analogy is ARM32 and ARM64. Unlike x86_64, which is indeed an extension of x86, ARM64 is not an extension of ARM32 and not compatible, although inspiration is clear. eBPF is clearly inspired by BPF and it even keeps all BPF ALU opcodes exactly the same, but that’s it.
Regardless of what you call it, I think the original post is snark that doesn’t add any real information.
If you can run existing BPF programs in Linux via translation (that’s transparent?), and you have a faster and more capable engine for new programs, that seems pretty great.
Overall what the page describes seems pretty well motivated. It’s not just breaking things because of ignorance. (there was some argument that epoll() failed to learn from prior art, etc. splice() and cgroups seem to have a bunch of mistakes)
It’s often true that Linux makes an implementation-defined mess, but that is what standardization is intended to solve … You can’t really have standardization until you have a bunch of competing implementations. POSIX came after Unix wars, not before, etc.
Didn’t expect to see that. According to this paper some people already tried some weird bypasses to the normal kernel storage stack for performance boosts. Wonder how well you can actually offload stuff to the NVME here. This would probably also increase the “fsync isn’t actually fsync” problems. The storage device could lie about even more things.
A lot of deployments have had to disable eBPF because it turns out to be an amazing way of injecting gadgets into the kernel that can use transient execution information leaks in the hardware to leak kernel secrets to an unprivileged attacker. I wonder what exciting security holes will be added by eBPF offload. My guess is violations of tenant isolation hardware accelerators will be the topic of a load of adversarial security research in a few years time.
Not that I’m happy about it, but it feels like information leaks have to be in a different class than other vulnerabilities.
It’s a lost cause in web browsers, and I haven’t followed what happened with CPU caches, but that also seems intractable…
In general, I largely agree, but in the absence of something like CHERI it’s very hard to avoid using secrets to build other security features and if you can leak secrets from one security context into another then that’s a huge problem. By allowing userspace to inject gadgets into the kernel, you break any security that depends on the kernel protecting secrets or keys (e.g. KTLS, disk encryption, and so on) and any userspace thing that is using the kernel’s secure key storage interfaces. Oh, and because Linux still has a direct map (unless the work to change that landed while I wasn’t looking) you can also exfiltrate keys stored in any other process.