I’ve used that recently to explain latency. It’s odd, even people with CS background sometimes act like you can beat the speed of light. This one was about network latency and someone claiming that a Europe / US connection could be made faster than light could travel, if it traveled directly by using a third party service. Not even considering that there won’t be a direct connection, especially not with a third party service, not considering that we weren’t talking about light in vacuum, but light in fiber, etc.
At the same time it always astonishes me how good the network infrastructure is, that the typical network latency is so low. I know it was a lot of effort, but I also think it’s a way too rarely mentioned achievement, when the low latency exchange of information is such a corner stone for so many applications.
It’s odd, even people with CS background sometimes act like you can beat the speed of light
You can’t beat the speed of light, but you often can cheat. If something is not going to change, you can cache it and pretend that you’re fetching it from the distant thing. Pretty much every layer in a modern system does this a lot to beat lightspeed limitations by shortening the distance. When that fails, you guess. Everything from instruction fetch to player positions in online games does some variation of this: pick a plausible value based on what you’ve seen so far, assume it’s correct, and retry if it isn’t. If you’re 10ms away, at worse your predictions will be 10ms out of sync and that’s easy to correct without people noticing. It works best if the thing you’re guessing is not the data, but the location of the data, so you can speculatively fetch the data. There’s still a lightspeed (or, often, much slower) delay between the fetch and the response, but the fetch happens in reality before it happens logically.
Between caching to reduce the distance and guessing to move the request back in time, you can often beat lightspeed limitations of a naïve system. Nothing is a causality violation, but the user doesn’t care how much you cheat within the laws of physics, as long as the right answer comes out at the end.
Yet neither caching nor prediction will do anything to your latency.
That’s the thing. People all know caching, they all know that you can make algorithms that can “predict” things, sometimes in a really broad meaning of the word prediction, there is caching algorithms that seem almost magical when you first learn about them (ARC, etc.), yet none of these things will raise the speed of light.
And that’s the whole thing that people tend to have a problem with. Everyone knows light can go around the world 7.5 but in many situations breaking that down to milliseconds for a certain distance isn’t considered.
I think that sometimes maybe Computer Science wandered a bit too far from physics. And it also makes sense because a basically the main benefit of using a computer over regular mechanics is that you don’t have to deal with physics and everything being some endlessly long decimal value, and you “just” apply logics.
But it so often leads to people forgetting about how their product often ends up in a world of physics and a world where things go wrong, that’s especially true for things like IoT and computers in more classical tools, but something in a “cloud” also might physically burn down.
And while not physics, I think that’s another thing where it would be great to invest more reasoning. How often the caching strategy is just throwing something into Redis or a Cache library when for example a HTTP lib actually caches things itself retrieving it way faster than a TCP connection can even be built is sad. And speaking about Redis with TCP connections. How often people are surprised to find out their project runs so much faster when they switch over to good old Unix sockets is insane.
In other words: People get bitten way too often by not getting a basic understanding of the layer below, whether that layer is HTTP, TCP, CPU and memory the OS’s capabilities, physics or something else. Properties will leak through in one way or another, there is only so much that abstractions can protect you from. Again the abstraction being the OS, some Cloud API or something else.
And I think Grace Hopper’s piece of thread is the prime example of that. You will hit limits. And while most people are used to these being some form of optimization problem, sometimes these limits are hard. Sometimes they are algorithmic and sometimes they are physical.
Knowing your limits is important. :)
I really wouldn’t call caching and predicting cheating physics. You never do that. These are different domains. Having something closer is basically making use of physics and respecting that limit, predicting things is algorithms. That’s what I meant with neither caching nor prediction will do anything to latency.
What I mean is information going from A to B has its limit (like you say with causality). If A is somewhere else (because of a cache) it’s still true. If you guess that’s something (which is also part of caching anyways, as in that’s the difference between caching algorithms) that isn’t transferring information.
But I admit that’s basically a question of phrasing. However I think it’s important to make that distinction, because it prevents people from getting a wrong picture into their heads. And it can help to have that distinction in your head when developing algorithms. Like the thought that comes to mind and starts with “I can’t do X, but I can do…”. That’s why I always disliked it when people call things breaking, bending, and cheating physics, when it’s really just using the understanding of it. It’s similar to when people say that an algorithm is better than the theoretical limit. That’s not what you do.
Downloading a video from YouTube and playing it from your computer rather via an internet stream isn’t cheating physics. Just as extrapolating things which a lot of predictions do at their core isn’t defying physics or maths either.
Yet neither caching nor prediction will do anything to your latency.
Absolutely untrue. They will not do anything about your tail latency (except possibly make it worse by doing some extra compute on the path to the request). They will dramatically increase your average-case latency. On a typical computer, DRAM latency is on the order of a couple of hundred cycles, and a big chunk of that is the physics of sending the message out to the DRAM chip. With CXL, it can be on the order of a thousand cycles. The latency of fetching from the L1 cache is single-digit cycles. For typical workloads, you hit in L1 over 90% of the time. Your average latency (mean, median, or mode) is improved by two or three orders of magnitude. You worst-case latency is unchanged (maybe a few percent bigger because you have to check that you’re missing in all levels of cache before you fetch).
Similarly, if you’re doing a predictable access pattern through your memory and your prefetcher is working, you will always access memory with a single cycle because it will be streamed into L1 ahead of you. When you get to the end of the predictable pattern, your worst case hits again, but your average has dramatically improved.
Both have given you a huge decrease in latency. Saying that they don’t because you ignore all latency except tail latency would result in computers that ran at under 1% of their current speed.
Of course, in a distributed system, overall throughput may depend on the tail latency of individual nodes.
Yet neither caching nor prediction will do anything to your latency.
Absolutely untrue.
No it’s not.
They will dramatically increase your average-case latency. On a typical computer, DRAM latency is on the order of a couple of hundred cycles, and a big chunk of that is the physics of sending the message out to the DRAM chip. With CXL, it can be on the order of a thousand cycles. The latency of fetching from the L1 cache is single-digit cycles
Yes, like you write. The latency of DRAM vs the latency of your L1 cache. Two latency. Your DRAM latency isn’t faster. You just switched over to L1 cache latency. The latencies still are the same.
Downloading a video doesn’t mean you decreased your latency to YouTube. You simply switched the latency from the one with YT to the one wherever your file is.
And for prediction it’s similar. And forgetting that it’s a prediction and not latency (and by extension physics and the physical world like a computer turning off, a packet getting physically lost) means that people end up having issues due to things like rollbacks.
EDIT:
Again: I am talking about the notion of physical limits and how they are not limits of algorithms, etc. (prediction, caching and just approaching a problem from a different angle) that are usually understood by people with CS background. I’ve written networked applications that do cache, that do predict, I know how things like video games do these things. None of them cheats physics.
There is an odd twist on “dead reckoning” (forward prediction and correction) meets causality if you also have enough snapshotting support to perform the whole-system form of virtualization rollbacks and the ability to forward step at a rate proportional to the size of your rollback window that is done in some emulation gaming scenarios when trying to lower perceived latency.
Basically you rollback the amount of frames that your display path introduces compared to the emulated target on key/button input. You then inject the input events as if they happened when you thought they did, then fast-forward back to the “current” state and you get much closer to the ‘feel’ of older display systems where things could, at times, respond per-scanline. There are some seemingly unavoidable glitches in audio (sound effect for bullets fired etc. will be clipped but with a little dynamic resampling and filtering you can sortof get away with it).
I’ve used the same tactic in debugger aided fuzzing to start at a deeper state with great success - use a breakpoint as your steady-state, scrape locals to get a crude type model, generate new test cases (your causal contrafactuals), project over the locals, step forward n instructions, rollback, inject a new case and so on.
What you write sounds very interesting but I don’t understand most of it. (It sounds like you are very knoweldgeable about this, but have built a specialized vocabulary, your own or the vocabulary of a community I’m not part of. Some things that are pre-existing concepts to you are unknown to me. Also the writing is just hard to follow, your first and last paragraphs are just one sentence each.)
If you were motivated, I would up for a clearer explanation of your ideas, that does not assume too much shared background from the reader. Maybe with some links for further information on some specific ideas, some examples, etc.
Concretely, in your last paragraph for example:
this is my first time hearing about debugger-aided fuzzing and I’m not sure what it is (I know about debuggers and about fuzzing)
I’m not sure exactly what you call a “state” (deeper state, steady state); I suppose it is the program state, but it could be something else
I suspect that “locals” means “local variables” and “type model” indeed means “reconstruct type information about the current state of the program”, but I’m not sure why that is related to the discussion so I cannot validate my guess
no idea what “causal contrafactuals” are
no sure what “project over the locals” means
unsure about how to interpret the rest (“inject a new test case”?)
For most of these questions I could make guesses, maybe one or two hypothesis about what you meant. But there isn’t enough context to validate or invalidate those guesses. It feels like I would need to do a 2^N bruteforce search to understand which possible meanings for the whole sentence make sense. I would be interested in a lower-effort way to understand your explanation.
The latency compensation tactics can be found in marketing material from the RetroArch project where they call it ‘RunAhead’ but there is a mountain of work (hundreds of forum threads and little to no structured publication) to dig through for building enough intuition.
The closest public example I have for the fuzzing method would be https://jamchamb.net/projects/fuzzydolphin though the one I hinted at was part of a larger startup / project that was terminated shortly before early release (thanks Covid). There are some dusty papers from VMware folks that I can’t find right now that were also poking around in this general area.
For the mindset on debugging, execution and state there is a less dense writeup at systemicsoftwaredebugging.com (albeit dated) in chapter 3 on Principal Debugging. Assuming from a quick glance that you come from a programming language theory background more than my default of reverse engineering and offensive security. Think of your source code as a causal model “Do A, then B, then C.” possibly with some amount of counterfactuals (If not D then E otherwise F). Compilation, Linking, Loading and so on all the way to down to hardware add nuance that this model doesn’t cover. Debuggers: https://gu.outerproduct.net/debug.html lets us explore, experiment and reason about the entire chain from some analyst guided vantage point (breakpoints, watches, etc.).
This can double as the ‘harness’ stage for fuzz testing with rollback simply acting as a big optimization. From here ‘locals’ are symbols within the hierarchy of function arguments, scopes, support runtime, etc. that somehow direct execution down one causal path, meaning that we could modify them to explore an alternate but plausible reality. To prune the uninteresting ones (flipping bits in pointers isn’t a great idea) you can use debugger leftovers (or if you come from an interactive disassembler like Ida Pro, your annotation database) to generate a grammar for synthesising test cases or combine with other methods like https://www.fuzzingbook.org/html/ConcolicFuzzer.html and bias with your experiences of the code.
FWIW I’ve fired off an email to to the FOIA liaison requesting clarification as to how an individual or organization might make digitization/playback equipment and/or services available to the archive such that this and/or future requests may be serviced.
The FOIA liaison responded, surprisingly quickly, that after a brief investigation it was determined that any coordination or assistance was beyond their scope of purview and that I should contact the NSA directly. So…..
Write a paper letter to the NSA CC’d to your own representative and senators (along with maybe the same from New York and Virginia where Hopper lived) explaining how the American public is being deprived of historically significant recordings from a foundational expert in her field because the NSA cannot and will not take the steps necessary to keep this and other media from mouldering away in their archives.
It sounds as though the issue is that the NSA can’t review the video, and therefore neither knows- nor apparently cares- whether it is a candidate for being released.
Has anybody else ever come across the NSA’s (mis)use of “responsive” in any other context?
She REALLY wanted to be in the navy. It seems like logistics became an important part of her work, which is understandable for the navy. But it was the navy she wanted to be a part of, not that particular work.
I’m curious what the standards were that she produced… specific language standards, or general standards on how languages should function?
It is perhaps notable that her page doesn’t have a “Personal life” section.
Later, yeah. I was looking at her beginning where she joined the Navy in her thirties after being turned down a couple times, and turned down other opportunities to stay with the Navy. But ultimately the feeling was mutual between them.
(They also brought her back from forced retirement, so it’s kind of the bureaucracy unwinding at least one retirement)
That is strange! It seems she does have (at least one) biography and someone found sources for this article which has a lot more of her personal background.
Her explanation of the nanosecond is something I often think about when it comes to optimisation: https://www.youtube.com/watch?v=9eyFDBPk4Yw
I’ve used that recently to explain latency. It’s odd, even people with CS background sometimes act like you can beat the speed of light. This one was about network latency and someone claiming that a Europe / US connection could be made faster than light could travel, if it traveled directly by using a third party service. Not even considering that there won’t be a direct connection, especially not with a third party service, not considering that we weren’t talking about light in vacuum, but light in fiber, etc.
At the same time it always astonishes me how good the network infrastructure is, that the typical network latency is so low. I know it was a lot of effort, but I also think it’s a way too rarely mentioned achievement, when the low latency exchange of information is such a corner stone for so many applications.
You can’t beat the speed of light, but you often can cheat. If something is not going to change, you can cache it and pretend that you’re fetching it from the distant thing. Pretty much every layer in a modern system does this a lot to beat lightspeed limitations by shortening the distance. When that fails, you guess. Everything from instruction fetch to player positions in online games does some variation of this: pick a plausible value based on what you’ve seen so far, assume it’s correct, and retry if it isn’t. If you’re 10ms away, at worse your predictions will be 10ms out of sync and that’s easy to correct without people noticing. It works best if the thing you’re guessing is not the data, but the location of the data, so you can speculatively fetch the data. There’s still a lightspeed (or, often, much slower) delay between the fetch and the response, but the fetch happens in reality before it happens logically.
Between caching to reduce the distance and guessing to move the request back in time, you can often beat lightspeed limitations of a naïve system. Nothing is a causality violation, but the user doesn’t care how much you cheat within the laws of physics, as long as the right answer comes out at the end.
Yet neither caching nor prediction will do anything to your latency.
That’s the thing. People all know caching, they all know that you can make algorithms that can “predict” things, sometimes in a really broad meaning of the word prediction, there is caching algorithms that seem almost magical when you first learn about them (ARC, etc.), yet none of these things will raise the speed of light.
And that’s the whole thing that people tend to have a problem with. Everyone knows light can go around the world 7.5 but in many situations breaking that down to milliseconds for a certain distance isn’t considered.
I think that sometimes maybe Computer Science wandered a bit too far from physics. And it also makes sense because a basically the main benefit of using a computer over regular mechanics is that you don’t have to deal with physics and everything being some endlessly long decimal value, and you “just” apply logics.
But it so often leads to people forgetting about how their product often ends up in a world of physics and a world where things go wrong, that’s especially true for things like IoT and computers in more classical tools, but something in a “cloud” also might physically burn down.
And while not physics, I think that’s another thing where it would be great to invest more reasoning. How often the caching strategy is just throwing something into Redis or a Cache library when for example a HTTP lib actually caches things itself retrieving it way faster than a TCP connection can even be built is sad. And speaking about Redis with TCP connections. How often people are surprised to find out their project runs so much faster when they switch over to good old Unix sockets is insane.
In other words: People get bitten way too often by not getting a basic understanding of the layer below, whether that layer is HTTP, TCP, CPU and memory the OS’s capabilities, physics or something else. Properties will leak through in one way or another, there is only so much that abstractions can protect you from. Again the abstraction being the OS, some Cloud API or something else.
And I think Grace Hopper’s piece of thread is the prime example of that. You will hit limits. And while most people are used to these being some form of optimization problem, sometimes these limits are hard. Sometimes they are algorithmic and sometimes they are physical.
Knowing your limits is important. :)
I really wouldn’t call caching and predicting cheating physics. You never do that. These are different domains. Having something closer is basically making use of physics and respecting that limit, predicting things is algorithms. That’s what I meant with neither caching nor prediction will do anything to latency.
What I mean is information going from A to B has its limit (like you say with causality). If A is somewhere else (because of a cache) it’s still true. If you guess that’s something (which is also part of caching anyways, as in that’s the difference between caching algorithms) that isn’t transferring information.
But I admit that’s basically a question of phrasing. However I think it’s important to make that distinction, because it prevents people from getting a wrong picture into their heads. And it can help to have that distinction in your head when developing algorithms. Like the thought that comes to mind and starts with “I can’t do X, but I can do…”. That’s why I always disliked it when people call things breaking, bending, and cheating physics, when it’s really just using the understanding of it. It’s similar to when people say that an algorithm is better than the theoretical limit. That’s not what you do.
Downloading a video from YouTube and playing it from your computer rather via an internet stream isn’t cheating physics. Just as extrapolating things which a lot of predictions do at their core isn’t defying physics or maths either.
Absolutely untrue. They will not do anything about your tail latency (except possibly make it worse by doing some extra compute on the path to the request). They will dramatically increase your average-case latency. On a typical computer, DRAM latency is on the order of a couple of hundred cycles, and a big chunk of that is the physics of sending the message out to the DRAM chip. With CXL, it can be on the order of a thousand cycles. The latency of fetching from the L1 cache is single-digit cycles. For typical workloads, you hit in L1 over 90% of the time. Your average latency (mean, median, or mode) is improved by two or three orders of magnitude. You worst-case latency is unchanged (maybe a few percent bigger because you have to check that you’re missing in all levels of cache before you fetch).
Similarly, if you’re doing a predictable access pattern through your memory and your prefetcher is working, you will always access memory with a single cycle because it will be streamed into L1 ahead of you. When you get to the end of the predictable pattern, your worst case hits again, but your average has dramatically improved.
Both have given you a huge decrease in latency. Saying that they don’t because you ignore all latency except tail latency would result in computers that ran at under 1% of their current speed.
Of course, in a distributed system, overall throughput may depend on the tail latency of individual nodes.
No it’s not.
Yes, like you write. The latency of DRAM vs the latency of your L1 cache. Two latency. Your DRAM latency isn’t faster. You just switched over to L1 cache latency. The latencies still are the same.
Downloading a video doesn’t mean you decreased your latency to YouTube. You simply switched the latency from the one with YT to the one wherever your file is.
And for prediction it’s similar. And forgetting that it’s a prediction and not latency (and by extension physics and the physical world like a computer turning off, a packet getting physically lost) means that people end up having issues due to things like rollbacks.
EDIT: Again: I am talking about the notion of physical limits and how they are not limits of algorithms, etc. (prediction, caching and just approaching a problem from a different angle) that are usually understood by people with CS background. I’ve written networked applications that do cache, that do predict, I know how things like video games do these things. None of them cheats physics.
There is an odd twist on “dead reckoning” (forward prediction and correction) meets causality if you also have enough snapshotting support to perform the whole-system form of virtualization rollbacks and the ability to forward step at a rate proportional to the size of your rollback window that is done in some emulation gaming scenarios when trying to lower perceived latency.
Basically you rollback the amount of frames that your display path introduces compared to the emulated target on key/button input. You then inject the input events as if they happened when you thought they did, then fast-forward back to the “current” state and you get much closer to the ‘feel’ of older display systems where things could, at times, respond per-scanline. There are some seemingly unavoidable glitches in audio (sound effect for bullets fired etc. will be clipped but with a little dynamic resampling and filtering you can sortof get away with it).
I’ve used the same tactic in debugger aided fuzzing to start at a deeper state with great success - use a breakpoint as your steady-state, scrape locals to get a crude type model, generate new test cases (your causal contrafactuals), project over the locals, step forward n instructions, rollback, inject a new case and so on.
What you write sounds very interesting but I don’t understand most of it. (It sounds like you are very knoweldgeable about this, but have built a specialized vocabulary, your own or the vocabulary of a community I’m not part of. Some things that are pre-existing concepts to you are unknown to me. Also the writing is just hard to follow, your first and last paragraphs are just one sentence each.)
If you were motivated, I would up for a clearer explanation of your ideas, that does not assume too much shared background from the reader. Maybe with some links for further information on some specific ideas, some examples, etc.
Concretely, in your last paragraph for example:
For most of these questions I could make guesses, maybe one or two hypothesis about what you meant. But there isn’t enough context to validate or invalidate those guesses. It feels like I would need to do a 2^N bruteforce search to understand which possible meanings for the whole sentence make sense. I would be interested in a lower-effort way to understand your explanation.
The latency compensation tactics can be found in marketing material from the RetroArch project where they call it ‘RunAhead’ but there is a mountain of work (hundreds of forum threads and little to no structured publication) to dig through for building enough intuition.
The closest public example I have for the fuzzing method would be https://jamchamb.net/projects/fuzzydolphin though the one I hinted at was part of a larger startup / project that was terminated shortly before early release (thanks Covid). There are some dusty papers from VMware folks that I can’t find right now that were also poking around in this general area.
For the mindset on debugging, execution and state there is a less dense writeup at systemicsoftwaredebugging.com (albeit dated) in chapter 3 on Principal Debugging. Assuming from a quick glance that you come from a programming language theory background more than my default of reverse engineering and offensive security. Think of your source code as a causal model “Do A, then B, then C.” possibly with some amount of counterfactuals (If not D then E otherwise F). Compilation, Linking, Loading and so on all the way to down to hardware add nuance that this model doesn’t cover. Debuggers: https://gu.outerproduct.net/debug.html lets us explore, experiment and reason about the entire chain from some analyst guided vantage point (breakpoints, watches, etc.).
This can double as the ‘harness’ stage for fuzz testing with rollback simply acting as a big optimization. From here ‘locals’ are symbols within the hierarchy of function arguments, scopes, support runtime, etc. that somehow direct execution down one causal path, meaning that we could modify them to explore an alternate but plausible reality. To prune the uninteresting ones (flipping bits in pointers isn’t a great idea) you can use debugger leftovers (or if you come from an interactive disassembler like Ida Pro, your annotation database) to generate a grammar for synthesising test cases or combine with other methods like https://www.fuzzingbook.org/html/ConcolicFuzzer.html and bias with your experiences of the code.
She would later give away small packets of ground pepper labelled “Picoseconds”.
FWIW I’ve fired off an email to to the FOIA liaison requesting clarification as to how an individual or organization might make digitization/playback equipment and/or services available to the archive such that this and/or future requests may be serviced.
Thanks for doing this! Is there a way we, the community, can also apply some friendly pressure and advocacy?
The FOIA liaison responded, surprisingly quickly, that after a brief investigation it was determined that any coordination or assistance was beyond their scope of purview and that I should contact the NSA directly. So…..
Write a paper letter to the NSA CC’d to your own representative and senators (along with maybe the same from New York and Virginia where Hopper lived) explaining how the American public is being deprived of historically significant recordings from a foundational expert in her field because the NSA cannot and will not take the steps necessary to keep this and other media from mouldering away in their archives.
“We aren’t required to get new equipment to make information available”
Your organization lost its access to its own archives and you are using that as an excuse to not honour a FOIA request?
It sounds as though the issue is that the NSA can’t review the video, and therefore neither knows- nor apparently cares- whether it is a candidate for being released.
Has anybody else ever come across the NSA’s (mis)use of “responsive” in any other context?
The lecture is now available!
https://lobste.rs/s/la6upp/capt_grace_hopper_on_future
https://www.youtube.com/watch?v=_bP14OzIJWI
Anybody knows how she got to be a rear-admiral?
Wikipedia does
She REALLY wanted to be in the navy. It seems like logistics became an important part of her work, which is understandable for the navy. But it was the navy she wanted to be a part of, not that particular work.
I’m curious what the standards were that she produced… specific language standards, or general standards on how languages should function?
It is perhaps notable that her page doesn’t have a “Personal life” section.
I got the impression that the navy REALLY wanted her to be in the navy, and kept giving her promotions as an incentive to un-retire.
Later, yeah. I was looking at her beginning where she joined the Navy in her thirties after being turned down a couple times, and turned down other opportunities to stay with the Navy. But ultimately the feeling was mutual between them.
(They also brought her back from forced retirement, so it’s kind of the bureaucracy unwinding at least one retirement)
That is strange! It seems she does have (at least one) biography and someone found sources for this article which has a lot more of her personal background.