As described above, the ocaml-tls library forms the core…
The interface between OCaml and C is defined by another library…
The scope grows quite fast here doesn’t it? C interop plus a ‘pure ocaml’ TLS implementation that is not as widely tested and additionaly calls into C code itself as stated here:
For arbitrary precision integers needed in asymmetric cryptography, we rely on zarith, which wraps libgmp. As underlying byte array structure we use cstruct (which uses OCaml Bigarray as storage).
From the PDF:
Since OCaml programs do not manipulate addresses, collection and compaction are not generally visible to a program.
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I am personally disappointed that the researchers didn’t try to evaluate possible timing attacks or attacks on OCaml’s runtime itself. Considering that just recently Pornhub bug bounty yielded 20k and it consisted of:
We have found two use-after-free vulnerabilities in PHP’s garbage collection algorithm.
Those vulnerabilities were remotely exploitable over PHP’s unserialize function.
OCaml might shield you from buffer overflows, manual memory management etc. I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal and various possible side channel attacks on it.
“I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal ”
They don’t. It’s why they’re using a language that provably reduces defects instead of one that adds them in common operations. ;)
“I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.”
Regarding leaks, I wrote the same thing in response to Freenet using Java. Orange Book B3 and A1-classes are what’s designed to handle the kind of opponents they’re thinking about. Back in 80’s to early 90’s, they required covert channel analysis to root out any storage or timing related leaks in a system. Time, audit, and/or try to mask the ones that had to stay. With Kemmerer’s Shared Resource Matrix, even amateurs in colleges were able to find many covert channels. That’s with software designed for easy analysis with manual, resource management and mapped fairly close to the metal of CPU.
Now, there’s suddenly people that think writing the stuff in a complex, GC’d language that obscures all of that is a good idea. It’s not. You can use safe, systems languages for it even with a mix of manual or GC for various components. It’s just that covert channel analysis is still necessary to show whatever scheme is in use doesn’t leak the secrets. The work goes up with GC due to extra complexity of situation. Easiest method is probably a form of reference counting. I was about to speculate a dedicated thread in background with concurrent GC could handle it by mutating only when protocol engine wasn’t processing secrets. Then, as usual, I Googled first to find MLS crowd came up with a few ways to do garbage collection in leak-resistant systems. (shrugs) Still, it’s extra complexity that should be avoided given methods exist for safe, memory management (or problem detection) without GC.
The trick is either not to touch secrets with the GC looking over your shoulder, or explicitly use blinding. In this case, symmetric encryption is handed off to AESNI through a small C shim, with OCaml carrying the keys as opaque structures, never exposing them to GC’d computation. And in the public key case, there is explicit blinding in place, to counter whatever GMP+GC might leak.
This is just about the C binding layer. You should check out the writeup about the actual library being bound. It was not made oblivious to the potential timing side channels introduced by the GC, although it does assume that the runtime has no glaring bugs. It might have. But the OCaml implementation is just not on the same level as PHP, in any sense.
Since OCaml programs do not manipulate addresses, collection and compaction are not generally visible to a program.
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I believe that what they meant here was, collection and compaction are not something the developer explicitly writes code for. (“…visible in a program.”, perhaps)
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I’m not sure if it applies in this case, but one thing Ocaml has going for it is that the GC can only kick on in allocations, so if they have allocated all of their data up front before the algorithm runs, they can alleviate any GC interaction.
I really wish this paper talked a little bit more about the tradeoffs of one vs the other. How does the OCaml implementation compare on memory usage, GC pause time / latency, throughput, or other important factors? What kind of bugs do you find when you fuzz both libraries?
The scope grows quite fast here doesn’t it? C interop plus a ‘pure ocaml’ TLS implementation that is not as widely tested and additionaly calls into C code itself as stated here:
From the PDF:
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I am personally disappointed that the researchers didn’t try to evaluate possible timing attacks or attacks on OCaml’s runtime itself. Considering that just recently Pornhub bug bounty yielded 20k and it consisted of:
OCaml might shield you from buffer overflows, manual memory management etc. I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal and various possible side channel attacks on it.
“I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal ”
They don’t. It’s why they’re using a language that provably reduces defects instead of one that adds them in common operations. ;)
“I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.”
Regarding leaks, I wrote the same thing in response to Freenet using Java. Orange Book B3 and A1-classes are what’s designed to handle the kind of opponents they’re thinking about. Back in 80’s to early 90’s, they required covert channel analysis to root out any storage or timing related leaks in a system. Time, audit, and/or try to mask the ones that had to stay. With Kemmerer’s Shared Resource Matrix, even amateurs in colleges were able to find many covert channels. That’s with software designed for easy analysis with manual, resource management and mapped fairly close to the metal of CPU.
Now, there’s suddenly people that think writing the stuff in a complex, GC’d language that obscures all of that is a good idea. It’s not. You can use safe, systems languages for it even with a mix of manual or GC for various components. It’s just that covert channel analysis is still necessary to show whatever scheme is in use doesn’t leak the secrets. The work goes up with GC due to extra complexity of situation. Easiest method is probably a form of reference counting. I was about to speculate a dedicated thread in background with concurrent GC could handle it by mutating only when protocol engine wasn’t processing secrets. Then, as usual, I Googled first to find MLS crowd came up with a few ways to do garbage collection in leak-resistant systems. (shrugs) Still, it’s extra complexity that should be avoided given methods exist for safe, memory management (or problem detection) without GC.
The trick is either not to touch secrets with the GC looking over your shoulder, or explicitly use blinding. In this case, symmetric encryption is handed off to AESNI through a small C shim, with OCaml carrying the keys as opaque structures, never exposing them to GC’d computation. And in the public key case, there is explicit blinding in place, to counter whatever GMP+GC might leak.
This is just about the C binding layer. You should check out the writeup about the actual library being bound. It was not made oblivious to the potential timing side channels introduced by the GC, although it does assume that the runtime has no glaring bugs. It might have. But the OCaml implementation is just not on the same level as PHP, in any sense.
I believe that what they meant here was, collection and compaction are not something the developer explicitly writes code for. (“…visible in a program.”, perhaps)
I’m not sure if it applies in this case, but one thing Ocaml has going for it is that the GC can only kick on in allocations, so if they have allocated all of their data up front before the algorithm runs, they can alleviate any GC interaction.
I really wish this paper talked a little bit more about the tradeoffs of one vs the other. How does the OCaml implementation compare on memory usage, GC pause time / latency, throughput, or other important factors? What kind of bugs do you find when you fuzz both libraries?