Haskell is a product of the UK academic circles [..]
For a long time, OCaml was a product of the French academic circles. There was a hole of about 10 years when it saw almost no development, except for what the Coq project needed. Disinterested academics are the reason its libraries are all over the place, and it started picking up steam only a few years back.
For example, you would expect a standardized string type in your language of choice.
Speaking of text, OCaml’s string is a byte array and most closely matches strict ByteString. Oh, you want Unicode with that? Excellent – there are three small ecosystems of unicode processing libraries for that!
Some Haskell libraries/concepts just can’t be easily expressed with the same level of generality in OCaml. For example the absence of typeclasses and HKT means that you can’t as easily reason on “an arbitrary functor”.
While the first part is true, especially with value promotion, both languages contain Fω. It happens on the module level in OCaml. You can parameterize functors with functors [in OCaml] just like you can parameterize types and values with type constructors [in Haskell].
While searching for the truth is such a big thing in Haskell where every self-respecting programmer will write his/her own effect system based on the latest and coolest papers.
In an interesting twist, in my experience, there is a higher proportion of PL PhDs in the OCaml crowd – although absolute number might be comparable. Haskell has somehow escaped the lab and crazed the internets. OCaml is often used by people who play with type theory in Coq instead.
This eagerness to play with the machinery is probably the reason that Haskell stuff is often massively over-engineered.
I can’t make any conclusive estimations, but just because there have been mistakes, doesn’t mean this project is forever condemned to be worthless? What’s your general argument?
That does not answer my question. Which a) multiple; and b) warning signs?
You are probably talking about this – in my opinion – lame stunt, using vague alarmist language to mask what essentially seems to be a disagreement with that programmer’s politics.
Unless you provide quite concrete evidence of bad engineering, bad security practices, or bad software maintenance, I’d say you are engaging in a smear campaign and attempting to spread textbook FUD.
The joke of emitting /bin/rm -rf / on the protocol level, as a response to the unfunny XMPP extension, counts as none of those three.
I suspect that you can throw this exact comment for any post that at least tangentially concerns programming languages :)
Talking about details:
Lisp does not have a good way to scope methods and fields by an object (I suspect that the dot-syntax is the greatest invention of OOP ever, ergonomically);
Lisp as a language (or any of its core tooling) does not track history of source code changes; image-based approach of Smalltalk is more friendly to keeping history;
Lisp does not define any special tooling for interaction with the programmer (although REPLs are definitely a Lisp invention); SLIME-style connections to a running program grew organically later; Lisp has ability to re-run different parts of code in the same context, but does not have ability to track dependent parts of code.
Lisp does not track refactoring and copy-paste in any way. Even Java tooling is much better at automatic refactoring.
Also static types, algebraic types, monadic computation expressions, hindley milner type inference, type providers, no nulls by default.
But yes in a very twisted sense every language is a lisp, in the same way that every language is a kind of forth. Amazing what you can do once you disregard most things.
All these are nice things to have, but they still exist on the code level, not on the level of interaction with a programmer. The article talks about things specific to interaction of a language/environment with a programmer.
Although typed holes and type-driven development are an interesting new development in the ergonomics of programming languages, yes.
P.S. In case you are offended by my “calling” strongly statically-typed languages “Lisps”, that was not my intention: my point was that for any programming language on earth a Lisp aficionado can find an obscure research dialect of Lisp that had prototypes of some concepts from that language (or something which is not, but looks similar enough for them).
Not offended, I just felt that it was an overly broad characterization of programming experience. For the record I don’t dislike lisp in any way, and there is statically typed racket. You can recreate pretty much any language feature in any lisp but doing so eventually approaches creating an entirely new language. While you can do this in any language I’ll concede that it’s vastly easier to do this with a lisp or an ML.
[..] for any programming language on earth a Lisp aficionado can find an obscure research dialect of Lisp that had prototypes of some concepts from that language
Interestingly, the Lisp prototype of most of the above features was… ML.
Rather than treating programs as syntactic expressions, we should treat programs as results of a series of interactions that were used to create the program. Those interactions include writing code, but also refactoring, copy and paste or running a bit of program in REPL or a notebook system.
every single word is related to any of a lisp dialects.
It’s regular way to develop lisp programs.
You write function, play around with REPL, make sure it works or fails.
When you only use one language concepts that apply to many many languages appear to only apply to your language. That’s the only way I could assume you could possibly conflate an ML with a lisp.
Don’t you, as a ruby, developer, do most of your initial development in a REPL before saving the structures that work? This is a really common pattern with all scripting languages (and many non-interpreted languages that nevertheless have a REPL).
Long answer - REPL is not integrated with code editor. You cannot tell your editor to run this particular chunk of code.
But let’s assume you can integrate ruby REPL with your code editor. I cannot imagine how would you run particular method of some particular class you want to play around with. You have to evaluate the whole classes. But let’s assume it’s okay to evaluate whole class to run one method. What about dependencies?
For example you are writing project MVP with rails. Each time you want to test your super lightweight and simple class - you have to load every single dependency, since you cannot attach to the running ruby process.
And I’m not even talking about global immutability, which will add a lot of headache as well.
Ohh, you’re a rails developer. OK, I understand now – having a web server & a web browser in the way makes it hard to do anything iteratively in a REPL.
It’s pretty common, with scripting languages, to load all supporting modules into the REPL, experiment, and either export command history or serialize/prettyprint definitions (if your language stores original source) to apply changes. Image-based environments (like many implementations of smalltalk) will keep your changes persistent for you & you don’t actually need to dump code unless you’re doing a non-image release. All notebook-based systems (from mathematica to jupyter) are variations on the interactive-REPL model. In other words, you don’t need a lisp machine to work this way: substantial amounts of forth, python, julia, and R are developed like this (to choose an arbitrary smattering of very popular languages), along with practically all shell scripts.
Vim & emacs can spawn arbitrary interactive applications and copy arbitrary buffers to them, & no doubt ship with much more featureful integrations with particular languages; I don’t have much familiarity with alternative editors, though I’d be shocked that anybody would call something a ‘code editor’ that couldn’t integrate a REPL.
I think similar things can be done in scala. I mostly mean to say that a web stack represents multiple complicated and largely-inaccessible layers that aren’t terribly well-suited to REPL use, half of which are stuck across a network link on an enormously complex third-party sandboxed VM. Editing HTTP handlers on live servlets is of limited utility when you’re generating code in three different languages & fighting a cache.
Vim! With a little script that creates a mount namespace, mounts an encrypted directory with gocryptfs, and either spawns a shell or opens vim on a markdown file with passwords.
As described above, the ocaml-tls library forms the core…
The interface between OCaml and C is defined by another library…
The scope grows quite fast here doesn’t it? C interop plus a ‘pure ocaml’ TLS implementation that is not as widely tested and additionaly calls into C code itself as stated here:
For arbitrary precision integers needed in asymmetric cryptography, we rely on zarith, which wraps libgmp. As underlying byte array structure we use cstruct (which uses OCaml Bigarray as storage).
From the PDF:
Since OCaml programs do not manipulate addresses, collection and compaction are not generally visible to a program.
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I am personally disappointed that the researchers didn’t try to evaluate possible timing attacks or attacks on OCaml’s runtime itself. Considering that just recently Pornhub bug bounty yielded 20k and it consisted of:
We have found two use-after-free vulnerabilities in PHP’s garbage collection algorithm.
Those vulnerabilities were remotely exploitable over PHP’s unserialize function.
OCaml might shield you from buffer overflows, manual memory management etc. I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal and various possible side channel attacks on it.
“I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal ”
They don’t. It’s why they’re using a language that provably reduces defects instead of one that adds them in common operations. ;)
“I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.”
Regarding leaks, I wrote the same thing in response to Freenet using Java. Orange Book B3 and A1-classes are what’s designed to handle the kind of opponents they’re thinking about. Back in 80’s to early 90’s, they required covert channel analysis to root out any storage or timing related leaks in a system. Time, audit, and/or try to mask the ones that had to stay. With Kemmerer’s Shared Resource Matrix, even amateurs in colleges were able to find many covert channels. That’s with software designed for easy analysis with manual, resource management and mapped fairly close to the metal of CPU.
Now, there’s suddenly people that think writing the stuff in a complex, GC’d language that obscures all of that is a good idea. It’s not. You can use safe, systems languages for it even with a mix of manual or GC for various components. It’s just that covert channel analysis is still necessary to show whatever scheme is in use doesn’t leak the secrets. The work goes up with GC due to extra complexity of situation. Easiest method is probably a form of reference counting. I was about to speculate a dedicated thread in background with concurrent GC could handle it by mutating only when protocol engine wasn’t processing secrets. Then, as usual, I Googled first to find MLS crowd came up with a few ways to do garbage collection in leak-resistant systems. (shrugs) Still, it’s extra complexity that should be avoided given methods exist for safe, memory management (or problem detection) without GC.
The trick is either not to touch secrets with the GC looking over your shoulder, or explicitly use blinding. In this case, symmetric encryption is handed off to AESNI through a small C shim, with OCaml carrying the keys as opaque structures, never exposing them to GC’d computation. And in the public key case, there is explicit blinding in place, to counter whatever GMP+GC might leak.
This is just about the C binding layer. You should check out the writeup about the actual library being bound. It was not made oblivious to the potential timing side channels introduced by the GC, although it does assume that the runtime has no glaring bugs. It might have. But the OCaml implementation is just not on the same level as PHP, in any sense.
Since OCaml programs do not manipulate addresses, collection and compaction are not generally visible to a program.
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I believe that what they meant here was, collection and compaction are not something the developer explicitly writes code for. (“…visible in a program.”, perhaps)
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I’m not sure if it applies in this case, but one thing Ocaml has going for it is that the GC can only kick on in allocations, so if they have allocated all of their data up front before the algorithm runs, they can alleviate any GC interaction.
I really wish this paper talked a little bit more about the tradeoffs of one vs the other. How does the OCaml implementation compare on memory usage, GC pause time / latency, throughput, or other important factors? What kind of bugs do you find when you fuzz both libraries?
For a long time, OCaml was a product of the French academic circles. There was a hole of about 10 years when it saw almost no development, except for what the Coq project needed. Disinterested academics are the reason its libraries are all over the place, and it started picking up steam only a few years back.
Speaking of text, OCaml’s
string
is a byte array and most closely matches strictByteString
. Oh, you want Unicode with that? Excellent – there are three small ecosystems of unicode processing libraries for that!While the first part is true, especially with value promotion, both languages contain Fω. It happens on the module level in OCaml. You can parameterize functors with functors [in OCaml] just like you can parameterize types and values with type constructors [in Haskell].
In an interesting twist, in my experience, there is a higher proportion of PL PhDs in the OCaml crowd – although absolute number might be comparable. Haskell has somehow escaped the lab and crazed the internets. OCaml is often used by people who play with type theory in Coq instead.
This eagerness to play with the machinery is probably the reason that Haskell stuff is often massively over-engineered.
avoid
https://github.com/hannesm/jackline/commit/0607ae0977faf92c7c4bff6c769df15b019a2daa
I am pretty sure that this will just return the String “
/bin/rm -rf /
” as response to version queries. Nothing to be afraid of.Huh. How’d you find that? And how long was that in there? The commit seems to have been made in the very early days of the project.
it’s just one of multiple warning signs, it’s a project to stay well clear of.
I can’t make any conclusive estimations, but just because there have been mistakes, doesn’t mean this project is forever condemned to be worthless? What’s your general argument?
an
rm -rf /
put in by the main developer is a bit more than a mistake, i’d say!Which multiple warning signs?
he put the software under a racist license until github made him change it or leave the platform
Source?
That does not answer my question. Which a) multiple; and b) warning signs?
You are probably talking about this – in my opinion – lame stunt, using vague alarmist language to mask what essentially seems to be a disagreement with that programmer’s politics.
Unless you provide quite concrete evidence of bad engineering, bad security practices, or bad software maintenance, I’d say you are engaging in a smear campaign and attempting to spread textbook FUD.
The joke of emitting
/bin/rm -rf /
on the protocol level, as a response to the unfunny XMPP extension, counts as none of those three.Protip: clone the project, then check whether the commit is still there. Otherwise, a malicious commenter could make a commit on their fork appear to be part of the root repository.
In this case, the commit actually is there:
but you need to make sure…
If it’s any good it might be a good candidate for a fork :P.
Isn’t that reinventing lisp?
I suspect that you can throw this exact comment for any post that at least tangentially concerns programming languages :)
Talking about details:
My bad that I used word “LISP” instead of “LISP dialect” or even “Clojure” (as a good example of modern lisp dialect)
Also static types, algebraic types, monadic computation expressions, hindley milner type inference, type providers, no nulls by default.
But yes in a very twisted sense every language is a lisp, in the same way that every language is a kind of forth. Amazing what you can do once you disregard most things.
All these are nice things to have, but they still exist on the code level, not on the level of interaction with a programmer. The article talks about things specific to interaction of a language/environment with a programmer.
Although typed holes and type-driven development are an interesting new development in the ergonomics of programming languages, yes.
P.S. In case you are offended by my “calling” strongly statically-typed languages “Lisps”, that was not my intention: my point was that for any programming language on earth a Lisp aficionado can find an obscure research dialect of Lisp that had prototypes of some concepts from that language (or something which is not, but looks similar enough for them).
Not offended, I just felt that it was an overly broad characterization of programming experience. For the record I don’t dislike lisp in any way, and there is statically typed racket. You can recreate pretty much any language feature in any lisp but doing so eventually approaches creating an entirely new language. While you can do this in any language I’ll concede that it’s vastly easier to do this with a lisp or an ML.
Interestingly, the Lisp prototype of most of the above features was… ML.
How does this relate to LISP?
every single word is related to any of a lisp dialects. It’s regular way to develop lisp programs. You write function, play around with REPL, make sure it works or fails.
When you only use one language concepts that apply to many many languages appear to only apply to your language. That’s the only way I could assume you could possibly conflate an ML with a lisp.
Well I’m a ruby developer, far away from being pro-lisp-dialect developer.
Don’t you, as a ruby, developer, do most of your initial development in a REPL before saving the structures that work? This is a really common pattern with all scripting languages (and many non-interpreted languages that nevertheless have a REPL).
Quick answer is no.
Long answer - REPL is not integrated with code editor. You cannot tell your editor to run this particular chunk of code. But let’s assume you can integrate ruby REPL with your code editor. I cannot imagine how would you run particular method of some particular class you want to play around with. You have to evaluate the whole classes. But let’s assume it’s okay to evaluate whole class to run one method. What about dependencies? For example you are writing project MVP with rails. Each time you want to test your super lightweight and simple class - you have to load every single dependency, since you cannot attach to the running ruby process.
And I’m not even talking about global immutability, which will add a lot of headache as well.
Ohh, you’re a rails developer. OK, I understand now – having a web server & a web browser in the way makes it hard to do anything iteratively in a REPL.
It’s pretty common, with scripting languages, to load all supporting modules into the REPL, experiment, and either export command history or serialize/prettyprint definitions (if your language stores original source) to apply changes. Image-based environments (like many implementations of smalltalk) will keep your changes persistent for you & you don’t actually need to dump code unless you’re doing a non-image release. All notebook-based systems (from mathematica to jupyter) are variations on the interactive-REPL model. In other words, you don’t need a lisp machine to work this way: substantial amounts of forth, python, julia, and R are developed like this (to choose an arbitrary smattering of very popular languages), along with practically all shell scripts.
Vim & emacs can spawn arbitrary interactive applications and copy arbitrary buffers to them, & no doubt ship with much more featureful integrations with particular languages; I don’t have much familiarity with alternative editors, though I’d be shocked that anybody would call something a ‘code editor’ that couldn’t integrate a REPL.
Clojure has a really good story for working with HTTP servers at the REPL. It’s very common to start a server and redefine a HTTP handler function.
The multithreadedness of that JVM is awesome in this regard.
I think similar things can be done in scala. I mostly mean to say that a web stack represents multiple complicated and largely-inaccessible layers that aren’t terribly well-suited to REPL use, half of which are stuck across a network link on an enormously complex third-party sandboxed VM. Editing HTTP handlers on live servlets is of limited utility when you’re generating code in three different languages & fighting a cache.
Yeah that object oriented focus gets in the way, I get that. Lisp is also not the only functional programming language though.
Also applies to Ocaml, F#, python, ruby.
Edit: lol the article is about F#. That’s what I get for not reading the article I guess.
Vim! With a little script that creates a mount namespace, mounts an encrypted directory with gocryptfs, and either spawns a shell or opens vim on a markdown file with passwords.
For syncing, I carry the laptop in a bag.
Is the bag cross platform?
That bag is very portable.
The scope grows quite fast here doesn’t it? C interop plus a ‘pure ocaml’ TLS implementation that is not as widely tested and additionaly calls into C code itself as stated here:
From the PDF:
I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.
I am personally disappointed that the researchers didn’t try to evaluate possible timing attacks or attacks on OCaml’s runtime itself. Considering that just recently Pornhub bug bounty yielded 20k and it consisted of:
OCaml might shield you from buffer overflows, manual memory management etc. I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal and various possible side channel attacks on it.
“I don’t think though that you can simply ignore the amount of code that is involved in achieving the goal ”
They don’t. It’s why they’re using a language that provably reduces defects instead of one that adds them in common operations. ;)
“I disagree. Garbage collection takes time, an attacker can observe timings & patterns in the application based on the GC impact.”
Regarding leaks, I wrote the same thing in response to Freenet using Java. Orange Book B3 and A1-classes are what’s designed to handle the kind of opponents they’re thinking about. Back in 80’s to early 90’s, they required covert channel analysis to root out any storage or timing related leaks in a system. Time, audit, and/or try to mask the ones that had to stay. With Kemmerer’s Shared Resource Matrix, even amateurs in colleges were able to find many covert channels. That’s with software designed for easy analysis with manual, resource management and mapped fairly close to the metal of CPU.
Now, there’s suddenly people that think writing the stuff in a complex, GC’d language that obscures all of that is a good idea. It’s not. You can use safe, systems languages for it even with a mix of manual or GC for various components. It’s just that covert channel analysis is still necessary to show whatever scheme is in use doesn’t leak the secrets. The work goes up with GC due to extra complexity of situation. Easiest method is probably a form of reference counting. I was about to speculate a dedicated thread in background with concurrent GC could handle it by mutating only when protocol engine wasn’t processing secrets. Then, as usual, I Googled first to find MLS crowd came up with a few ways to do garbage collection in leak-resistant systems. (shrugs) Still, it’s extra complexity that should be avoided given methods exist for safe, memory management (or problem detection) without GC.
The trick is either not to touch secrets with the GC looking over your shoulder, or explicitly use blinding. In this case, symmetric encryption is handed off to AESNI through a small C shim, with OCaml carrying the keys as opaque structures, never exposing them to GC’d computation. And in the public key case, there is explicit blinding in place, to counter whatever GMP+GC might leak.
This is just about the C binding layer. You should check out the writeup about the actual library being bound. It was not made oblivious to the potential timing side channels introduced by the GC, although it does assume that the runtime has no glaring bugs. It might have. But the OCaml implementation is just not on the same level as PHP, in any sense.
I believe that what they meant here was, collection and compaction are not something the developer explicitly writes code for. (“…visible in a program.”, perhaps)
I’m not sure if it applies in this case, but one thing Ocaml has going for it is that the GC can only kick on in allocations, so if they have allocated all of their data up front before the algorithm runs, they can alleviate any GC interaction.
I really wish this paper talked a little bit more about the tradeoffs of one vs the other. How does the OCaml implementation compare on memory usage, GC pause time / latency, throughput, or other important factors? What kind of bugs do you find when you fuzz both libraries?