What bothered me about the paper is that it seemed to assert security without justifying it. Here are the five arguments given for why functional programming is more secure.
Functions with Parameters and Results: The author seems to confuse a strong type system (“All procedures … distinguish incoming values from outgoing values”) with purity (“a program always produces the same answer given the same inputs regardless of the order of evaluation of its expressions”). The former is a good idea (at least we think it is) and has been spreading to other languages, including imperative ones. The latter is unique to functional programming languages (but not all of them).
Binding of Parameters: Contains the same claims and information as the first argument.
Recursive Calls: Asserted as a security feature without elaboration (“This provides secure programming with a proof technique based firmly on mathematical induction”), recursive calls not exclusive to functional programming.
Referential Transparency: Contains the same claims and information as the first argument.
Functions as First Class Values: Asserted as a security feature without elaboration (“This permits secure programming to create a program that models mathematical composition”), higher-order functions not exclusive to functional programming.
There’s four core ideas: strong type systems are good for security, recursion is good for security, higher-order functions are good for security, and pure functions are good for security. All of these can be done in imperative languages, too, so the argument isn’t so much “these things are good” as “these replace bad things”: weak/dynamic typing, loops, impure functions.
The paper doesn’t present any data backing these assertions up. It makes sense to me that those would be less secure, but I don’t trust my common sense. Empirical research can often contradict what makes sense. For example, there’s no consistent evidence that static typing is better than dynamic typing. That doesn’t mean, of course, that are pure functions are more secure than impure ones, but I’d love to see data confirming that, as well as showing how big the difference is. The paper doesn’t provide that though, just asserts that the difference is there and the difference is worth it.
On a sidenote, this is why I’m really excited about Python’s new gradual typing: we’ll potentially have a large set of typed and untyped code in the same language to compare.
Here’d be the questions I think would be worth exploring further: are functional languages, in practice, more secure than imperative ones? If so, how much of that is purity and how much is specific features, like option types, that can easily be ported to imperative languages?
[Comment removed by author]
If you have any good studies to recommend, I’d love to read them. I found myself influenced a lot by the papers in Making Software and Never Work In Theory and am always interested in reading more on the subjects. Probably worth making a separate thread for that, actually; I’ll get to work on that.
Re locality, I agree that I’ve found nonlocal effects to be really frustrating. I’m just not sure if the marginal locality gain of pure functions are worth the marginal difficulty versus, say, the marginal locality gain and investment of removing GOTOs and pointers. There’s different degrees of locality, and “no mutations” is more extreme than “no loops” is more extreme than “no direct memory management”.
That only is a reasonable question, of course, if there other improvements out there that don’t depend on pure functions. For example, I’m enthusiastic about model checking, specifically TLA+, which has been really helpful in reducing bugs. I’m guessing there’s other imperative wins out there, too.
Cleanroom methodology consistently produced low-defect software with a lightweight, formal method plus strong iteration and review processes. It was used with imperative languages but functional style is ideal for it.
Like the PL/S language, IBM kept the specifics of how to do Cleanroom semi-secret as a perceived advantage. They shared Cleanroom to other firms only via a paid, training program. The initial studies on it were IBM’s developers who did way better. Usually on first try. Those are paywalled at ACM & IEEE. So, I needed an article that explained the specifics without paying IBM and showed similar results on test group without paying ACM/IEEE. Given those criteria, I think a resource on students that matches industrial experience was a good choice here. :)
Far as the methods, high-assurance field constantly tested various assurance activities to at least produce a pile of anecdotes on what was effective where. They applied every method they could find. Made their stuff pretty thorough. My meta-analysis, also a draft of a high-assurance guide, is still my highest-impact comment here:
Hope you enjoy that one as it took some brainpower to produce. More effort on what to leave off & summarize than what to add. Harder for me to do that.
Damn. I was worried about that. Appreciate your reply on it.
Far as timing, this is especially true. There’s ongoing work in CompSci on handling it at high-level with proofs and such. For now, I notice it’s usually a small part of the code like crypto algorithms. I agree that’s terrible example for Haskell. The solution, given it’s just an occasional component, would be to code those in CRYPTOL, COGENT, or SPARK. Their annotations and/or backends would be modified to enforce the timing requirement in synthesized code. On language level, prior work like Flow Caml just encoded the information flow rules into the language or specs so automated tools can spot any illegal flows. There’s at least two prototypes in academia for doing that with “temporal, security policies.” I’m sure these could be extended to do any high-level analysis of information flow for something like Haskell.
Well, Ocaml was already used in safety-critical for Esterel’s SCADE code generator. They said the compiler made it very easy to do source-to-object code verification. The next use was MirageOS’s TLS stack that includes a lot of pure code in it. Really clean vs C-based TLS. There’s also a verified compiler for SML, CakeML. Original version of QuickCheck ran on ML, too. Plus things like Meta-ML and Flow ML. That ecosystem is strong for making reference implementations of high-assurance software. Some like Ocaml good for medium-assurance, production releases. Not high-assurance yet, though, without manual intervention.