1. 34
  1.  

  2. [Comment removed by author]

    1. 9

      The OP and that comment inspired me to submit this:

      https://lobste.rs/s/xmga3q/secure_programming_using_functional

      Need people who understand both how software exploits work and functional programming to review it to tell me if it’s an accurate summary benefits to follow up on. I agree with a few on the list that I understand. Makes it worth submitting.

      re high-assurance

      Welcome! It should be easier than ever given the tooling. Just don’t expect industry or FOSS acceptance of anything. My current recommendation is people developing with high-assurance model but deliverable is in whatever language/platform is popular. In systems, it will have to be C or C++. Good news is there’s at least three toolkits IIRC for doing that well. :) Far as learning proving, I also found recommendations for a beginners resources that’s incremental and low-level enough that you understand the “magic” behind the easier provers. It’s called The Little Prover done in Little Schemer style w/ that book being all you need to know per reviews. I plan to pick it up later.

      Also, do the projects incrementally on top of a medium-assurance, decent-quality project where you have immediate benefits from regular assurance plus extra assurance over time. That was invented for A1-class systems given they were taking 2-5 years a piece to build using older tools. People are throwing together analysis like that in TLA+ and Coq these days in days to months. It’s still good advice, though. And maximize reusability. Don’t do a secure, chat protocol: do a link encryptor whose protocol can handle text, audio, and video. Don’t do a C-specific compiler: do a C-to-intermediate, intermediate-to-intermediate, and intermediate-to-ASM where intermediate is like QBE or LLVM. And so on so your hard work done once keeps paying off with minimal re-work.

      Those are your quick tips for the day. God-speed to ya! :)

      1. 6

        With the caveat that I didn’t dig into it all real deeply and may be mistaken here – I was under the impression that a significant fraction (majority?) of the IoS-device vulnerabilities exploited in recent DDoS incidents (or at least the Dyn one) were things like default (or easily-guessable) passwords rather than code-correctness problems. I don’t think any amount of memory-safety or type-system guarantees is going to help if admin/admin on an open port 23 gets you a root shell…

        1. 3

          It ususlly starts with either a memory-safety bug, an interface error with malicious input, or a default password still there. The first is countered with memory-safe language. Second can be countered with design by contract a la Ada. Third is developer or ops stupidity that should be countered with emails and lawsuits till they knock it off.

          Note that this isnt based on empirical data. It’s just a few categories that were extremely common when I looked into things.

          1. 2

            In this case it starts with telnet servers open to the internet, with default passwords.

            1. 2

              Yeah, I put the third category of pure apathy needing legal liability. An ancient risk, too.

        2. 7

          I agree.

          At this point, performance is just a boring economic problem for most applications. If the language is 10% slower, we can buy more computing power. In such a case, performance issues are fixable whereas correctness problems are not. Moreover, incorrect or insecure code can cause problems more than a decade later.

          The other thing to keep in mind is that, to a large degree, functional programming languages “are slower” because less effort has been invested in making the tools fast. We have orders of magnitude more person-hours invested in making C-style languages fast. Even then, it’s only in highly optimized code that the performance discrepancy is notable and more meaningful than differences in programmer ability and effort. Excellent C or C++ code can be very fast, but average-case corporate C++ is very slow.

          Of course, correctness can be achieved in low-level languages: I’d bet that NASA uses more C than Haskell. It just takes a lot of effort to do so, and most companies would never budget the resources necessary for it (whether that involves using C or Haskell). The selling point of Haskell is that it’s possible to have a very high level of correctness with only moderate effort.

          1. 7

            Fundamentally, NASA’s beef with Haskell would be unpredictable latency, not throughput.

            1. 4

              You can’t use the full language. Atom and FAUST might be useful, though. If extracting to C, maybe also Galois' Ivory or Tower. As in, we can use Haskell but just a subset with code extraction.

              1. 1

                You can’t use the full language.

                What specific parts would cause problems? Garbage collection (unpredictable latency)?

                I assumed that C was still preferred because, even though application code can be of very high quality in Haskell, you still have to rely on a runtime, and while that runtime is very good it is still much larger than a system that we can formally prove. I could be wrong here; that was just my understanding.

                Ultimately, Haskell’s selling point is that you can achieve very high quality with only moderate effort. You can also achieve very high quality with very high effort in C. In an environment like NASA or JPL where correctness really matters, very high effort isn’t a dealbreaker but expected.

                1. 3

                  Determinism is the big thing in real-time systems. I mean, down to how long a low-level operation takes. Even in between interrupts in some cases. The Haskell operations are just too non-deterministic. The runtime isn’t real-time. The compiler isn’t designed for real-time. Lazy evaluation itself would hurt things where you’d have to force strict. If you could get Haskell good for real-time, you’d practically have turned it into a different language like Java people did with their enhanced, subsets of Java with custom JVM’s. That’s why they’re embedding other stuff in it.

                2. 0

                  If only they had designed less language.

                  1. 4

                    Yes, what a shame they didn’t cripple their language to be suitable to the specialized needs of real-time systems. If a garbage collector qualifies as “too much language” for you, there’s always Rust. or ATS.

                    1. 3

                      Did you mean a simplified version of Haskell like PreScheme was for standard Scheme? If so, then yeah it would be nice to have at least one like that.

                      The House OS people were doing at least non-real-time, kernel programming with Pragmatica Project and Habit language. They got hired into greater projects. People could pick up their work then turn it to something deterministic.

                      1. 1

                        Yes, I meant something like that. The things that make Haskell somewhat unpredictable performance-wise are few in number – GADTs never hurt anyone.

                3. 11

                  We have orders of magnitude more person-hours invested in making C-style languages fast

                  This isn’t really the only issue. Even a compiler that does essentially no optimization like tcc still can perform extremely well because certain algorithms are simply more efficient with mutation.

                  1. [Comment removed by author]

                    1. 2

                      okay, so there must be even more reasons for them being slow then.

                      1. 3

                        What makes you so confident that it isn’t just a case of how much effort has been put into the compilers?

                        1. 3

                          There is less abstraction over the underlying machine semantics, so there is less work. I think if you look at existing compilers, even immature compilers for C style languages can have performance on par with languages like clojure. There is just less overhead, its that simple.

                          A “sufficiently advanced compiler” that can make up the difference is something that is often promised. Maybe it is possible, but the gains in performance from things like llvm or GCC optimizations for C/C++ programs are being vastly over stated as an excuse for slower functional languages.

                          All that being said, I think for many problems the performance hit is absolutely worth it.

                          1. 2

                            There is less abstraction over the underlying machine semantics, so there is less work. I think if you look at existing compilers, even immature compilers for C style languages can have performance on par with languages like clojure

                            It is true that naive compilers will compile idiomatic C to more efficient machine code than idiomatic Haskell. I don’t think this necessarily proves anything about non-naive compilers. E.g. the idiomatic C usually specifies the evaluation order a lot more fully than the idiomatic Haskell, which is an advantage for a naive compiler (because the programmer usually has a decent idea about what a sensible evaluation order is), but can turn into a disadvantage if the compiler is advanced enough to pick a better evaluation order than the programmer would.

                            1. 1

                              Have you seen the work on using Coq to wrote proofs about and generate machine code?

                              http://research.microsoft.com/en-us/um/people/nick/coqasm.pdf

                              Or the work to generate efficient and verified implementations of elliptic curve cryptography which gives competition to hand crafted code?

                              https://people.csail.mit.edu/jgross/personal-website/papers/2017-fiat-crypto-pldi-draft.pdf

                              1. 5

                                Competition? “We come within a factor of 6 of the running time of the handcrafted world-champion implementations.”

                          2. 2

                            Here are some of those reasons:

                            • Overusing runtime-tagged data, laziness and objects with semantically relevant physical identities.
                            • Linked data structures with lots of very small nodes.
                            • Language designs that don’t offer enough room for eliminating overheads via static analysis.
                    2. 3

                      Correctness (in functional programming terms) doesn’t have much to do with the IoT DDoS attacks. You can write insecure software that is “correct”. I would argue most of the security vulnerabilities that have been exploited to hijack IoT devices would not have been prevented by using functional programing, they are just a result of total laziness and failure to care about security at all. If you don’t care about security, functional programming isn’t going to force you to be secure. You can still do stupid shit like rely on an insecure server connection, etc.

                    3. 4

                      Others already have the bigger issue of correctness covered here, but I want to point out some of the little wrongness in the post.

                      • GC overhead for parallel programs: this is off-base since most threaded runtimes will keep per-thread generations to avoid exactly this overhead. Values will go into a shared heap only after a few generations or when transferred between threads. Both of these are uncommon cases. In the extreme case, Erlang will copy everything during thread-communication so locality is always guaranteed. The GC can thus also be thread-local, and there you have part of Erlang’s scalability.
                      • He claims an SO answer is advising people to turn off garbage collection when what it really does is mention card-counting, a GC optimization.
                      • most functional programming languages… lack the killer combo of: reified generics, value types and a fast GC write barrier

                        • Reified generics aren’t necessary for efficiency here; type families in Haskell allow the sort of data representation optimization he’s wanting. WRT value types, GHC has unboxed tuples/floats/integers if you need them. The write barrier could be a legitimate issue, though if your hash table’s full of unboxed types I wouldn’t think the write barrier would hit at all.
                      • “Smug weenies” - the author’s just being an asshole here. If you spend your time doing PL research, you have zero obligation to break it down into bite-sized pieces for the benefit of people looking to make a quick buck. Also, the insinuation without detail that he’s got the Lispers all figured out is more smugness than I’ve ever actually heard out of a Lisp weenie.

                      So the author makes a few good points, but a) he makes a few more wrong points, and b) all of them are related to performance, which isn’t really FP’s big advantage.

                      Also speaking of performance, consider our current architectures: we have imperative CPUs that have begotten imperative languages and vice versa for decades worth of optimization. The quality of a CPU architecture these days is basically judged on how fast it runs C code. So yeah, imperative languages are faster in this environment. Congratulations. There have been a few older and more recent attempts at FP architecture, and such things will probably get more attention in the not-too-distant future as single-core performance stagnates, but for now fast functional code basically means faking imperative code, which (surprise!) is slower than purpose-written imperative code. It’s a similar thing for garbage collectors: Lisp machines had hardware-assisted GCs, something no x86 will ever have in a million years. We can already write GCs that come close to manual memory management perf-wise on modern systems, so imagine the speed we could get with hardware support. My guess is FPGAs will help here: hardware designers are understandably wary of adding a bunch of complexity for an “untested” use-case, but with FPGAs each language implementation can have exactly the hardware offloading it needs.

                      1. 1

                        Maybe someone who knows can educate me on monads. As far as I can tell, Monads are a terribly cumbersome way of emulating the use of lists or structures to carry computational state: div(a,b) = return (b != 0 ? (“ok”, a/b): (“bad”,nil)) Hooray. Now what?

                        1. 3

                          There’s no shortcut, which is why monad tutorials tend to fail.

                          I taught a course on Haskell, linked here. It might help.

                          A monad is “just a type class”, so once you’re fluent in type classes, there’s still a lot of ground work but it’s not that hard.

                          1. -1

                            That’s hard to believe. I can describe all sorts of complex CS and mathematics concepts concisely without requiring immersion in the field or that the questioner become an expert. After reading multiple papers, my impression is that starting from the peculiar requirement that we write programs without state, one encodes state in a list in order to preserve the fiction and then wraps it in some not particularly relevant category theory to justify it.

                            1. 1

                              So, the State class of two parameters, State s a, is a wrapper around s -> (a, s). The type s is the state and a is produced by computation. So a State s a is a computation that has some interaction with and dependence on an initial state (of type s) and .

                              For the natural monad of s -> (a, s) you end up with:

                              return a = \s -> (s, a) (>>=) sa f = \s -> let (s1, a) = sa s in (f b) s1

                              These correspond to “make no changes to s but return a” and “perform stateful computation sa, feed its result into f to generate the next stateful computation, and perform it”. The bind gives you sequencing.

                              The IO monad is similar, but without an observable state, because IO can interact with anything that the machine can interact with. However, conceptually IO a ~= World -> (a, World).

                              There are other monads (e.g. the list monad) that correspond to other computational contexts. In the case of [] (list) it’s nondeterministic choice or multiple return; in the case of Maybe it’s a computation that might fail.

                              There’s a learning curve, but I like this stuff a lot because it gives you the ability to know, from your types, which parts of your code can have state and which ones can’t.

                              1. 0

                                thanks for the explanation. I don’t see what you get from this that is illuminating, but perhaps I’ll reread wadler and try to write up my response in a little more detail.

                                1. 2

                                  Railway oriented programming is a good article which explains monadic error handling practically.

                                  1. -1

                                    What I get from that is: If your functions return both a computed value and a condition code, you can define methods of composition that look only at the condition codes. Thus: F(f,g) = (if f(x) = (error,y) for some y return (error,y) else return g(y))

                                    1. 3

                                      Yes, that is the general idea; however, you don’t have to put the handling of the error code into the flow of logic. You can express your logic step by step with no competing concern, then at the end “unwrap” the type as necessary. Your actual “work” ends up looking very clean, and the handling of the error condition is out of band. Using Maybe versus using null in Java is one of my favorite examples; in pseudocode:

                                      // with maybe
                                      x = something_that_returns_a_maybe()
                                      y = x.foo()
                                      z = y.bar()
                                      
                                      return match z {
                                        Nothing => "nothing was returned!"
                                        Some(n) => "I got {n} as a result"
                                      }
                                      
                                      // with null
                                      x = something_that_returns_a_null()
                                      if (x == null) return "I got a null!"
                                      y = x.foo()
                                      if (y == null) return "I got a null!"
                                      z = y.bar()
                                      if (z == null) return "I got a null!"
                                      return "I got {z} as a result"
                                      

                                      This avoidance of interleaving is one reason that monads have been called “programmable semicolons,” although I hate that and all other metaphors.

                        2. 0

                          My question is why you would want to pretend that programs don’t modify or have state. That seems like attempting to do welding without heating metal. Hot metal can cause problems and requires skill, but it’s kind of the point.

                          1. 5

                            I’m only talking for my self, of course, but I don’t pretend that programs don’t mutate state when I write code in a purely functional language. I know the compiler will do that behind the scenes. Having immutable state in the programming model is about eliminating some classes of bugs and making it easier to reason about.

                            I’m also really happy about having virtual memory even though I’m only “pretending” that I have large, contiguous blocks of memory in a suspiciously big memory space.

                            (Juggling [memory blocks] can cause problems and requires skill. ;)

                            1. 1

                              Good point, but all abstractions are tradeoffs. Even virtual memory is an abstraction with a cost. There is a compelling argument in favor of being able to mark functions or calculation as side effect free or purely functional, but it seems to me that e.g. trying to make I/O “functional” involves a lot of effort and complexity and no real advantage.

                              1. 3

                                This is an area of active research, so I don’t expect that monad transformer stacks are the last word in handling effects. That said, in order to mark something as “side effect free” or “pure” requires that you have a way of encoding what is impure. That is, you can’t just say: “This is a pure function!” without knowing what “pure” means. Monads and monad transformer stacks are essentially ways of saying: “This thing handles impurity of a particular kind.” For example, Maybe handles the impurity of a result that may or may not happen; Reader handles the impurity of a configuration environment; Either handles the impurity of two potential branches of execution (i.e. errors).

                                1. 1

                                  I’m restricting myself to “pure function” as “no side effects”. I have never understood the utility of monads - they seem to be complex ways of describing mundane operations.

                                  1. 4

                                    That’s what I’m saying, though: a “side effect” has to be codified somehow. If it isn’t captured in the type, then it looks fine to the compiler; printf returns an int, after all, so how is it to be marked as a side effect? That is the utility of monads. Maybe (or Result in Rust) is the most obvious example: your type is literally saying “Something may or may not come back,” and you then must deal with that either by passing that uncertainty “up the chain” or by dealing with the uncertainty at the point of origin. This is encoding the side effect of an uncertain return which, in more typical languages, is represented by null (None, nil, …).

                                    1. 1

                                      If your mechanism of capturing function semantics is restricted to the type or signature, you have to encode all sorts of things in the types. Alternatively, you could just have a keyword, as in D, for example. I’d like a compiler directive in C to designate a function as “pure” in that sense - it would allow many optimizations and perhaps discourage C compiler writers from coming up with stupid optimizations based on undefineds.

                                      Ok, I didn’t know that GCC/Clang both support “pure” attributes. So there.