Threads for craigstuntz

  1. 2

    So, how does it compare with Codon?

    1. 1

      One difference is that, AFAICS, Codon compiles to LLVM IR, whereas Mojo compiles to MLIR

    1. 3

      This is a nice debugging yarn, but I don’t see what’s “impossible” here?

      1. 6

        Impossible because it required unreleased knowledge to fix. Privileged arcana. If I were faced with those renders I’d be scratching me head and assuming I need to software-composite multiple frames to work around the issue.

        I like the sections about finding details from talks and seeing what Metal does whilst it’s being debugged.

        1. 5

          Unfortunately, WWDC slides are the de facto documentation for many Apple APIs.

      1. 3

        I would add assembly (and various IRs such as LLVM IR, .NET IL, and Java bytecode). You probably don’t need to write it, but it’s useful to be able to read it.

        1. 1

          Is there something in assembly that you wouldn’t get in the ALGOL family at this point? I agree that it’s useful to be able to read it, and it forces you to learn how things like the stack and heap and the like are implemented, but is there a mental structure of computation that you learn from it that’s unique?

          1. 4

            Assembly is a GOTO soup in a way that ALGOL’s aren’t, which is a mode of thinking that most ALGOL languages, outside of BASIC, don’t really help you learn.

            1. 1

              Essentially, how to program without structured programming. At one point in time I would have said “the model of computation used by your CPU,” although that’s less true nowadays. Still the closest thing conveniently available to us, though.

          1. 22

            “…those who would give up essential safety to purchase a little temporary performance deserve neither performance nor safety.” -Benjamin Franklin, I think?

            1. 2

              I think you’ve got it backwards. The Ben Franklin quote is: Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety.

              1. 1

                On purpose!

              2. 2

                You inverted the intent of the original line.

              1. 15

                I moved to Feedly when Google Reader shut down and it was close enough that I’ve stuck with it.

                1. 3

                  Same, and I don’t want it to disappear in the same way, so I’ve been a paying user ever since.

                1. 7

                  The Dragon Book is amongst the worst introductions to compilers, generally recommended by people who already know about the material contained within (in which case it can be a good reference) or people who haven’t read it at all. Better choices include Programming Language Concepts, by Peter Sestoft, and Modern Compiler Implementation in ML, by Andrew Appel.

                  1. 4

                    for my undergrad compilers class we read Introduction to Compilers and Language Design, which I really enjoyed. free too!

                    1. 3

                      I agree with your recommendation of Modern Compiler Implementation in ML (the “Tiger Book”) over the Dragon Book.

                      But, I still found Tiger Book confusing at times, because it mentions so many design alternatives. In my project group we really wanted to do first-class functions, so a lot of the alternatives the book presents were not valid. It was hard for me to see how all the pieces should fit together.

                      Later I read Compiling with Continuations, also by Appel, and really enjoyed it. It felt a lot more cohesive because it explains the design of one particular compiler. And the language it compiles (SML) has the cool features I was interested in. Maybe it was also easier just because I read it second. :)

                      1. 2

                        Are there specific complaints you have about “The Dragon Book”?

                        It was one of the text books for my undergrad compilers class a long time ago, and I’ve used it as a reference several times since then, and haven’t had any complaints with it. It doesn’t cover the bleeding edge of compiler development, but it covers the basics of grammars, lexers, syntax analysis and all of that pretty well, I thought.

                        It does seem a little out of place under “CS Fundamentals.”

                        1. 3

                          It aims to be comprehensive rather than a good intro, and it regularly does things like make passing reference to concepts which are explained later in the book. So if you already know the material it’s fine, but if you’re a n00b you’re confused.

                          1. 3

                            I don’t think it’s fair to criticise it as a bad intro book because it’s not trying to be one. The Preface says it’s aimed at readers finishing up an undergrad CS education and experience with multiple programming languages. At my university, at least, a class covering the basics of finite automata, regular expressions, etc. was a prerequisite for taking the classes using this book.

                        2. 2

                          It belongs in the category most well known unread books in my circle :)

                        1. 5

                          A shift in perspective that I enjoy is that it’s not that the world is as it is and equality is broken, but instead that picking an equality tells us what the world is. Your choice of equality nails down the finest distinction you’re allowed to make.

                          So, practically speaking, tossing away access to reference equality is a necessary part of abstracting away our notion of PLs operating on a computer atop a pool of memory. If we retain it, we can make distinctions which violate that model of the world.

                          Often the trouble is that IEEE floats are designed such that they cannot be abstracted from their memory representation. On the one hand, NaN != NaN. On the other hand, Floats are supposed to represent Real numbers when you throw away the memory model, but Real numbers don’t support computable equality at all!

                          So IEEE floating point is pretty incompatible with a language model that tries to hide memory representations. But it’s critical for fast math. Now, the correctness of your language has to do not just with equality but how it allows these two incompatible worlds to live together as harmoniously as possible.

                          1. 2

                            Your choice of equality nails down the finest distinction you’re allowed to make.

                            I love this way of putting it!

                          1. 3

                            I solved this problem a while ago and wrote some notes on it.

                            “Equality & Identity”: Overview, Problems, Solution, Fixing Haskell, Implementation in Dora


                            One mistake in the article though:

                            You want the developer to see === and think “reference equality,” not “more equals signs is better.”

                            No, you don’t want this. You want === to stand for identity, such that it can have a definition that works consistently across all types, reference and value. Both sane handling of floating point values as well as the ability to freely change types from reference types to value types (and back) depend on it.

                            Some minor nipicks:

                            programming languages should make it simple to create types where equality comparison is disabled …

                            It’s probably easier to pick sane defaults and stop the runtime from implementing random methods “for convenience”.

                            … they should use this feature in their standard library where needed, such as on floating point types

                            Not really seeing the reason for this.


                            Overall a nice article with an entertaining description of an interesting problem!

                            1. 2

                              Your “fixing Haskell” article does not fix anything. The actual fix is to make Float and Double not implement the Eq type class, period.

                              One thing Standard ML does right is using different functions for comparing eqtypes (e.g. int, bool, tuples of eqtypes, etc.) and floating points (i.e. real). When you use the former, you can expect all the usual mathematical properties of equality to hold. When you use the latter, you know what you are getting into.

                              1. 1

                                Your “fixing Haskell” article does not fix anything.

                                I think the article does a good job to explain why you’re wrong.

                                The actual fix is to make Float and Double not implement the Eq type class, period.

                                Why not just drop floating point values altogether, “language design is hard, let’s go shopping”-style?

                                When you use the former, you can expect all the usual mathematical properties of equality to hold. When you use the latter, you know what you are getting into.

                                That’s exactly what happens – you specify the properties you need:

                                If you require identity and the guarantees it provides, e. g. for keys of maps, simply demand that the type implements it.

                                class HashMap[K: Identity + Hash, V] { ...}
                                // Double implements Identity
                                let map = HashMap[Double, String]()
                                map.put(Double.NaN, "found me")
                                // Lookup works, even for NaN:
                                assert(map.get(Double.NaN) == Some("found me"))
                                
                                1. 1

                                  I think the article does a good job to explain why you’re wrong.

                                  Your article literally asks for the Eq class to expose two distinct equality testing functions. What can be more broken than that? I am having a lot of trouble imagining it.

                                  If you require identity and the guarantees it provides, e. g. for keys of maps, simply demand that the type implements it.

                                  Using a type class to constrain the key type of a map is in itself broken. Suppose for the sake of the argument that you have a type that can be totally ordered in two different ways, both useful. (I could have also said “hashed in two different ways, both useful”, but that is less realistic.) It is perfectly reasonable to want to construct two distinct ordered map types, one for each total order. With a type class constraint on the map’s key type, you cannot this, because type classes must have canonical instances. (Or you could destroy canonicity at your own peril, as Scala does.)

                                  The correct solution is to do exactly what ML does:

                                  signature ORDERED =
                                  sig
                                      type key
                                      val compare : key * key -> order
                                  end
                                  
                                  functor TreeMap (Key : ORDERED) :> MAP
                                      where type key = Key.key =
                                  struct
                                      (* ... *)
                                  end
                                  

                                  The map abstraction is not parameterized just by the key type but rather by the whole ordered type structure, of which the naked key type is but one component. This is exactly how things should be.

                                  Applying this lesson to your hash map example, we have

                                  signature HASHABLE =
                                  sig
                                      type key
                                      val equal : key * key -> bool           (* not necessarily ML's built-in equality! *)
                                      val hash : key -> int
                                  end
                                  
                                  functor HashMap (Key : HASHABLE) :> MAP
                                      where type key = Key.key =
                                  struct
                                      (* ... *)
                                  end
                                  

                                  EDIT: Fixed formatting.

                                  1. 1

                                    Your article literally asks for the Eq class to expose two distinct equality testing functions. What can be more broken than that? I am having a lot of trouble imagining it.

                                    As mentioned in the article, you could also introduce a new trait for it.

                                    I think the SML approach is well-known to those who thought about non-canonical instances in Haskell and Scala. (Rust seems to enforce uniqueness of instances, so that’s another interesting data point.)

                                    The core question is whether the price one would pay for this feature is worth the cost – I wouldn’t say so; that’s why I’m approaching this by splitting things into better-defined typeclasses to relieve the pressure on needing multiple instances of the same typeclass even for trivial things.

                                    For instance for numbers, you don’t need to engage in crazy contortions anymore (like in Haskell) to swap behavior between floats that compare nicely in the domain (NaN != NaN) and floats that can be put and retrieved from data structures.

                                    1. 2

                                      I think the SML approach is well-known to those who thought about non-canonical instances in Haskell and Scala.

                                      Haskell’s designers know very well the importance of having canonical instances for type classes to work correctly, as well as the inherent conflict between canonicity of instances and modularity.

                                      And yet Haskell has the -XIncoherentInstances extension.

                                      The core question is whether the price one would pay for this feature is worth the cost

                                      The payoff is huge. The type checker will stop you immediately if you conflate two structures constructed from different instances on the same type. For example, suppose you have two ordered sets and want to merge them. The obvious O(n) algorithm only works correctly if both ordered sets are ordered using the same total order on the element type. But Scala will silently allow you to use the wrong order on one of the sets, due to the way implicits work when used as type class instances.

                                      that’s why I’m approaching this by splitting things into better-defined typeclasses

                                      The problem with your “better-defined” type classes is that many of them will support similar operations. In fact, similar enough that you will want to define the same algorithms on them. But, since your “better-defined” type classes are not literally the same as each other, as far as the type checker cares, you still need to reimplement the same algorithms over and over, which defeats the point to using type classes in the first place.

                                      Another way to work around this issue is to use newtype wrappers to mediate the conversion between instances. But that is disgusting in its own way. For example, before Applicative became a superclass of Monad, Haskell used to have a WrappedMonad newtype as a contrived way to get an Applicative instance from a Monad one.

                                      1. 1

                                        This comment was pitched at exactly my level of type knowledge - thanks!

                                        1. 1

                                          And yet Haskell has the -XIncoherentInstances extension.

                                          … which you don’t even need. GHC allows defining incoherent instances without it.

                                          For example, suppose you have two ordered sets and want to merge them. The obvious O(n) algorithm only works correctly if both ordered sets are ordered using the same total order on the element type.

                                          Scala supports this with path-dependent types, but the syntactical and mental overhead never felt worth it.

                                          Anyway, I wasn’t aware that SML supported applicative functors (which I think are a requirement to make this work) – when did this change?

                                          A conservative approximation could also be done at the value-level – if typeclass instances were first-class values (but I don’t think that’s worth it).

                                          But, since your “better-defined” type classes are not literally the same as each other, as far as the type checker cares, you still need to reimplement the same algorithms over and over, which defeats the point to using type classes in the first place.

                                          Agreed. I just think that’s vastly preferable to people using one (or a few) typeclasses as random dumping grounds until these typeclass has no laws left (see Eq, Ord in Haskell).

                                          Another way to work around this issue is to use newtype wrappers to mediate the conversion between instances. But that is disgusting in its own way.

                                          Ick, that’s way more ugly than the “recommended” approach in Haskell; that is wrapping the data types and reimplementing the typeclass instance (e. g. TotalDouble wrapping Double to implement Eq and Ord differently from Float).

                                          1. 1

                                            Scala supports this with path-dependent types, but the syntactical and mental overhead never felt worth it.

                                            Scala’s troubles arise from the desire to make everything first-class. “Objects with type members are basically first class modules, right?” It sounds good until you realize it does not interact well with type inference. Knowing what to make second-class is an art.

                                            Anyway, I wasn’t aware that SML supported applicative functors (which I think are a requirement to make this work) – when did this change?

                                            Standard ML does not have applicative functors. Making applicative functors work correctly requires a notion of “pure functor body”, which no ML dialect has as far as I can tell. (OCaml’s applicative functors create broken abstract data types when their bodies are impure.)

                                            In any case, using applicative functors is not the only way to make this work. Another way is to arrange your code so that you do not need to apply the same functor to the same module twice. Unfortunately, this requires a way of thinking that is not very compatible with how either object-oriented or functional programmers design their libraries.

                                            I have two cardinal rules for library design.

                                            Do not expose higher-order interfaces when first-order ones will do.

                                            As a counterexample, consider the Haskell API for maps:

                                            empty :: Map k a
                                            lookup :: Ord k => k -> Map k a -> Maybe a
                                            insert :: Ord k => k -> a -> Map k a -> Map k a
                                            delete :: Ord k => k -> Map k a -> Map k a
                                            {- more functions -}
                                            

                                            This is higher-order than necessary. Essentially, lookup, insert and delete are functors that are applied to an Ord structure every time they are called. Only from the canonicity of Ord instances can you argue that the effect of these multiple applications is equivalent to a single application of a functor that produces a map type whose key type is monomorphic.

                                            Have you gained anything? Strictly speaking, yes. You have gained the ability to tell your users that the implementation code for inserting into a Map Int a and into a Map String a is really the same.

                                            Is this something the user cares about? IMO, no. If I am working with maps with 64-bit integer keys, I do not care whether the underlying implementation is a generic red-black tree (which would work just as well with any other key type) or a radix tree (which requires lexicographically ordered bit vectors as keys).

                                            The right interface, as previously stated, has an abstract but monomorphic key type:

                                            type Key
                                            type Map a
                                            
                                            empty :: Map a
                                            lookup :: Key -> Map a -> Maybe a
                                            insert :: Key -> a -> Map a -> Map a
                                            delete :: Key -> Map a -> Map a
                                            

                                            Do not enforce two unrelated sets of invariants in the same module.

                                            According to most standard libraries, the obvious way to implement ordered maps is something like:

                                            functor TreeMap (Key : ORDERED) :> MAP
                                                where type key = Key.key =
                                            struct
                                                (* hard-coded implementation of red-black trees or what have you *)
                                            end
                                            

                                            Never mind that you have to reimplement red-black trees from scratch if you want to implement sets as well. This design is bad for a more fundamental reason: you cannot debug the red-black tree implementation without tearing down the map abstraction!

                                            The problem here is that TreeMap enforces two unrelated sets of invariants:

                                            1. Trees are correctly balanced.
                                            2. The in-order traversal of a tree produces a strictly ascending sequence of keys.

                                            The first invariant actually does not depend on the key type, so it can be factored out into a separate module:

                                            signature SEARCH_TREE =
                                            sig
                                                type 'a tree
                                                (* Manipulate self-balancing search trees using zippers.
                                                 * The user must know how to look for a given element. *)
                                            end
                                            
                                            structure RedBlackTree :> SEARCH_TREE = (* ... *)
                                            

                                            Once we have a working search tree implementation, we can finally implement maps:

                                            functor TreeMap
                                                ( structure Key : ORDERED
                                                  structure Tree : SEARCH_TREE ) :> MAP where type key = Key.key =
                                            struct
                                                type key = Key.key
                                                type 'a map = (key * 'a) Tree.tree
                                                (* ... *)
                                            end
                                            

                                            The same lesson can be applied to hash maps:

                                            signature TABLE =
                                            sig
                                                type 'a table
                                                (* Operations on a fixed size table, including collision resolution.
                                                 * The user must know where to start looking for a given element. *)
                                            end
                                            
                                            structure LinearProbing :> TABLE = (* ... *)
                                            structure QuadraticProbing :> TABLE = (* ... *)
                                            
                                            functor HashMap
                                                ( structure Key : HASHABLE
                                                  structure Table : TABLE ) :> MUTABLE_MAP where type key = Key.key =
                                            struct
                                                type key = Key.key
                                                type 'a map = (key * 'a ref) Table.table ref
                                                (* ... *)
                                            end
                                            
                                2. 1

                                  I’m getting a bit confused by they way you contrast ‘equality’ and ‘identity’. To me they mean the same thing, and your article is suggesting === for equality and == for equivalence. This feels in line with your point, because there are many different ways to define equivalence for any particular type.

                                  1. 2

                                    Yes – the naming is largely driven by the desire to not have two things (where differences are important) behind two names that are quite similar.

                                  2. 1

                                    I really like your series; I’ll update my post to have a reference to it.

                                    It’s not clear to me that having === mean two different things for value type and reference types is an improvement (it’s also not clear to me that it isn’t an improvement, either; I’m not sure). It would be interesting to study this in a human factors evaluation of a programming language (like the Quorum language) to see what makes the most intuitive sense to developers. Certainly what you propose is at least lawful which is an improvement over the current state of things!

                                    My reasoning for not wanting equality on floats is the same as SML’s for not having it, which I cite in the article. A reasonable comparison on a float requires an epsilon value, which doesn’t fit into the == operator usage.

                                    FYI, the Implementation in Dora post seems to be empty.

                                    Thanks!

                                    1. 2

                                      I really like your series

                                      Thanks, I appreciate it! :-)

                                      It’s not clear to me that having === mean two different things for value type and reference types is an improvement

                                      My argument is that it stands for the exact same thing in both reference and value types: “is this thing here identical to that thing over there”?. Sure, we know this as “reference equality” on reference types, but it’s just as crucial on value types. Some may call it “ability to find your floats in your collections again equality”.

                                      Here is another article on how == vs. === relates to the various orderings the IEEE754 spec defines: Comparing & Sorting

                                      A reasonable comparison on a float requires an epsilon value, which doesn’t fit into the == operator usage.

                                      I disagree on that. 80% of the comparisons on floating point values are with “magic values” like 0.0, 1.0, NaN etc. that were not exposed to any kind of arithmetic computation before, and therefore can not foil the comparison.

                                      FYI, the Implementation in Dora post seems to be empty.

                                      Yes, I left out the link, because I haven’t written it yet. :-)

                                      Here is a unit test, that might demonstrate the semantics.

                                      1. 2

                                        I would argue that if you have a comparison which is only good for “magic values” then what you really have is a bunch of useful functions: isNaN, isZero, etc. I will note that == NaN is going to fail anyway!

                                        1. 1

                                          I will note that == NaN is going to fail anyway!

                                          Which is exactly where === comes in. :-)

                                  1. 8

                                    Have you read Bill Kent’s essay on this? I think you’d really like it.

                                    The choice of syntax is partially due to heritage: F# is based on ML, which is based on math, and JavaScript syntax is based on Java -> C -> Algol -> FORTRAN.

                                    This is incorrect. Algol does not derive from FORTRAN. Additionally, neither Algol nor FORTRAN follow C’s style of equality and assignment. Algal uses := for assignment and = for equality, while FORTRAN uses = and .EQ.. C actually gets its style from BCPL, which got its own style from a deliberate simplification of CPL. I wrote a bit more about this here.

                                    Also, ML has mutable assignments with :=.

                                    1. 3

                                      Thanks for the correction; I’ll update the post.

                                      No, I hadn’t seen that essay; thanks!

                                      Edit: This chart indicates otherwise? It’s a minor point in the article, but I’m interested in the truth. Why do you say “Algol does not derive from FORTRAN?”

                                      1. 4

                                        Edit: This chart indicates otherwise? It’s a minor point in the article, but I’m interested in the truth. Why do you say “Algol does not derive from FORTRAN?”

                                        Oop, I could be completely wrong here! I’d have to go and review all my notes on that. This is all stuff I’m now pulling out of my butt:

                                        In Favor:

                                        • John Backus worked on both
                                        • Everybody knew about Fortran at the time

                                        Against:

                                        • None of the Algol material I could dig up mentioned FORTRAN
                                        • I haven’t found any “language cognates” in Algol that could have come from FORTRAN
                                        1. 3

                                          I suspect the truth is somewhere in between. Lots of languages influenced Algol, but a straight line from FORTRAN may be overstating the facts.

                                        2. 4

                                          Fortran originally just had .EQ., .NE., .GT., etc. Support for = came later.

                                          Fortran and Algol coevolved to some degree, so they cannot be placed in a tree.

                                          1. 3

                                            I think ALGOL derived from FORTRAN about as much as any other language [edit: ..that existed at the time]. It would depend if we’re talking ALGOL 60 specifically, or 58 (arguably closer to FORTRAN), or the whole “family”.

                                            The last page of The Early Development of Programming Languages sums it up really well.

                                          2. 2

                                            Also ALGOL is based heavily on mathematics.

                                            1. 2

                                              Have you read Bill Kent’s essay on this?

                                              I think we need to give you the “Bill Kent Stan Account” hat. Not that I’m complaining; I’ve liked what I’ve read.

                                              1. 4

                                                This is the nicest thing anyone’s ever said to me

                                                1. 1

                                                  Hey, at least it’s a Twitter display name!

                                            1. 5

                                              Wow, this reads as a great list of “reasons why I’m glad I don’t use language X”.

                                              But as I hinted above, there are certainly cases where structural equality does not make sense. One example is with languages which support mutation of variables, which is most of them.

                                              I don’t think this is correct; it’s mutation of data structures which makes structural equality problematic. That’s different from mutation of variables, which doesn’t interfere with equality at all.

                                              One of the first lessons we learn when becoming a programmer is that there are two different concepts which we both call “equals.” One is assignment, the other is testing equality.

                                              Also there are languages like Prolog and Erlang in which = simultaneously binds and asserts equality!

                                              1. 3

                                                Your point regarding data structures vs. variables is subtle but fair.

                                                Will have to look into Prolog and Erlang; I don’t know either language well.

                                                1. 3

                                                  Thanks for posting; I really enjoyed the article.

                                              1. 3

                                                Another great resource on this topic: The Left Hand of Equals by Noble, Black, Bruce, Homer, and Miller

                                                1. 1

                                                  Thank you! I’ve updated the post with a link to the paper.

                                                1. 8

                                                  The thing that’s “8* more expensive” is API gateway, which you can use with or without serverless and isn’t strictly required for serverless. But it does indeed seem to be not the best deal!

                                                  1. 3

                                                    You need api gateway to terminate TLS no? Almost anything real should be using TLS.

                                                    1. 7

                                                      No - AWS supports ALB <-> Lambda. Pop a free ACM certificate on there and you are good to go.

                                                      1. 3

                                                        Eh, pricing on ALB is complex - no per-request charges, so maybe it’s negligible, but not free.

                                                  1. 4

                                                    The opening comments - particularly about print/parse round trips etc. - suggest a link between fuzzing and property-based testing that I’d love to see explored more. I know that a fuzzer based on Haskell QuickCheck exists but haven’t played with it.

                                                    1. 4

                                                      Properties are specifications: what your program is supposed to do. Other names include models and contracts. The code itself is how you attempted to do it. Tests generated from them naturally check the how against the what. Finally, you or your tools can convert each property to a runtime check in the code before fuzzing it. Takes you right to point of failure.

                                                      Design-by-Contract, contract-based test generation, and fuzzing with contracts as runtime checks is a combo that should work across about any language. Add static/dynamic analysis with low false positives if your language has them. Run this stuff overnight to get more CPU time fuzzing without dragging down performance of your system while you use it.

                                                      1. 2

                                                        There are a couple papers on Targeted PBT essentially adding argMax semantics to (at least an Erlang) QuickCheck lib. One can say “test this property using this somewhat non trivial generator and also try to maximize code coverage, as this may help the generation of interesting values”. This is exactly what I did in this proof of concept [1]. It indeed finds counter examples faster than the non maximizing code. In this PoC the non maximizing version often doesn’t find anything at all.

                                                        I have discovered a passion with this technology and (plug!) am building what will essentially be a language agnostic PBT/fuzzing tool and hopefully SaaS at [2]!

                                                        [1] https://github.com/fenollp/coverage_targeted_property_testing

                                                        [2] https://github.com/FuzzyMonkeyCo/monkey

                                                        1. 1

                                                          The way I use the terms, the link is quite simple: both are instances of automated tests with generated input data, but with property based testing, there is a relatively strong oracle, whereas with fuzzing, the oracle is limited to “did it crash?”

                                                          This might be slightly different to how the author here uses the terms, though.

                                                          1. 4

                                                            Your point about oracle is the biggest difference; I think I would expand that to; property based testing can give you statistical guarantees, which means that it tries to sample your program input space according to some pre-defined probability distribution. It doesn’t particularly care about things like coverage either (and as far as I understand it, property based testing should not use feedback — but lines are bluring[1]).

                                                            Fuzzing, on the other hand does not particularly care about statistical guarantees (not that you cant make it, but typically it is not done). All it cares about is “can I exercise interesting code that is likely to invoke interesting behaviors”. So, while we use coverage for as a feedback for fuzzing, it is OK to leave aside parts of the program that are not interesting enough.

                                                            At the end of the day, I would say the similarities are that both are test generation tools (which also include things like Randoop and Evosuite which are neither fuzzers nor property checkers).

                                                            [1] ArbitCheck: A Highly Automated Property-Based Testing Tool for Java

                                                            1. 3

                                                              I used afl fuzzing to find bugs in math libraries, see e.g. [1] (i.e. things like “divide input a through b with two different libraries, see if the result matches, otherwise throw an assert error”). So you can get the “strong oracle” with fuzzing. I guess you can’t really have a strong line between “fuzzing” and “property-based testing”, it’s just different levels of test conditions. I.e. “doesn’t crash” is also a “property” you can test for.

                                                              [1] https://www.mozilla.org/en-US/security/advisories/mfsa2016-07/

                                                              1. 2

                                                                The original twitter thread where he solicited ideas about how to write fuzzable code had a conversation about how PBT and fuzzing relate: https://twitter.com/mgambogi/status/1154913054389178369.

                                                                1. 1

                                                                  Fuzzing does not limit the oracle to “did it crash?” Other oracles (address sanitizers, for example) are quite common.

                                                                  There’s obviously some overlap between fuzzing and property based testing, but:

                                                                  Fuzzing tends to work on the whole application, or a substantial part of it, at once. PBT is typically limited to a single function, although both fuzzing and PBT are useful in different scopes.

                                                                  Fuzzing tends to run for weeks on multiple CPUs, whereas PBT tends to run alongside unit tests, quickly.

                                                                  Fuzzing (often!) tends to use profile guidance, whereas PBT does not.

                                                              1. 29

                                                                Replace JS for your favorite language:

                                                                https://0.30000000000000004.com/

                                                                JS just happens be everyone’s favorite punching bag nowadays ;)

                                                                1. 10

                                                                  Most other languages I know of let you specify that a number is an integer ;)

                                                                  EDIT: wow this was a very bad brain fart, and I’m leaving it here as a testament to how quickly people will pile on on Javascript without thinking. Sorry everyone.

                                                                  1. 17

                                                                    0.1 + 0.2 doesn’t work well for integer math either.

                                                                    1. 2

                                                                      Multiple them by enough zeros to turn them into integers. Do the math in integer form for integer rules (1 + 2). Reverse the process to get float results.

                                                                      I’ve done that before a long time ago. I can’t remember why I did. Maybe just to dodge floats’ problems during number crunching. I know peoples’ reactions to it were funny.

                                                                      1. 2

                                                                        I’m going to point out that COBOL handles this tricky calculation “unsurprisingly” by default. We should all switch from JavaScript to COBOL.

                                                                      2. 2

                                                                        Upvoting because the edit is a good example of self-awareness and humility.

                                                                      3. 5

                                                                        the author does state:

                                                                        This behaviour is not unique to JavaScript. It is seen in every programming language where doubles are available, including C, C++, C#, Erlang, Java, Python and Rust.

                                                                      1. 1

                                                                        I did stop reading when I reached the common mis-attribution of who did the bulk of the work on the Apollo code.

                                                                        1. 4

                                                                          What’s the misattribution in question?

                                                                          1. 3

                                                                            Author here: would love to hear more on this? Fact-checking always appreciated. If you’re referring to Margaret Hamilton’s contribution, the article states her position as leading the team, which does not directly mean that she wrote the code in question. Fair enough, I’ll make note to add the actual author of the critical pieces as well.

                                                                            1. 1

                                                                              I have to get a good book on the Apollo Code (I bet someone here has a recommendation) but I recall that Hamilton was elevated to leadership position at a more mature stage of the project. Once you put a name to a project, you are making it about a personality (rather than the code) and the question of fair attribution comes up.

                                                                              1. 3

                                                                                I think “mis-attribution” is an overstatement. But it’s true that Hamilton did not singlehandedly write the AGC code (nor did anyone; it was a team effort). The single person who probably most deserves credit for the fault tolerance feature is probably MIT’s Hal Laning. He designed the executive and waitlist systems. Hamilton deserves as much credit as anyone, but I wish it was expressed that she was a member of a team rather than a one-person effort.

                                                                                1. 1

                                                                                  @craigstuntz thanks for that reference! Is there a good book on the Apollo automation engineering efforts? I’m interested in both the hardware and software and not just the AGC but the whole of it.

                                                                                  1. 2

                                                                                    I can tell you about some books I’ve actually read; these might not be the best for your specific interests. “How Apollo Flew to the Moon,” by David Woods. But it doesn’t go into the AGC in minute detail. “The Saturn V F-1 Engine: Powering Apollo into History,” by Anthony Young, is good, but doesn’t cover controls at all. There are a couple of books specifically about the AGC which I haven’t read.

                                                                            2. 3

                                                                              You could just give him the specifics. Then he can update the article. I’m curious what yours are since I’ve read several different versions of the story.

                                                                              1. 2

                                                                                This story and the story of Rosalind Franklin are both curious to me. We could start a project - ideally primary sources - to do some archaeology. For DNA all I can think of papers. For the Apollo code it has to be early documentation - all US Gov docs have lists of contributors and responsible people.

                                                                                1. 5

                                                                                  I was asking because I did it before to address the possibility that a bunch of people did work and she just put her name on it. She seems to have for her team. So, I started saying Hamilton’s team. The problem occurs enough that I started doing that in general.

                                                                                  Now, as to Apollo, I did have some kind of paper that was specific. I faintly recall it saying they were responsible for one or two key modules but they did all the verification. They were the best at making robust software. So, if I’m remembering right, the paper talked like they were handed the code other people wrote to find and fix the errors on top of their own modules. That’s impressive. So is the stacks of spec and code in assembly in the picture. The “Lessons from Apollo” article here has a description of the environment and mindset that led to that as well. Later, she and another woman spent whole career developing CASE tools (HOS and 001 Toolkit) for designing systems with no interface errors that generated code for you. It was like binary trees, Prolog, and Occam combined which is why usability and performance sucked despite it succeeding on robustness and generality.

                                                                                  So, that’s what I learned when I dug into it. If I get the papers again, I’ll send you the one with the attributions. No promises that I’m going to do another deep dive into that soon, though.

                                                                                  1. 3

                                                                                    That would be a very interesting project indeed! Part of the beauty and definitely one of the main reasons the Apollo program was so successful IMHO is specifically the way the work and communication was organized. To this day the project stands as a testament to how such a large-scale project should be carried out effectively. While I’m not privy to the inner working of NASA, there seems to be evidence that some of the organizational systems were lost later and this led to sever inefficiencies. It’s a pity, but luckily it offers us a wealth of learning opportunities.

                                                                                    1. 2

                                                                                      On Hamilton’s side, it seemed like they mostly just let people do their work their own way. The work was also highly-valued and interesting. So, they came up with innovative solutions to their problems. This doesn’t usually happen in process-heavy or micro-managing environments.

                                                                            1. 8

                                                                              Turn off JS then? Isn’t this what a modern browser is by definition? A tool that executes arbitrary code from URLs I throw at it?

                                                                              1. 7

                                                                                I am one of those developers whom surfs the web with “javascript.options.wasm = false” and NoScript to block just about 99% of all websites from running any Javascript on my home-machine unless I explicitly turn it on. I’ve also worked on various networks where Javascript is just plain turned off and can’t be turned on by regular users. I’ve heard some, sadly confidential, war-stories that have led to these policies. They are similar in nature to what the author states in his Medium-post.

                                                                                If you want to run something, run it on your servers and get off my laptop, phone, tv or even production-machines. Those are mine and if your website can’t handle it, then your website is simply terrible from a user experience viewpoint, dreadfully inefficient and doomed to come back hunting you when you are already in a bind because of an entirely different customer or issue. As a consequence of this way of thinking, a few web-driven systems I wrote more than a decade ago, are still live and going strong without a single security incident and without any performance issues while at the same time reaping the benefits of the better hardware they’ve been migrated to over the years.

                                                                                Therefore it is still my firm belief that a browser is primarily a tool to display content from random URLs I throw at it and not an application platform which executes code from the URLs thrown at it.

                                                                                1. 3

                                                                                  That’s a fine and valid viewpoint to have, and you are more than welcome to disable JS. But as a person who wants to use the web as an application platform, are you suggesting that browsers should neglect people like myself? I don’t really understand what your complaint is.

                                                                                  1. 2

                                                                                    But as a person who wants to use the web as an application platform, are you suggesting that browsers should neglect people like myself?

                                                                                    I don’t think so. But using Web Applications should be opt-in, not opt-out.

                                                                                    1. 3

                                                                                      Exactly.

                                                                                      There are just to many issues with JavaScript-based web-applications. For example: Performance (technical and non-technical). Accessibility (blind people perceive your site through a 1x40 or 2x80 Braille-character-display matrix, so essentially 1/2 or 2 lines on a terminal). Usability (see gmail’s pop-out feature which misses from by far most modern web-applications and you get it almost for free if you just see the web as a fancy document-delivery/viewing system). Our social status as developers as perceived by the masses: They think that everything is broken, slow and unstable, not because they can make a logical argument, but because they “feel” (in multiple ways) that it is so. And many more.

                                                                                      However the author’s focus is on security. I totally get where the author is coming from with his “The web is still a weapon”-posts. If I put off my developer-goggles and look through a user’s eyes it sure feels like it is all designed to be used as one. He can definitely state his case in a better way, although I think that showing that you can interact with an intranet through a third-party javascript makes the underlying problems, and therefore the message too, very clear.

                                                                                      It also aligns with the CIA’s Timeless tips for sabotage which you can read on that link.

                                                                                      We should think about this very carefully, despite the emotionally inflammatory speech which often accompanies these types of discussions.

                                                                                      1. 1

                                                                                        He can definitely state his case in a better way

                                                                                        I sincerely welcome suggestions.

                                                                                  2. 1

                                                                                    by the same stretch of logic you could claim any limited subset of functionality is the only things computers should do in the name of varying forms of “security.”

                                                                                    perhaps something like: “The computer is a tool for doing computation not displaying things to me and potentially warping my view of reality with incorrect information or emotionally inflammatory speech. This is why I have removed any form of internet connectivity.”

                                                                                  3. 7

                                                                                    This is not a bug and it’s not RCE. JavaScript and headers are red herrings here. If you request some URL from a server, you’re going to receive what that server chooses to send you, with or without a browser. There’s a risk in that to be sure, but it’s true by design.

                                                                                    1. 3

                                                                                      Turn off your network and you should eliminate the threat. Turn your computer off completely for a safer mitigation.

                                                                                    1. 2

                                                                                      Here’s another intro from Adam Langley which helped me a lot.

                                                                                      1. 1

                                                                                        Shoot, I really had my hopes up, here. Thought it was going to be about my favorite VAX editor.

                                                                                        1. 6

                                                                                          My favorite approach to this problem is in TLA+, and it always finds the solution.

                                                                                          1. 4

                                                                                            And here’s the Alloy way of doing it!

                                                                                            1. 2

                                                                                              Both of these are awesome! :) Thanks for the links!