1. 37
  1. 11

    There’s a great blog post by Conal Elliott about the false belief that “everything is a function” in Haskell:

    There a belief about Haskell that keeps popping up in chat rooms and mailing lists — one that I’ve been puzzling over for a while. One expression of the belief is “everything is a function” in Haskell.

    Of course, there are all of these non-functions that need to be accounted for, including integers, booleans, tuples, and lists. What about them? A recurring answer is that such things are “functions of no arguments” or functions of a one-element type or “constant functions”.

    I wonder about how beliefs form, spread, and solidify, and so I asked around about how people came to this notion and how they managed to hold onto it. I had a few conjectures in mind, which I kept to myself to avoid biasing people’s responses. Of the responses I got, some were as I’d imagined, and some were quite surprising to me, revealing some of my blind spots about others’ thinking and about conversation dynamics.


    1. 10

      It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.

      —Alan Perlis

      1. 1

        I don’t understand this. Does it mean he doesn’t like typing? By switching to stringly-typed programming, I can easily reduce my whole app to one single data structure.

      2. 6

        Funny I was literally just writing down this kind of list for a presentation I had to make for one of my courses.

        This is what I wrote:

        • CPU :: Everything is a memory address
        • Unix :: Everything is a stream
        • Lisp :: Everything is a symbol (as in ‘symbolic computation’)
        • Erlang :: Everything is an actor
        • Prolog :: Everything is a relation
        • Haskell :: Everything is a function
        • Mathematics :: Everything is a set
        1. 10

          Lisp :: Everything is a symbol (as in ‘symbolic computation’)

          In lisp, everything is a cons cell, an atom (‘symbol’), or nil.

          CPU :: Everything is a memory address

          On a CPU, there are usually two or three types of operands (depending on if RISC or CISC): registers, immediates, and (only on CISC) memory operands.

          Memory operands are composed of a combination of registers and immediates. The point being that you take a value which is nominally an integer and treat it as a memory address in a specific context. So I don’t think it’s very useful to say that everything is a memory address.

          Unix :: Everything is a stream

          Streams are a very important part of unix, but they’re not the totality. In the unix programming language, functions are called ‘processes’ and data comes from ‘files’ (though “stream,” as you call it, is perhaps the clearer word). Data and data structures are (arguably) more important than the transformations we define on them, but the process model is an indisputably an integral part of unix.

          Haskell :: Everything is a function

          In untyped lambda calculus, everything is a function. Haskell has types, as well as laziness and boatloads of other junk.

          Mathematics :: Everything is a set

          I think this one is the truest, if perhaps also the most literal.

          1. 3

            Nice, thanks for diving in. I knew some were inaccurate as I wrote them but got the general idea across, I should have added tcl’s ‘everything is a string’, I think that holds up pretty well.

            1. 1
              Mathematics :: Everything is a set

              (or a category or a type)

              (although at least type theory really rejects the idea that everything is X, by explicitly saying that ⊤ is different from ⊥, and ne’er the twain shall meet)

              (I don’t know about category theory)

            2. 5
              • APL :: Everything is an array
              1. 3

                Smalltalk :: Everything is an object

                Pascal :: Everything is a procedure

                Plan 9 :: Everything is a file system

                1. 1

                  Mathematics :: Everything is a set

                  And all sets are categories.

                  1. 1

                    Matter and pattern is what the universe is made of, these words are etymologically related to the words ‘mother’ and ‘father’ for a reason.

                    Math represents one with elements (in sets) and the other with morphisms (in categories). One implies the other as we know, so the universe is built on circular reasoning :)

                    1. 4

                      Your etymology is highly suspect, as is all reasoning derived from it.

                      1. 0

                        You don’t like to have fun? I think this is a nice perspective and it has given me something so I figured I’d share it..

                        I agree that it’s all ‘highly suspect’ - I wouldn’t base any important decisions on such “reasoning” ;)

                        But at least for ‘pattern’ we have ‘patron’ as a root which comes from ‘father’, with such old words it’s hard to make any strong claims.

                        1. 1

                          Nothing you say is accurate.

                          1. 1

                            Nothing you say is substantial.

                2. 6

                  You could turn this around and say: Emacs is an everything.

                  1. 3

                    Python integers are full Python objects, which is extremely wasteful of memory compared to compact representations

                    I’m not familiar with the implementation of Python, but I have a hard time believing that’s true, at least for typical integers. Most dynamic languages use pointer tagging, so most ints fit inside a pointer and don’t allocate memory. This goes back at least to Smalltalk-76, and probably all the way to LISP.

                    Really big ints (Python has bignums, right?) would have to go on the heap. Floats might or might not; NaN tagging lets you store doubles inside a pointer (JS does this.)

                    1. 16

                      Depressingly it is 100% true. Where to start for more info: https://docs.python.org/3/c-api/long.html

                      How to verify it yourself:

                      >>> import sys
                      >>> sys.getsizeof(3)

                      Not sure whether the lesson here is “never underestimate how bad something can be”, or “never underestimate how well a bad idea can work”. Maybe both.

                      1. 2

                        (late reply)

                        The “bloat” in Python’s objects has annoyed me for a long time, to the point where I’ve been writing a statically-typed Python to C++ translator for quite awhile now… [1]

                        But the lesson I would take is that system design is a balance. You can consider two worlds:

                        1. One where Python has a really compact representation of ints, but it messed up the object model (PyObject and PyTypeObject, etc.).
                        2. The current world, where NumPy and Pandas are built on top of Python’s object model. This is by no means perfect, but it’s better than what other languages have done. (Arguably it’s even better than R, which has some of these concepts built in! At least R is even more shockingly slow than Python.)

                        I think the second world is better. Python made a tradeoff of the specific representation of ints for generality in its object system.

                        Obviously it’s possible to imagine a world where you have the general object model and the int special case, but that leads to even more tradeoffs!

                        And someone would criticize you for the other decision anyway. People always criticize Java for having the “wart” of non-object arrays, when in fact that’s an escape hatch for writing high performance Java.

                        Guy Steele’s famous talk is about operator overloading (and value types), and Python got that “right” more than other languages did:


                        Despite the fact that I never use operator overloading myself, and I rarely use libraries that use it, there is no denying it became useful in Python. Operator overloading is heavily tied to Python’s single dispatch object model.

                        Also, as a separate topic, the other reason for the bloat is the use of ref counting + a cycle detector. People always tear that down as a bad decision.

                        Well Fabrice Bellard just used the exact same technique in a brand new JavaScript interpreter.

                        The hidden tradeoff I just learned about is that this technique lets you avoid juggling stack roots in C code:


                        Reference counting is used to free objects automatically and deterministically. A separate cycle removal pass is done when the allocated memory becomes too large. The cycle removal algorithm only uses the reference counts and the object content, so no explicit garbage collection roots need to be manipulated in the C code.

                        That is actually mentioned in the blog post below as the cause of the slowdown in Oil’s parser, i.e. the StackRoots object in every function.

                        Related: SpiderMonkey really suffered for this: https://old.reddit.com/r/ProgrammingLanguages/comments/ivyaw8/clawing_our_way_back_to_precision_spidermonkey/

                        [1] http://www.oilshell.org/release/0.8.5/benchmarks.wwz/mycpp-examples/

                        and http://www.oilshell.org/blog/2020/11/fixes-and-updates.html

                        1. 1

                          But the lesson I would take is that system design is a balance. You can consider two worlds…

                          That’s valid, but I think that Lisps (and JS) demonstrate you can have the best of both worlds with a tagged pointer representation. If you can cram numbers into tagged pointers, then you can still have a uniform representation but your number handling gets way better. Basically your object model can stay the same, the inner guts of it just become more complicated in the name of performance. Whether it’s worth that complexity is, of course, a matter of opinion, but there’s plenty of languages out there that demonstrate you can do it and they work pretty well.

                          …But then I suppose you do end up with a distinction between value and reference types which permeates your whole language, and can come with its own pitfalls.

                          1. 1

                            It might be possible to come up with a better tradeoff than CPython did, but Lisps and JS aren’t it.

                            JS doesn’t have NumPy and will never have it. It doesn’t have operator overloading, which NumPy entirely rests on.

                            Lisps don’t have operators to begin with. Rather than having dispatch on argument types, they force you to come up with new names for everything. You need a different name for clearing a list vs. clearing a dict, e.g. d.clear() or L.clear() in Python.

                            Lisps are also built on “concretions”, which Clojure fixes with its protocols, which is essentially the same the iterable protocol that’s in Python. In fact the Racket designers said that one of the things they would like to improve was exactly this – Racket should not deal with “concretions” like cons cells, but rather protocols.

                            So Python is slower because of the heavy dispatch mechanism, but I’d argue it makes the language much better, more powerful, and more popular. To me most Lisps sounds good in theory, but when you try to write big apps in them, it feels “old” because of things like this. (And I tried starting Oil in femtolisp.)

                            Again, Java does have the special case for non-object arrays, but people always complain about that. It has some surprising consequences.

                            BTW the bloat in the 28 bytes is 4 things:

                            • forward and back pointers for cycle detection (this is the biggest thing, 16 bytes on a 64-bit machine, and as mentioned has a huge tradeoff)
                            • the ref count
                            • the PyTypeObject pointer – this is what we’re talking about. It would be interesting to consider if you could have this without the GC bloat, which are significantly bigger. Maybe.
                            • the actual integer values, which transparently expands to big integers

                            That’s actually more than 28 bytes… I guess they may not be counting the cycle detection parts.

                            I do agree that integers don’t really need methods. They have a couple in CPython, but IMO they are better implemented as functions, e.g. see dir(42). You could technically disallow the operator overloading for one type, but it would make the code a lot messier, and it is already quite messy. Again, there could be a better tradeoff, but I’m not sure what it is.

                            1. 2

                              Lisps don’t have operators to begin with. Rather than having dispatch on argument types, they force you to come up with new names for everything. You need a different name for clearing a list vs. clearing a dict, e.g. d.clear() or L.clear() in Python.

                              Common Lisp’s object system supports argument dispatch, method combinations, multimethods, and many other things.
                              Here is an example from Practical Common Lisp.

                              (defgeneric beat (drum stick)
                                 "Produce a sound by hitting the given drum with the given stick."))
                              (defmethod beat ((drum snare-drum) (stick wooden-drumstick)) ...)
                              (defmethod beat ((drum snare-drum) (stick brush)) ...)
                              (defmethod beat ((drum snare-drum) (stick soft-mallet)) ...)
                              (defmethod beat ((drum tom-tom) (stick wooden-drumstick)) ...)
                              (defmethod beat ((drum tom-tom) (stick brush)) ...)
                              (defmethod beat ((drum tom-tom) (stick soft-mallet)) ...)
                              1. 1

                                Thanks for the interesting response! It sounds like you’re putting together the object representation and dynamic dispatch, and it seems to me like those things can be related, but don’t have to be. If you have an everything-is-an-object-on-the-heap representation then life is simple, everything is a pointer to some object on the heap, and that object has a vtable glued to it that makes dynamic dispatch and such easy. Then operator overloading can be represented by methods. It’s nice and easy.

                                But if you have a tagged pointer representation, seems like there’s nothing preventing you from having a vtable for each type you squish into the pointer, and checking what your type is to figure out which vtable to refer to. I suppose that puts your numbers small but your method dispatches slightly more expensive, which may not be a good tradeoff.

                                I thought that Lua functioned this way, but now that I double check it appears to special case numbers and bools; calling 1 + 2 doesn’t call any metamethods, but calling foo + bar will call getmetatable(foo)._add(bar) if it exists. Today I learned, I guess!

                                1. 1

                                  Yeah you can definitely do something different than CPython, like Lua does, but I guess my point is that there are surprising consequences of that.

                                  I will also note that none of the languages that are implemented with pointer tagging have big integers by default.

                                  Lua and JS don’t even have ints and floats! That alone makes Python a significantly better language IMO. You need both ints and floats for many programs.

                                  Likewise, all the small/fast Lisps like femtolisp don’t have big integers as their default integer type.

                                  This is basically retreating my comment “Python is an amazing achievement” … I definitely got frustrated with Python’s slowness, but in trying to change some of those decisions with “OPy/OVM”, I learned why they were there, or at least I learned what the tradeoffs were.



                                  I would also I say I learned that Python is significantly better than its contemporaries in the vector of (language design, language implementation). That is, if you compare it to Lua, JavaScript, Ruby, Perl, PHP, and essentially all Lisps, and even Tcl, you won’t find any instance with both a better design and better implementation. [1]

                                  You could argue Ruby is better designed in some ways (which is true, although I’d say it’s worse in more ways), but the implementation is worse than CPython. Likewise JS has faster implementations, but the language is significantly worse.

                                  That’s not the “common wisdom”, since there are definitely faults with Python’s design and implementation, but after spending several years working with this type of code, I strongly believe that.

                                  Basically, Python got popular on its merits… it is actually a better language for many of applications. Or you can say as some people do “it’s the second best language for every problem” :)

                                  [1] e.g. The thing I found funny about Racket is that up until recently, it’s just a big pile of C code, exactly like CPython, but not even as polished.

                        2. 5

                          Alan Kay explicitly said that he copied this from Lisp. It’s a really old technique.

                          I was quite surprised that even optimised Python implementations don’t do it. There was a Python JIT (whose name completely escapes me) that was briefly fashionable about 5 years ago. I had an MPhil student add pointer tagging to it and it gave a massive performance improvement (5% in the worst case, doubling the performance in some things).

                          1. 4

                            CPython is not known for its performance. I expect that pypy does pointer tagging; cpython does not.

                            (Though cpython does intern the integers in range of [-5, 255].)

                          2. 2

                            It’s important to remember that these things aren’t necessarily mutually exclusive, even where they are asserted of the same kind of high level ‘thing’. For instance, in Ruby everything is an object, but everything is also an expression. However, those statements actually assert something about two different categories of ‘things’. The things that are objects are all values. The things that are expressions are all syntactically valid strings of Ruby.

                            1. 2

                              Forth: everything is a word

                              1. 1

                                I think this thinking while useful at times, can send programmers down some pointless or even damaging rabbit holes. Often I’ve seen programmers create or refactor code to make everything an X. Sometimes it helps cohesiveness, but other times it just makes things more confusing - things you thought were an X don’t really represent well as an X and are a bad abstraction.

                                1. 8

                                  RPC is one of those counterexamples. “Everything on the network is a function [or an object]” can get you in a lot of trouble when those functions/objects turn out to have very unusual properties, like enormously high latency, or a tendency to fail with unexpected exceptions during network outages.

                                  When working on iChat at Apple, a co-worker and I once spent a while tracking down a weird performance problem; it turned out to be caused by an innocent looking object (an NSURL) secretly being a DistributedObjects proxy, so every time it was asked for its scheme or path the caller had to wait for an IPC call to another process where the real URL lived. That was happening in some inner loops but we hadn’t paid attention because it was obvious that nothing slow was going on in that code..l