1. 21
    1. 9

      Another interesting example of monoid is sorted lists, with [] as identity and “sorted” merge as the operation. Depending on the reduction strategy, reduce over this monoid can yield sorting algorithms such as mergesort and timsort.

      1. 6

        The all([]) being true one is a vacuous truth. It does make sense, if you think of it just being “does every element of this list satisfy this predicate?” where of course the answer is yes, every element does - because there are no elements.

        I kinda vacillate in my opinion on vacuous truth - sometimes it’s useful, sometimes it’s annoying and checks succeed that ideally would have failed. For example, in TLA+ it’s possible to accidentally define a spec with zero valid behaviors. This means every liveness property will evaluate to true, because you’re basically asking “for every behavior generated by this spec, does this liveness property hold?” and the answer will be yes for the same reason that all([]) is true. But actually there’s just a bug and so you might be led to believe your liveness properties hold when really they do not.

        This is one of many reasons why Lamport says to “be suspicious of success”, and it’s good practice to occasionally check your spec with some property that should be obviously false, just to make sure it’s caught and nothing weird is going on.

        1. 2

          Some tangential math: proofs about John Conway’s surreal numbers often have a vacuous truth as the base case. Every surreal number is a pair of sets, each of which contains nothing but other numbers. A pair of the empty set is a number (zero), because the empty set doesn’t contain any non-numbers. Proofs of the form “a number has property X, if all the members of its sets have that property” are always true because eventually you get to zero and the empty set, for which it is (vacuously) true that every member has any property you want.

          1. 2

            Could be fixed if it took a nonempty collection which eliminates these edge cases while forcing downstream to consider the problem & what solution makes sense for them

            1. 7

              I’d argue that having the behavior be consistent with widely accepted mathematical/logical set theory isn’t something which needs to be fixed. If the downstream needs to special case empty set behavior it seems appropriate that they do so without introducing a deviation only to create a forcing factor.

          2. 3

            I believe max forms a monoid over R∪{−∞}, and python, with the math import, has float('-inf'). I wonder why max doesn’t return that? To avoid a dep on math, or to avoid extending the set of possible values? Because that imo is the natural answer, and what J does, eg:

               >./ ''
            __
            

            The above is a max reduce over the empty list, and __ is negative inf in J.

            1. 9

              I wonder why max doesn’t return that?

              Because min/max don’t exclusively work with numbers. So they take a default and you can byo fallback instead.

              1. 1

                Ah, fair enough. Still, couldn’t it detect if it was an iterable of numbers and choose a suitable neg inf default when none was provided?

                1. 5

                  No, Python is dynamically typed, a list is a list, it does not indicate what payloads it has. Some more specific collections might (numpy arrays) but those would likely provide their own min/max handling those cases (and possibly taking advantage of the collection’s implementation details to be higher performing).

                  A second issue is that this is an unnecessary and error prone inconsistency, and such subtle inconsistencies are generally footguns.

                  1. 1

                    A second issue is that this is an unnecessary and error prone inconsistency, and such subtle inconsistencies are generally footguns.

                    Imo, the current behavior is the inconsistency, but I can see arguments both ways. Mainly, that returning neg inf makes the return type inconsistent, or at least more complex. For most typical use cases I’m imagining though it’s the behavior I’d want, as it avoids having to write special case code everywhere there might be an empty list.

                  2. 2

                    An implementation of Python could detect such, but Python as a language, doesn’t have a way of achieving such knowledge, and I think adding it would be non-trivial. While CPython is popular, some of the other implementations do have a fair chunk of market share in their areas too, so I’d assume that would deter against making implementation-specific behavior for such a key function.

                    The fun of dynamic typed languages.

                2. 4

                  max(l) in l is a useful property to have. That likely outweighs the issues with max([]) being an error.

                  1. 2

                    max(l) in l is a useful property to have.

                    Interesting point I hadn’t considered. As noted in the article, the other natural invariant we’d like is:

                    max(xs) = max(max(xs), max([]))
                    

                    which the error behavior breaks. So now you have a battle between useful invariants. I still think the above one is more important, but your point and the other responses here have convinced me the issue is subtle.

                    1. 2

                      at least it’s relatively easy to define max_prime from max that returns some sentinel for “nothing there” and lets you have the invariant if you wanted it.

                      Kind of hard to talk about these invariants in the abstract though, because it’s dependent on what you’re doing. Like if you want to write up some monoid-spanning code, then yeah you’ll want max_prime. If you want to do multicore max’ing then you likely want max(xs + ys) = max(max(xs), max(ys))… and there max_prime might be useful. etc etc.

                      Though I do think that a universe where max([]) returns None would cause some downstream difficulties in Python given we have both heterogenous lists, and very open systems for things like comparisons, making it hard to have a pre-defined None to return that would not cause other issues. But that’s a Python problem.

                  2. 3

                    You don’t need the math import for float("-inf"), float is a builtin. -math.inf requires a math import, but they both return the same value. You can also use hacks like -1e999 which evaluates to -inf also (due to fp64 rounding).

                  3. 2

                    getting a non-Haskell language to automatically determine that you’re reducing over a monoid is more trouble than it’s worth

                    Many languages incorporate ad-hoc mechanisms (which is a bit unfortunate). In common lisp and raku, reducing over an empty sequence will invoke the reducing function with zero arguments. In j and other apls, there is typically a hard-coded table of identities for primitive functions, with no recourse for user-defined functions (you just get an error). This is something I have been planning to properly fix for j (along with much more exciting but related stuff, like annotating a function as associative so reductions can be parallelised), but have yet to get around to.

                    1. 4

                      To be fair, Haskell solution is also suboptimal. Haskell encodes the identity in the typeclass instance, but the most common monoid (Int) has 6 different plausible instances of Monoid (+, *, ^, &, min, max).

                      Haskell solves this with type wrappers, but type wrappers are often just as much typing as specifying the identity manually (and they require extra machinery in the compiler for efficiency reason).

                      1. 5

                        I’m not sure why that’s suboptimal. You wouldn’t want it to guess the monoidal context of your Int <>. If you’re worried about typing too many characters (which I don’t think is a justified worry), you can use the sum and prod functions in Haskell instead of mconcat . map Sum.

                        1. 3

                          True—this is a general problem with typeclasses. There were some proposed solutions—under the name ‘modular implicits’, I believe?

                          1. 1

                            Modular implicits, if I understand them correctly, work by selecting whatever module[1] instance of Monoid is in scope. Any ambiguity would result in a compile-time error, which the caller can resolve by specifying explicitely the module.

                            So I’d say it is better compared to Haskell, as it gets rid of type wrappers, but there would still be wrappers for ambiguous cases. I.e. this cannot be made to work

                            hypotheticalFoldr (+) myIntList
                            

                            you’d need to write something like this

                            mconcat {MonoidSum} myIntList 
                            

                            where MonoidSum is the name of the monoid instance.

                            Also, the modular implicits paper mentions that they would require extra annotations compared to Haskell typeclasses, so not necessarily a win overall)

                            [1]: Modular implicits have been proposed in OCaml, I guess the Haskell equivalent would be a typeclass instance.

                            1. 1

                              mconcat {MonoidSum} myIntList

                              Explicit dictionary passing is exactly how typeclass constraints are implemented in the GHC ‘Core’ phase : https://serokell.io/blog/haskell-to-core

                              I’d say that the newtype wrapper + typeclass approach of giving the same algebraic structure to different concrete types is a very principled way of letting the computer people and the math people talk to each other. Leaving aside the debates on surface syntax which are quite uninteresting at the end of the day.

                          2. 1

                            min/max are not monoid instances since they don’t have mempty. That’s why maximum/minimum from Data.Foldable1 should now be preferred.

                            They are not partial unlike the ones in Data.List.

                            1. 3

                              Notice that I mentioned Int, not Integer. From the Data.Int docs:

                              A fixed-precision integer type with at least the range [-2^29 .. 2^29-1]. The exact range for a given implementation can be determined by using minBound and maxBound from the Bounded class.

                              You can implement mempty as maxBound (respectively minBound).

                          3. 1

                            Do you work for jsoftware, or just planning to contribute?

                            1. 6

                              ‘Jsoftware’ doesn’t have employees—it hasn’t the money for them, and it’s not clear to me that it ever did. There is a loose agglomeration of people who do things j-related in more or less official capacity, and I suppose I am one of them. I have a commit bit and have done various things (most notably, I had a major hand in the concurrency system and the now-not-insane error messages, as well as a few fun integer-factor speedups).

                              1. 2

                                ‘Jsoftware’ doesn’t have employees—it hasn’t the money for them, and it’s not clear to me that it ever did.

                                Interesting, I wasn’t aware of that. I knew it wasn’t big, but had assumed it was chugging along as a small business selling commercial licenses and what not.

                                Thanks for your work on the error messages, I’ve noticed the improvements in the recent versions.

                            2. 1

                              Aside: I was wondering how [max] in Raku worked over an empty lists, -Inf gets returned, how that works for the type system? I think Inf and -Inf are special cased to always be more and less than anything else. Fun!

                              1. 1

                                Reject modernity; embrace tradition: all([]) == false