1. 35
  1.  

  2. 6

    The natural numbers also form a monoid under multiplication as well:

    1 x 1 = 2

    What?

    1. 1

      Just a typo.

      1. 1

        just off by 45 degrees ;)

        1. 10

          Corrected (I shall be remembered as the person who wrote a math textbook and made a mistake in calculating 1 * 1)

          1. 2

            Eh, one of the old answer guides to Spivak’s Calculus has 2 - (-2) = 0 in it, so they literally got 2+2 wrong.

            1. 1

              Relax, everyone makes mistakes

        2. 5

          I’m not sure the “white as identity in paint mixing” is particularly good as an analogy. If you mix white paint with blue paint, you’ll usually get a lighter shade of blue. A colourless thinner or water might be slightly closer to an identity, although it will still change the blue paint. Paint mixing looks more like a semigroup than a monoid.

          1. 3

            This is a good nuance. We do want to think of both additive and subtractive color models as having a monoidal operation, but we aren’t guaranteed to have a workable unit just because we have all of the rest of a monoidal relationship. White and black only work as monoidal units when we are mixing abstract colors on a computer screen, and that ultimately isn’t because white and black are absolute endpoints in color modelling, but because we have a maximum and minimum brightness value on our display hardware.

            1. 2

              Good nuance indeed. Added a little acknowledgement, but I think I wouldn’t go so far as to remove the part altogether - my example is not meant to depict real-world color mixing (the mixed colors are not really the ones that you would get if you mix the ingredients), just a schematic representation of it. Whether or not it is accurate depends on interpretation (as does any other “application” of a mathematical law).

          2. 3

            I’m surprised you left “closure” as a group property until the very end. It’s one of the fundamental rules.

            1. 2

              It’s fundamental, but only if you view groups from category-theoretic perspective, e.g. it is not mentioned in the canonical group definition.

              1. 3

                Where are you getting the canonical group definition? I just checked Serge Lang’s Undergraduate Algebra and it’s mentioned in the first paragraph on groups:

                A group G is a set, together with a rule (called a law of composition) which to each pair of elements x, y in G associates an element denoted by xy in G, having the following properties.

                1. 1

                  Well, yeah my definition is roughly the same:

                  A monoid is defined by a collection (set) of elements and an operation that allows us to combine two element and produce a third one of the same kind.

                  What I meant to say is that “closure” is not listed explicitly as a property of the operation.

                  1. 2

                    What I meant to say is that “closure” is not listed explicitly as a property of the operation.

                    It is, though. It’s why, for example, subgroups are defined as being a closed subset of the group.

                    1. 1

                      .I haven’t gotten to subgroups in my article, I actually haven’t covered much material, as a whole.

            2. 3

              A tangent, not to detract from the article: this reminded me that I am really interested in examples of using such things in production with positive effects.

              I understand what monoids are and how to use them. I’ve literally never seen: “Look, we turned this into a monoid and as a result we uncovered bugs, had fewer bugs, solved another issue, made it easier to implement X, …”. If it would be in any language not Haskell that would be even more awesome.

              I need compelling examples as nucleation points and without those I just don’t ‘get’ things. You can call it a failure of imagination if you like: that doesn’t change that actually using category theory in my coding practices isn’t something I understand how to do, but I feel is something I could learn with the right nudge.

              1. 5

                You’re asking for soup from a stone, in some sense; category theory doesn’t give us libraries, it gives us an understanding of innate structure.

                To give two examples from capability theory, let us consider an object graph in a heap. An object will merely be a collection of references to other objects, allowing duplicates and cycles. Also, some objects may be the same even though they have multiple copies on the heap, and some objects may be transparent forwarders who hold a single reference and should be considered the same as their referent. The first example is that, with a bit of reflection, this is all of the data of a hypergraph with path equivalence, and hypergraphs with those equivalences are categories! Every capability object graph has an underlying category, and if we can evolve the graph over time, then this gives us a sequence of functors.

                The second example comes when we want to imagine that we have multiple objects in multiple heaps, and we want to send messages from one heap to another. We may want an operation, traditionally called join, which can take two references, assert that they refer to the same object, and unify them into one single reference. Note that, in any heap, we should not just be able to join references, but also create fresh new objects which, by definition, are (trivially) the same as themselves. This sounds like a monoid. Also, CRDTs are commutative monoids. This leads to an application of the Eckmann–Hilton argument which suggests that we can find high-level CRDT effects within existing inter-heap effects.

                1. 1

                  hypergraphs with those equivalences are categories

                  And so you know that you can perform certain operations on them that conserve certain properties. So that’s good to know, but now what? You can define mempty and mappend functions and get rid of a bit of boilerplate in operations on the hypergraphs, but it seems to me the interesting thing is realizing they are a category in the first place and knowing what law-conserving operations you can perform. The consequences of that on how you implement your operations seem relatively trivial to me. Actually casting it into a Monoid and implementing the mempty and mappend functions doesn’t seem to be the important/necessary thing here?

                2. 4

                  Consider the kinds of algorithms you write where you start with a collection of items and you want to perform some mapping on each item and then combine the results. For example, you want to add one to each integer and sum them, or convert integers to strings and concatenate them, or map integers to lists of integers and then flatten the resulting list.

                  Each of these examples can be implemented using the function foldMap in Haskell. Let’s look at the signature:

                  λ> :t foldMap
                  foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
                  

                  foldMap asks for a function from any type a to some type m which is a Monoid. It also asks for some Foldable of that type a (we’ll just use lists). It returns a value of type m by applying your mapping function to each element and then “combining” the results. How does it know how to combine them? The Monoid instance for each type tells it how. Let’s try it out.

                  λ> foldMap show [1, 2, 3, 4]
                  "1234"
                  

                  Here we convert an integer to a string, and then concatenate the results for produce a single string.

                  λ> foldMap (\i -> [i, i + 1]) [1, 2, 3, 4]
                  [1,2,2,3,3,4,4,5]
                  

                  Now we effectively do a “flat map”, producing a list per initial item and then concatenating all of those lists together.

                  λ> import Data.Monoid (Sum (..))             
                  λ> foldMap Sum [1, 2, 3, 4]
                  Sum {getSum = 10}
                  

                  This time, we sum up the numbers in a list. Note that we had to explicitly wrap our integers in a new type Sum. Why? Because summing of integers is only one of several valid Monoid instances for integers. Here’s another:

                  λ> import Data.Monoid (Product (..))
                  λ> foldMap Product [1, 2, 3, 4]
                  Product {getProduct = 24}
                  

                  The power of the Monoid abstraction has allowed us to implement all these seemingly different algorithms with a single function, foldMap. This also means that any new data type I create that has a valid Monoid instance will be able to take advantage of this function. I hope this helps!

                  1. 1

                    This also means that any new data type I create that has a valid Monoid instance will be able to take advantage of this function. I hope this helps!

                    Thank you for your effort, but unfortunately it doesn’t. Your comment explains the abstraction and what you can achieve with it. What I’m trying to understand is when this occurs in practice.

                    As far as I am aware I don’t regularly find myself foldmapping collections of values. I don’t find myself applying the same function to different datastructures. I don’t find myself applying different functions to the same datastructure, where those functions contain boilerplate or other abstractable details. But I may be wrong and just not recognizing it.

                    Say you’re building Lobsters, Shopify, Stripe, Gmail, … Where would you expect to encounter a use case for defining a Monoid? How does doing that make the code better?

                    1. 2

                      Say you’re building Lobsters, Shopify, Stripe, Gmail, … Where would you expect to encounter a use case for defining a Monoid? How does doing that make the code better?

                      I think this same question could justifiably be asked about patterns like map and fold. The challenge in directly answering the question is that the scope of the Monoid (or map, or fold) pattern is quite granular, while the problem being solved is stated quite broadly. Having Monoid as one of your foundational tools changes the way you construct software to solve broad problems, but it is hard to say “in designing Lobsters, you’d use a Monoid here” just as it is hard to say “in designing Lobsters, you’d use a fold here”. Being specific about use cases for a fine-grained pattern in a very broad (and possibly vague) problem space is tricky. I can’t tell you exactly where you’d use a fold in building Lobsters, but I know I wouldn’t want to built it without the capabilities fold provides.

                      1. 1

                        The difference I see is that I, and I assume most others, have a good feel for when to use them, because I use them all the time, without thinking about monoids, categories, laws, etc. You just don’t (need to) identify and turn datastructures (explicitly) into monoids to use them and the benefit of doing so seems negligible in most cases.

                        Would it be wrong to say that these abstractions are useful if you find yourself applying the same function to multiple datastructures or applying a bunch of different functions with ‘reusable’ implementation details to the same datastructure, in addition to it being useful to recognize a datastructure as forming a certain category, meaning it satisfies certain conservation laws under certain operations, but that that is all there is to it?

                  2. 3
                    • If you know your operation is monoidal, you can distribute it across multiple nodes and combine the results with their neighbours as they arrive.

                    • If you know your operation is commutative, you can distribute across multiple nodes and combine the results as soon as they come in.

                    • If you know your operation forms a group, you can recover from mistakes without restarting from scratch (by applying enough inverses that you correct your mistake).

                    You can write all of these abstractions in terms of a monoid/abelian monoid/group interface and let the caller choose the precise structure that he or she wants.

                    An extremely useful example is key-value maps. Haskell’s monoidal-containers library provides Monoid instances for maps if the values have a Monoid instance:

                    instance (Ord k, Semigroup m) => Semigroup (MonoidalMap k m) where
                      MonoidalMap p <> MonoidalMap q = MonoidalMap (Map.unionWith (<>) p q)
                    instance (Ord k, Monoid m) => Monoid (MonoidalMap k m) where
                      mempty = MonoitalMap Map.empty
                    

                    From this, we can recover a huge array of useful map-combining operations:

                    • Keep the first: MonoidalMap k (First a)
                    • Keep the last, like the left one provides defaults: MonoidalMap k (Last a)
                    • Build a table of running counts, and sum them together: MonoidalMap k (Sum Int)

                    I imagine there’d be a way of expressing this polymorphism in imperative languages.

                    1. 2

                      I feel a similar way. I’d like to apply category theory to produce better code but I don’t really get how to do it in general. Should I try to turn my abstractions into categories, and how does that result in better code? Does this question even make sense?

                      1. 1

                        There was this really awesome article about the benefits of maths that I would have linked as a response to your questions, but I couldn’t find it. The point was that maths is not so much about specific improvements of your practice, as it is about broadening your perception, so the benefits come secondary and are not directly related to your knowledge i.e. you might be already using what you learned without knowing it.

                        I have this joke that mathematics is a scam, and that it only works because people get smart while studying it.

                        1. 1

                          Now assume someone has already studied it and has already gotten pretty smart, but doesn’t see how turning any of the datastructures they encounter in their working life into e.g. a monoid is going to help them.

                      2. 2

                        (Starting a new comment because I think the discussion will be different)

                        Another interesting about Monoids is that the product of two monoids is also a monoid. This means that doing things like producing statistics over some data set can be broken into: split the data set into chunks, compute all the statistics you want to that dataset, say, average of column A, word count across column B, the hash of the values in column C, and then you can combine those results from all chunks trivially.

                        calculateStats :: 
                          Document [Col "A" Double, Col "B" Text, Col "C" Base64Binary] -- made up, just to give some imaginary CSV a type
                          -> (Average, WordCount, MonoidalHash MD5)
                        

                        with Average, WordCount and MonoidalHash MD5 having an instance for Monoid themselves, then the type (Average, WordCount, MonoidalHash MD5) is also a monoid. We can now take our thousands of CSVs and compute these statistics individually across a cluster of machine, and then combine them. (I’m not sure the hash of all Base64 encoded strings in some CSVs is particularly useful, but you can do it - perhaps a bloom filter would be a better example, since they are also monoidal, by combining them using OR)

                        1. 1
                          • If you know your operation is monoidal, you can distribute it across multiple nodes and combine the results with their neighbours as they arrive.

                          • If you know your operation is commutative, you can distribute across multiple nodes and combine the results as soon as they come in.

                          • If you know your operation forms a group, you can recover from mistakes without restarting from scratch (by applying enough inverses that you correct your mistake).

                          You can write all of these abstractions in terms of a monoid/abelian monoid/group interface and let the caller choose the precise structure that he or she wants.

                          An extremely useful example is key-value maps. Haskell’s monoidal-containers library provides Monoid instances for maps if the values have a Monoid instance:

                          instance (Ord k, Semigroup m) => Semigroup (MonoidalMap k m) where
                            MonoidalMap p <> MonoidalMap q = MonoidalMap (Map.unionWith (<>) p q)
                          instance (Ord k, Monoid m) => Monoid (MonoidalMap k m) where
                            mempty = MonoitalMap Map.empty
                          

                          From this, we can recover a huge array of useful map-combining operations:

                          • Keep the first: MonoidalMap k (First a)
                          • Keep the last, like the left one provides defaults: MonoidalMap k (Last a)
                          • Build a table of running counts, and sum them together: MonoidalMap k (Sum Int)

                          I imagine there’d be a way of expressing this polymorphism in imperative languages.

                          1. 1

                            One particularly interesting use case is parsing utf-8 text in parallel - you can take an arbitrary large byte string, split it into chunks, and parse each chunk into a monoidal type which represents whether that chunk a) contains any invalid bytes, b) whether the beginning of the text appears to split the middle of a code point, and c) whether the end of the chunk appears to split a code point. If you now have chunk parsing results A and B, the monodical operation will look at these to see:

                            a) Are both chunks valid utf-8? If so, we can say that the combination of both chunks are also valid

                            b) Does the end of A and the beginning of B say they thing this is a split code point? If so, can we combine the fragments and produce a valid combined parse for both chunks, otherwise we produce an invalid parse

                            c) does either chunk say it contains an invalid parse? Then the combination is also an invalid parse (we could collect the locations of these if you wanted)

                            Now imagine you have 20TB of what you think is utf-8 text, you can distribute arbitrary chunks among many machines, each off which can split their chunk into multiple fragments, and we can combine all the results together as long as we keep track which chunks each result comes from.

                            This same idea can be applied too many real world applications (though some end up being extremely difficult or the state machine becomes insane; IIRC CSV is very difficult to do this with because it’s hard to tell if you’re inside a string or not). I believe JSON parson may also be amenable to this.

                          2. 1

                            I think that for the octagon example, you want to rotate it 360°/8 = 45°, not 30°?

                            1. 1

                              Corrected, thanks for pointing that out.

                            2. 1

                              Nice illustration, but could you elaborate on the visualization of groups as the rotation of a triangle? All previous examples (i.e. colorful balls, numbers, true/false) present morphisms are functions that take two objects and return a new one, but rotation as a function only takes one argument (the triangle in some position). What does each triangle in the diagrams refer to, an object/element of a category/monoid/group (the position), or a morphism (the rotation operation)?

                              The concept of abelian groups also confuses me: if “abelian” means f(a, b) == f(b, a), how then could a function with only one argument be abelian?

                              1. 3

                                The essay is mixing two concepts here. The group is rotations of a certain degree, which was represented by the triangles. The elements of the set are degrees and the operation is adding them: ie 120 + 120 = 240, 240 + 120 = 0. To actually apply them to a triangle, you need to introduce the concept of a group action, which turns the elements of a group into transformations of a set.

                                1. 2

                                  The answer of the first question is somewhat hidden in this section:

                                  In order to form the correct intuition about monoids, try to avoid thinking of the elements in the set as objects, but instead think of them as actions. For example, when thinking about numbers don’t think of them as quantities (as in two apples, two oranges etc.), but as operations, (e.g. as the action of adding one to a given quantity). In other words, don’t think of the element by itself, but only think of it together with the operation (in this case addition).

                                  So it’s always about the actions, it is just that with numbers, we often think of + as the action and of 1 as the object, but the action is really there is just an action +1. Another complication comes from the fact that when you see a triangle, you are probably thinking about a specific triangle, as opposed to of triangles in general, which is the intention.

                                  As for the second question - not every function can be abelian, the word “abelian” can only be applied to functions which take two arguments of the same type and return a result of yet the same type - these are called operations.

                                  Let me know if that makes sense for you.

                                  1. 1

                                    Thanks for your response, but somehow it makes me more confused!

                                    So it’s always about the actions, it is just that with numbers, we often think of + as the action and of 1 as the object, but the action is really there is just an action +1.

                                    This sentence and the quote above suggest that each triangle in the diagram refers to the operation of rotating a triangle by some degree. So far so good.

                                    Another complication comes from the fact that when you see a triangle, you are probably thinking about a specific triangle, as opposed to of triangles in general, which is the intention.

                                    Well… so the intention of the diagram is to represent triangles in general, as opposed to the rotation of triangles in general?

                                    As for the second question - not every function can be abelian, the word “abelian” can only be applied to functions which take two arguments of the same type and return a result of yet the same type - these are called operations.

                                    You mentioned that the group of rotations of a general triangle is abelian. Meanwhile, the word “abelian” doesn’t even apply to the rotation of a triangle, since it only takes one argument (the degree to rotate) and thus doesn’t satisfy the definition of an operation. What the hell? :0

                                    1. 2

                                      Ignore the second remark, your first insight is the correct one - the diagrams are supposed to 100% represent the rotations of triangles, (and not the triangles themselves) and it is the rotations (and not the objects themselves) that are the arguments to the group operation.

                                      But the rotation itself is not the group operation. Rather the group operation is a function that takes two rotations and returns a third rotation which is equivalent to the other two. This function would just perform the two rotations one after the other.

                                      And because rotations can be applied in any order without changing the end result, we say that this operation is abelian.

                                      1. 1

                                        That makes sense now! I just realized that the morphism of rotation has associativity, and that is why we can combine two rotations into one without changing their semantics.