Threads for jdegoes

    1. 2

      Please do!

      1. 1

        Propose one. :)

      1. 1

        Most of these definitions seem pretty good, but I have some critique of a few of them.

        An abstraction is a precise description of the ways in which different data types share common structure.

        Hm, that definition captures only a very narrow sense of the word. Functional programming is based on lambda calculus, and in lambda calculus the word abstraction means lambda abstraction (i.e., what most people would informally call a function). Of course, the word abstraction can also mean many other things, even within functional programming.

        An algebra is a set of objects together with a set of operations on those objects.

        I think it’s more common for object, in a mathematical sense, to refer to an entire group (which is more consistent with its usage in category theory), not the elements of a group. The word elements is more appropriate for a group because a group is a set with some structure, and a set contains elements.

        Functional Effect

        Interesting. I’ve never heard this term outside of this blog post. In papers about monads and algebraic effects, the term computational effect is the term I usually encounter.

        To map over a polymorphic data type is to change one type parameter into another type, by specifying a way to transform values of that type. For example, a list of integers can be mapped into a list of strings, by specifying a way to transform an integer into a string. To be well-defined, mapping over any polymorphic data type with an identity function must yield something that is equal to the original value.

        I’d really expect to see at least a passing mention of the word functor here, if for nothing else to give the reader a keyword to Google. It’s interesting that only one of the functor laws is mentioned here (the identity law, but not the composition law). Seems arbitrary?

        Sometimes called generics, parametric polymorphism is a feature of some programming languages that allows universally quantifying a function or a data type over one or more type parameters. Such polymorphic functions and polymorphic data types are said to be parameterized by those type parameters. Parametric polymorphism allows the creation of generic code, which works across many different data types; and also the creation of generic data types (like collections).

        This is good, but it misses the most important point about parametric polymorphism: parametricity. This property distinguishes parametric polymorphism from ad hoc polymorphism.

        Universal quantification asserts that a statement or type holds for all possible values of a given variable.

        It’s a bit awkward to talk about a type “holding” for values. This makes a little more sense in the context of Curry-Howard, but even then it’s still fairly awkward exposition. I would expect this entry to make at least a passing mention to parametric polymorphism, which also makes an appearance in this glossary.

        A universal type is a type that is universally quantified over. In programming languages, all type parameters of functions and data types are universal types.

        Hm, I’m still getting this feeling that the glossary contains many related (or even synonymous) terms without linking them together in a helpful way.

        1. 1

          Thank you for your feedback, I incorporated much of it into a revision!

          I didn’t change abstraction since I am deliberately using a more narrow definition, a functional analogue of how the word is used in mainstream programming languages like Java. (Any definition that considers #define of the C preprocessor to be a form of abstraction is not useful in everyday parlance nor related at all to how mainstream programming language communities use the term.)

        1. 6

          You know, it’d be really nice if this didn’t immediately hit a login wall. :|

          1. 2

            Sorry about that. The usability has suffered since we switched to Dryfta for managing the event (hopefully the other perks will make up for it).

            1. 1

              Maybe the link can be changed to the main home page instead?

              I know I won’t be submitting a paper, but I’m still interested in other information, like seeing when and where it’s held.

          1. 5

            This post seems to be arguing for two things that are actually unrelated to conditional expressions, which are unavoidable:

            1. The type signature of a function with multiple parameters that share the same non-descriptive type (bool, int, etc) can be made more descriptive and less error prone (i.e., wrong argument order) by introducing new types.

            2. You can change a function to be more general purpose by parameterizing them with functions in the quintessential functional style used by functions like map or filter.

            I think introducing a lot of tiny, single-purpose, two-valued sum types like in (1) is ugly from a code aesthetics point of view. What we’re really after here are compiler-checked names for arguments, which can be achieved using records (example in ML):

            match : string -> { ignoreCase : bool, globalMatch : bool } -> string -> bool
            

            My problem with the overall argument about “removing conditionals” is in the real-world example provided, where a function is basically just split into two pieces, and the caller is assumed to be passing a constant value that represents either the piece that performs a dry run, or the real deal. What if that value is dependent upon a dynamic value, or some other more complicated situation? The conditional would just move somewhere else:

            if dryRunCmdOption() then publish dryRunOptions else publish forRealOptions
            

            I don’t think breaking functions into pieces to avoid a conditional expression is a rational design choice.

            1. 6

              The conditional would just move somewhere else…

              That sums up my read of this article as well. The author seems to be making an argument that either ignores Conway’s Law or underhandedly exploits it by trying to move decisions from the callee to the caller. Any code that has value is going to need to make decisions of some form, and the author hints at this in the aside “now the function match has been simplified so much, it no longer needs to exist, but in general that won’t happen.”

              They might be trying to push the decision making off to some other team as you describe, but what’s to stop that team from passing the buck further? It seems that you would end up pushing everything to the “business” people who write configurations, which leaves you with code that works perfectly but doesn’t do anything, and pathologically complex configurations that never quite work but hold all the cards. Which, I guess, is exactly what happens in shitty enterprise-ware.

              The point of decision is where the work happens, by definition. There are better and worse places to put that depending on context, but pretending it can be entirely shoved-off is not realistic.

              1. 1

                Introducing sum types for things like booleans does, in my experience, reduce errors; but I wasn’t arguing for that. I was arguing for one step beyond that, which is the elimination of the sum type (boolean or otherwise) by extracting the effect of the branch into a lambda, which is passed by caller to callee.

              1. 12

                This is a useful technique but I think the author goes too far when advocating for universal replacement of conditionals with lambdas.

                In some cases it could be an improvement to usability and reliability, in others it isn’t.

                In the real world example given, I would argue the refactored version is worse to use and you really aren’t gaining anything. The argument that lambdas are better than conditionals seems to be based on the idea that the conditionals could be faulty and the caller doesn’t really know what’s going on. But if you’re dealing with pre-made lambdas supplied by the same library as the method you’re calling, is that really fixing either of those problems? If you don’t trust the method to have a conditional that works or does what you think it will, why would you trust that the lambdas are going to work and do what you think they will? Just because they in theory don’t have conditionals themselves? Flow control bugs are only one class of bug.

                In the example given, the option to dry run or not dry run is now being buried inside of an options instead of being clearly set on the method call. That seem like a loss of code and usage clarity. And now the logic of doing a dry run or doing a publish is separated into two lambdas and not in the actual publish method. All around it seems like a loss.

                Back when OO was really gaining popularity in the early 90s, some people learned the wrong lessons for it and advocated for bad practices. I think the same thing is happening with functional programing now.

                Lambdas are extremely useful and awesome tools. But replacing all conditionals with them is not an improvement or a good use of them, IMHO.

                1. 4

                  The issue is much larger than just booleans. Fundamentally, it’s about taking semantics, and packing them into data values (booleans), feeding them into functions, and unpacking the data values into semantics (code).

                  Instead of developing bit protocols to serialize intentions, why not just propagate our intentions directly into the code?

                  Indeed, you do that by pulling out those semantics (blocks of code) from the site of the conditional (inside the callee) and into the caller. The callee then doesn’t need to make any decisions, and neither callee nor caller are required to serialize intentions to and from bits (or other data values).

                  Coding in the if/then/else style is pervasive, possibly because there existed a point in time when programming languages did not let you easily or performantly “propagate intentions”. But every time we engage in that style of programming, we are packing our intentions into bits at the caller site, and unpacking them at the callee site, and the use of an inter-program serialization protocol for semantics can and clearly does lead to lots of bugs.

                  Others may have different preferences, but as for me, I prefer to propagate intentions directly into the code; and to remove control from a callee to make the callee simpler, easier to reason about, and easier to test.

                1. 4

                  ArrangoDB is a very interesting piece of technology. I’m not so certain I buy the multi-modal aspect of it, although there’s a tendency for databases to adopt more and more paradigms (which began in the relational world with XML column types, for example). Rather, the really interesting piece of ArrangoDB is Foxx, which lets you built and deploy Javascript-based microservices that expose data-driven REST APIs. This type of technology could play a key role in serverless architectures and MBaaS.

                  Where Foxx needs the most help right now is in whole-API versioning, so you can push new APIs as atomic changes (which can be rolled back) while still supporting the old APIs, and somehow integrate that with schema migration. Because honestly, if Foxx doesn’t figure that out, it will have all the well-known maintenance problems of stored procedures, which will keep people from trying the approach.

                  1. 5

                    I’ll be there, naturally. :) I also recommend the above Haskell training by Chris & Julie for those who are hesitant to jump into FP. Yeah, FP really is for everyone, even if you don’t know or even hate math. :)

                    1. 15

                      This guy seems overly emotionally invested in the internals of MongoDB.

                      I find Multicorn and UDFs to be excellent extension mechanisms for PostgreSQL. Whatever gets the job done in the least amount of lines. Have

                      1. 18

                        A quick reading suggests that his company’s complimentary product to MongoDB is being threatened by Mongo’s cheerful repackaging of Postgres–that may have something to do with it.

                        1. 13

                          I’m the author, and you’re right, I’m definitely not unbiased!

                          I have three main biases that I can see: first, I didn’t like the one-sided partner experience I felt at my day job; second, I was a strong proponent for MongoDB to release an Apache 2-licensed BI connector that leveraged open source work I contribute to (which does 100% in-database analytics); and third, I co-founded an open source company based on the premise that relational analytics aren’t a good fit for NoSQL databases.

                          So yeah, I’m definitely biased. I try not to let those biases cloud my judgement, but I’m no vulcan.

                          I would have a different opinion of the connector if (a) they had been 100% upfront about the Postgres database and the (severe) limitations of the approach, rather than pounding their chest and omitting the Postgres connection; OR (b) they had released their own connector (even proprietary) that properly dealt with the problem (native support for nested data structures, 100% pushdown, etc.).

                          They didn’t do either. Which means I can’t get behind their decision. Others may come to very different conclusions, which is fine by me. Agree to disagree. :)

                          1. 2

                            Gotcha gotcha, good luck to you sir. :)

                            Out of curiosity–what do you mean by 100% pushdown?

                            1. 5

                              Thanks for that! And sorry for the jargon.

                              By 100% pushdown, I mean that every query is translated into operations on the target system that run entirely inside the database. Without pushdown, you end up sucking data out of the database, and relocating it into another system which actually executes the query.

                              The whole analytics via PostgreSQL via FDW via Multicorn via MongoDB route ends up pulling ALL the data out of ALL the referenced(s) collections for nearly ANY possible query (!).

                              Which only works if the collections are really, really small. :)

                              1. 3

                                Predicate pushdown is a more common name for the concept, which makes its meaning more obvious. You push predicates down the levels of abstraction closer to the data. Applying predicates reduces result set size, so the sooner you apply them, the less data you have to transfer around to other systems.

                                But you can also push down other operations. In addition to what @jdegoes said, this shows up a lot in big data type stuff. For example, MapReduce can be done in strictly Map / Shuffle / Reduce phases, but it’s (almost always) better to run the reduce locally on each map node before shuffling the map results over the network.

                        1. 3

                          Wantrepreneurship, even geared towards technical folks, belongs over at HN or elsewhere.

                          1. 3

                            I have to agree. The fact that we don’t have a tag for it ought to be a pretty good clue. :)

                            1. 2

                              Fair enough. :) I do think, though, that every engineer can benefit from a perspective on what life is like on the other side. It’s too easy to demonize or dismiss the “business people” just because their perspective is different.

                              1. 1

                                Agreed - empathy and understanding are really important.

                          1. 3

                            I was one of the people who worked at Precog. Happy to answer questions.

                            1. 2

                              On point #6, how do you know you’re still staying focused while still being responsive? Did you, as a team, ever figure out a rule of thumb?

                              1. 4

                                I can tell you what I think we did wrong. While we had not-many users, which was most of the time:

                                1. We spent weeks on features which only one customer had requested and showed interest in.
                                2. Our developer evangelist spent a lot of time writing the integration code for our customers. I remember one time this went as far as writing unrelated JavaScript for their website. We badly wanted to keep customers.

                                There were times we should probably have just said no. Even if that meant losing one of the very few customers, which would have seemed devastating, it would have allowed us to work on what would have provided more value to more potential customers.

                                1. 1

                                  Been in this position way too many times over the years (though lessons were usually learned quickly after the first time).

                                2. 3

                                  I think now I would ask myself, “Is this work we need to make a sale?” (if so, say no!), or “Is this work that will make it impossible for our segment to even consider using anything else for our use case?” (if so, do it!).

                                  To some extent, it’s about not being so desperate for customers that you’re willing to go every which way just to close a few deals (see puffnfresh answer below).