1. 11

    You might also find the explanation of bloom filters in “Cache Efficient Bloom Filters for Shared Memory Machines” helpful, particularly the description of how to make bloom filters dynamically grow as they start to get too full.

    My C property-based testing library, theft, uses a dynamic blocked bloom filter to check whether it has already run a property test with a particular combination of argument(s). This eliminates a lot of redundant activity, and also helps track how often duplicates are getting generated.

    1. 4

      I know silentbicycle knows this, but I’m working on a set of bloom filters written in Rust. I already have an implementation of a Blocked Bloom Filter based on the paper linked above.

    1. 6

      sh -c “curl https://babushka.me/up


      1. 3

        I mean you can read the file by going to the url. It’s not particularly complex and pretty well documented.

        1. 3

          its clear people keep doing this for a reason and fighting it isn’t working, perhaps it should be embraced by making a new tool popular which downloads scripts, checks a signature and then runs it.

          1. 3


            A friend wrote this recently.

          2. 1
          1. 22

            The whole message: http://marc.info/?l=linux-kernel&m=137390362508794&w=2

            Using humor to advocate for a poor managerial style can still be advocating for that style. Humor gives the speaker a deniable position by later being able to claim they were “just joking around”. This humor is easy to see as consent–or at least implication–that this particular managerial style won’t be policed.

            Sarah is correct. That behavior is unprofessional. Obviously.

            1. 14

              While what you say about managerial styles is true in the abstract, it isn’t applicable in this particular case. When Linus says, “The guy is a freakish giant. He should scare you. He might squish you without ever even noticing,” he is not deniably advocating that kernel contributors should “violence”, as Sarah said. We can infer this from the fact that there are no known cases in the last 24 years of one kernel contributor being physically crushed by another, nor of Linus physically crushing anybody, so this is not a management style that is currently in use in the project, nor one that Linus has experience with or an inclination for. Rather, Linus is making fun of Greg’s unusually tall body, using humor and countersignaling to decrease everyone’s level of discomfort with Linus and Ingo substantively criticizing Greg’s conduct, and also symbolically reasserting his dominance over Greg. (You’ll note that Greg didn’t respond by making fun of Linus.)

              When Linus says, “You may need to learn to shout at people,” he is not deniably advocating that kernel contributors should use “physical intimidation”, as Sarah said. We can infer this from the fact that the subject at hand is people emailing patches to Greg from around the planet, which he normally answers by email, just like Linus and Ingo. It is deeply implausible that Linus might mean that Greg should respond to a patch by buying plane tickets, showing up at somebody’s office, and literally shouting at them in order to physically intimidate them. Rather, Linus and Ingo are advocating that Greg should be more critical of patches submitted to him, metaphorically “shouting” by responding to them more frequently with critical emails.

              I recognize that you may be skeptical of the above analysis, but if so, I think you should take my word for it that you lack some basic skills necessary to understand what is happening in this mail thread. Continuing to argue that Sarah was right on these points will reduce your credibility in general. In fact, your best available move by far is to claim that you didn’t mean that Sarah was correct, perhaps because you were joking or because you weren’t talking about the obviously false claims of hers I picked apart in the above paragraphs.

              Perhaps we should leave professional behavior to the clergy, judges, prostitutes, physicians, professors — actual professionals, whose role as often as not has been to institutionalize and defend forms of oppression — and instead try to behave like ideal hackers, recognizing the spark of brilliance in any work of ingenuity that comes before us, even if the author triggers some stereotype you are afflicted with — whether it’s women, or Indian people, or Chinese people, or autistic people, or insane people, or transgender people, or whatever, that you’re biased against. It’s not just laudable altruism; it’s the only way to not look like the idiot Postgres committer who argued against Meredith Patterson’s proposed contribution on the basis that she was a woman and therefore couldn’t possibly do it.

              1. 9

                I’d like to back you up here. I was chatting with someone who worked in a particular division of Microsoft quite some years ago - late 90s, I think - and he described the management style in a neighboring division as “fear and intimidation”. The specific example he gave is of a high-level executive who literally picked up his (large, CRT) monitor and threw it in rage at the wall to make his point. The executive did not get reprimanded for this. That is a physical intimidation move.

                A management culture which intertwines with physical intimidation makes actionable threats (and plausibly carries them out on occasion). A management style such as Torvalds focuses on bluster, which is more impressive in the barking than the biting (although I find his legendary emails unpleasant myself and don’t care for bluster), and bluster is so well understood to be practically harmless that there is case law around it (although I can’t find it myself with a minute of Googling).

                1. 3

                  Humor is ambiguous. If someone feels singled out, threatened, or hurt by the behavior of authority figures, can we blame them for calling out the behavior? Or leaving the community outright? How many smart hackers get turned away from the community because the effort needed to participate is too great because of natural human emotional responses?

                  My point was not to highlight the intent of Linus' actions, merely another way of understanding the damage it can cause. And I think that damage is obvious.

                  1. 4

                    Gina Linkins talk at ApacheCon 2015 How to Thoroughly Insult and Offend People in Open Source (video) highlights some of the research that shows the damage that poor behaviour can have on a community.

                  2. -6

                    No record of actual crushing perhaps, but certainly of violence. Remember ReiserFS?

                    1. 13

                      Yeah. It’s scary how there were no consequences for that, and the guy continues to happily code within the kernel community.

                      No, wait. I’m lying. He’s rotting in jail, and even if he weren’t, he’d be an outcast. It’s almost like the kernel community doesn’t condone actual violence.

                      1. 8

                        I don’t mean to imply that Linux kernel contributors have never engaged in violence, just that there is no evidence that they have done so as a strategy for managing contributions to the kernel. Hans Reiser is perhaps the most extreme example of this: since he planned and executed the cold-blooded murder of Nina Reiser, he was clearly able and willing to use physical violence in general, but I have never heard a suggestion that he ever used physical violence to manage kernel contributions. And he had the opportunity! He had physically co-located kernel contributors who worked for him as employees!

                  1. 8

                    This post is great. The comments on it are a great example of why comments should go away.

                    1. 8

                      The Haskell “space leak” issue is less of one than it’s often made out to be. It’s something the community addresses early on (by pointing out that, e.g., foldl is pedagogical but shouldn’t actually be used) because Haskell is a high-level language where the community cares a lot about performance. Other languages have just as many gotchas and you can write code with terrible performance in any language, even C.

                      Haskell’s lazy utilities and its strict utilities are generally of high quality and, for the pieces that are janky or have unexpected behavior (e.g. lazy I/O) there are libraries (like pipes) that make the problem a non-issue for most cases. There isn’t really an issue there. Where there is debate is whether Haskell should be lazy by default and strict by opt-in (bang patterns) or vice versa. It’s an interface-level discussion but there’s no problem with the tool.

                      1. 17

                        Haskell’s laziness is a trap. In common cases it certainly doesn’t bite people, but nobody understands or can explain its performance implications when one hits an uncommon case.

                        I had two different medium-sized personal projects (maybe 10k lines each) which got fairly far before it became clear that performance was several orders of magnitude worse than it should have been, and made them impossible to finish. Heap and time profiles implicated laziness, related to thunks for >>=, but in the end no amount of INLINE annotations, strictness annotations, or deepseq'ing made any difference; >>= simply wasn’t being evaluated promptly. It was actually churning several gigabytes of heap space every second of execution, which is not automatically abnormal given that the heap profiler includes immediate reuse as a new allocation, but in this case it was “real”, and the CPU was pinned as well.

                        Several years after I’d given up, I figured out why - I’d designed both projects around free monads (such a pretty abstraction). These are “well known to cause performance issues” according to some papers I’ve since seen on how to improve them, but nobody I asked for advice thought of that at the time. Experts in the Haskell community are pretty accessible to intermediate users, but the experts were as lost as I was.

                        For the record, by the way, I’ve heard lots of people claim that the issue is fixed. I don’t believe that to be the case in general, although the inlining is much more aggressive within a module than it was back then, which is a nice improvement. If you have a monad that isn’t fully specified in the module where it’s used, you’re still going to have a bad time; you’d need to flag every definition using it for cross-module inlining, and that doesn’t happen by default since it, you know, completely defeats separate compilation.

                        This isn’t really an accident; it falls out directly from what it means to have a monad, and what it means that it’s free. I’ll leave out a comparison of the type signatures for monadic vs. applicative bind, and of algebraic effects, but it’s highly instructive to study those. Monads are a powerful abstraction; unfortunately, powerful abstractions are hard to compile.

                        This is certainly a case of several things being to blame; it’s clear to me now that monads are the wrong abstraction for using an abstract interface across module boundaries, and this is a lot of why I encourage people to work on making algebraic effects more useful.

                        But without laziness, what I was doing wouldn’t have compiled at all. I would have preferred that.

                        1. 9

                          I’ve used Haskell for 6+ years at this point. I’m always personally bothered when I see criticisms of Haskell that I’m unable to understand. I feel like my uses of Haskell have stuck to the more “normal” side of the language. I rarely use heavily research-y features or many language extensions. This being the case, I’ve also never run into any serious performance issues.

                          But, I’m not sure if what was described above could be considered “normal” or not. Could someone please break the critique down a bit more for me? I can’t use what’s there to change my style or make better decisions.

                          1. 11

                            Okay, so. :) I can give an example of free monads from my old code if anyone really wants one, but it’s probably unnecessary: here’s a decent explanation. At 6+ years, you have probably used them without knowing the name. http://www.haskellforall.com/2012/06/you-could-have-invented-free-monads.html

                            As you can see, at least some people encourage the pattern, and I do consider it tempting. Notice that this particular blog post makes no mention of performance topics or of organizing code into modules. Similarly, I didn’t think about that as relevant, and it bit me: The issue I hit arises when you leave a monad free in one file, specifying only a typeclass it implements, and leave the decision of which monad is used up to a caller in another file.

                            I would characterize what I was doing as a naive attempt to do something that “should” work, but that isn’t a traditional pattern. At the time, I had enough experience to be dangerous, you know? Also, I didn’t even think of them as a novel technique I was trying out; at the point I was at, this seemed like a natural design. I’ll be curious whether others agree.

                            I’m using free monads primarily as an example of a performance issue that would have been easier to diagnose in a strict language. What happened is that I forced the compiler into a pathological case which it nonetheless supported, then I couldn’t figure out why.

                            The pathology was a failure in inlining, not directly related to laziness. It resulted in numerous calls via a typeclass dictionary that should, instead, have been expanded to the specific implementation. Specifically, these were the calls to >>=. Since they hadn’t been eliminated by inlining, like any other call they wound up being lazily-evaluated. I’m not even sure adding ! would have been valid syntax, but it would have had to be added to every statement of every do-block.

                            I actually have to retract my earlier statement that this wouldn’t have compiled without laziness; it’s clear to me now that it would. What laziness did here was obfuscate the failure. This is surprisingly similar to the classic foldl issue, where each node in a large data structure winds up with a thunk, all of which have to be constructed before any can be evaluated. The difference is that here, the nodes are the statements of my program. It’s running as if it were an interpreter chewing an AST, which is the point of a free monad, but it’s supposed to perform decently. :)

                            Now some details on Haskell performance debugging. Mostly, my complaint is that I can’t think how I could have known. In a strict language, a time profile would have noticed a large constant factor of slowdown, and attributed it to each individual statement being slower than it should have been. In theory, I could have looked at the profiler’s call hierarchy, and noticed that the expensive statements were always within functions using a monad they didn’t completely specify. I wouldn’t call it easy to debug (though the expense would have been a lot smaller, and I might not even have bothered), but the data I needed would have been available and organized in a useful way.

                            In Haskell, the call stack is fairly meaningless since lazy thunks are “called” by the function that wants their final output. In profiling mode only, there’s a separate cost-center stack which does in fact attribute allocations to something sort-of akin to what the call hierarchy would look like in a strict language. I suppose that in theory, that data could have prompted the same insight as I suggested above. The tools don’t output any sort of cost-center hierarchy, just a flat list, so I would have had to sort it quite creatively. (There is a hierarchy tracked at runtime, but it’s only unwound for exception reporting.)

                            One standard approach I tried to separate efficient callers from inefficient callees was ignoring “total time” and looking only at “self time”. This didn’t help, because the self-time was all spent in >>=, which is a bizarre place to see an expense, and I couldn’t distinguish which parent callers were producing it to see what they had in common.

                            Also, this expense got worse with code length (I want to say it was exponential, can anybody sanity-check that?), so my attempts to write reduced programs showed less of a problem the more I reduced them, no matter what I took out.

                            In closing, I’d like to clarify that I think the ability to have free monads in Haskell programs is a good thing; there are times when they’re appropriate, especially if they don’t cross module boundaries. I certainly wouldn’t want a type system that went out of its way to preclude them. I do blame laziness for the architectural decisions that make it hard to track issues down - on the rare occasions they occur.

                            1. 4

                              Were you using Free/FreeT or MonadFree?

                              I would characterize what I was doing as a naive attempt to do something that “should” work, but that isn’t a traditional pattern. At the time, I had enough experience to be dangerous, you know? Also, I didn’t even think of them as a novel technique I was trying out; at the point I was at, this seemed like a natural design. I’ll be curious whether others agree.

                              seemed like a natural design

                              Not if you’re worried about performance, no.

                              The pathology was a failure in inlining, not directly related to laziness.

                              This can happen just as easily in a strict language with ML modules. In fact, most ML compilers don’t inline anything so if you abstract any of your code you’re taking a performance hit all of the time rather than some of the time. You can enforce inlining with Haskell typeclasses. (lens takes advantage of inlining extensively for performance)

                              It’s running as if it were an interpreter chewing an AST, which is the point of a free monad, but it’s supposed to perform decently. :)

                              I don’t think Free is representative of something you’d expect to be fast solution for the problem of staging out representations/transformations from their interpretations. If someone or some resource made representations along those lines to you, please point them out so they can be corrected.

                              With this in mind: https://ro-che.info/articles/2014-06-14-extensible-effects-failed

                              The recommendation I make when someone needs to abstract out their effects in a manner like what you seemed to have wanted is that they do an mtl-style typeclass and make “real” and “test” instances for the effects they want.

                              In general, anything Kmett does, you want to look at if you’re writing performance sensitive Haskell. You’ll notice the mtl style in projects of his like quine.

                              Further initial encodings intended to provide “algebraic effects” such as Eff are going to be, if anything, worse than Free for performance. mtl-style does what people generally have wanted from something like Eff, but is much faster and more reliably so to boot.

                              I don’t really know exactly what you were doing or what went wrong, but these are my impressions based on what I’ve read so far and what I’ve seen happen mostly commonly with others in the past.

                              it’s clear to me now that monads are the wrong abstraction for using an abstract interface across module boundaries

                              I’m not sure how that follows and it seems ill-posed as written. Could you elaborate?

                              If you wanted a fully abstract interface enforcing module boundaries then I don’t know how you could expect cross-module inlining to occur. (cf. what I said about ML modules, except for MLton which is whole-program optimizing) What are you asking for here?

                              What follows is more general advice, mostly derived from observations while helping people. Not necessarily targeted at you, though I hope at least some of it may help.

                              Mostly people get fast Haskell code by not being fancy, avoiding creating more values than necessary, inlining where appropriate, avoiding boxing where appropriate, applying non-strictness and strictness where appropriate, etc. People like Don Stewart have written extensively about this.

                              Generally speaking, you do not use the same techniques to optimize Haskell code that you would in other languages. The cost center/stack stuff is getting better (GHC has DWARF now), but really it comes down to reading Core and seeing if it matches what you expected. GHC Core output is quite explicit (but still readable if you use the stripping flags) and it’ll generally tell what you what’s going to happen. This wiki page has sections on inlining and what follows inlining talks about reading Core.

                              One of the more frequent bits of advice I give learners is to resist the urge to write code they don’t understand - that includes operational considerations. I find Haskell easier to reason about in non-trivial applications, but I don’t blow my complexity budget on ‘nice’ before I’ve dealt with ‘need’. Most of the smart people I’ve seen burn out with Haskell fell into this trap at one point or another.

                              When you’re levering up and learning/experimenting with new techniques, it’s really valuable to ask questions and get help when something doesn’t go the way you’d like. In my experience, the Haskell community has been very gracious with their time and helpful - especially if you single someone out and ask them specifically for help because you’ve seen them create, use, or write about something you’re having difficulty with. Usually people are pleased you noticed and eager to help by the time they finish reading the intro of your email. Did you reach out to anyone specific for help when you were having these issues?

                              1. 6

                                There’s a lot to reply to there, so here’s a short answer for the parts that are easy to write.

                                Glad to have your opinion that it wasn’t natural. :) Oh well.

                                I wasn’t using anyone else’s monads or transformers; these were my own. So that did divert a lot of my efforts because it seemed likely that their implementations, though trivial, were somehow the issue.

                                The interesting thing about algebraic effects here is that, although it may be worse for performance in some absolute sense, it doesn’t hit this issue because it’s specifying a specific monad, and the implementation of >>= doesn’t depend on which handlers are used. The actual side-effect invocations will be explicit calls, but bind should inline properly, as it does for most cases.

                                Yes, sorry, the “abstract interface across module boundaries” does need substantial explanation, both of what I mean and of why I think that. That’s a long answer; maybe later if I have time.

                                I’m very impressed to hear that GHC implements DWARF now, that should definitely help a lot.

                                I do encourage anyone who needs to understand Haskell performance to learn to read Core. At the time, I could barely parse it. I’d recommend that people learn sooner rather than later, so that they have a sense for what it should look like in a program that performs properly, as a basis for comparison when they have one that doesn’t. :)

                                As a woman, I can’t say I find the Haskell community particularly welcoming. That didn’t apply at the time though because everyone involved perceived me as a man, and they were definitely helpful with many different sorts of advice. I would have to dig through years-old chat logs to figure out who (I doubt I sent email about it; I still have no idea what list would be appropriate for a performance issue not specific to any package), but everyone went above and beyond.

                                1. 4

                                  As a woman, I can’t say I find the Haskell community particularly welcoming.

                                  This deserves a dedicated side discussion probably, but why? (I’ve heard more than one person make this complaint, although TBH I’ve never seen any behavior first-hand that wasn’t welcoming. As a white male, I don’t see a lot of the bad behavior in tech.) What should people like me be thinking about in order to make the Haskell community more welcoming?

                                  I actually think that “non-traditional” programmers (second-career, 35+, women) are (a) on the whole much better than the young “rock stars”, and (b) a great target audience for Haskell. High-reliability programming (e.g. the stuff that JPL does) is typically 30-50% female, and there’s a reason for that. If I were starting my own company, I’d focus on the “non-traditional” programmers who are overlooked by the the imbecilic, anti-intellectual, ageist VC-funded startup culture of Scrum, open-plan offices, and shoddy engineering.

                                  As I turn into an old programmer myself (I’m 32) I’m starting to take an interest in the question of how to fight for the do-it-right engineering culture, preferably without moving 100% into management.

                                  1. 9

                                    I agree with your remarks, and have respect for you generally, so I guess I’ll answer. I haven’t talked about this publicly before. Part of what leaves a bad taste in my mouth is that a problem which takes very little effort for anyone who wants to to cause has taken a year or so of sober analysis for me to explain. I don’t often have the energy to give these explanations; leaving is simpler, and it’s what I did, more or less.

                                    There’s no way to proceed without telling the story from my perspective. I see no reason to give names, and would prefer not to for my own safety. As far as I know, the person in question has forgotten I exist, and that’s my strong preference.

                                    There was an isolated incident on one of the IRC side-channels which I tried to let go at the time, but over the months that followed I suddenly noticed a lot of behavior like it (directed at other people, and not from any single person) which had previously gone over my head. Contemplating the nature of the response and what I could meaningfully ask for showed me that, while admins care, nothing is really going to change.

                                    I suspect the admin who fielded it would be surprised to hear I had a problem with the response, by the way. It wasn’t until a few months later that I understood it was inadequate, and I couldn’t think of anything that would help other than me leaving. By that point I felt less interested in the channels anyway.

                                    Obviously, though this happened on IRC, all the same people are on many of the mailing lists. I haven’t left those, and won’t.

                                    So there was a person who, though apparently we’d both been in the channels concurrently for many years, I don’t recall ever speaking to until a week before the event; I hadn’t spent a lot of time there since switching to an obviously-feminine username until around the time this happened. We only ever had two very brief conversations. I don’t actually know this person’s gender, although from the behavior, and from the fact that as far as I know they knew nothing about me other than my gender, they’re very likely a man.

                                    In both conversations, I had asked for debugging help from the channel at large, and the other party responded by challenging basic assumptions. I don’t remember the details; they made statements which in retrospect were taunts, and I fell for it by responding by giving contextual knowledge which demonstrated that I was familiar with the area in which they were questioning me. This is a standard dominance game which many engineers practice without ever thinking about it, and which these days strikes me as extraordinarily obnoxious; I regret every time I’ve ever played along with it. I don’t believe anyone else present perceived it as a dominance game, because it’s socially accepted behavior that people don’t really even notice unless they’re one of the parties to it.

                                    What differed this time that took it from mildly unpleasant but soon-forgotten social ritual to something serious is that it iterated much further, challenging ever more basic knowledge. It went through enough iterations that it became clear there was no level of knowledge which would satisfy this person that I had a right to ask for help from the channel (not even from them specifically), nor any conciliatory utterance which would stop the ritual. I’m sure if an admin hadn’t asked in-channel for us both to stop, I would have eventually been asked to pass a Turing test.

                                    I left the channel for a few hours and put it behind me. Later that evening, I was surprised to hear from the admin again, who said he’d just finished listening to the other person’s side for a few hours, and he could see the situation was complicated and was not looking forward to trying to adjudicate it but felt obligated to spend as much time listening to me. I said I was astonished by that, had no idea who the other person was other than the past week (it’s a large channel), and had nothing to add.

                                    The resolution was that, via the admin, we agreed to put each other on /ignore, which I did with great reluctance - in twenty years of IRC, that’s the only time I ever have, because it’s a completely ineffective problem-solving technique. I expected that would be the end of it, but of course it wasn’t; a couple weeks later, I logged in from another machine which didn’t have the /ignore, and inadvertently responded to something that person had said as part of a conversation with several other people. They immediately accused me of violating… I have no idea what. There was extreme hostility. Writing this, it occurs to me that if they’d done their part, they wouldn’t have seen my response anyway!

                                    The thing here is, I’ve had straight-out stalkers. I recognized the same psychology here. My best strategy is to hope that they find someone else to fixate on; any disciplinary action is going to be taken as a challenge, as is any request for distance.

                                    In fact, people who behave this way very frequently regard any interaction at all, however negative, as a victory for them, because they consider it to be a form of flirting. I’ve had people who I’ve only ever spoken to on opposite sides of a public, civil, but vociferous argument about whether women are people, contact me months later referring to the friendship they believe we have.

                                    So by the time this happened, I knew that what I wanted was to avoid drawing this person’s attention; I suspected if I brought it up again with an admin, I’d be asked to talk through it, which would probably gain me several more years of this (don’t laugh at “years” - it’s happened to me; I was out in other places long before I was out in technical venues), as well as a strong risk of being followed to other places online, which it happened to have been mentioned would have been viewed as irrelevant to the channel. The admin had made it clear in the earlier conversation that they considered this unprecedented behavior on the other person’s part, and that since they’d been there as many years as I had, they would prefer a solution where nobody had to leave.

                                    And, of course, any action would at best only help me, and only with this one person, but I witnessed other people receive similar behavior on a daily basis. I’d just never thought about it before.

                                    So I left.

                                    1. 6

                                      Thanks for writing this. It’s thought-provoking. If you ever write more on the subject I’d be interested to read it.

                                      1. 4

                                        Good to have the feedback, thank you :)

                              2. 2

                                Thank you very much for the explanation. This helped me properly contextualize things. I really appreciate it. :)

                                1. 1

                                  I’m glad it helped. :)

                            2. 2

                              I don’t intend to argue or invalidate what you’re saying, because you obviously understand Haskell and were burned by a deep implementation bug, but what I’m not convinced of is that there aren’t issues like this lurking deep in other languages, that hit you if you use advanced features and trigger an implementation bug. Scala and Java have worse than this, despite having larger communities.

                              Whether Haskell should be lazy or strict by default is another issue that I’m not prepared to argue, because plenty of people know this territory better than I do.

                              1. 3

                                Similarly, I don’t disagree with any of that. :)

                                1. 2

                                  I don’t think Scala does have worse than this. I’ve hit bugs in the compiler, bugs in the bytecode generator, bugs in the virtual machine, and surprising performance issues. But I’ve always had the sense that the community takes them seriously, that everyone knows that what happens at runtime matters and that tools for providing more insight there are valuable.

                                  I don’t feel so confident that those people exist for Haskell, and I see more dismissal of these things (not that there aren’t people who dismiss them in Scala-land). Could be the difference is no deeper than who speaks loudest in each community.

                                  1. 2

                                    Ed Kmett said it far better than I can on Reddit.

                                    To me, it seems that some of these are intrinsic to a community that (and not for bad reasons) has to care more about JVM compatibility than about advanced monadic programming. It’s not that Scala is bad (it isn’t) or that it’s community doesn’t care (it does). The objective function just seems to be different. Haskell is a community that is obsessed over quality but very, very small as a result of that.

                                    1. 6

                                      I’ve had many of those problems, in production codebases. They are (well, most of them) real problems. But I don’t think they’re as bad a “trap” as Irene’s issue. They’re all kind of “O(n)”: when type inference doesn’t work it’s tedious to write the type explicitly, but it’s not hard, and it’s self-contained. Free monads are kind of crap in Scala but it’s all up front. Fusion isn’t going to mysteriously start failing when you make some subtle, seemingly unrelated change because there isn’t any fusion (the one exception I can think of is TCO, which I do think should only be applied to explicitly @tailrec methods).

                                      So while it’s conceivable on any given project that I’ll hit a point where Scala becomes not worth it, at least I’ll see it coming. I’m confident that I’ll never be surprised by dramatic performance changes. I’m confident that any problem that does occur will be local, and that I’ll be able to debug it top-down: the runtime of f <*> g is the runtime of f plus that of g (which, sure, becomes less true as we start passing functions around, but at least those cases are explicit).

                                      Rightly or wrongly, I don’t have that confidence in Haskell. Once a project reaches a certain size I’m terrified that I’m going to add some seemingly-innocuous feature that will suddenly hit a performance wall, and I won’t be able to understand what the problem is or how to fix it. And I get the sense that a vocal faction (at least) thinks runtime tools are irrelevant, that any effects you care about should be in the type system, the whole “failure is not an option so we won’t prepare for it” ethos. But the difference between an integer and a long chain of thunks that will ultimately evaluate to an integer can be an important effect - and it’s not captured in the type system.

                                      1. 2

                                        That’s indeed a fascinating post. None of the things it mentions are that same kind of impossible debugging situation, so it doesn’t contradict lmm’s comment, but it scares me away from Scala. But it’s also astonishing what a very different category of complaint it is from my issues with Haskell.

                                        (I do have some relatively abstract points about Haskell’s exotic extensions, and the habit of making exotic extensions in the first place, but I don’t think anything like that should scare anyone away from a language, just tempt them to make a better one.)

                                        Since you can pass any dictionary anywhere to any implicit you can’t rely on the canonicity of anything. If you make a Map or Set using an ordering, you can’t be sure you’ll get the same ordering back when you come to do a lookup later. This means you can’t safely do hedge unions/merges in their containers. It also means that much of scalaz is lying to itself and hoping you’ll pass back the same dictionary every time.

                                        Ouch. I’m a theoretical fan of implicit dictionaries instead of type classes, after a lot of thought about the latter and how they need to be overridden sufficiently often that the cross-module scope thing they do in Haskell is a very bad mechanism, but this is a pretty serious design issue.

                                        1. 1

                                          People have quite some issues with his claim here, and I would argue that it is misrepresenting the situation quite fundamentally:

                                          • Regarding coherence of typeclass instances: Haskell promises a lot, but GHC fails to deliver almost completely. Non-contrived and simple programs which violate these uniqueness guarantees compile without any warning in GHC, and are able to break all assumptions associated with typeclass coherence, including corrupting data-structures at runtime.

                                          • Scala doesn’t promise coherence of typeclass instances in the first place, and therefore doesn’t fail to deliver the guarantees–unlike GHC. I claim this is a fundamentally better position to be in.

                                          • Nevertheless, you can either check whether you received the same instance (or use a different approach with path-dependent types) and do hedge unions/merges. Depending on the definition of “sameness” you have to balance desirable guarantees with associated restrictions. This is nothing which is unique to Scala, but a standard representation of the decade old generative/applicative/… functor problem in ML.

                                          • If you look at the evolution of languages at a larger scale, it seems that newer languages with typeclasses are all adopting approaches similar to Scala, and none of them try to replicate what Haskell tried.

                                          • The issues with Haskell’s typeclasses are–despite claims to the contrary–neither a simple bug in GHC (where just nobody found some time to fix it), nor “fixed” by disallowing orphans completely. It’s a really fundamental issue to which nobody found a practical solution yet.

                                          I’m not sure whether Kmett’s claims are doing the Haskell community a favor, given that most people want to compile and run Haskell programs on a computer, not execute their code in their heads based on some idealized reading of the spec.

                              1. 2

                                I like single-header unit test frameworks for C. If you’re interested in something with a few more features, I absolutely love Greatest: https://github.com/silentbicycle/greatest.