1. 5

    Another great/thorough writeup. And kudos to the dgraph team (and all of the other previous jepsen customers), for putting their money where their mouth is to have their work publicly torture tested like this.

    @aphyr, 3 questions if you don’t mind the tangent:

    • Any chance on a FoundationDB analysis?
    • Are there a lot of jepsen users out in the wild? I imagine some bigger places have engineers that have a need/capability of using it within their own organization, but I don’t know what kind of feedback (if any) they give you.
    • Any thoughts on integrating jepsen with more formal methods? Any engineering pros/cons would be insightful.


    1. 7

      Thank you! To answer your questions…

      1. There’s no FDB analysis planned; they haven’t approached me, and since I just moved and took a couple months off, I really need to focus on taking paying gigs and rebuilding funds. My next client is all lined up though, and I should have more results to show in winter. :)

      2. It’s hard to say! I get PRs from maybe 5 active orgs, and I know of… maybe a dozen orgs who use it independently? I’ve also trained… maybe 150 people in writing Jepsen tests, but I don’t necessarily know whether those folks went on to use Jepsen at their orgs internally, adapted the techniques to their own test suites, or moved on to other things. I think the techniques are more important than the tool itself, so even if folks aren’t using Jepsen itself, I’m happy that they’re doing more testing, fault injection, and generative testing!

      3. I have sort of a “part of this balanced breakfast” take on Jepsen–it exists on a spectrum of correctness methods: normal proofs, machine-checked proofs, model checking, simulation testing, the usual unit & integration tests, Jepsen-style tests, internal self-checks, production telemetry, fault injection/chaos engineering, and user reports. In the early design phase, you want provable algorithms, but complexity might force you to give up on machine-checked proofs and move to model checking; model-checking covers weird parts of the state space but isn’t exhaustive, so it’ll miss some things. The map is never the territory, so we need simulations and tests for individual code components and the system as a whole, to verify that each piece and the abstraction boundaries between them hold up correctly. As you move to bigger tests, you cover more system interactions, but the state space generally explodes: larger tests explore less of the state space. Jepsen’s at the far end of that testing continuum, looking at all the interactions of a real production system, but only over short, experimentally accessible trajectories–a simulation test, like FDB does, is going to cover a lot more ground in the core algorithm, but may not catch bugs at the simulation layer itself or in untested components, e.g. a weird interaction between the filesystem and database which wouldn’t arise in an in-memory test. And Jepsen is specifically constrained to simple, testable workloads; it’s never gonna hit the data or request volumes, or query diversity, that real users will push at the system–that’s why we need user reports, telemetry from production, self-checks, etc.

      There’s a lot of “formal methods” in Jepsen; every test encodes, more or less explicitly, an abstract model of the system being evaluated. We take a range of approaches for performance and coverage reasons, so some actually involve walking graphs of state spaces, and others are just checking for hand-proved invariants. Developing new and faster checkers is a great place to apply your formal methods knowledge, if you’re looking to contribute!

      1. 1

        re FoundationDB. I wanted to see that, too. He had good reason for not doing it described here. In that thread, he said he hasn’t tested it because “their testing appears to be waaaay more rigorous than mine.” Still might be good for independent replication, though. Plenty of scientific papers look like they have a lot of rigor until you find that they missed this, used that incorrect algorithm, or just made stuff up for fame or fortune.

        I say that as someone who was Wow’d by FoundationDB. Hopefully, Jepsen just confirms it was as good as it appeared. If not, people get to fix any problems he finds. It’s all-win scenario unless he finds a problem they can’t fix somehow.

        1. 3

          That was based on a phone conversation I had with one of the FDB team members–they were doing a bunch of tests, like hardware faults and simulation tests, that weren’t really feasible for Jepsen because a.) I didn’t have custom hardware, and b.) simulation testing has to be built into the database code itself, and Jepsen takes a black-box approach. FDB also spun up their own Jepsen test, but I can’t tell you how deeply they explored there.

          Then FDB got eaten by Apple, and fell off my radar–but I’m happy it’s re-emerging now! We don’t have any plans to work together right now, and I’ve got my hands full with other clients, but I’d be happy to work on FDB tests in the future. :-)

      1. 7

        Massive kudos to this guy for not putting up with this SJW madness. I wish him all the best!

        We at suckless are heavily opposed to code of conducts and discriminatory organizations of any shape or form.

        1. 11

          Suckless takes a similarly principled stand against runtime config files.

          1. 8

            How does suckless oppose discrimination?

            1. 13

              It’s very simple. Any non-technological matters during software development move the software away from its ideal form. Thus, to make your software suck less, you only take the best developers no matter what race, gender, heritage, etc. these persons have.

              We do not believe in equal status (i.e. e.g. forcibly obtaining a 50/50 gender ratio), as this immediately leads to discrimination. We do however strongly believe in equal rights, naturally. You also naturally cannot have both.

              1. 94

                Any non-technological matters during software development move the software away from its ideal form.

                Suckless makes a window manager: a part of a computer that human beings, with all their rich and varying abilities and perspectives, interact with constantly. Your choices of defaults and customization options have direct impact on those humans.

                For example, color schemes determine whether color-blind people are able to quickly scan active vs inactive options and understand information hierarchy. Font sizes and contrast ratios can make the interface readable, difficult, or completely unusable for visually impaired people. The sizes of click targets, double-click timeouts, and drag thresholds impact usability for those with motor difficulties. Default choices of interface, configuration, and documentation language embed the project in a particular English-speaking context, and the extent to which your team supports internationalization can limit, or expand, your user base.

                With limited time and resources, you will have to make tradeoffs in your code, documentation, and community about which people your software is supportive and hostile towards. These are inherently political decisions which cannot be avoided. This is not to say that your particular choices are wrong. It’s just you are already engaged in “non-technical”, political work, because you, like everyone else here, are making a tool for human beings. The choice to minimize the thought you put into those decisions does not erase the decisions themselves.

                At the community development level, your intentional and forced choices around language, schedule, pronouns, and even technical terminology can make contributors from varying backgrounds feel welcome or unwelcome, or render the community inaccessible entirely. These too are political choices. Your post above is one of them.

                There is, unfortunately, no such thing as a truly neutral stance on inclusion. Consider: you wish to take only the best developers, and yet your post has already discouraged good engineers from working on your project. Doubtless it has encouraged other engineers (who may be quite skilled!) with a similar political view to your own; those who believe, for instance, that current minority representation in tech is justified, representing the best engineers available, and that efforts to change those ratios are inherently discriminatory and unjust.

                Policies have impact. Consider yours.

                1. 7

                  I don’t know if that was your goal, but this is one of the best arguments for positive discrimination I’ve read. Thanks for posting it, and also thanks for noting that all decisions have some inherent politics whether we like it or not.

                  Unfortunately there is simply no solution: positive discrimination is opposed to meritocracy. Forced ratios are definitely an unethical tool, as they are a form of discrimination. However, this unethical tool brings us to a greater good, which is a final product that incorporates diversity on its design and accommodates more users, which is a desirable goal on itself, for the reasons you explained.

                  1. 4

                    color schemes determine whether color-blind people are able to quickly scan active vs inactive options and understand information hierarchy. Font sizes and contrast ratios can make the interface readable, difficult, or completely unusable for visually impaired people. The sizes of click targets, double-click timeouts, and drag thresholds

                    Let me see if I understand what you’re saying. Are you claiming that when color schemes, font sizes and drag thresholds are chosen that that is a political decision? I think that many people would find that quite a remarkable claim.

                    1. 3

                      It’s impossible to not be political. You can be “the status quo is great and I don’t want to discuss it”, but that’s political. The open source “movement” started off political - with a strong point of view on how software economics should be changed. In particular, if you say a CoC that bans people from being abusive is unacceptable, you are making a political statement and a moral statement.

                      1. 3

                        It’s impossible to not be political

                        Could I ask you to clarify in what sense you are using the word “political”?

                        Merriam-Webster (for example) suggests several different meanings that capture ranges of activity of quite different sizes. For example, I’m sure it’s possible to act in a way which does not impinge upon “the art or science of government” but perhaps every (public) action impinges upon “the total complex of relations between people living in society”.

                        In what sense did you use that term?

                        1. 4

                          Let’s start off with a note about honesty. FRIGN begins by telling us “We do not believe in equal status (i.e. e.g. forcibly obtaining a 50/50 gender ratio)” as if someone was proposing the use of force to produce a 50/50 gender ratio - and we all know that wasn’t proposed by anyone. There’s no way to discuss this properly if people are going to raise false issues like that. What comment’s like FRIGN’s indicate is an unwillingness to have an open and honest conversation. The same bogus rhetoric is at the heart of Damore’s memo: he claims to be in favor of equal rights and just against mythical demand for 50/50 gender equality so that he can oppose obviously ineffective affirmative action programs at Google where 80% of technical staff are male (Damore’s misappropriation of science is similarly based on an objection to a position that nobody ever argued.).

                          The next point is that some people are objecting that a CoC and a minority outreach program are “political”. That’s true, but it involves the use of the more general meaning of “political” which the Collins dictionary provides as “the complex or aggregate of relationships of people in society, esp those relationships involving authority or power”. If we are using that definition, of course a CoC and a minority outreach program are political, but opposition to a CoC and a minority outreach program fits the definition as well. If you have an opinion one way or another, your opinion is political. You can’t sensibly use this wide definition of political to label the effort to adopt a CoC and to recruit more minorities and then turn around and claim your opposition to those is somehow not political. So that’s what I mean by “it is impossible to not be political”. The question is a political question and those who try to claim the high ground of being objective, disinterested, non-political for their side of the question are not being straightforward (perhaps it’s just that they are not being straightforward with themselves).

                          1. 3

                            I agree that a CoC, a minority outreach program, and opposition to a CoC all impinge upon “the complex or aggregate of relationships of people in society, esp those relationships involving authority or power”.

                            Would you also agree that there is a popular ideological political movement in favour of CoCs (some combination of the feminist, civil rights and social justice movements)? Perhaps there is also a popular ideological movement against CoCs (some combination of MRAs and the alt right). Are you also claiming that if one claims a “neutral” stance on CoCs one is de facto supporting one of these ideologies?

                            1. 3

                              I’m not sure it is possible to have a neutral stance. In fact, I doubt it.

                              1. 1

                                Interesting! Do you also doubt it is possible to take any action that is neutral with regard to a political ideology?

                                1. 3

                                  You are introducing something different. I don’t think you have to line up with one “side” or another, but you can’t avoid being a participant.

                                  1. 1

                                    You said “It’s impossible to not be political” so I’m trying to understand what you mean by that. So far I’m not clear whether you think every action is political. I’d appreciate it if you’d clarify your position.

                                    1. 2

                                      I’m making a very concrete assertion, which I sense does not fit into your schema. My assertion is that there is no neutrality on workplace equality and inclusion for anyone involved in the workplace. Anyone who, for example, participates in an open source development effort has a position on whether efforts should be made to make it more inclusive even if that position is “this is not important enough for me to express an opinion.”

                                      1. 1

                                        Thank you for clarifying. When you originally said “It’s impossible to not be political” I got the wrong impression.

                                        Do you also hold the same point of view when it comes to roughly comparable statements in other spheres? For example ‘Anyone who eats has a position on vegetarianism even if that position is “this is not important enough for me to express an opinion.”’?

                    2. 1

                      You’ve been quoted by LWN: https://lwn.net/Articles/753709/

                    3. 11

                      AKA shut up and hack? :)

                      1. 1

                        The suckless development process has no non-technical discussions?

                        How are the best developers identified?

                        1. 8

                          just curious, why would you need to identify the best developers? Wouldn’t the quality of their code speak for that?

                          1. 5

                            I also fail to see what the reasoning is. Just send your code, get the non technical discussions out.

                            1. -1

                              Apparently, quoting @FRIGN from above, “to make your software suck less.”

                            2. 8

                              How are the best developers identified?

                              I think this is a totally reasonable question, and one I’d like to see the answer too–if for no other reason than it might help those of us on other projects find more objective metrics to help track progress with.

                              Do you all at suckless use something like:

                              • defect rate
                              • lines of code/feature shipped
                              • execution time
                              • space in memory, space in storage

                              Like, what metrics do you use?

                              1. 7

                                You know, suckless is not a big company and the metrics that can be applied are more of a heuristic. A good developer is somebody who e.g. supplies a patch with a bug report, provides feedback to commits, makes contributions to the projects, thinks his commits through and doesn’t break stuff too often and does not personally identify with their code (i.e. is not butthurt when it’s not merged).

                                What needs to be stressed here is that the metric “lines of code” is completely off. There are horrible programmers who spit out lots of code and excellent ones who over time drop more lines than they add. Especially the latter group is very present among us and thus the LOC-metric will only give false results. Same with execution time, you find that when not enough time is spent on a problem you end up solving it wrong, in the worst case having to start all over.

                          2. 5

                            By being very diverse and doing fackelmärsche of course. https://suckless.org/conferences/2017/

                            1. 3

                              @FRIGN What’s the purpose of this “torchlight hike” in the context of producing code that sucks less? Don’t you see that the activities you choose to have during your conferences are a cultural stance, and because of that, can be perceived as exclusive by programmers that don’t recognize themselves in these activities?

                              1. 0

                                I get your point, but must honestly say that your argument sadly aligns with the ever-excluding and self-segregating destructful nature of cultural marxism. By eating food together at the conferences, do we exclude anorexics that might otherwise be willing to attend such a conference? I don’t drink any alcohol and never have. Still, it was not a problem when we went to a local Braukeller and some people drank alcohol and others like myself didn’t.

                                The fundamental point I think is that one can never fully and analytically claim that a certain process is completely unaffected by something else. If we dive down into these details we would then move on and say that the different choice of clothings, hairstyle, means of travel and means of accomodation all affect the coding process at suckless. This can be taken further and further with no limit, as we all know about the butterfly effect. At some point it is just not measurable any more.

                                If you ask me, this is a gross overstretching of what I said. There are quite a lot of people who do not attend the conferences but still work together with us on projects during that time. What really matters is that we e.g. do not ignore patches from these people or give them less relevance than those of others. To pick the example up: The torchlight hike did not affect any coding decision in a direct way, but it really bonded the team further together and was a very nice memory of this conference that I and the others are very fond of from what I’ve heard. On top of that, during the hike we were able to philosophize about some new projects of which some have become a reality. The net-gain of this event thus was positive.

                                In classical philosophy, there are two main trains of thought when it comes to evaluating actions: Deontology and Teleology. Deontology measures the action itself and its ethical value, completely ignoring the higher goal in the process. Teleology is the opposite, evaluating actions only by their means to reach a goal, completely ignoring the value of the action itself. The best approach obviously should be inbetween. However, there is a much more important lesson that can be taken from here: When evaluating a decision, one needs to realize what they are measuring and what is unimportant for a decision. What I meant is that to reach the goal of software perfection, the gender and other factors of the submitters do not matter. So even though we here at suckless have a goal, we are not teleologists, as we just ignore the factors that do not matter for coding.

                                It is an ethical question which norms you apply to a decision.

                                If we look at organizations like Outreachy, one might be mistaken to think that they are deontologists, striving to improve processes. However, after closer inspection it becomes clear that this is not the case and they are actually working towards a certain goal, increasing the number of trans and minority people in such communities. No matter how you think about this goal, it makes one thing clear: When you are working towards such a goal and also do not ignore irrelevant factors in your norms (and they in fact do by not ignoring e.g. race and gender), you quickly end up discriminating against people.

                                I hope this clears this up a bit, but as a short sentence, what can be taken from here is: When discussing ethical matters, it’s always important to make clear which norms are applied.

                                1. 2


                                  I’m not going to wade into anything else on this, but I’d like to just take a second and let you know that, while you may not mean it in this way the phrase “cultural marxism” is very, very often used as a stand in for “jews”. Some links for the record:


                                  https://newrepublic.com/article/144317/trumps-racism-myth-cultural-marxism https://www.smh.com.au/world/cultural-marxism--the-ultimate-postfactual-dog-whistle-20171102-gzd7lq.html

                                  1. 3

                                    It’s not my fault that some idiots don’t understand this term or it’s critical analysis. Cultural marxism, as the term implies, is the classical theory of marxism applied to culture. It has nothing to do with jews directly, it’s just an idea. If you know any better term to describe it, please let me know.

                                    Anyway, in the philosophical realms it’s known as ‘Critical Theory’, which originated in the Frankfurt School. However, nobody knows this term.

                                    Unless a better term is found, I disregard your argument and won’t accept your attempt to limit language of perfectly acceptable words to describe an idea. At the end of the day, terminology must be found that adequately describes what a certain idea is, and I see no reason why this should be wrong.

                                    Regarding the torch hike: Yes, marching with torches was abused by the NSDAP as a means of political rallying. However, at least in Germany, it is a much older and deeper-reaching tradition that dates back hundreds of years.

                                    1. 0

                                      You have amply demonstrated that you don’t know anything about the topic. You could start with the decent Wikipedia article. https://en.wikipedia.org/wiki/Frankfurt_School

                                    2. 2

                                      wow, uh, kind of a weird red flag that pointing this out is getting seriously downvoted. I picked these links pretty quickly, and anybody who comes behind and reads this and wonders how serious this is, do yourself a favor and image search and see how many memes have the star of david, greedy merchant, world strangling octopus or any of a number of openly anti-semitic imagery. Its not hidden, its not coy. If you’re tossing “cultural marxism” around you’re either willfully ignoring this or blatantly playing along. Its not a thing in the world. There are no leftists (at all) who call themselves “cultural marxists”, and in fact there is a sizeable faction of marxists who are openly disdainful of any marxism that eschews political struggle. The new republic article linked above goes into this, Perry Andersons “Considerations on Western Marxism”, a well known, well regarded text across a number of marxist subsects, is explicitly based on this. Anyway, enjoy contributing to a climate of increasing hostility toward jews. good stuff.

                                      edit: have some fun with this https://www.google.com/search?q=cultural+marxism&client=firefox-b&source=lnms&tbm=isch&sa=X&ved=0ahUKEwjz2tWrhvnaAhUJ7YMKHVgcCccQ_AUIDCgD&biw=1247&bih=510#imgrc=_

                                      1. 1

                                        The term ‘Cultural Marxism’ describes very well what it is, and not all leftists are cultural marxists. The classical theory of marxism, roughly spoken, is to think of society as being split in two camps, the Proletariat and the Bourgeoisie, eternally involved in a struggle, where the former is discriminated against and oppresed by the latter.

                                        Cultural Marxism applies these ideas to society. In the Frankfurt School it was called ‘Critical Theory’, calling people out to question everything that was deemed a cultural norm. What is essentially lead to was to find oppressors and oppressed, and we reached the point where e.g. the patriarchy oppressed against women, white people against minorities, christians against muslims and other religions and so forth. You get the idea. Before you go again rallying about how I target jews or something please take a note that up to this point in this comment, I have just described what cultural marxism is and have not evaluated or criticized it in any way, because this here is the wrong platform for that.

                                        What you should keep in mind is that the nature of cultural marxism is to never be in a stable position. There will always be the hunt for the next oppressor and oppressed, which in the long run will destroy this entire movement from the inside. It was a friendly advice from my side to you not to endulge in this separatory logic, but of course I understand your reasoning to the fullest.

                                        Just as a side note: I did not see you getting ‘seriously’ downvoted. What do you mean?

                                        1. 2

                                          It’s uncommon to find such a well-put explanation; thanks for that.

                                          There will always be the hunt for the next oppressor and oppressed, which in the long run will destroy this entire movement from the inside.

                                          If the movement runs out of good targets (and falls apart because they can’t agree on new ones), wouldn’t that imply that it will self-destruct only after it succeeds in its goals? That doesn’t sound like a bad thing.

                                          1. 1

                                            I’m glad you liked my explanation. :)

                                            That is a very interesting idea, thanks for bringing this thought up! It’s a matter dependent on many different factors, I suppose. It might fall apart due to not being able to agree on new targets or when everybody has become a target, but it is a very theoretical question which one of these outcomes applies here.

                                          2. 1

                                            Did you actually read any of the links I posted? Specifically the New Republic and SPLC links? I don’t know how else to say this and you pretty much side stepped what I said the first time so I’ll try to reiterate it: There is no such thing as “Cultural Marxism”. At all. Its not a descriptive category that any marxist actually self applies or applies to other marxists. I’m fully aware of the Frankfurt School, Adorno, Horkheimer, etc. I’ve read some of them and many, many of their contemporaries from Germany, people like Karl Mannheim. I read marxist publications everyday, from here in the states and from Europe. I’m a member of an explicitly marxist political party here in the states. I can’t emphasize this enough, “cultural marxism” isn’t real and is roughly on par with “FEMA camps”, “HARRP rays” and shape shifting lizard jews, meaning; its a far far right wing paranoid fantasy used to wall off people from other people and an actual understanding of the material conditions of their world. I also didn’t say, specifically in fact pointing out that I wasn’t saying this, that you were “targeting jews”. That being said, if you use a phrase that has its origins in anti-semitic polemics, is used explicitly and over-whelmingly by anti-semites, than that is on you. (Did you take a look at the linked image search? Does that sort of thing not give you pause?) To say that you “just described what cultural marxism is” is also inaccurate, you absolutely used it in a descriptive way

                                            I get your point, but must honestly say that your argument sadly aligns with the ever-excluding and self->segregating destructful nature of cultural marxism.

                                            White supremacist organizing is experiencing an enormous upsurge, not only here in the states but in Europe as well. From Le Pen to AfD to SVO in Austria and on and on. These people are not interested in polite conversation and they’re not using “cultural marxism” as a category to illuminate political opponents, its meant to denigrate and isolate, ironically given thats exactly what Neo Nazis and white supremacists here in the states accuse left wingers and “SJWs” of doing.

                                            I appreciate that you’re discussing this peacefully but I’m going to bow out of this thread unless you’re interested enough to take some time and read the links

                                            FWIW these also dismantle the trope and point out pretty much exactly what I’m saying around anti-semitism: https://www.vice.com/en_us/article/78mnny/unwrapping-the-conspiracy-theory-that-drives-the-alt-right https://www.theguardian.com/commentisfree/2016/feb/22/chris-uhlmann-should-mind-his-language-on-cultural-marxism

                                            1. 2

                                              I took some more time to read it up and from what I could see, I found that indeed cultural marxism has become more of a political slogan rather than a normal theoretical term in the USA.

                                              Here in Germany the term “Kulturmarxismus” is much less politically charged from what I can see and thus I was surprised to get this response after I just had “translated” this term into English. It might be a lesson to first get some background on how this might be perceived internationally, however, it is a gigantic task for every term that might come around to you.

                                              So to reiterate my question, what term could be better used instead? :)

                                              1. 1

                                                interesting that it has a different grounding/connotation in Germany, but then again I’m not surprised since thats where its supposed to have originated from. I’ll reread your other posts and come up with a response thats fair. Thanks for taking the time to read those links.

                                            2. 1

                                              Generally people who use “cultural marxism” as a pejorative are sloganeering. The idea of an “eternal struggle” is completely foreign to any kind of marxism which is based on a theory that classes come out of the historical process and disappear due the historical process. Marxism claims that the proletariat and bourgeosie are temporary divisions that arise from a certain type of economic organization. Whatever one thinks of that idea, your characterization of Marxism is like describing baseball as a game involving pucks and ice. Your summary of “cultural marxism” is even worse. Maybe take a class or read a decent book.

                                2. 17

                                  I’m not going to remove this because you’re making a public statement for suckless, but please don’t characterize positions you disagree with as madness. That kind of hyperbole generally just leads to unproductive fights.

                                  1. 9

                                    Please don’t remove anything unless it’s particularly vulgar…

                                    1. [Comment removed by author]

                                      1. 3

                                        hey that’s my account you’re talking about!

                                    2. -1

                                      Removing differing viewpoints? It is precisely this kind of behavior that maddens people who complain about SJW, who (the SJW) seem unable to take any discussion beyond calling their opponent’s position “evil”, “alt-right”, “neo-nazi”, or, if they are exceptionally well-spoken, “mad”.

                                      1. 14

                                        No, removing abuse and hyperbole that acts as flamebait regardless of the political opinions expressed. So far I’ve removed one post and hope not to remove more.

                                        1. 2

                                          It’s hard for me to see a reason to remove things when we have the voting system in place, neither are perfect but one is at your sole discretion whereas the other is the aggregate opinion of the users.

                                          1. 21

                                            Voting isn’t a replacement of moderation. It helps highlight and reward good comments and it can punish bad comments, but it’s not sufficient for running a community. I’m trying to head off places where people give up on argument and just try to hurt or tar the people they disagree with because it doesn’t lead to a good community. Lobsters is a very good place for discussing computing and I haven’t seen that in communities this size with hands-off moderation (but I’d love counter-examples to learn from!) From a quick query, we’ve had comments from 727 unique users in the last 30 days and there’s around 15k unique IPs in the logs per weekday, so people are constantly interacting with the others who don’t know their background, don’t share history, can’t recognize in-jokes, simply don’t have reason to trust when messages are ambiguous, let alone provocative. Friendly teasing like “ah yeah, you would think that” or “lol php sucks” that’s rewarding bonding in a small, familiar group hurts in a big one because even if the recipient gets the joke and laughs along or brushes it off as harmless, it’s read by thousands of people who don’t or can’t.

                                            1. 2

                                              Lobsters is a very good place for discussing computing and I haven’t seen that in communities this size with hands-off moderation

                                              I support your position on sub-topic but even my Trial you linked to shows a bit otherwise on just this point. This site has more flexible, hands-off moderation than many I’ve seen with this much political dispute. Even in that link, we saw an amount of honest, civility, and compromise I don’t usually see. There’s been quite a bit better results in this thread than usual elsewhere. There seems to be enough community closeness despite our size that people are recognizing each others positions a bit. Instead of comments, you can actually see it by what’s not said more since it’s prior ground we’ve covered. The others are learning as discussion furthers. Then, there’s the stuff we don’t want which seems to be basically what those individuals are intending in a way that has nothing to do with site’s size.

                                              So, I support you getting rid of just pure abuse, trolling, sockpuppeting, etc. I don’t think we’ve hit the full weaknesses and limited vision of large sites yet despite our increase in comments and views. We’re still doing a lot better than average. We’re still doing it with minimal intervention on things like politics relative to what I’ve seen elsewhere. I think we can keep at current moderation strategy for now because of that. For now.

                                              Just wanted to say that in the middle of all this.

                                              1. 0

                                                Voting isn’t a replacement of moderation. It helps highlight and reward good comments and it can punish bad comments, but it’s not sufficient for running a community.

                                                I’m not sure if I see why it’s not a good replacement. To me, I see voting as distributed moderation and the “real” moderation is automatically hiding (not removing) comments when they fall below a threshold.

                                                I’m trying to head off places where people give up on argument and just try to hurt or tar the people they disagree with because it doesn’t lead to a good community.

                                                I think this method relies on an accurate crystal ball where you can foresee people’s actions and to an extent, the reactions of the people reading the comments.

                                                I’d have to question what you mean by “a good community”, it seems like it’s just a place where everyone agrees with what you agree with and those that disagree aren’t heard because it risks offending those that do agree.

                                                I think the best discussions on here are because we have many people with wide and varied opinions and backgrounds. The good comes from understanding what someone else is saying, not excluding them from the discussion. The only places I see that warranted is where someone has said something purposely and undeniably vile.

                                                1. 8

                                                  The automatic hiding of low-scoring comments is also a “sole discretion” thing; jcs added it and I tweaked it a few months ago. The codebase enforces a lot of one moderator’s ideas of what’s good for a community in a hands-off way and the desire to do that motivated its creation.

                                                  I strongly agree that a community where everyone agrees with the moderator would be bad one, even if I am that moderator. It’s tremendously rewarding to understand why other people see things differently, if for no other reason than the selfish reason that one can’t correct learn or correct mistakes if one never sees things one doesn’t already agree with.

                                                  I think the crystal ball for foreseeing problems is experience, from many years of reading and participating in communities as they thrive or fail. I think it’s possible to recognize and intervene earlier than the really vile stuff because I’ve seen it work and I’ve seen its absence fail. I keep asking for examples of excellent large communities without active moderators because I haven’t seen those, and after a couple decades and a few hundred communities I see the anthropic principle at work: they don’t exist because they self-destruct, sink into constant vileness, or add moderation. At best they have maintain with signal-to-noise ratios far below that of Lobsters where the thoughtful commentary is crowded out by trolling, running jokes, ignorance, and plan low-quality comments because it doesn’t seem worth anyone’s while to care when posting.

                                                  But moderation is not a panacea in and of itself. Without good experience, judgment, and temper a bad moderator swiftly destroys a community, and this is a very common way communities fail. If it helps any, the author of the comment I removed agrees that it wasn’t done to suppress their opinion.

                                                  1. 1

                                                    The benefit I see from moderation being part of the codebase is that it’s public, predictable and repeatable (it terms of reliability). When you take moderation decisions into your own discretion many of these virtues are lost.

                                                    As for experience, I think that’s tricky because it can easily lead you to making the same mistake twice. It’s also made of your personal experiences and you’re using that to curate the discussion of other people, I would caution that it’s another method of controlling dialog (perhaps subconsciously) to what you find acceptable, not necessarily what’s best for everyone.

                                                    1. 3

                                                      The benefit I see from moderation being part of the codebase is that it’s public, predictable and repeatable (it terms of reliability). When you take moderation decisions into your own discretion many of these virtues are lost.

                                                      Most of them go into the Moderation Log. I’ve been watching it since the jcs days since it’s what folks are supposed to do in a transparent, accountable system. Gotta put effort in. I haven’t seen much of anything that bothered me. The bans and deletes I’ve been able to follow @pushcx doing were trolling, alleged sockpuppeting, and vicious flamewaring. Some I couldn’t see where I’d rather the resource go off the front page rather getting deleted so someone looking at logs could see it for whatever it was. Nonetheless, his actions in the thread about me, the general admining, and what I’ve seen in moderation have been mostly good. A few really good like highlighting the best examples of good character on the site. I think he’s the only one I’ve seen do that on a forum in a while.

                                                      You have little to worry about with him in my opinion at the moment. Do keep an eye on the comments and log if you’re concerned. Scrape them into version storage if concerned about deletions. What goes on here is pretty public. Relax or worry as much as you want. I’m more relaxed than worried. :)

                                                      1. 3

                                                        Yeah, I agree on the pitfalls of experience. As SeanTAllen noted in a separate branch of this thread a minute ago, there’s “but you didn’t say” and other wiggle room; I think that’s where automatic moderation falls down and human judgment is required. Voting has its own downsides like fads, groupthink, using them to disagree (which is all over this thread), in-jokes, a drifting definition of topicality, all the parallels to the behaviors of political rhetoric, etc. Lobsters has never been voting only and I don’t see a compelling reason to change that. jcs’s involvement in the site was steadily declining so I’m certainly more actively moderating, but I don’t see that as a change in character. I guess what it comes down to is that I agree with you about what successful communities do and don’t look like, but I haven’t seen one that works on the model you’ve outlined and I don’t see that kind of fundamental change as a risk worth taking.

                                            2. 1

                                              So FRIGN writes to oppose “SWJ madness”, and you chime in to complain that “SWJ” calls opponents “mad”. Are you calling FRIGN “SWJ” or what? It’s kind of hard to discern your point in that cloud of grievance.

                                              1. 1

                                                “SJW” for “social justice warrior.”

                                                @COCK is sarcastically non-replying because you typo’ed.

                                                1. 2

                                                  Not exactly, I was sarcastically non-replying because I assumed he was intentionally misunderstanding me. I assumed this because I didn’t see any ambiguity in my answer. On later inspection I noticed the ambiguity so I gave an actual reply:


                                                  1. 1

                                                    The interesting thing is how people agreeing with Mr. cock pile on the insults against the people who they complain are insulting them by forcing them to sign on to codes of conduct which prohibit insults. It’s almost as if there was a good reason for those codes.

                                                    1. 1

                                                      I doubt the irony is lost on anyone supporting a CoC.

                                                  2. -1

                                                    Yes, I’m calling FRIGN a “SWJ”.

                                                    1. -1

                                                      Yes, well, one sympathizes with your plight.

                                                      1. 2

                                                        Ah now I see the ambiguity: “people who complain about SJW, who…” the “who” referred to the “SJW”, not the “people”

                                                  3. 1

                                                    The only comment that was removed was against FRIGN point of view. Nobody is removing differing point of view, just enforcing civil discussion.

                                                2. [Comment removed by author]

                                                  1. 4

                                                    “We at suckless are heavily opposed to code of conducts and discriminatory organizations of any shape or form.”

                                                  2. 4

                                                    It’s responses like yours that really make the case for codes of conduct.

                                                    1. 2

                                                      Are you speaking for the group or is that your own opinion? Knowing that the group aligns itself with that position would certainly make me not interested in working with it or contributing.

                                                      1. 6

                                                        To be fair, suckless is not well-organised enough to be a group that can have a single opinion to be spoken for.

                                                        That said, FRIGN is a prominent contributor and I from what I’ve seen most contributors are heavily on the side of “the code will speak for itself”.

                                                    1. 1

                                                      I have what feels like a really stupid question:

                                                      A careful investigation of the raw history reveals that just prior to the read of 4, process 98 attempted to write 4, and received a failure code :unavailable:

                                                      Isn’t it always legal for any DB client to claim that an operation failed when it actually succeeded? e.g. the op may have succeeded but the acknowledgement from the DB server got lost. I feel like I missed something obvious.

                                                      1. 4

                                                        Yes, this is a subtle question! Two-generals implies that we cannot determine whether or not a message was received in a finite number of messages. This means that any request-response pattern has three possible outcomes: a definite success, a definite failure, or an indeterminate result (e.g. crash, timeout, …) which could be either successful or not; we can’t say. Jepsen is careful to treat each of these cases appropriately. In this particular case, the database returned a definite failure error, rather than an indeterminate one. When databases say something definitely didn’t happen, we hold them to it.

                                                        1. 1

                                                          Thank you! ❤

                                                          So if it had returned an error code for which the documentation said “your update might have been applied despite this error. We can’t tell for sure” then you would have made the linearizability testing treat that as indeterminate rather than success or failure, and that history would have been considered legal?

                                                          1. 2

                                                            Yes, exactly. By default, Jepsen treats all errors as indeterminate ones, and we allow both possibilities. You have to explicitly tell Jepsen that a certain error is a definite failure, and then it won’t consider that operation as one that could have happened, when it goes to check for correctness. :)

                                                            1. 1

                                                              Thank you. ♥

                                                      1. 4

                                                        Great writeup as always @aphyr!

                                                        Out of curiosity, what sort of pricing do you give for this sort of work, and is it per-time-period or per-product or per-defect found or what?

                                                        1. 9

                                                          Thank you.

                                                          I usually charge for dedicated weeks of time, and we keep going as long as the client feels the work is fruitful, but it varies. Some work I try to do for free, or with a pre-arranged rate in installments.

                                                          1. 1

                                                            Have you done CockroachDB yet?

                                                            1. 1

                                                              Yep! Review is here: https://jepsen.io/analyses/cockroachdb-beta-20160829 (full disclosure: I work at cockroach)

                                                        1. 31

                                                          Author here. To be clear, all the stories in this series are jokes, not endorsements. Engineering interviews are a complex problem and I won’t begin to discuss them here, except to say that there are people whose entire lives involve studying and improving teams of humans, they’re called organizational psychologists, and it might be worth hiring some.

                                                          Since folks have expressed incredulity that these techniques are at all involved in modern programming jobs, I should note that I have had to implement trees, sorting algorithms, and parsers in my professional code. For instance, I wrote the clj-antlr parser bindings to support Riemann’s query parser, which is chock full of tree transforms, on account of it being a compiler for search predicates. Knossos includes a hand-rolled binary search over… what are essentially packed and bitmasked structs, masquerading as a JVM int array. There’s also some graph neighborhood stuff in there that allows us to memoize arbitrary function calls over objects into pointer-chasing through densely packed int arrays. Gretchen implements a number of simplification algorithms over trees of boolean expressions, as well as Tseitin-expansion to reduce predicates to conjunctive normal form, which are required to turn transaction dependency graphs into constraint- and SAT-solver-friendly expressions. This is, like, the third or fourth time I’ve tackled this problem.

                                                          I don’t… think I’ve done a regex algorithm yet, but my former coworker Zach Tellman has done quite a few DFA and NFAs in his time; Automat comes to mind. I’m sure it’s only a matter of time before I wind up doing it too.

                                                          My experience is, of course, not representative. :)

                                                          1. 1

                                                            Why is it not representative? Seems like problems any experienced person may encounter in their travels. Enjoyed the story BTW.

                                                            1. 3

                                                              That’s the hope, but I suspect not everyone gets the chance to venture out into the interesting waters of comp sci.

                                                              1. 3

                                                                I’d guess because the median programmer is writing business code and never writes library code in their entire career.

                                                                1. 1

                                                                  Come on, that’s rock-star stuff.

                                                                  1. 1

                                                                    It’s true, it’s obviously great work. My point is that aphyr IS representative of experienced people. Experienced people … experience many variety of problems.

                                                              1. 18

                                                                This is something that the rest of the team and I have been working on for more than a year. I think open source sustainability is going to be one of the big issues the tech community needs to face in the next ten years, in all communities, but particularly in niche ones like Clojure. Ruby Together has a really good model, so we copied it and applied it to the Clojure community (with some tweaks). Happy to answer any questions people have.

                                                                1. 17

                                                                  Thank you for putting this together–all of you. I’m signing up Jepsen as a corporate sponsor right now.

                                                                1. 1

                                                                  For testing byzantine faults, it’s important to keep results actionable. https://github.com/madthanu/alice does a nice job of introducing realistic filesystem semantics for crash testing. You may need to run it on the ubuntu version they recommend, as their patched strace is a little out of date. It’s a pretty simple tool to use other than that!

                                                                  1. 4

                                                                    Jepsen is a distributed systems test first and foremost, but yes, for single-node faults, tools like Alice are nice. I actually spent a week or so on a research project involving filesystem-level faults, but it didn’t product useful results within the time I had available.

                                                                    1. 2

                                                                      Recovery correctness has a ton of implications for distributed systems. It’s vital for leader election in broadcast-replicated systems with progress-related invariants, like what raft needs to enforce. I’ve also come across several popular (purportedly linearizable) distributed databases that will do things like start serving before their recovery process completes, returning stale reads just behind the in-progress recovery scan of the WAL. You’ll find gold if you look.

                                                                      1. 5

                                                                        Wow thank you, that sounds like a really interesting research opportunity!

                                                                  1. 4

                                                                    If you’d like to see an example of how these numbers play out in practice, take a look at Zach Tellman’s benchmarks for Bifurcan (https://github.com/lacuna/bifurcan/blob/master/doc/benchmarks.md), which compare Java’s ArrayList to Clojure’s vectors (which, like Scala’s vectors, are 32-way hash array mapped tries), and also Java’s HashMap and HashSet to Clojure’s hashmaps. You can also see how some of the decisions in his linear and persistent collections (List vs Linearlist, Map vs LinearMap, etc) affect performance.

                                                                    1. 2

                                                                      Out of curiosity, when you evaluated Knossos performance, were you testing using histories with crashed clients, or happy-path histories where client operations succeed relatively quickly? Knossos makes some choices which optimize for the former, and I think the P-compositionality paper focuses on the latter, but it’s been a few years since I’ve been down in those papers guts. May need to revisit those assumptions if they were slower for your workload.

                                                                      1. 3

                                                                        Hi aphyr!

                                                                        To make the comparison more fair to Knossos, I tested histories where you can’t take advantage of P-compositionality. (in short, P-compositionality is when a history is linearizabile iff all subhistories in a partitioned history are linearizable - e.g. with a map, you can partition by keys and check the subhistories independently, and that’s a lot faster)

                                                                        I used test data from Jepsen etcd tests: https://github.com/anishathalye/porcupine/blob/master/test_data/jepsen

                                                                        Here’s the quick-and-dirty benchmarking code I used: https://gist.github.com/anishathalye/a315a31d57cad6013f57d2eb262443f5 (basically, just timing knossos.core/linearizable-prefix).

                                                                        Even where Knossos is slow, e.g. etcd_002.log and etcd_099.log (timed out after > 2 days). Porcupine seems to do fine, taking hundreds of milliseconds on a single core to check the histories.

                                                                        Out of the ~100 tests, filtering for ones that Knossos finished in < 1 hour, we have a speedup of 1000x on Knossos’s fastest test (etcd_016.log) and a speedup of 20,000x on Knossos’s slowest test (etcd_040.log). And for the ones that timed out (because I didn’t want to run the tests for way too long), e.g. (etcd_099.log), Porcupine had a speedup of > 10^6.

                                                                        I haven’t had time to look into Knossos’s implementation in detail and figure out exactly where Porcupine’s speedups are coming from, but that would be cool to do at some point. Jepsen/Knossos are obviously a way more complete solution for testing distributed systems, and it would be cool to speed up the linearizability checking aspect.

                                                                        1. 2

                                                                          Ohhhhhh! Yeah, you’re using the original algorithm–that’s definitely slower. Try (knossos.linear/analysis model history) instead–that’s based on the JIT-linearization algorithm from Lowe’s paper, plus some additional optimizations–instead of performing union-find for compaction, we pre-compile the state space into a graph of pointer arrays, which turns the search into immutable pointer-chasing instead of running model code. There are… certain cases where the knossos.core algorithm is preferable (it offers better parallelism) but linear should be significantly faster. Still not good though; I’d like to sit down and figure out some alternative strategies.

                                                                          And yeah, as you note, we don’t do P-compositionality in Knossos–that’s handled by Jepsen, which performs the decomposition into independent keys for maps, sets, etc, then hands individual histories to Knossos. Would be nice to fold into Knossos later though!

                                                                          Last thing, if you wind up packaging this for distribution, I’d like to offer a hook in Jepsen so we can pass histories to it. If you’d like to define some sort of serialization format (JSON, tsv, EDN, protobufs, etc) for passing histories in and getting analysis results back, I can wrap that up in a Jepsen checker as an alternative strategy. :)

                                                                          1. 2

                                                                            Oops, I didn’t know that. I redid the benchmarking with (knossos.linear/analysis model history), running Knossos on 6 cores and Porcupine on 1 core.

                                                                            The benchmark results did change: Knossos completed every check significantly faster. On some tests, the new algorithm performed significantly better: e.g. on etcd_040.log, Porcupine has a speedup of 278x, as opposed to a speedup of 20,000x when comparing against the original algorithm (knossos.core/linearizable-prefix).

                                                                            Porcupine still ran faster on all tests, though; the following is a summary of the results (over the ~100 Jepsen etcd tests):

                                                                            Min speedup:    8.1x     on etcd_002.log
                                                                            Max speedup:    21,219x  on etcd_067.log
                                                                            Median speedup: 1000x

                                                                            Ooh, that sounds cool! I’ll probably end up packaging this for distribution in a couple weeks, and I’ll definitely reach out to you once I have an API for communicating with Porcupine.

                                                                      1. 3

                                                                        The motherboard is wonky and refuses to find half the disks on boot. You can crash the box by using certain USB ports. We have a complicated relationship.

                                                                        I’m curious about this. You can buy a stable, reliable computer with 48 cores and 128GB of ECC RAM completely off the shelf — Dell/HP/etc sell rackmount or tower servers with these configurations.

                                                                        Is using a really powerful but unreliable desktop computer a net productivity advantage relative to using a small reliable desktop + SSHing into a big reliable server? I appreciate that remote debugging is often not as nice as local debugging, but on the other hand remote debugging has some nice side benefits like the fact that the UI that you’re using doesn’t go unresponsive when the machine gets loaded.

                                                                        I can totally sympathise if it turns out that the root cause of this was just that Kyle just really really wanted a really overpowered computer out of sheer nerdlust.

                                                                        1. 8

                                                                          sheer nerdlust

                                                                          I know it’s a time-honored tradition to armchair-architect strangers’ technical decisions without regard for the privileges of context or experience, but fuck it, I’ll bite.

                                                                          I used to rebuild and rack servers for a living, and considered that here, but ultimately decided I wanted a workstation.

                                                                          It’s quieter; I didn’t feel like having a screaming banshee 2u sitting in my tiny SF bedroom. It means owning one computer instead of two, which is cheaper, takes up less space, and cuts down on my time spent doing stupid sysadmin stuff. It’s also way less of a pain in the ass to work with than the janky-ass combination of remote filesystems, SSH tunnels, rsync hacks, X forwarding, and Yourkit injection that I have to use with remote Jepsen clusters.

                                                                          1. 6

                                                                            Thank you for replying! I’m sorry if I came off as insulting. Edit: I apologise for insulting you. That was not my intention.

                                                                            It’s quieter

                                                                            I appreciate the noise issue. I’m used to 1U servers being awful for it because the fans are small so the noise is high-pitched, haven’t gotten my hands on 2U or up to see if they’re much quieter. I thought tower servers were supposed to be no worse than desktops in this regard? Since they’re not that different and can use similarly huge fans?

                                                                            janky-ass combination of remote filesystems, SSH tunnels, rsync hacks, X forwarding, and Yourkit injection

                                                                            Ouch. Good point, avoiding that mess is worth a lot of effort.

                                                                          2. 1

                                                                            My guess is that its some kind of whitebox. I’ve never had good lucky with them, and always some kind of jank. I replaced my whitebox server with an SFF business desktop and it’s been far better in stability.

                                                                          1. 14

                                                                            VoltDB doesn’t have a whole lot to sell to me, really. That said, if I understand the chain of events properly, VoltDB sponsored this Jepsen post (and all the research that went into it), took those findings and started fixing the problems unearthed. That’s an admirable commitment to both data safety and openness, and it means I’ll consider VoltDB preferably over competitors should I ever need that feature set.

                                                                            1. 28

                                                                              Yep, you’re understanding correctly. Like RethinkDB, VoltDB approached me for help in testing their systems, and funded the research. I found initial cases pretty quickly, deeper problems over the next month, and worked with their team for the next month or so to create more stringent tests and evaluate proposed fixes–VoltDB ported some of these test scenarios to their internal test suite, and is integrating Jepsen into their testing cycle now. That work culminated in the release of 6.4 last week. You can read more about how I handle sponsored research, and see the full set of bugs we uncovered on VoltDB’s issue tracker.

                                                                              1. 1

                                                                                There are some points where I just don’t understand you (GC are not partitions in the CAP proof, and it’s not new: see cap faq point 16 http://henryr.github.io/cap-faq/). You can push it into it (for example by putting some deadline constraints in the definition of availability), but it’s not really CAP anymore.

                                                                                It’s easier to choose an algorithm which is safe in the general case of asynchronous networks,

                                                                                Yeah this one is interesting. I agree on the algo choice (there is no other choice anyway), but many errors (typically not flushing the writes to the disk) are visible only with kill9 or node failures. You have to test independently. Partition tolerance does not mean fault tolerance.

                                                                                1. 4

                                                                                  Partitions in the G&L proof are any pattern of message loss. Any garbage collection period which exceeds the duration an application will wait for a message before considering it lost (e.g. tcp socket timeouts, application-level timeouts) will get you the same behavior.

                                                                                  1. 1

                                                                                    Yep, that’s what I call pushing it into it (application cannot wait forever => there are deadlines constraints). CAP applies in this case (i.e. you really have to choose between consistency & availability).

                                                                                    GC are still a little bit simpler than a true network partition, because the process stops working for everybody. Agreed, there will be some nasty race conditions at the end. But you don’t loose a full rack when there is a GC. It’s a much nicer kind of fault, one node at a time. In a big data system, if you lose a rack you have to replicate the data again for safety. With a GC you do not need that: the node will come back to life before you start to replicate the 2Tb of data it was storing (not to mention the case w/ a rack!).

                                                                                    I do agree with you on the asynchronous part: the algorithm you need to choose when the network is asynchronous will help a lot with partitions and with GC. But you need to test both.

                                                                                    1. 4

                                                                                      It’s a much nicer kind of fault, one node at a time

                                                                                      GC is a notorious cause of distributed systems data-loss, typically where it allows two nodes to believe they’re the primary at the same time. Take a look at the Elasticsearch mailing lists sometime for all kinds of horror stories around high load, memory pressure, or GC causing inconsistency.

                                                                                      1. 1

                                                                                        I’m not sure if the size of the fault is necessarily relevant for this discussion, either.

                                                                                        1. 1

                                                                                          Agreed again, but node failures and network partitions will add a few other horror stories.

                                                                                          I mean, I would expect a software vendor to say

                                                                                          • We have tested dirty node crashes, no data loss

                                                                                          • We have tested GC. No dataloss, no performance issue if the GC is contained within 10s.

                                                                                          • We have not tested network partitions. Per our design it should be fine (we’re aiming at AP: availability during partition), but it’s still an edge case.

                                                                                          Rather than: “we’re partition tolerant of course.”

                                                                                          And for a system like ES (for example), the design for availability under network partition could be something with partial results and so on (harvest & yield). Not that easy to do (I don’t know ES).

                                                                                          1. 4

                                                                                            Agreed again, but node failures and network partitions will add a few other horror stories.

                                                                                            Absolutely agreed. The reason I mention GC in this response is because you’ve argued that LANs won’t partition. Even if LAN fabric were totally reliable, I’m trying to remind people that partition tolerance is about message delivery, not just what we traditionally consider “the network”.

                                                                                            And for a system like ES (for example), the design for availability under network partition could be something with partial results and so on (harvest & yield).

                                                                                            Gilbert & Lynch, footnote 4: “Brewer originally only required almost all requests to receive a response. As allowing probabilistic availability does not change the result when arbitrary failures occur, for simplicity we are requiring 100% availability.”

                                                                                            1. 1

                                                                                              Absolutely agreed. The reason I mention GC in this response is because you’ve argued that LANs won’t partition.

                                                                                              I doubt I said something like this :-) But yeah, for sure the whole post is only about network partitions. I will update the post to make this clear.

                                                                                              1. 1

                                                                                                “CA exists, and is described as acceptable for systems running on LAN” “Stonebraker was considering small systems of high range servers on a LAN” “it is also fine to assume CA on a LAN”

                                                                                                1. 1

                                                                                                  None of those are mine (lynch/stonebraker/brewer)

                                                                                                  Both Stonebraker and Brewer consider (quoting Brewer but Stonebraker said exactly the same thing) “CA should mean that the probability of a partition is far less than that of other systemic failures”, so even if they think that CA is acceptable in some specific cases on a LAN that does not mean they think that LANs won’t partition.

                                                                                        2. 1

                                                                                          GC are still a little bit simpler than a true network partition… It’s a much nicer kind of fault, one node at a time

                                                                                          This is usually the case. However, I’ve also seen the back pressure resulting from a GC cause other nodes to become loaded. The other nodes then started a GC. Now there was a feedback loop and the entire system ended up falling over.

                                                                                          The system could have been configured better… but that’s kind of the same point about experiencing partitions on a LAN. It’s not cost-effective and you’re still going to miss something.

                                                                                  1. 20

                                                                                    People seem not to realize (or not to want to realize) that CAP is a mathematical theorem which is rigidly true in every circumstance and cannot be talked around or ignored. It’s simply a fact. In the presence of network partitions, your system will sacrifice consistency or availability (or, in a no-doubt worryingly large number of cases, both); the most you can do is pick. (It is a safe bet, by the way, that availability is the wrong choice [edit: to keep; that was entirely unclear, sorry].)

                                                                                    (As an amusing-to-me aside, CA is what every system should be in the absence of partitions. If your system cannot deliver both consistency and availability even when everything is going right, it is worse than useless.)

                                                                                    1. 14

                                                                                      As an amusing-to-me aside, CA is what every system should be in the absence of partitions.

                                                                                      Sometimes, for latency reasons, you might want lower consistency even when the network is fully connected. Figuring out when you want that, though… maybe an open research problem, haha.

                                                                                      1. 2

                                                                                        As an amusing-to-me aside, CA is what every system should be in the absence of partitions.

                                                                                        As aphyr said, you may want to do this for latency reasons. For example, theres PA/EL in Abadi’s PACELC taxonomy. Many traditional architectures with replications trees of relational databases offer these tradeoffs, as do many “quorum”-based database systems.

                                                                                        Along with latency, there’s also scale. Again, with relational databases it’s fairly common to have async replicated read replicas take read-only traffic, or to have multi-master configurations with async replication. These systems choose A over C entirely for performance reasons, and may not actually intend to be available under partitions. In fact, many choose a “bounded staleness” model, where they stop returning data after some known staleness, which is not possible to achieve with full availability under partitions. These kinds of systems - very sensible systems - are neither C (linearizable) or A (fully available) under partitions.

                                                                                        1. 2

                                                                                          This is true. Actually, the extreme strength of the notion of consistency Brewer used (that is, linearizability) is a point that can be used to argue against the conclusions of the CAP theorem, because depending on the data model, many systems can be meaningfully consistent without full linearizability.

                                                                                          I’m not aware of any work to prove (or disprove) the CAP theorem for different notions of consistency, though I would conjecture that the lower bound on consistency possible while maintaining availability is uselessly low.

                                                                                          1. 4

                                                                                            I’m not aware of any work to prove (or disprove) the CAP theorem for different notions of consistency

                                                                                            I suggest http://www.vldb.org/pvldb/vol7/p181-bailis.pdf, which includes a handy summary of impossibility results for various consistency models.

                                                                                          2. 1

                                                                                            I can not remember the name good enough to find it in google, but MSR had an interesting paper trying to figure this out to some degree. Pileaus or something.

                                                                                            1. 1

                                                                                              This? http://research.microsoft.com/en-us/projects/capcloud/default.aspx

                                                                                              (looks like your memory is totally CA.)

                                                                                              1. 1

                                                                                                Yep, that’s it! That only does these adaptive things on reads.

                                                                                        1. 6

                                                                                          This article is making me really nervous but I know I don’t have the distributed chops to prove it wrong.

                                                                                          I’ll say this: when an author starts talking about probabilistic interpretations of the theorem and going on about cost-benefit analysis (seriously, why are we worked up about poor “administration interfaces” here?!) my BS needle starts twitching. And when they do that when an impossibility proof exists that shows element availability and atomic consistency are not both possible, it starts swinging around madly.

                                                                                          The article reads like an awful lot of language lawyering around fairly well understood concepts, but I’m not sure what the motivations of the author are.

                                                                                          1. 6

                                                                                            Heh… Sigh. It reads like an attempt to illuminate, but a bad one. That seems worthwhile if it were shorter and clearer; I don’t think the concepts are actually all that well understood, unfortunately. At a previous job, after two months of arguing that Riak was the wrong choice for the company, I finally got through:

                                                                                            Me: “What exactly is the benefit you’re hoping for from using a distributed technology? Uptime? Preventing data loss?” Them: “Yes, both of those.” Me: “Those are mutually-exclusive in our situation.” Them: “Oh… Maybe something else would be okay.”

                                                                                            (And no, they aren’t inherently mutually exclusive, but the data was peculiar and merging later, after resolving a partition, wasn’t an option. I can’t go into it.)

                                                                                            I definitely don’t want that to be read as an insult to the intelligence of the person involved; they were quite competent. It’s just that databases are a subject not all engineers actually know very much about, and distributed ones are a rather new technology in the scheme of things.

                                                                                            It’s worth noting that not all distributed systems are databases, too, of course!

                                                                                            1. 5

                                                                                              That’s not what the impossibility proof says–he references that paper.

                                                                                              “In 2002, Seth Gilbert and Nancy Lynch publish the CAP proof. CA exists, and is described as acceptable for systems running on LAN.”

                                                                                              “If there are no partitions, it is clearly possible to provide atomic, available data. In fact, the centralized algorithm described in Section 3.2.1 meets these requirements. Systems that run on intranets and LANs are an example of these types of algorithms” [0]

                                                                                              I don’t think CAP is very well understood. I think folks end up very confused about what consistent means, and what partition-tolerant means.

                                                                                              I think this is pretty well researched. I’m not sure why cost-benefit analysis makes you nervous.

                                                                                              1. 4

                                                                                                James Hamilton of AWS says it best, I think:

                                                                                                Mike also notes that network partitions are fairly rare. I could quibble a bit on this one. Network partitions should be rare but net gear continues to cause more issues than it should. Networking configuration errors, black holes, dropped packets, and brownouts, remain a popular discussion point in post mortems industry-wide.

                                                                                                Gilbert & Lynch’s implicit assertion is that LANs are reliable and partition free; I can buy this in theory but does this happen in practice? When Microsoft performed a large analysis of failures in their data centers, they found frequent loss occurring that was only partially mitigated by network redundancy.

                                                                                                But either way you make a fair point: CA models aren’t strictly precluded by that proof. I’m just not certain I’ve seen a network that is trustworthy enough to preclude partitions.

                                                                                                1. 5

                                                                                                  Network partitions are not even remotely rare, honestly. LANs are actually worse culprits than the Internet, but both do happen.

                                                                                                  You already cited one of the better sources for it, but mostly I believe this because I’ve been told it by network engineers who I respect a lot.

                                                                                                  1. 6

                                                                                                    Even if network partitions were rare, I’ll tell you what aren’t (for most people): garbage collections. What I did not like about this post is it, over and over again, just talks about network partitions and the actual networking hardware. But weird application-specific things happen as well that appear to be unresponsive for longer than some timeout value and these are part of the ‘P’ as well.

                                                                                                    In reality, I think CAP is too cute to go away but not actually adequate in talking about these things in detail. PACELC makes the trade-offs much clearer.

                                                                                                    1. 4

                                                                                                      LANs are actually worse culprits than the Internet

                                                                                                      Funny you mention that: over the past few days I’ve been fighting an issue with our internal network that has resulted in massive packet loss internally (>50% loss in some spikes), and ~0.5% to the Internet. That’s probably why this article raised my eyebrows - it’s my personal bugbear for the week.

                                                                                                      The culprit seems to have been a software update to a Palo Alto device that stopped playing nice with certain Cisco switches… plug the two of them together and mumble mumble spanning tree mumble loops mumble. The network guys start talking and my eyes glaze over. But all I know is that I’ve learned the hard way to not trust the network - and when a proof exists that the network must be reliable in order to have CA systems, well…

                                                                                                      1. 1

                                                                                                        Heh - my sympathies.

                                                                                                  2. 3

                                                                                                    I think some of the confusion comes from describing all node failures as network partitions. In reality “true” network partitions are rare enough (lasting in durations long enough to matter to humans), but nodes failing due to hardware failure, operational mistakes, non-uniform utilization across the system, and faulty software deploys are sometimes overlooked in this context.

                                                                                                    i like the comment above “It’s worth noting that not all distributed systems are databases, too, of course!”, but i think this is also a matter of perspective. most useful systems contain state, isn’t twitter.com as a network service a distributed database? kind of neat to think about

                                                                                                  3. 3

                                                                                                    It’s not clear to me that the distinction the author makes between a CA and a CP system exists. He uses ZooKeeper as an example of a CP system, but the minority side of networking partition in ZooKeeper cannot make progress, just like his CA example. In reality, CP seems to be a matter of degree not boolean, to me. Why does a CP system that handles 0 failures have to be different than one that handles 2f-1?

                                                                                                    1. 1

                                                                                                      When the system availability is zero (not available at all) after a partition, you can claim both CP and CA (that’s the overlap between CP/CA).

                                                                                                      There are two corner cases when the system is not available at all:

                                                                                                      • the system does not even restart after the partition. You can claim CP theoretically. The proof’s definitions don’t prevent this formally. But it makes little sense in practice.

                                                                                                      • the system restarts after the partition and remains consistent. Both CP and CA are ok.

                                                                                                      But ZooKeeper is not concerned by these corner cases, because it is partly available during the partition.

                                                                                                      1. 9

                                                                                                        No, you can’t: a system which is not available during a partition does not satisfy A, and cannot be called CA. If you could claim both CA and CP you would have disproved CAP.

                                                                                                        1. 2

                                                                                                          CA means: I have a magical network without partition. If my network is not that magical at the end, I will be CP/AP and more likely in a very bad state, not fully available and not fully consistent.

                                                                                                          1. 7

                                                                                                            I’m responding to “When the system availability is zero (not available at all) after a partition, you can claim both CP and CA”. Please re-read Gilbert & Lynch’s definition of A: you cannot claim CA if you refuse to satisfy requests during a partition.

                                                                                                            1. 3

                                                                                                              But those magic networks do not exist, so how can a CA system exist?

                                                                                                              1. 1

                                                                                                                :-) It exists until there is a partition. Then the most probable exit is to restore manually the system state. 2PC with heuristic resolution being an example.

                                                                                                                Or, if you build a system for machine learning: 20 nodes with GPU, 2 days of calculation per run. If there is a network partition during these two days you throw away the work in progress, fix the partition and start the calculation process again. I don’t see myself waiting for the implementation/testing of partition tolerance for such a system. I will put it in production even if I know that a network partition will break it apart.

                                                                                                                1. 2

                                                                                                                  That system is still CP. You are tolerating the notion of partitions, and in the case of a partition you sacrifice A (fail to fulfill a request–a job in this case) and restart the entire system for the sake of C.

                                                                                                                  1. 1

                                                                                                                    It exists until there is a partition.

                                                                                                                    If a system reacts to a partition by sacrificing availability - as it must, and you haven’t demonstrated differently - how can you claim it is CA?

                                                                                                                    If there is a network partition during these two days you throw away the work in progress, fix the partition and start the calculation process again. I don’t see myself waiting for the implementation/testing of partition tolerance for such a system. I will put it in production even if I know that a network partition will break it apart.

                                                                                                                    I feel like I’m in bizarro world.

                                                                                                                    1. 1

                                                                                                                      If a system reacts to a partition by sacrificing availability - as it must, and you haven’t demonstrated differently - how can you claim it is CA?

                                                                                                                      If the system sacrifices consistency (it could also be consistency, or both), then there is an overlap between CA and CP. That’s what Daniel Abadi said 5 years ago: “What does “not tolerant” mean? In practice, it means that they lose availability if there is a partition. Hence CP and CA are essentially identical.”

                                                                                                                      The key point is that forfeiting partitions does not mean they won’t happen. To quote Brewer (in 2012) “CA should mean that the probability of a partition is far less than that of other systemic failures”

                                                                                                                      That’s why there is an overlap. I can choose CA the probability of a partition is far less than that of other systemic failures, but I could have a partition. And if I have a partition I will be either non consistent, either non available, either both, and I may also have broken some of my system invariants.

                                                                                                                      I’m sure it does not help you as I’m just repeating my post, and this part is only a repetition of something that was said previously by others :-(

                                                                                                                      Trying differently, maybe the issue to understand this is that you have:

                                                                                                                      • CAP as a theorem: you have to choose between consistency and availability during a partition. There are 3 options here:

                                                                                                                        • full consistency (the CP category)

                                                                                                                        • full availability (the AP category)

                                                                                                                        • not consistent but only partial availability (not one of the CAP categories, but possible in practice, typically 2PC with heuristic resolutions: all cross-partition operations will fail).

                                                                                                                      • CAP as a classification tool with 3 options: AP/CP/CA. There are a description of the system. CA means you forfeited partition tolerance, i.e. it’s a major issue for the system you build.

                                                                                                                      And, in case there is any doubt: most systems should not forfeit partitions. I always mention 2PC/heuristic because is a production proven exception.

                                                                                                            2. 1

                                                                                                              Could you rephrase your statement? I am having trouble parsing what you have said.

                                                                                                              1. 1

                                                                                                                the cr went away. let me edit.

                                                                                                              2. 1

                                                                                                                If we take your second case - as it’s the only real case worth discussing, as you note :-) - how can you claim the system is available?

                                                                                                                The system is CA under a clean network until time n when the network partitions. The partition clears up after m ticks. So from [1, n) and (m, inf) the system is CA, but from [n, m] it is unavailable. Can we really say the system maintains availability? That feels odd to me.

                                                                                                                Maybe it makes more sense to discuss this in terms of PACELC - a system in your second case has PC behavior; in the presence of a partition it’s better to die hard than give a potentially inconsistent answer.

                                                                                                                Having said all of this, my distributed systems skills are far below those of the commentators here, so please point out any obvious missteps.

                                                                                                                1. 1

                                                                                                                  CA is forfeiting partition tolerance (that’s how it was described by Eric Brewer in 2000). So if a partition occurs it’s out of the operating range, you can forfeit consistency and/or availability. It’s an easy way out of the partition tolerance debate ;-). But an honest one: it clearly says that the network is critical for the system.

                                                                                                                  Maybe it makes more sense to discuss this in terms of PACELC - a system in your second case has PC behavior;

                                                                                                                  Yep it works, Daniel Abadi solved the overlap by merging CA and CP (“What does “not tolerant” mean? In practice, it means that they lose availability if there is a partition. Hence CP and CA are essentially identical.”) It’s not totally true (a CA system can lose its consistency if there is a partition, like 2PC w/ heuristic resolutions), but it’s a totally valid choice. If you do the same choice as Daniel in CAP you choose CP for the system 2 above. CA says “take care of your network and read the documentation before it is too late”.

                                                                                                            3. 3

                                                                                                              seriously, why are we worked up about poor “administration interfaces” here

                                                                                                              :-) Because I’ve seen a lot of system where the downtime/data corruptions were caused mainly by: 1) software bugs 2) human errors.

                                                                                                              I also think that a lot of people take partition tolerance for granted (i.e. “this system is widely deployed in production, so it is partition tolerant as I’m sure everybody has network issues all the time, so I can deploy it safely myself w/o thinking to much about the network”). Many systems are not partition tolerant (whatever they say). That’s why Aphyr’s test crash them (dataloss, lost counters,…), even if they are deployed in production.

                                                                                                              It does not mean they have no value. It’s a matter of priority. See Aphyr’s post on ES, imho they should plan partition tolerance and implement immediately crash tolerance for example, instead of trying to do both at the same time.

                                                                                                              I prefer a true “secure your network” rather than a false “of course we’re partition tolerant, CAP says anything else is impossible” statement (with extra points for “we’re not consistent so we’re available”).

                                                                                                              1. 3

                                                                                                                CAP tells you that you can’t have both C and A when a partition happens. Most people take that to mean you must choose one or the other and have a CP or AP system. But it’s worth remembering that you do have the option of making sure that partitions never[1] happen - either by making the system non-distributed or by making the communications reliable enough. And for some use cases that might be the correct approach.

                                                                                                                [1] In a probabilistic sense - you can’t ensure that a network partition never happens, but nor can you ensure that you won’t lose all the nodes of your distributed system simultaneously. Any system will have an acceptable level of risk of total failure; it’s possible to lower the probability of a network partition to the point where “any network partition is a total system failure” is an acceptable risk.

                                                                                                                1. 2

                                                                                                                  I think it’s important to modify your statement a bit. What you have to do is ensure that in the face of a partition you remain consistent then try your darnedest to reduce the frequency of partitions. The distinction being you have control over what happens during a partition but not control over a partition happening.

                                                                                                                  1. 4

                                                                                                                    you have control over what happens during a partition but not control over a partition happening.

                                                                                                                    I don’t think that this sharp distinction exists. You don’t have absolute control over what happens during a partition - to take an extreme example, the CPU you’re running on might have a microcode bug that means it executes different instructions from the one you intended. And you do have control - to the extent that you have control of anything - over the things that cause network partitions; you can construct your network (or pay people to construct your network) so as to mitigate the risks. It is absolutely possible to construct a network which won’t suffer partitions (or rather, in which partitions are less likely than simultaneous hardware failures on your nodes) if you’re willing to spend enough money to do so (this is rarely a smart choice, but it could be).

                                                                                                                    1. 2

                                                                                                                      I do not think byzantine faults really matter for this discussion, they are a whole other issue to partitions. But I do not think your response invalidates my point at all. Partitions are something that happens to you, how your program handles them is something you do.

                                                                                                              1. 9

                                                                                                                Implicitly ignoring nil is the worst thing to do; it’s the reason when something goes wrong in Javascript you spend hours figuring out “why was that nil”? At least with a NullPointerException you find out relatively soon (though ideally you would find out before that, at the thing that was going to return null).

                                                                                                                Most languages know better than to allow nilness to pass silently, just as they know better than to have asynchronism happen silently. The whole point of monads is to allow you to do this kind of thing in a low-overhead, but still explicit way.

                                                                                                                1. 3

                                                                                                                  The behavior described in the article didn’t seem very “monadic” to me either. Swallowing errors appears to be the opposite of what Monads are about - treating side effects explicitly. That said, I can definitely see the benefit of this behavior in certain cases, for example when it matters more to make something work than to make it correct.

                                                                                                                  1. 3

                                                                                                                    Objective-C’s bottom propagation seems to work really well; perhaps it’s a cultural and not a technical distinction?

                                                                                                                    1. 2

                                                                                                                      You’re absolutely right that implicit handling of nil can cause errors to be detected further from their causes–but the same is true of currying, which allows type errors to arise from widely separated parts of a program. Different languages have different habits around these kinds of convenience/safety tradeoffs.

                                                                                                                      FWIW, I find Clojure’s nil handling generally more convenient than a hindrance–functions which define nil often serve as their own base case in recursive algorithms, for example, which reduces the need for explicit branching. This is especially helpful when building up maps! And while I agree that Clojure’s default nil behaviors can make it easier to make mistakes, core.typed is quite effective at preventing NPEs when you want that degree of safety.

                                                                                                                      1. 4

                                                                                                                        You’re absolutely right that implicit handling of nil can cause errors to be detected further from their causes–but the same is true of currying, which allows type errors to arise from widely separated parts of a program.

                                                                                                                        Are you equating runtime errors and type errors? I’ve not seen currying create confusing type errors. Confusing runtime errors, yes definitely, but if you have types - no.

                                                                                                                        Source-sink distances in GHC type errors are generally quite good, whereas I’ve seen things like Clojure vectors being functions create mind-bending source-sink distances in errors.

                                                                                                                        1. 1

                                                                                                                          I think the way Swift handles this is a pretty happy medium. Values aren’t implicitly nullable, and nil propagation is explicit rather than implicit. This means that:

                                                                                                                          • When you’re okay with nil propagating, you can very easily just chain optional values by doing something like:

                                                                                                                            let myValue : SomeType? = myObject?.doSomething()?.value as? SomeType

                                                                                                                            And myValue would be nil if myObject is nil. Useful in the cases when something like that is acceptable.

                                                                                                                          • But if you absolutely need the value to not be nil, then you throw in an explicit optional unwrapping and you get crashes if it is nil:

                                                                                                                            let myValue : SomeType = myObject!.doSomething().value as! SomeType

                                                                                                                            And in this case myValue would either be non-nil or you’d get an exception.

                                                                                                                      1. 5

                                                                                                                        In my own code, if I ever have methods return self, it is because I am trying to implement some sort of chainable API. This pattern is popular when constructing objects in Rust, and is often called the “Builder” pattern. That is, instead of implementing a constructor that takes 5 different parameters like this:

                                                                                                                        let mut my_obj = MyObj::new("some string", 1, SOME_ENUM, true, false);

                                                                                                                        You use the “Builder” pattern to make it much more clear (also more verbose, but hey, “tradeoffs”):

                                                                                                                        let my_obj = MyObj::new("some string")

                                                                                                                        The nice about this in Rust is, you can keep the mutability confined to object construction. After the object gets assigned to my_obj, it is considered immutable (you would have to write let mut my_obj to change this behavior).

                                                                                                                        1. 19

                                                                                                                          I like builders and have written APIs that provide builder patterns, but I really prefer option maps where the language makes it possible. For instance:

                                                                                                                          let my_obj = MyObj::New("some string",
                                                                                                                                                  {:priority 1
                                                                                                                                                   :mode     SOME_ENUM
                                                                                                                                                   :foo?     true
                                                                                                                                                   :bar?     false})
                                                                                                                          1. Option maps are usually shorter in languages with map literals.
                                                                                                                          2. Option maps are data structures, not code. They’re easier to store and read from files. You can put them in databases or exchange them across the network. Over and over again I see boilerplate code that sucks in JSON and calls a builder fun for each key. This is silly.
                                                                                                                          3. Builders in most languages (perhaps not Rust!) require an explicit freeze/build operation because they’re, well, mutable. Or you let people clobber them whenever, I guess. :-/
                                                                                                                          4. Option maps compose better. You can write functions that transform the map, or add default values, etc, and call a downstream function. Composing builders requires yielding the builder back to the caller via a continuation, block, fun, etc.
                                                                                                                          5. Option maps are obviously order-independent; builder APIs are explicitly mutating the builder, which means the order of options can matter. This makes composition in builders less reliable.

                                                                                                                          Why not use option maps everywhere? I suspect it has to do with type systems. Most languages only have unityped maps where any key is allowed, but options usually have fixed names and specific but heterogenous types. The option map above has booleans, integers, and enums, for example.

                                                                                                                          In languages like Java, it’s impossible to specify type constraints like “This map has a :foo? key which must be a boolean, and has a :mode key that can only be one of these three values”. Using a builder with explicit type signatures for each function lets you statically verify that the caller is using the correct keys and providing values of the appropriate type.^

                                                                                                                          Of course, all this goes out the window when folks start reading config files at runtime, because you can’t statically verify the config file, so type errors will appear at runtime anyway, but you can certainly get some static benefit wherever the configuration is directly embedded in the code.

                                                                                                                          ^Know what a heterogenous map is in Java? It’s an Object! From this perspective, builders are just really verbose option maps with static types.

                                                                                                                          1. 1

                                                                                                                            I agree that it’s a shame to sacrifice the composability of option maps. I would prefer a builder API which is sugar on top of merging option maps, with an easy way of getting to the option maps when I want that composability.

                                                                                                                            You can also have builder APIs which are pure and return new objects instead of self. In Rubyland, ActiveRecord’s Relation API is like this, which is great because intermediary results can be shared:

                                                                                                                            posts = Post.where(user_id: 1).order("published_at DESC")
                                                                                                                            latest_posts = posts.limit(10)
                                                                                                                            favourites = posts.where(favourite: true)

                                                                                                                            This provides one part of the composability you get from option maps, but not all of it. Unfortunately I don’t think the ActiveRecord::Relation API is built in a way that lets you build up relations with option maps when you want.

                                                                                                                          2. 6

                                                                                                                            I’d argue that named parameters solve the opaque-list-of-arguments problem with much less complexity.

                                                                                                                            1. 6

                                                                                                                              As with all things, it depends on what kind of complexity. It increases complexity in the language for a decrease in complexity in your code. This may or may not be worth it, depending on how much you increase complexity in the language.

                                                                                                                              1. 4

                                                                                                                                less complexity

                                                                                                                                Questionable; have you seen Python’s positional/named parameter assignment rules? Granted, they’d be much simplified by killing the *args and **kwargs constructs, but at a language level, named parameters are definitely more complicated. On the other hand, they do make life somewhat simpler for language users. It’s a tradeoff.

                                                                                                                                Regardless, I think either is a perfectly acceptable solution to the problem.

                                                                                                                            1. -1

                                                                                                                              The article is very western-centric. Most of the world doesn’t do Daylight Savings Time anymore. Japan doesn’t, China doesn’t, Kazakhstan doesn’t, Russia doesn’t, even US state of Arizona doesn’t.


                                                                                                                              Gosh, just look at the wikipedia, they do have a nice visual map up there – pretty much the whole world doesn’t do DST anymore! I’d say, might as well keep your clock at Beijing Time, no need to bother with UTC.

                                                                                                                              1. 19

                                                                                                                                The article does not recommend that one use DST; it simply advises that if one finds oneself operating servers in those locales (and I assure you there are a nontrivial number of servers in the US), it’s a good idea to avoid the hassle of DST.

                                                                                                                                Picking an arbitrary offset (e.g. Beijing time) works, but it does tend to complicate your time arithmetic somewhat. You must:

                                                                                                                                1. Ensure every server, regardless of its location, uses Beijing time. Ensure that new sysadmins are made aware of this choice and don’t assume it’s a bug. Ensure they don’t subsequently “fix” the problem by changing the server’s timezone to UTC without consulting the Powers That Be.
                                                                                                                                2. Ensure that every piece of software which assumes UTC as its default regardless of system timezone (databases, external APIs, etc) is configured to use Beijing time or is wrapped by an appropriate translation layer. Write tests to verify times roundtrip correctly. This is is a great idea regardless of what TZ you choose. :)
                                                                                                                                3. Make sure to localize any use of a date library to Beijing time instead of accepting their default UTC behavior.
                                                                                                                                4. If you’re exchanging time information in, say, milliseconds since the unix epoch, and not a format that makes it clear what TZ you’re in, prepare for a fun time tracking down silent data corruption. Customers may or may not read your documentation. Customers may or may not implement your documentation correctly. If you do all internal time work in Beijing time, but expose UTC to customers, remember to always add the correct offsets exactly once and in the right direction.
                                                                                                                                5. If you exchange time information in a date format (e.g. ISO8601 W3Cschema) that nominally includes TZ information and accidentally omit the TZ because someone decided a quick little strftime was sufficient for a one-off script, and subsequently interpret Beijing times as UTC times, prepare for all kinds of fun.

                                                                                                                                I’m sure I speak for everyone here when I assure you that I have never made any of these mistakes personally and they have never cost me weeks of my life. ;-)

                                                                                                                              1. 6

                                                                                                                                I think extrapolating from “my program got slower with threads (and there was an easy fix)” to “most programs get slower with threads” is quite the leap.

                                                                                                                                1. 10

                                                                                                                                  I think the point is more: “it is easy to get multi-threading wrong and hurt performance in ways you may not expect if you’re unfamiliar with how multi-threading works.”

                                                                                                                                  1. 10

                                                                                                                                    Multi-threaded programs can, and very often do, run much more slowly than the equivalent single-threaded program.

                                                                                                                                    The point that I was trying to make is that Amdahl’s law gives us the wrong intuition about the performance of multi-threaded programs. The worst case of Amdahl’s law is a wash: the multi-threaded code runs in the same time as the equivalent single-threaded code. Unfortunately, that doesn’t match reality. In the real world, poorly written (or poorly optimized) multi-threaded code runs slower than the equivalent single-threaded program.

                                                                                                                                    That doesn’t mean that threads are bad, just that they aren’t a magic ointment that makes all programs faster. If there is contention, especially if it’s contention that requires a context switch, they can make code go slower. Sometimes shockingly so.

                                                                                                                                    The second think I was trying to talk about is how modern Linux has some very cool tools for tracking down these kinds of performance problems. perf (or perf-events) is extremely powerful, and combines a lot of what you get from profilers and strace into one package with much less runtime overhead. In addition, its ability to do system-wide profiling is very handy for cross-process interactions. Many other operating systems have equivalent, and some better, tools.

                                                                                                                                    1. 6

                                                                                                                                      In the real world, poorly written (or poorly optimized) multi-threaded code runs slower than the equivalent single-threaded program.

                                                                                                                                      I’ve done a lot of concurrent programming over the last four years, and this has almost never been my experience working with Erlang, Clojure, and java.util.concurrent, but YMMV I guess. I tend to see sublinear scaling owing to unexpected locks in the stdlib (hi Integer.parseInt), or known synchronization points like interning and queue contention, but I don’t think I’ve ever hit a real-world computational problem where I haven’t been able to get a ~4-10x speedup out of a 16-core box by slapping a threadpool on it. Usually takes a profiler to get to that near-linear domain though.

                                                                                                                                      1. 2

                                                                                                                                        It was harder to get useful behavior out of multithreading in the bad old C/C++ days where there was heavy reliance on out-of-process locks. People know how to do things better than lock-spaghetti now.

                                                                                                                                  1. 2

                                                                                                                                    Isn’t this just a fancy ircd?

                                                                                                                                    1. 9

                                                                                                                                      Excuse me but how is this better than Hadoop?

                                                                                                                                      1. 1

                                                                                                                                        best comment on lobsters to date

                                                                                                                                      2. 8

                                                                                                                                        I suppose you could look at it that way, but I think there is more value than just that…

                                                                                                                                        The main point is accessibility to non-technical team members, given the better UX of tools like this. Even being a developer myself, I don’t think I’d like IRC much without IRCCloud these days.

                                                                                                                                        With Slack and similar services, there are also lots of integrations into other services, like GitHub, etc. You could set up bots on IRC for these things, but these services make it more accessible.

                                                                                                                                        There are many other competitors in this space:

                                                                                                                                        • FlowDock
                                                                                                                                        • CampFire
                                                                                                                                        • HipChat

                                                                                                                                        I don’t use Slack myself, but have used FlowDock at a previous company.

                                                                                                                                        1. 4

                                                                                                                                          With a decent mobile client and ‘session syncronization’ by default.

                                                                                                                                          1. 2

                                                                                                                                            not to mention a built-in bouncer

                                                                                                                                          2. 3

                                                                                                                                            I was of the same opinion initially, but just started using this on my team for the last week and the integrations/sync/device support are pretty good – combining a lot of what we were using IRC and Google Hangouts separately for.

                                                                                                                                          1. 27

                                                                                                                                            Whether exactly-once delivery is possible depends a lot on what you mean by “message”, “exactly-once”, and “delivery”.

                                                                                                                                            First, ‘atomic broadcast’ IS possible. It is practically possible (and done frequently) to build a distributed system where multiple nodes process the same messages in the same order, all exactly once. This is behind the classic “distributed state machine” approach to building fault-tolerant systems. See http://dl.acm.org/citation.cfm?id=98167 for a classic paper in the area and http://dl.acm.org/citation.cfm?doid=1041680.1041682 for more information on the general topic. In short: building a totally ordered stream of messages and delivering them in the same order to multiple nodes is not only possible, but done frequently.

                                                                                                                                            So far so good, but there are two big caveats here. One is that getting this right requires significant coordination, which comes with both availability and latency costs. The second is that it’s not really what people mean when they say “message delivery”. Most people mean that each message gets delivered once to one consumer, which does something with that message that has side effects. That becomes trickier, because we need to start talking about failure tolerance.

                                                                                                                                            Consider the system where the queue hands the message off to the consumer, and does a handshake that makes both agree that the messages has been handed off. Now, the consumer goes off and does something with that packet that has side effects: it changes the world in some way. Finally, the consumer once again runs a protocol which makes both it and the queue agree that the message has been processed. What happens when the consumer fails?

                                                                                                                                            apy’s post gives two of the possibilities for handling that case. There are others, but they aren’t any better.

                                                                                                                                            The core problem here is that exactly once delivery is fundamentally at odds with fault tolerance. Exactly-once delivery and processing fundamentally requires that the act of processing, and hence knowledge about the fact the processing happened, is kept at just one place in the system. If that one place fails, the system needs to reconstruct that fact, but has no way to do so. It then needs to decide between re-delivering the message (and possibly having it processed multiple times) or dropping the message (and possibly having it never processed).

                                                                                                                                            Ok, so it’s impossible. Where does that leave us? It should be pretty obvious to you that many real-world systems rely on exactly-once processing of tasks and messages. How can they do that if it’s impossible?

                                                                                                                                            Think about Bob, who runs a pizza shop with online ordering. When people order from Bob, their orders go into Bob’s persistent queue. Bob workers take a pizza order off the queue, bakes it, delivers it, and goes back to the queue. Occasionally one of Bob’s workers gets bored and leaves early, in which case Bob gives the order to a different worker. Sometimes, this means that multiple pizzas arrive at the customer’s house (and never less than one pizza). On arriving, the pizza delivery guy asks the home owner if they had received a pizza with that order ID before. If the home owner says yes, the pizza guy takes the duplicate pie with him. If not, he leaves the pie. Each home owner gets exactly one pie, and everybody is happy.

                                                                                                                                            Short version without pizza: exactly-once delivery is impossible. Exactly-once processing of messages is possible if the processing can be made idempotent.

                                                                                                                                            1. 16

                                                                                                                                              I think mjb is exactly right. To expand on this a bit:

                                                                                                                                              Implementations of distributed replicated state machines in the literature generally assume that operations, once received by a node, are atomically and durably logged to disk. Real disks are not so reliable, which often entails some degree of log replaying, where operations are journaled before being applied to some state machine and applied again to recover from a checkpoint in the event of failure. Moreover, running an operation on multiple replicas is assumed to be safe: if the operation does something like “Lower Gertrude one meter deeper into the volcano”, executing it on one versus three replicas could mean the difference between a successful sampling expedition and a very unhappy geologist.

                                                                                                                                              Both of these constraints lead us to a hand-wavy notion that operations must be in some sense idempotent in the real world, and on each replica, they have to transform a state deterministically. These properties are key to crash recovery, but not all functions satisfy these properties.

                                                                                                                                              What people generally mean by “exactly once delivery” of a message is something like “This function will be be invoked atomically and exactly once.” But we know this property is not, in general, satisfiable. Consider:

                                                                                                                                              def lower
                                                                                                                                                run_winch :counterclockwise, 1

                                                                                                                                              Now imagine trying to call this function once on a single node, and knowing if we crash, whether the lowering has or has not occurred:

                                                                                                                                              log :lowered

                                                                                                                                              If we crash between logging and lowering, Gertrude remains a meter too high to grab her sample. What if, instead, we try

                                                                                                                                              log :lowered

                                                                                                                                              Now if a crash occurs between lowering and logging, the computer thinks that Gertrude still needs to go a meter deeper, even though she’s now at the correct altitude. The geologist and winch engineer wind up having a tense conversation punctuated by the odor of Gertrude’s burned boots. They decide instead to augment the lowering process with some extra information:

                                                                                                                                              def lower(height_above_lava)
                                                                                                                                                current_height = gertrude.rangefinder.height

                                                                                                                                              Together, they’ve modified the task itself so that it is idempotent. Notice that they had to couple the idempotency of this function to the state it manipulates–e.g. Gertrude’s sensor package–so that it is safe to call multiple times.

                                                                                                                                              tl;dr: In any message queue, or any transactional system in general, we cannot provide atomicity guarantees for arbitrary side-effecting functions. Those functions must be carefully designed to allow for repeated invocation, because repeated delivery is implicit in crash recovery where some state may be lost.

                                                                                                                                              1. 1

                                                                                                                                                I think this covers the case of the subscriber client pretty well. Does this also cover the case of the broker also (meaning that we assume that a queue is a side-effecting data structure)?

                                                                                                                                                1. 3

                                                                                                                                                  It’s not clear to me that you can think about a broker without clients as a meaningful message queue.

                                                                                                                                              2. 1

                                                                                                                                                So exactly-once is possible if there isn’t an in-order requirement? (I assume that’s what you mean by requiring idempotence)

                                                                                                                                                1. 5

                                                                                                                                                  No, perhaps I explained poorly.

                                                                                                                                                  Talking about idempotence was trying to explain how systems typically get around the problem of exactly-once being impossible. Basically, you embrace the fact that you can’t apply every operation exactly once, so you design for at-least-once. If you make your operations idempotent, then their effects end up being applied exactly once.

                                                                                                                                                  The goal of systems designed like this is delivery at-least-once (for completeness) and approximately-once (for efficiency). Idempotence then gives exactly-once effects at the cost of a little bit of efficiency. Obviously the challenge is designing operations that are idempotent (and commutative and associative if conditions require it).

                                                                                                                                                2. 1

                                                                                                                                                  Exactly-once delivery and processing fundamentally requires that the act of processing, and hence knowledge about the fact the processing happened, is kept at just one place in the system

                                                                                                                                                  Let me expand this, tell me if I’m wrong.

                                                                                                                                                  My first instinct is to say “No it doesn’t because the processor (client) sends an ACK back to the broker”. But the problem with my statement would be that the ACK may not arrive.

                                                                                                                                                  For instance, the server that I send the ACK to dies right after sending me the message so it doesn’t receive my ACK. The failover server picks up where the other server left off and resends the message, in which case I get the message delivered a second time even though I’ve already successfully processed it.

                                                                                                                                                  My rebuttal is “what if the load balancer has knowledge of which servers have a replica of the session and can smartly choose one of them to route the traffic to?”. This took me a while to figure out but I eventually realized that there’s still a gap between when the server becomes unavailable and when the cluster (and LB) realize that it’s unavailable. So there’s still plenty of time where my ACK will get dropped without the failover server recognizing it. Again, the problem here is that a network failure appears as as an unresponsive server, but so does a long garbage collection cycle (or many other normal, naturally occurring tasks).

                                                                                                                                                  1. 2

                                                                                                                                                    In order to solve exactly-once delivery by adding another node…first you must solve exactly-once delivery. Just, inductively, if it’s impossible to solve with N nodes, it will be impossible to solve with N + 1 nodes.

                                                                                                                                                    EDIT: Also note that this problem is not just queues, it’s any communication. Exactly once HTTP requests are impossible as well.