Threads for dallas

    1. 1

      nice post =)

    2. 2

      depending on when it would be I could attend / maybe provide a space.

      1. 30

        This response isn’t a denial. I think folks should notice that. He goes out of his way to diminish the author without actually denying it. He’s mad about the post, not that he’s falsely accused, because the post is true.

        This response is also filled with red flags:

        Saying Tom didn’t want money and then saying all future donations will be split with Tom is a weird contradiction. The quote “work on the advancement of requests” seems like a way to differentiate between maintenance (which he wasn’t really doing while others were) and “advancement” (which is whatever he’s doing). Including the news that the library will changing its backend sounds like one of those sudden, made-up decisions people do try and make their accuser seem unqualified. How interesting the timing on that! Talking about the small set of “real collaborators” excludes someone who he explicitly says he was collaborating with is gaslight-y. And the “just don’t fucking work with me” has such a long history of being said by people who really did awful things and don’t want to admit that.

        1. 3

          Saying Tom didn’t want money and then saying all future donations will be split with Tom is a weird contradiction

          But Tom is not njs.

        2. 1

          Including the news that the library will changing its backend sounds like one of those sudden, made-up decisions people do try and make their accuser seem unqualified.

          You make some good points. In terms of timing, feel like this was mentioned ahead of PyCon on an episode of Talk Python, but I was only half-listening to that the first time.

      2. 24

        All that being said, I’m not sure why this person feels the need to attack my character, including curating a list of quotes (what?) from “collaborators”.

        I’d just like to point out that Kenneth has lists of quotes…about himself…on his website.

        1. 0

          Kenneth has lists of quotes…about himself…on his website.

          While I’m in no way defending Kenneth or his actions, mocking someone for stating their opinions (in quote or any form) on their own website is not in the spirit of engineering or science. If you feel the need to be petty, please find another place to dunk on people.

          1. 6

            I was highlighting the irony of the journal entry expressing incredulity about nj’s inclusion of a list of quotes from collaborators, as KR knows all about including quotes from “collaborators” (or sycophants, everyone can make up their own mind).

            As far as “scoring on people”, I’d suggest that you are the one who is attempting to do so, with your virtue signalling and calling me petty.

      3. 3

        It always amazes me (and scares me) how different people percieve reality (if that is even an achievable thing) and how the same situation can be read completely differently by two different brains. It is super scary to me. In this case I believe neither of them had anything malicious going on, and still, both of them have a completely different grasp of the situation.

        1. 9

          This is why I think the original article was bad form. I know neither of the people involved. I’ve never even heard of them. I wouldn’t know who to believe even if I knew them.

          Tag this one as “call out culture.” If there’s something to be done, it should probably be done within that community and with discretion, precisely because there are two sides to every story and people are biased toward the first/best expositor regardless of whatever actually happened.

          I think it would be great if the mods banned personal call out articles on this basis. And, again, I know neither of these people. I’m not in the Python community.

          1. 10

            it should probably be done within that community and with discretion, precisely because there are two sides to every story and people are biased toward the first, best expositor.

            I don’t disagree in general, but how do you do that in the context of an open source community? There is no real central authority, and people can essentially just do what they want.

            1. 18

              I actually can contextualize Nathaniel’s post with my own interactions with reitz (which were a lot less involved) but they verify my impression.

              So I an glad Nathaniel posted this. It helps me stay clear of unproductive conflicts for the future.

              A few helpful and engaged members of the python community have signaled that it matches some of their observations.

              If you keep such things private and secretive is hard to go through with community actions (like removing someone from boards, etc). If you make it public discourse people complain about character assassination or whatever. At the end of the day I believe in a victims right to discuss their case publicly of they want to.

        2. 2

          I do think that’s true, but I think that one of the author’s central points, and part of the reason I posted this, is that it’s important to be aware that when money is involved there is a whole different level of accountability that comes into play.

          This is why the legal system exists. This is why scrupulously detailed contracts arbitrated by lawyers exist.

          Moreover, this is why foundations like the PSF exist - they handle the ‘dirty’ work of distributing money in a way that’s free of legal entanglement and less likely to engender this kind of mis-understanding.

    3. 9

      this is truly sad, but it also proves how important projects like the internet archive are for humanity.

      1. 12

        The question is, who should pay for those archives?

        Sweden has a long-standing law that each book published here has to provide copy to KB (Royal Library). In the 1920s, with the rise of radio and movies, this law was expanded to sound and video too.

        All this stuff is a massive boon to researchers and part of our cultural heritage. But it comes at a not insignificant cost.

        1. 3

          I would say that is a question, not the question. =)

          I would also argue that governments are probably not the right people/entities to be archiving raw content for free and public use as history has shown us time and again why that doesn’t work. In my limited dealings with the internet archive, it seems as though they are funded to a point where the mission is well and alive, although I’m sure they would say more would be better.

          1. 9

            I would also argue that governments are probably not the right people/entities to be archiving raw content for free and public use as history has shown us time and again why that doesn’t work.

            how so?

            1. 6

              I’m also confused by that comment. I always generally thought that the Library of Congress was fairly successful. Unless GP was speaking about spans multiple millennia, in which case I doubt a company dedicated to preserving anything would outlive most nation states.

            2. 2

              I can’t speak for dallas, but a historical precedent that supports his position is the decline of the Library of Alexandria. The Nazi book burnings are a more intentional example.

              That said, distributed & immutable archival of documents could make it so that it doesn’t matter as much who the archiving entities are.

              1. 4

                This is an argument for distribution, not necessarily against state management. Most larger countries have rules similar to Sweden, which practically means that all media published in multiple states are automatically archived multiple times.

                Considering that archiving is a task that most government organisations have the most experience and practice in, I’m hard pressed to throw that idea out. There’s quite a high bar to reach to even archive better than the most underfunded of government orgs.

              2. 1

                I can’t speak for dallas, but a historical precedent that supports his position is the decline of the Library of Alexandria. The Nazi book burnings are a more intentional example.

                I really don’t know what you mean. This is an argument that governments are not the best entities to archive and preserve works?

          2. 1

            OK conceded, a question.

            Maybe ISPs could just pledge to fund the Internet Archive is a small percentage of revenue, or have a formal agreement to donate hardware and bandwidth. But ultimately, it’s going to be a hard corporate decision to pay to host some punk bands 10 year old songs about vomit.

            1. 2

              yah, that’s not a bad idea. =)

      2. 7

        When music piracy was thriving, even most obscure albums was distributed to multiple nodes and was readily available in various networks. This included music that was initially released by (usually underground) artists for free. In the age of streaming services most people even deleted their old collections or didn’t migrate them from old PCs.

        Moreover, as streaming services grow in popularity, I observe loss of interest to obscure/rare music, now many people only want “new, fresh and trendy”, this includes not only pop music made by large corporations, but indie and even outright underground projects too, but those who by random factors reached popularity on internet, and this popularity lasts for a very short time. Myspace, being a streaming service (if I understand it right, I never figured out how to use it, with its arcane UI), could be large company now, but missed this opportunity, mostly due to horrible UX and lack of focus.

    4. 6

      Am I the only one who happily uses both Macs and GNU/Linux computers?

      1. 10

        Nope, one MUST use one or the other, never both, or even a third at the same time. Can only ever be either/or!

        /s obviously but most of these posts are starting to get tiring

        1. 2

          It is not like there are no benefits to using a single platform. For example I use Linux, NixOS specifically, just so I can configure everything–from kernel to user packages to development projects–in declarative fashion. Good luck doing that with OSX (which I use solely for building iOS projects).

          1. 1

            I use nix on OS X, its not that different to be honest, not sure I agree with this overall sentiment even as a nixos and macos (amongst others) user.

          2. 1

            nix + nix-darwin?

      2. 2

        I live out of a suitcase, so I don’t want to carry two machines around with me. I also don’t want to run two operating systems, because disk space is at a premium already on my MacBook Air. I have to garbage collect my Nix store and rebuild my Docker build slave (I need to compile for Linux arch) every week.

        I don’t need any MacOS GUI programs, but I do want the It Just Works thing when it comes to running the webcam (I don’t want to carry an external one), audio, battery life, etc.

        I could switch away from Apple hardware but so far I still think it’s best in class, although this is less true now that they’ve done away with scissor switches and MagSafe.

        1. 3

          With Nix and Docker on OS X it’s kind of like you’re running a few operating systems already. How many copies of the GNU userland are on your drive right now? 😜

          1. 1

            That’s true, it’s a sort of hybrid but only because I’m not yet confident I can have the best hardware for me (basically a penultimate generation MacBook Air, maybe with more disk space and memory), with NixOS but with all the other conveniences working.

        2. 2

          I think I’m one of the most diehard pro-Linux and anti-Mac developers (just shy of the line between pragmatism and zealotry) I know but if I had to live with one machine and one machine only I guess I wouldn’t think twice about choosing a Macbook Pro. (Right now I’ve been on Windows for games all my life, and using Linux casually since 1998 and for work, fulltime, for over 10 years).

          1. 3

            I use MacOS personally (2013 MBP) and Windows at work. Linux in a VPS.

            I’m seriously considering going Windows for personal stuff when I need to update my computer.

            Why? Because while I love the Apple hardware, the latest iterations don’t look compelling to me. KB issues, no magsafe, no SD-reader (I do a lot of photography), lack of USB-A ports, and higher price. And I really don’t use any Apple-made software (apart from Terminal.app). It’s Chrome, Lightroom, VLC… all of which are basically cross-platform. I don’t feel I’ll miss much from OSX, and it will be much easier to game in Windows too.

            1. 2

              I tried switching to windows from mac at home on account of games, but my use case is really centered around audio. Turns out windows is still 3rd place for handling external audio devices (alongside more general web browsing sound) so I’m probably going to take the NUC and sell it or Linuxify it and get a mac mini. I use *nix side of the mac (and linux) constantly at work. Windows doesn’t have that side and the reliability although better in win 10 isn’t quite where I need it to be when I just want to hit record.

      3. 1

        As an anecdote, I use three computers, a mac, a windows and a Linux. Recently, I was going to give a talk at a meetup and I had two demos to show regarding p2p in the browser. Unfortunately, I did a poor job in testing them prior to the event day. I didn’t test my old demo, just the new one (my thought was, I didn’t change the old stuff, it should just work), well, it didn’t. In the end, one demo worked only on the mac and the other only on windows. I gave the talk with two computers on the stand and switched from one to the other… sometimes using two computers is ok :-)

    5. 9

      because it forces you to use a phone number, the author works for facebook and recommends against GPG, and they did not want people to use the F-droid free appstore and they are against federating with people hosting their own server

      1. 8

        Who works for Facebook? Not Moxie, as far as I know.

      2. 5

        moxie used to work for twitter, but no longer.

    6. 2

      this is awesome!

      1. 1

        Thanks for the kind words, glad you like it!

    7. -1

      it looks like the link is broken =(

      1. 1

        I just got to the page fine.

        1. 2

          weird it failed for me initially and i just assumed the link was bad, sorry for the confusion.

    8. -1

      [Title] /proc/<pid>/stat is broken

      This sounds serious! Is the content of the pseudo-file associating incorrect PIDs or parent PIDs to processes?

      Let’s continue…

      Documentation (as in, man proc) tells us to parse this file using the scanf family, even providing the proper escape codes - which are subtly wrong.

      So it’s a documentation issue…

      When including a space character in the executable name, the %s escape will not read all of the executable name, breaking all subsequent reads

      I have literally never encountered an executable with a space in the name, although it’s perfectly legal from a file name perspective. (I’ve been a Linux user since 1998).

      The only reasonable way to do this with the current layout of the stats file would be to read all of the file and scan it from the end […]

      So… let’s do this instead?

      The proper fix (aside from introducing the above function) however should probably be to either sanitize the executable name before exposing it to /proc//stat […]

      Sounds reasonable to me.

      […], or move it to be the last parameter in the file.

      Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?

      This problem could potentially be used to feed process-controlled data to all tools relying on reading /proc//stat

      I can’t really parse this. Do you mean “affect” instead of “used”?

      In conclusion: I can’t see any evidence of the functionality of this proc pseudo-file being “broken”. You have encountered an edge case (an executable name with a whitespace character in it). You’ve even suggested a workaround (scan from the end). If you had formulated this post as “here’s a workaround for this edge case” I believe you would have made a stronger case.

      1. 5

        I have literally never encountered an executable with a space in the name

        Well, tmux does this, for example. But my primary concern is not has it ever happened to me but, if it happens, what will my code do?. As this is a silent failure (as in, the recommended method fails in a non-obvious way without indicating failure), no action is taken by most implementations to guard against this. That, in my mind, counts as broken, and the least thing to do is to fix the documentation. Or expose single parameters in files instead of a huge conglomeration with parsing issues. Or… see above.

        So… let’s do this instead?

        I do, but only after I got sceptical while reading the documentation, ran some tests and had my hunch confirmed. Then I checked to see others making that very mistake.

        Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?

        No, I don’t think so - except for introducing single-value files (and leaving /proc/<pid>/stats be as it is).

        This problem could potentially be used to feed process-controlled data to all tools relying on reading /proc//stat

        I can’t really parse this. Do you mean “affect” instead of “used”?

        Admittedly, English is not my first language, I do however think that sentence parses just fine. The discussed problem (which is present in several implementations based on the documentation), can potentially be used to inject data (controlled by the process, instead of the kernel) into third-party software.

        In conclusion: I can’t see any evidence of the functionality of this proc pseudo-file being “broken”.

        That depends on your view of broken - if erroneous documentation affecting close to all software relying on it with a silent failure does not sound broken to you, I guess it is not.

        You have encountered an edge case (an executable name with a whitespace character in it).

        I actually did not encounter it per se, I just noticed the possibility for it. But it is an undocumented edge case.

        You’ve even suggested a workaround (scan from the end).

        I believe that is good form.

        If you had formulated this post as “here’s a workaround for this edge case” I believe you would have made a stronger case.

        Maybe, but as we can see by the examples of recent vulnerabilities, you’ll need a catchy name and a logo to really get attention, so in my book I’m OK.

        1. 1

          Thanks for taking the time to answer the questions I have raised.

          The discussed problem (which is present in several implementations based on the documentation), can potentially be used to inject data (controlled by the process, instead of the kernel) into third-party software.

          Much clearer, thanks.

          On the use of “broken”

          I’m maybe extra sensitive to this as I work in supporting a commercial software application. For both legal and SLA[1] we require our customers to be precise in their communication about the issues they face.

          [1] Service level agreement

          1. 1

            Followup: can you give a specific example of how tmux does this? I checked the running instances of that application on my machine and only found the single word tmux in the output of stat files of the PIDs returned by pgrep.

            1. 2

              On my Debian 9 machine, when starting a tmux host session, the corresponding /proc/<pid>/stat file contains:

              2972 (tmux: client) S 2964 2972 2964 […]

      2. 3

        “Thus breaking all existing implementations that rely on the documentation in man proc. But I guess it can be done in some backwardly compatible way?”

        I will never get the 100ms it took to read this sentence back….

        1. 1

          I dunno, maybe just duplicate the information at the end of the current format, in the author’s preferred format, and delimited by some character not otherwise part of the spec.

          It’s not trivial, though.

          That was my point.

      3. 1

        this was clearly overlooked when the api was designed, nobody is parsing that file from the end and nobody is supposed to

        1. -1

          What was overlooked? That executables can have whitespace in their names?

          I can agree that this section of the manpage can be wrong (http://man7.org/linux/man-pages/man5/proc.5.html, search for stat):

          (2) comm  %s
              The filename of the executable, in parentheses.
              This is visible whether or not the executable is
              swapped out.
          

          From the manpage of scanf:

          s: Matches a sequence of non-white-space characters; the next
              pointer must be a pointer to the initial element of a
              character array that is long enough to hold the input sequence
              and the terminating null byte ('\0'), which is added
              automatically.  The input string stops at white space or at
              the maximum field width, whichever occurs first.
          

          So it’s clear no provision was made for executables having whitespace in them.

          This issue can be simply avoided by not allowing whitespace in executable names, and by reporting such occurrences as a bug.

          1. 8

            This issue can be simply avoided by not allowing whitespace in executable names, and by reporting such occurrences as a bug

            Ahhh, the Systemd approach to input validation!

            Seriously, if the system allows running executables with whitespace in their names, and your program is meant to work with such a system, then it needs to work with executables with whitespace in their names.

            I agree somewhat with the OP - the interface is badly thought out. But it’s a general problem: trying to pass structured data between kernel and userspace in plain-text format is, IMO, a bad idea. (I’d rather a binary format. You have the length of the string encoded in 4 bytes, then the string itself. Simple, easy to deal with. No weird corner cases).

            1. 1

              I agree it’s a bug.

              However, there’s a strong convention that executables do not have whitespace in them, at least in Linux/Unix.[1]

              If you don’t adhere to this convention, and you stumble across a consequence to this, does this mean that a format that’s been around as long as the proc system is literally broken? That’s where I reacted.

              As far as I know, nothing crashes when you start an executable with whitespace in it. The proc filesystem isn’t corrupted.

              One part of it is slightly harder to parse using C.

              That’s my take, I’m happy to be enlightened further.

              I also agree that exposing these kind of structures as plain text is arguably … optimistic, and prone to edge cases. (By the way, isn’t one of the criticisms of systemd that it has an internal binary format?).

              [1] note I’m just going from personal observation here, it’s possible there’s a subset of Linux applications that are perfectly fine with whitespace in the executable name.

              1. 3

                I agree with most of what you just said, but I myself didn’t take “broken” to mean anything beyond “has a problem due to lack of forethought”. Maybe I’m just getting used to people exaggerating complaints (heck I’m surely guilty of it myself from time to time).

                It’s true that we basically never see executables with a space (or various other characters) in their names, but it can be pretty frustrating when tools stop working or don’t work properly when something slightly unusual happens. I could easily see a new-to-linux person creating just such an executable because they “didn’t know better” and suffering as a result because other programs on their system don’t correctly handle it. In the worst case, this sort of problem (though not necessarily this exact problem) can lead to security issues.

                Yes, it’s possible to correctly handle /proc/xxx/stat in the presence of executables with spaces in the name, but it’s almost certain that some programs are going to come into existence which don’t do so correctly. The format actually lends itself to this mistake - and that’s what’s “broken” about it. That’s my take, anyway.

                1. 2

                  Thanks for this thoughtful response. I believe you and I are in agreement.

                  Looking at this from a slightly more usual perspective, how does the Linux system handle executables with (non-whitespace) Unicode characters?

                  1. 3

                    Well, I’m no expert on unicode, but I believe for the most part Linux (the kernel) treats filenames as strings of bytes, not strings of characters. The difference is subtle - unless you happen to be writing text in a language that uses characters not found in the ASCII range. However, UTF-8 encoding will (I think) never cause any bytes in the ASCII range (0-127) to appear as part of a multi-byte encoded character, so you can’t get spurious spaces or newlines or other control characters even if you treat UTF-8 encoded text as ASCII. For that reason, it poses less of a problem for things like /proc/xxx/stat and the like.

                    Of course filenames being byte sequences comes with its own set of problems, including that it’s hard to know encoding should be used to display filenames (I believe many command line tools use the locale’s default encoding, and that’s nearly always UTF-8 these days) and that a filename potentially contains an invalid encoding. Then of course there’s the fact that unicode has multiple ways of encoding the exact same text and so in theory you could get two “identical” filenames in one directory (different byte sequences, same character sequence, or at least same visible representation). Unicode seems like a big mess to me, but I guess the problem it’s trying to solve is not an easy one.

                    (minor edit: UTF-8 doesn’t allow 0-127 as part of a multi-byte encoded character. Of course they can appear as regular characters, equivalent to the ASCII).

                  2. 1
                    ~ ❯ cd .local/bin
                    ~/.l/bin ❯ cat > ą << EOF
                    > #/usr/bin/env sh
                    > echo ą
                    > EOF
                    ~/.l/bin ❯ chmod +x ą 
                    ~/.l/bin ❯ ./ą
                    ą
                    
              2. 2

                If you don’t adhere to this convention, and you stumble across a consequence to this, does this mean that a format that’s been around as long as the proc system is literally broken?

                Yes; the proc system’s format has been broken (well, misleadingly-documented) the whole time.

                As you note, using pure text to represent this is a problem. I don’t recommend an internal, poorly-documented binary format either: canonical S-expressions have a textual representation but can still contain binary data:

                (this is a canonical s-expression)
                (so "is this")
                (and so |aXMgdGhpcw==|)
                

                An example stat might be:

                (stat
                  (pid 123456)
                  (command "evil\nls")
                  (state running)
                  (ppid 123455)
                  (pgrp 6)
                  (session 1)
                  (tty 2 3)
                  (flags 4567)
                  (min-fault 16)
                  …)
                

                Or, if you really cared about concision:

                (12345 "evil\nls" R 123455 6 1 16361 4567 16 …)
                
          2. 3

            nobody is parsing that file from the end

            As an example the Python Prometheus client library uses this file, and allows for this.

    9. 8

      The core statement (in my view):

      When it comes to strategy, planning and so forth, they’ll go to the suit. Even though the engineer might be much better suited to discuss these topics.

      So the problem is the estimation

      P(Strategy | Suit) > P(Strategy | Tech)
      

      Humans are pretty good at picking up on regularities, so assuming that this is indeed a consistent pattern, there must be a reason for this judgement.

      • Assuming that most strategy people do wear suits, I’d guess that P(Suit | Strategy) is pretty large.
      • Assuming that most strategy people are not tech guys, I’d guess that P(Tech | Strategy) is pretty small.

      i.e.

      P(Suit | Strategy) >>  P(Tech | Strategy)
      

      then via Bayes:

      • P(Strategy | Suit) = P(Suit | Strategy) P(Strategy) / P(Suit)
      • P(Strategy | Tech) = P(Tech | Strategy) P(Strategy) / P(Tech)

      Now, assuming that there’s fewer managers than engineers, i.e.

      P(Suit) <= P(Tech)
      

      we immediately get the “problematic” estimation above:

          P(Strategy | Suit) 
      =   P(Suit | Strategy) P(Strategy) / P(Suit) 
      >   P(Tech | Strategy) P(Strategy) / P(Suit) 
      >=  P(Tech | Strategy) P(Strategy) / P(Tech) 
      =   P(Strategy | Tech)
      

      In this case, we could say that:

      • There’s not enough engineers && suits
      • There’s too many suits doing strategy
      • There not enough engineers doing strategy
      • There should be more suits in tech companies (???)

      Modulo sloppy reasoning and faulty assumptions :)

      1. -4

        this is your first comment?

    10. 3

      Go measure the cost of real threads on a modern linux kernel, compare to Golang, rethink some of the beliefs you have been parroting from Golang dogma. It’s an antipattern to spin up 10k goroutines anyway.

      Erlang has a much better situation, but still, once you’re getting into the realm of high-throughput you shouldn’t be erlanging.

      Real threads are cheap enough for 99.9% of workloads, and incur a lot less CPU overhead for steady state execution under high concurrency. Don’t drink the kool-aid.

      1. 3

        this is something i see people often ranting about in absolute contexts which drives me crazy, although i have observed problems related to high frequency thread creation/churn can be an issue for truly cpu bound workloads. i would say most people’s workloads are largely idle or IO bound, having thousands of largely unused threads is nbd.

        i wish i had a nickel for every compute instance w/ a daily max cpu usage of less than 60%.

    11. 1

      this is a very exciting project (even if it is in erlang) and what a wonderful readme!

      1. 5

        (even if it is in erlang)

        Why such a hater?!

        1. 2

          I’m not! But I do think that erlang has a high barrier for potential contributors, it’s a great quirky language and I’m excited to test minuteman/learn more. =)

          1. 1

            You might be right, but I can’t think of a better language to have done Lashup in. In retrospect, I might have rather done Minuteman in C, but it’s not like C is much friendlier to contributors. :(.

    12. 6

      well done!

    13. 6

      This article is making me really nervous but I know I don’t have the distributed chops to prove it wrong.

      I’ll say this: when an author starts talking about probabilistic interpretations of the theorem and going on about cost-benefit analysis (seriously, why are we worked up about poor “administration interfaces” here?!) my BS needle starts twitching. And when they do that when an impossibility proof exists that shows element availability and atomic consistency are not both possible, it starts swinging around madly.

      The article reads like an awful lot of language lawyering around fairly well understood concepts, but I’m not sure what the motivations of the author are.

      1. 6

        Heh… Sigh. It reads like an attempt to illuminate, but a bad one. That seems worthwhile if it were shorter and clearer; I don’t think the concepts are actually all that well understood, unfortunately. At a previous job, after two months of arguing that Riak was the wrong choice for the company, I finally got through:

        Me: “What exactly is the benefit you’re hoping for from using a distributed technology? Uptime? Preventing data loss?” Them: “Yes, both of those.” Me: “Those are mutually-exclusive in our situation.” Them: “Oh… Maybe something else would be okay.”

        (And no, they aren’t inherently mutually exclusive, but the data was peculiar and merging later, after resolving a partition, wasn’t an option. I can’t go into it.)

        I definitely don’t want that to be read as an insult to the intelligence of the person involved; they were quite competent. It’s just that databases are a subject not all engineers actually know very much about, and distributed ones are a rather new technology in the scheme of things.

        It’s worth noting that not all distributed systems are databases, too, of course!

      2. 5

        That’s not what the impossibility proof says–he references that paper.

        “In 2002, Seth Gilbert and Nancy Lynch publish the CAP proof. CA exists, and is described as acceptable for systems running on LAN.”

        “If there are no partitions, it is clearly possible to provide atomic, available data. In fact, the centralized algorithm described in Section 3.2.1 meets these requirements. Systems that run on intranets and LANs are an example of these types of algorithms” [0]

        I don’t think CAP is very well understood. I think folks end up very confused about what consistent means, and what partition-tolerant means.

        I think this is pretty well researched. I’m not sure why cost-benefit analysis makes you nervous.

        1. 4

          James Hamilton of AWS says it best, I think:

          Mike also notes that network partitions are fairly rare. I could quibble a bit on this one. Network partitions should be rare but net gear continues to cause more issues than it should. Networking configuration errors, black holes, dropped packets, and brownouts, remain a popular discussion point in post mortems industry-wide.

          Gilbert & Lynch’s implicit assertion is that LANs are reliable and partition free; I can buy this in theory but does this happen in practice? When Microsoft performed a large analysis of failures in their data centers, they found frequent loss occurring that was only partially mitigated by network redundancy.

          But either way you make a fair point: CA models aren’t strictly precluded by that proof. I’m just not certain I’ve seen a network that is trustworthy enough to preclude partitions.

          1. 5

            Network partitions are not even remotely rare, honestly. LANs are actually worse culprits than the Internet, but both do happen.

            You already cited one of the better sources for it, but mostly I believe this because I’ve been told it by network engineers who I respect a lot.

            1. 6

              Even if network partitions were rare, I’ll tell you what aren’t (for most people): garbage collections. What I did not like about this post is it, over and over again, just talks about network partitions and the actual networking hardware. But weird application-specific things happen as well that appear to be unresponsive for longer than some timeout value and these are part of the ‘P’ as well.

              In reality, I think CAP is too cute to go away but not actually adequate in talking about these things in detail. PACELC makes the trade-offs much clearer.

            2. 4

              LANs are actually worse culprits than the Internet

              Funny you mention that: over the past few days I’ve been fighting an issue with our internal network that has resulted in massive packet loss internally (>50% loss in some spikes), and ~0.5% to the Internet. That’s probably why this article raised my eyebrows - it’s my personal bugbear for the week.

              The culprit seems to have been a software update to a Palo Alto device that stopped playing nice with certain Cisco switches… plug the two of them together and mumble mumble spanning tree mumble loops mumble. The network guys start talking and my eyes glaze over. But all I know is that I’ve learned the hard way to not trust the network - and when a proof exists that the network must be reliable in order to have CA systems, well…

        2. 3

          I think some of the confusion comes from describing all node failures as network partitions. In reality “true” network partitions are rare enough (lasting in durations long enough to matter to humans), but nodes failing due to hardware failure, operational mistakes, non-uniform utilization across the system, and faulty software deploys are sometimes overlooked in this context.

          i like the comment above “It’s worth noting that not all distributed systems are databases, too, of course!”, but i think this is also a matter of perspective. most useful systems contain state, isn’t twitter.com as a network service a distributed database? kind of neat to think about

      3. 3

        It’s not clear to me that the distinction the author makes between a CA and a CP system exists. He uses ZooKeeper as an example of a CP system, but the minority side of networking partition in ZooKeeper cannot make progress, just like his CA example. In reality, CP seems to be a matter of degree not boolean, to me. Why does a CP system that handles 0 failures have to be different than one that handles 2f-1?

        1. 1

          When the system availability is zero (not available at all) after a partition, you can claim both CP and CA (that’s the overlap between CP/CA).

          There are two corner cases when the system is not available at all:

          • the system does not even restart after the partition. You can claim CP theoretically. The proof’s definitions don’t prevent this formally. But it makes little sense in practice.

          • the system restarts after the partition and remains consistent. Both CP and CA are ok.

          But ZooKeeper is not concerned by these corner cases, because it is partly available during the partition.

          1. 9

            No, you can’t: a system which is not available during a partition does not satisfy A, and cannot be called CA. If you could claim both CA and CP you would have disproved CAP.

            1. 2

              CA means: I have a magical network without partition. If my network is not that magical at the end, I will be CP/AP and more likely in a very bad state, not fully available and not fully consistent.

              1. 7

                I’m responding to “When the system availability is zero (not available at all) after a partition, you can claim both CP and CA”. Please re-read Gilbert & Lynch’s definition of A: you cannot claim CA if you refuse to satisfy requests during a partition.

              2. 3

                But those magic networks do not exist, so how can a CA system exist?

                1. 1

                  :-) It exists until there is a partition. Then the most probable exit is to restore manually the system state. 2PC with heuristic resolution being an example.

                  Or, if you build a system for machine learning: 20 nodes with GPU, 2 days of calculation per run. If there is a network partition during these two days you throw away the work in progress, fix the partition and start the calculation process again. I don’t see myself waiting for the implementation/testing of partition tolerance for such a system. I will put it in production even if I know that a network partition will break it apart.

                  1. 2

                    That system is still CP. You are tolerating the notion of partitions, and in the case of a partition you sacrifice A (fail to fulfill a request–a job in this case) and restart the entire system for the sake of C.

                  2. 1

                    It exists until there is a partition.

                    If a system reacts to a partition by sacrificing availability - as it must, and you haven’t demonstrated differently - how can you claim it is CA?

                    If there is a network partition during these two days you throw away the work in progress, fix the partition and start the calculation process again. I don’t see myself waiting for the implementation/testing of partition tolerance for such a system. I will put it in production even if I know that a network partition will break it apart.

                    I feel like I’m in bizarro world.

                    1. 1

                      If a system reacts to a partition by sacrificing availability - as it must, and you haven’t demonstrated differently - how can you claim it is CA?

                      If the system sacrifices consistency (it could also be consistency, or both), then there is an overlap between CA and CP. That’s what Daniel Abadi said 5 years ago: “What does “not tolerant” mean? In practice, it means that they lose availability if there is a partition. Hence CP and CA are essentially identical.”

                      The key point is that forfeiting partitions does not mean they won’t happen. To quote Brewer (in 2012) “CA should mean that the probability of a partition is far less than that of other systemic failures”

                      That’s why there is an overlap. I can choose CA the probability of a partition is far less than that of other systemic failures, but I could have a partition. And if I have a partition I will be either non consistent, either non available, either both, and I may also have broken some of my system invariants.

                      I’m sure it does not help you as I’m just repeating my post, and this part is only a repetition of something that was said previously by others :-(

                      Trying differently, maybe the issue to understand this is that you have:

                      • CAP as a theorem: you have to choose between consistency and availability during a partition. There are 3 options here:

                        • full consistency (the CP category)

                        • full availability (the AP category)

                        • not consistent but only partial availability (not one of the CAP categories, but possible in practice, typically 2PC with heuristic resolutions: all cross-partition operations will fail).

                      • CAP as a classification tool with 3 options: AP/CP/CA. There are a description of the system. CA means you forfeited partition tolerance, i.e. it’s a major issue for the system you build.

                      And, in case there is any doubt: most systems should not forfeit partitions. I always mention 2PC/heuristic because is a production proven exception.

          2. 1

            Could you rephrase your statement? I am having trouble parsing what you have said.

            1. 1

              the cr went away. let me edit.

          3. 1

            If we take your second case - as it’s the only real case worth discussing, as you note :-) - how can you claim the system is available?

            The system is CA under a clean network until time n when the network partitions. The partition clears up after m ticks. So from [1, n) and (m, inf) the system is CA, but from [n, m] it is unavailable. Can we really say the system maintains availability? That feels odd to me.

            Maybe it makes more sense to discuss this in terms of PACELC - a system in your second case has PC behavior; in the presence of a partition it’s better to die hard than give a potentially inconsistent answer.

            Having said all of this, my distributed systems skills are far below those of the commentators here, so please point out any obvious missteps.

            1. 1

              CA is forfeiting partition tolerance (that’s how it was described by Eric Brewer in 2000). So if a partition occurs it’s out of the operating range, you can forfeit consistency and/or availability. It’s an easy way out of the partition tolerance debate ;-). But an honest one: it clearly says that the network is critical for the system.

              Maybe it makes more sense to discuss this in terms of PACELC - a system in your second case has PC behavior;

              Yep it works, Daniel Abadi solved the overlap by merging CA and CP (“What does “not tolerant” mean? In practice, it means that they lose availability if there is a partition. Hence CP and CA are essentially identical.”) It’s not totally true (a CA system can lose its consistency if there is a partition, like 2PC w/ heuristic resolutions), but it’s a totally valid choice. If you do the same choice as Daniel in CAP you choose CP for the system 2 above. CA says “take care of your network and read the documentation before it is too late”.

      4. 3

        seriously, why are we worked up about poor “administration interfaces” here

        :-) Because I’ve seen a lot of system where the downtime/data corruptions were caused mainly by: 1) software bugs 2) human errors.

        I also think that a lot of people take partition tolerance for granted (i.e. “this system is widely deployed in production, so it is partition tolerant as I’m sure everybody has network issues all the time, so I can deploy it safely myself w/o thinking to much about the network”). Many systems are not partition tolerant (whatever they say). That’s why Aphyr’s test crash them (dataloss, lost counters,…), even if they are deployed in production.

        It does not mean they have no value. It’s a matter of priority. See Aphyr’s post on ES, imho they should plan partition tolerance and implement immediately crash tolerance for example, instead of trying to do both at the same time.

        I prefer a true “secure your network” rather than a false “of course we’re partition tolerant, CAP says anything else is impossible” statement (with extra points for “we’re not consistent so we’re available”).

      5. 3

        CAP tells you that you can’t have both C and A when a partition happens. Most people take that to mean you must choose one or the other and have a CP or AP system. But it’s worth remembering that you do have the option of making sure that partitions never[1] happen - either by making the system non-distributed or by making the communications reliable enough. And for some use cases that might be the correct approach.

        [1] In a probabilistic sense - you can’t ensure that a network partition never happens, but nor can you ensure that you won’t lose all the nodes of your distributed system simultaneously. Any system will have an acceptable level of risk of total failure; it’s possible to lower the probability of a network partition to the point where “any network partition is a total system failure” is an acceptable risk.

        1. 2

          I think it’s important to modify your statement a bit. What you have to do is ensure that in the face of a partition you remain consistent then try your darnedest to reduce the frequency of partitions. The distinction being you have control over what happens during a partition but not control over a partition happening.

          1. 4

            you have control over what happens during a partition but not control over a partition happening.

            I don’t think that this sharp distinction exists. You don’t have absolute control over what happens during a partition - to take an extreme example, the CPU you’re running on might have a microcode bug that means it executes different instructions from the one you intended. And you do have control - to the extent that you have control of anything - over the things that cause network partitions; you can construct your network (or pay people to construct your network) so as to mitigate the risks. It is absolutely possible to construct a network which won’t suffer partitions (or rather, in which partitions are less likely than simultaneous hardware failures on your nodes) if you’re willing to spend enough money to do so (this is rarely a smart choice, but it could be).

            1. 2

              I do not think byzantine faults really matter for this discussion, they are a whole other issue to partitions. But I do not think your response invalidates my point at all. Partitions are something that happens to you, how your program handles them is something you do.

    14. 3

      Building an etcd mesos framework so that it’s easier to run HA kubernetes clusters on top of mesos (or anything else that has made the decision to use etcd over zk or the mesos replicated log.) I’ve got it handling failover of the scheduler or the mesos master, and seamless recovery when up to (N-1)/2 etcd instances fail. Etcd mutates its membership by invoking RAFT itself, which is why it’s a little more interesting to recover from more than (N-1)/2 failures. Next I’ll be adding periodic backups to HDFS/S3/etc… so that the cluster can be restored in catastrophic scenarios. If up to N-1 failures occur, it will be able to just dump the current snapshot and spawn a fresh cluster using that as the seed.

      1. 1

        is it possible to run mesos under etcd instead of zk yet? the jira looks like it’s still open, but i would be willing to try a branch if it’s something that’s semi-functional

        1. 1

          It’s in progress currently, but I’m not sure how far along it is. There is also desire to make mesos self sufficient by handling its own leader election, but zk is going to be the most reliable option for the time being.

    15. 4

      saying that anything needs to die in a fire is a great way to not have your argument taken seriously

    16. 4

      Storm 0.9.2 + Kafka 0.7 + Cassandra 2.1.0-rc5 + Elasticsearch 1.3 cluster is now up-and-running in production, running against around 3,000 web traffic requests per second. Time to test it in more detail and make it fast!

      1. 1

        are these 3000 real requests per second? or is that just in benchmarks?

        1. 1

          real requests per second

      2. 1

        why deploy kafka 0.7 instead of 0.8.1?

        1. 1

          we want to upgrade to 0.8.1, but we currently use a Python driver we wrote for 0.7 and are in the midst of merging its functionality with an open source driver for 0.8.1

    17. 2

      What are you trying to create another lobster clone for?

      1. 11

        I’d assume they want to use it for something other than tech news.

      2. 4

        I, for example, have opened a lobster clone for Russian developers https://develop.re/

        1. 2

          that’s a great domain name

      3. 3

        We have a small dev community in Hawaii, and we deployed our own lobster clone because we wanted a place to share and discuss links/events. We also wanted something we could customize a bit (so that ruled out reddit, also reddit is generally too public for the type of discussion we wanted), so rather than NIH another link-sharing site we started with the lobsters codebase.