Threads for aphyr

  1. 3

    I’ve written a half-dozen libraries which had to deal with paginated APIs, and it’s always a little awkward threading the state through lazy-seq properly–this will be a welcome addition!

    1. 5

      Fast or elegant. Writing Java in Clojure for higher performance is meh. This is what we get when trying to use a too high level of abstraction. It is really good that we have a simple way to observe performance and fix the bottlenecks though.

      1. 7

        The reality is that idiomatic Clojure without any optimizations is fast enough for vast majority of situations. Clojure is already significantly faster than a lot of popular languages like Ruby or Python out of the box. So, the amount of times you actually need to do these kind optimizations is pretty rare in most domains Clojure is used. However, having the ability to tweak performance to this level is really handy when you do end up in a scenario where that’s needed. I’m very much a fan of writing code with maintainability in mind as the primary concern, and optimizing things on case by case basis as the need arises. Clojure is hands down one of the best languages I’ve used in this regards.

        1. 1

          Clojure is already significantly faster than a lot of popular languages like Ruby or Python out of the box.

          Any sources, if you don’t mind my asking?

          1. 4

            Totally anecdotal, but my rewrite of the Riemann monitoring system from Ruby to Clojure improved throughput by something like two orders of magnitude right off the bat, and that was the first thing I ever wrote in Clojure. With a little work, we went from ~700 events/sec (in ruby) to ~10 million events/sec per node. Not an apples-to-apples comparison there–that’s years of optimization, batching and network tricks, and different hardware, but like… coming from MRI, the JVM was a game changer.

            1. 2

              Here’s a post from AppsFlyer discussing a significant performance boost when we moving from Python to Clojure.

          2. 1

            I completely agree with you. My only problem is that is very hard to sell Clojure to people. Both clients and co-workers. Clojure and F# are the two default languages I use a lot for writing code for myself. Unfortunately team mates and clients force me to use Python, TS, C#, etc.

            1. 3

              My team’s been using Clojure for close to a decade now, and what I observed over that time is that developers experienced in imperative style often have a hard time getting into FP. Yet, when my team hires students who have little to no programming experience, they’re able to pick up Clojure very quickly and tend to love the language.

              I think the big problem is with mismatched expectations. People build up a set of transferable skills that allow them to quickly move from one language to another as long as those languages are within the same family. If you learn a language like Ruby then you can quickly pick up Python, Js, or Java. You’ll have to learn a bit of syntax and libraries, but how you approach code structure and solve problems remains largely the same.

              However, using a language like Clojure or F# requires a whole new set of skills that you wouldn’t pick up working with imperative languages. People tend to confuse this with FP being inherently more difficult as opposed to just being different.

        1. 4

          A small shout-out to the push-sum family of gossip systems, which you can use to obtain asynchronous, exponential-fast convergence for things like rate-limiting counters, quantiles, etc. https://manishyadav.dev/blog/gossip-push-sum-protocols

          1. 6

            One nice thing to model is “identity” of data, i.e. that two structures with the same values are different somehow. Object-oriented languages (and procedural languages with some effort) provide intrinsic “identity” by using references to structures; the memory address is used to form identity. Some functional programmers, particuarly Clojure programmers, said the equivalent of “fuck it, we don’t need identity, cause that doesn’t model time. With pure functions, one can model time.”

            I don’t really understand this critique. It feels like there’s two separate things here: one is the question of an equivalence class for “equal but not identical” objects, and another is modeling and control of state and side effects. For the first, Clojure can compare memory addresses just like Java can with ==: (identical? [:a] [:a]) returns false because the two vectors have different memory addresses. For the second, Clojure has a rich set of reference types (what Clojure calls “identity”) for explicitly modeling changing state, including vars for the global environment, promises, atoms, and STM refs for changing data, and defrecord and deftype, in addition to all the JVM container types (e.g. AtomicReference) for Weirder Stuff. It’s not at all a “purely” functional approach–you can do that, but I think the default Clojure path leans a lot more towards eagerness and mutability rather than, say, an effects system or monadic transformers.

            1. 3

              I sometimes wonder if the benefits of immutable data structures aren’t overshadowed by the added complexity needed to use them efficiently in practice.

              Unless I’m misunderstanding, a Zipper is basically an iterator that makes traversing and updating immutable trees more efficient? Does anybody have real life examples of how these are used? Or maybe an example comparing a language with mutable data structures and Clojure with Zippers?

              1. 5

                I think it might be better to think of a zipper as something that gives you suspendable/resumable iteration. One place I find them useful is when building UIs where you need to step through a list of data. At work I use a zipper on a page where I have a list of items that are related to the main entity a user is viewing. The UI lets a user step back and forward through these one at a time with next/previous buttons. A zipper gives me all of these operations in O(1) time, and also lets me query the state for “is there a previous page? is there a next page?”.

                This is doable if you maintain a list of items along with the length of the list and some 0 <= i < length xs, but now I think all the operations I want become harder to obviously see, and also harder to obviously verify I got things correct (e.g., did I remember to increment i? Did I bounds check?)

                1. 3

                  Thank you! It’s interesting to hear a real life use case, and I hadn’t considered the suspendable/resumable aspect. I can see that being a real benefit when iterating over a complicated structure.

                  Is the interface extendable for user defined types? One gripe I have with Common Lisp is that the built-in iteration with (loop) isn’t easily extended for user defined types. The popular package ‘iterate’ allows it, but it’s an extra dependency to pull in.

                  I suppose one day I should just learn Clojure.

                  1. 4

                    Yes, the interface is extensible–zippers basically take a few functions which define the children of a node, a root, and how to create new nodes. This makes them really convenient when you want to do ad-hoc modification of data structures that weren’t necessarily intended for traversal–say, for example, an HTML document, an SVG, a b*tree, a web API, or ad-hoc tree representations: (node child1 (child2 subchild1 subchild2)) or {:node "foo" :children [{:node "bar"} ...]}–you can lift them into a zipper, traverse or transform them in some way, then spit back a new structure without anyone being the wiser. If your trees are code, this is a nice way to write compiler optimizations and complex macros–you could, for instance, statically apply DeMorgan’s laws to simplify deeply nested boolean expressions in a complex conditional.

                    This is super helpful for complex transformations like the kind you might do with XSLT/XPATH on XML documents: I could use a zipper to find instances of (e.g.) an image followed by a paragraph (at any level of nesting) within a <div> with a particular class, then rewrite that HTML to, I dunno, add some special tags and wrap both photo and following paragraph in a single div. Really handy for transforming, say, the results of a Markdown processor if the output it gives you isn’t quite right for the context you want to display it in.

                    Of course you can do all this mutably too, but zippers give you two big advantages. One is that they allow you to extend tree traversal and manipulation over arbitrary structures that weren’t necessarily intended for it. Second, because the zippers themselves are immutable, you get some nice atomic properties: if I have a node with two children a and b, I can compute a new a' and b' and replace them atomically based on both a and b without having to create temporary variables or (potentially) deep copies. You get arbitrary rollback, it plays nicely with compare-and-set, etc.

                    Not always the right solution–of course their performance isn’t going to be as good as direct traversal–but they’re a surprisingly useful tool. :-)

              1. 9

                I have to admit, I did raise an eyebrow at the idea that Tigerbeetle’s correctness relies on well-behaved wall clocks during VM migration–in a database which also explicitly claims strict serializability. Viewstamped replication is an established consensus protocol, so that’s not a red flag per se, but we know from Lamport 2002 that consensus requires (in general) at least two network delays. However, Tigerbeetle claims “We also can’t afford to shut down only because of a partial network outage”, which raises some questions! Do they intend to provide total availability? We know it’s impossible to provide both strict serializability and total availability in an asynchronous network, but in a semi-sync model, maybe they can get away with something close, ala CockroachDB, where inconsistency is limited to a narrow window (depending on clock behavior) before nodes shut themselves down. Or maybe they’re only aiming for majority-available (which is what I’d expect from VR–it’s been a decade or so since I’ve read the paper and I only dimly remember the details), in which case… it’s fine, and the clocks are… just an optimization to improve liveness? I dunno, I’ve got questions here!

                1. 11

                  Thanks for the awesome questions, @aphyr!

                  The clocks are… just an optimization for the financial domain of the state machine, not for the consensus protocol.

                  As per the post, under “Why does TigerBeetle need clock fault-tolerance” we explain that “the timestamps of our financial transactions must be accurate and comparable across different financial systems”. This has to do with financial regulation around auditing in some jurisdictions and not with total order in a distributed systems context.

                  As per the talk linked to in the post, we simply need a mechanism to know when PTP or NTP is broken so we can shut down as an extra safety mechanism, that’s all. Detecting when the clock sync service is broken (specifically, an unaligned partition so that the hierarchical clock sync doesn’t work, but the TigerBeetle cluster is still up and running) is in no way required by TigerBeetle for strict serializability, it’s pure defense-in-depth for the financial domain to avoid running into bad financial timestamps.

                  We also tried to make it very clear in the talk itself that leader leases are “dangerous, something we would never recommend”. And on the home page, right up front (because it’s important to us) we have: “No stale reads”. If you ever do a Jepsen report on TigerBeetle, this is something I guarantee you will never find! :)

                  You might also recall that I touched on this in our original email discussion on the subject back on the 2nd July, when I suggested that CLOCK_BOOTTIME might be better for Jepsen to recommend over CLOCK_MONOTONIC going forward, and where I wrote:

                  “If you’re curious why we were looking into all of this for TigerBeetle: We don’t plan to risk stale reads by doing anything dangerous like leader leases, but we do want a fault-tolerant clock with upper and lower bounds because […] these monotonic transaction timestamps are used […] for timing out financial two-phase commit payments that need to be rolled back if another bank’s payments system fails (we can’t lock people’s liquidity for years on end).”

                  You can imagine TigerBeetle’s state machine (not the consensus protocol itself) as processing two-phase payments that look alot like a credit card transaction that has a two-phase auth/accept flow, so you want the leader to timestamp roughly close enough to true time, so that the financial transaction either goes through or ultimately gets rolled back after roughly N seconds.

                  Hope that clarifies everything (and gets you excited about TigerBeetle’s safety)!

                  P.S. We’re launching a $20K consensus challenge in early September. All the early invite details are over here and I hope to see you at the live event, where we’ll have back-to-back interviews with Brian Oki and James Cowling: https://www.tigerbeetle.com/20k-challenge

                  1. 3

                    This has to do with financial regulation around auditing in some jurisdictions

                    Aha, thank you, that makes it clear. You’re still looking at potentially multi-second clock errors, right? It’s just that they’re limited to the window when the VM is paused, instead of persisting indefinitely?

                    that leader leases are “dangerous, something we would never recommend”

                    You know, I’m delighted you say this, because this is something I’ve discussed with other vendors and have encountered strong resistance to. Many of my clients have either felt the hazard didn’t exist or that it’s sufficiently improbable to worry about. I still don’t have any quantitative evidence to point to which suggests how frequently clock desync and/or skew occurs, or what kinds of user-facing impact resulted. I think CockroachDB choosing to up their default clock skew threshold to 250 (500?) ms was suggestive that something was going wrong in customer environments, and I’ve had private discussions to the effect of “yeah, we routinely see 100+ ms offsets in a single datacenter”, but that’s not something I can really cite. If y’all have done any sort of measurement to this effect, I’d love to read about it.

                    in our original email discussion

                    My terrible memory strikes again! Yes, I do recall this now, thank you. :-)

                1. 19

                  What a mess. I founded #riemann for discussion and support of a monitoring system about ten years ago. Tentatively moved to libera after the whole mess last week, but because so much extant infrastructure mentions the Freenode channel, I left a note in the topic that we’d moved. Freenode nuked the #riemann channel; it now forwards to ##riemann, and the topic is gone. This has got to be a huge hassle for projects with logging and bot integration.

                  1. 3

                    Worth mentioning: this might work some of the time, but in an asynchronous (e.g. real) network, it could be unsafe. The described locking scheme does not actually ensure no two processes hold the lock at the same time. Even if it did, it would not ensure that side effects, like writing to block storage, would be safe. Martin Kleppmann has a terrific overview of why “distributed locks” generally don’t do what people think, and what to do instead: https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

                    1. 1

                      Thanks for this feedback. Actually my design already takes this problem into account.

                      Kleppmann warns that a process can freeze for an arbitrary amount of time due to garbage collection, network problems, CPU starvation, etc. I mitigate this in my design through two ways:

                      1. By recommending a long TTL, in the order of minutes. Kleppmann cited a problem at Github where packets were delayed for 90 seconds. The default TTL I recommend is 5 minutes.
                      2. By refreshing the lock from time to time, and by checking for its health periodically. The fresh- and healthcheck interval must be sufficiently short as to prevent the very problems described by Kleppmann.

                      It’s described in section “Long-running operations”.

                      Kleppmann proposes the use of fencing tokens, which indeed guarantees that such problems don’t occur, but it requires sufficient support by all systems. My design doesn’t provide as strong of a guarantee, but we can make it arbitrarily approach 100% safety by configuring the various timing settings (TTL, refresh interval).

                    1. 4

                      Since 2017 the interest for Clojure dropped significantly, almost to zero, to a 2008 level (the language was created in 2007): https://trends.google.com/trends/explore?date=all&q=%2Fm%2F03yb8hb

                      This sounds scary. No one would invest in such a curve.

                      More, the founders / the company behind Clojure were bought up last year by a bank. We all know what this means in other areas.

                      And so on.

                      I’ve started learning Clojure a month ago. And these are my back-thoughts on it.

                      1. 10

                        The trends chart for Apache Spark shows interest in that technology near a 5-year low and trending downward. Interest in SQL, Java, JavaScript have been on downward trajectories for 15+ years, and are currently at an interest level metric very near to Clojure.

                        Would it be fair to call it scary to invest in those technologies?

                        1. 8

                          Since 2017 the interest for Clojure dropped significantly, almost to zero, to a 2008 level (the language was created in 2007): https://trends.google.com/trends/explore?date=all&q=%2Fm%2F03yb8hb

                          This sounds scary. No one would invest in such a curve.

                          If you think that’s scary, wait ’til you see the same graph for Java!

                          https://trends.google.com/trends/explore?date=all&q=%2Fm%2F07sbkfb

                          Or C#!

                          https://trends.google.com/trends/explore?date=all&q=%2Fm%2F07657k

                          Or JavaScript!

                          https://trends.google.com/trends/explore?date=all&q=%2Fm%2F02p97

                          Or C!

                          https://trends.google.com/trends/explore?date=all&q=%2Fm%2F01t6b

                          Or C++!

                          https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0jgqg

                          I used to think Google Trends correlated with language popularity, but these are pretty strong counterexamples.

                          1. 6

                            Right, G Trends shows even React is in a serious downward spiral.

                            Point taken, thanks for everybody clarifying this.

                          2. 5

                            More, the founders / the company behind Clojure were bought up last year by a bank. We all know what this means in other areas.

                            I don’t. What’s the concern about being owned by a bank?

                            1. 3

                              More, the founders / the company behind Clojure were bought up last year by a bank. We all know what this means in other areas.

                              This also happened to Elixir and it seems to be doing fine?

                              1. 3

                                Google trends isn’t really a useful metric. What’s more interesting is that there are more and more companies using Clojure commercially. For example, we had Clojure/north conference in Toronto where lots of people presented from companies that are entirely built on Clojure stack. There are lots of new companies popping up doing innovative stuff with Clojure every year. Roam Research being a good example.

                                The communities on Slack, Reddit, and Clojureverse are very active, and constantly growing.

                                There are now projects like Clojurists Together for funding open source ecosystem and ensuring that it’s sustainable. Incidentally, one of the first things that happened from Cognitect being bought by Nubank was that they started funding open source developers on Github.

                                Clojure is a mature language, with a large ecosystem around it, and a very active community. It’s not a hype driven language, but it’s very much sustainable and has been for many years now.

                                1. 1

                                  Definitely, I choose Clojure/Script as an alternative to JavaScript web dev due to all the above.

                                  However I still don’t feel safe, because of the language popularity. For example on the 2020 Stack Overflow Dev Survey (https://insights.stackoverflow.com/survey/2020#technology) Clojure didn’t hit the list. A presence there would be reassuring.

                                  I see Clojure a one way path: take a deep breath, go down into the rabbit hole (yes, Clojure learning is not simple at all, Clojure is unlike others) and never look back.

                                  1. 2

                                    This seems like a pretty limited perspective… Learn more languages and you’ll see that Clojure is easier to learn (and better to use) than most if not all.

                                    If the syntax, style, or ideas seem foreign, than all the better! You can write (C, Lisp, Cobol) in any language, and learning the pros and cons of each style is never time wasted.

                                    1. 1

                                      Clojure is the 9th language I’m learning.

                                      So far I find it so strange like Assembly. And functional programming such a shift when I transitioned from procedural programming (C) to object-oriented programming (C++).

                                      These makes one cautious.

                                      For example, with React was no question to learn it, to invest in. It was the solution for the problem I was waiting for ages.

                                      On Clojure I can’t see really that clear path. Functional programming, for example, is solved elsewhere more thoroughly and in a simpler way (https://github.com/MostlyAdequate/mostly-adequate-guide).

                                      That’s why language popularity would be a good indicator whether to adopt it, or not.

                                      However, on HN, the comments on this same article are more alarming: https://news.ycombinator.com/item?id=27054839

                                      It seems to explain why the language popularity is dropping. Clojure starts as a nice promise, then problems rise, people flock away.

                                      1. 13

                                        Been writing Clojure professionally for a little over nine years, both on teams of hundreds and as a solo engineer. I can’t speak to popularity, but Clojure has been (and remains!) an exceedingly nice language choice for long-running services and desktop applications. It combines a well-designed standard library, reasonable performance, an emphasis on immutability and concurrency-safety without being dogmatic about evaluation semantics, just the right amount of syntax, excellent JVM interop, and access to a huge swath of Clojure and other JVM libraries. It’s a generally mature, stable language which solves data-oriented problems well. The major drawbacks, in my view, are a lack of static typing (though I’ve used spec, schema, and core.typed to some avail here), unnecessarily unhelpful error messages, slow (~5 to ~20s) startup times, and that working with JVM primitives efficiently is more difficult than one might like. And, of course the usual JVM drawbacks: garbage collection, for instance.

                                        None of this is really new per se. I wouldn’t worry too much about popularity metrics or library churn rate–one of the nice things about Clojure is that it’s fairly stable, and libraries from a decade ago tend to work basically unchanged. After Scala, that was a breath of fresh air for me. What I’d ask, were I in your position, is whether the language’s ergonomics, libraries, and speed are suitable for the kind of work you’re trying to do.

                                    2. 2

                                      My team’s been using Clojure for around a decade now, and things have only been getting better all around in that time. I think the most important part is that there are a lot of companies using it nowadays as their core platform. There is a lot of commercial interest in keeping the language and its ecosystem alive and active. I don’t think Clojure will ever get big like Java or Python, but I really don’t see it going away either at this point.

                                      It’s also worth noting that Clojure can be sustainable with a smaller community because it piggy backs on JVM and Js runtimes. We have access to entire ecosystems from these platforms, and can seamlessly leverage all the work from the broader community.

                                  2. 2

                                    I’m not so sure about Google trends as real data point… but there seems to be less buzz, but people are still using it.. and I don’t think there ever was a real hype.

                                    I had noticed that my personal interest had diminished a bit and when most of the people from the irc channel migrated to Slack I didn’t join them. Stuff still seems to get regular updates and as just a casual user no Clojure release really excited or disturbed me - that could be because I’d neber used it to its full potential (likely) or that they were just iterating in small steps and not being revolutionary (also likely). I don’t think I’ve had to do meaningful changes over the years to the codebases I started between 2011 and 2013 and they run on the lastest Clojure version…

                                  1. 15

                                    All that aside - I see no way to argue for excluding RMS on the basis of his beliefs (ie. he is not progressive-orthodox enough) without also loosing a big chunk of the rest of the world as well.

                                    I don’t think the author seems to be aware that the people fighting do not care about losing the rest of the world.

                                    They do not care about driving out engineers that don’t share their politics, they do not care about driving out engineers that just want to ignore politics, and they certainly do not care that by picking their sides as they have they’ve grouped a lot of good people in with genuine bigots and assholes of the highest caliber.

                                    They will burn down free software to save it from itself, willfully ignorant of the stripmining of the ecosystem by small and large companies alike as long as they pay the correct lip-service.

                                    To me it is curious that someone can champion excluding people over their heterodox beliefs, while simultaneously shouting things like the below; perhaps some irony overload here:

                                    Again, the author is trying to apply logic to actions and a rabid progressivism that have, at best, a distant relationship with rationality. They don’t even care about obvious fact-checking and games-of-telephone, because it doesn’t matter once they’ve whipped themselves into a frenzy. They cannot be reasoned with and the attempt isn’t worth the effort or risk; the juice isn’t worth the squeeze.

                                    We used to be focused on liberty & freedom - I miss that.

                                    That boat sailed for a number of reasons. I miss it too, buddy. :(

                                    And yet, a big talking point is “hey, there were never any good old days”–because people have to rewrite the past to support the actions in the present to get to a desired future. Old as time.

                                    If this statement was focused on leaders, it could be characterized as installing a new glass-ceiling for any who are not progressive-orthodox.

                                    Ding ding ding, we have a winner. A huge dimension of all of this is a power grab–as long as people like RMS are around, you can’t be the new RMS, so he must be destroyed. As long as there is an FSF, you can’t be the new FSF, so you must seek to destroy the FSF. As long as there are people in power, you can’t be in power, so you must destroy those people in power–and right now, politically, what are the best tools for that? What rhetoric is in vogue?

                                    This is obvious to any student of history or revolution.

                                    (a cynical/troll-ey point would be to make the same observation about systemd/linux, but that’s neither here nor there.)

                                    ~

                                    I feel bad for this author, and I feel worse that I think he and I are probably sharing the same lot. To anybody else who feels similarly: just go away. Walk away from this culture war stuff, give wide berth to both heretic and inquisitors, and don’t try to understand the madness–just treat it as such. Go write code, go make things, and let the passions of the time deaden and pass. There’s nothing here for you but argument and suffering. Don’t risk your job, your livelihood, or your friendships.

                                    Return after the storm and build anew.

                                    1. 22

                                      Sock, remember when you found out that I was disappointed in your conduct on lobste.rs and you took that seriously?

                                      I’ve been reading your comments for something like seven years now and I remain consistently disheartened. Engaging with you is the missing stair of this community, and I’m not going to follow up on this thread, but I want you to know that your oh-so-civil, “trying to do better”, “just raising questions”, “why won’t you be rational”, socially regressive trolling is a big part of why I don’t spend more time here, and why I caution people who ask me for invites to lobste.rs.

                                      You’re still doing it, and I sincerely wish you’d stop.

                                      1. 4

                                        Nice link. I don’t know if you meant to highlight them, but there are some gems, like:

                                        Pandering to injustice and impotent outrage evokes strong reactions and such posts can be easily tailored to match the overall views of the hivemind. None of these posts actually tend to elevate the discussion or reveal new truths, but people will almost always upvote them more than they downvote them. And that’s the source of their toxicity: shitposts do at least as well as quality posts, they don’t increase the signal of the community’s nominal area of discussion, and they are very easy to crank out even by idiots.

                                        1. 0

                                          Engaging with you is the missing stair of this community, and I’m not going to follow up on this thread, but I want you to know that your oh-so-civil, “trying to do better”, “just raising questions”, “why won’t you be rational”, socially regressive trolling is a big part of why I don’t spend more time here, and why I caution people who ask me for invites to lobste.rs.

                                          @friendlysock “Missing stair” is a political shibboleth of the same political faction that finds RMS unacceptable. It’s a metaphorical way of calling you an abuser, just like how they call RMS an abuser. The same goes for mocking your civility when commenting, or labeling your words as trolling. These are political attacks, and I would encourage you to view them as such, rather than doubting yourself and worrying that you should change your behavior.

                                        2. 13

                                          They will burn down free software to save it from itself, willfully ignorant of the stripmining of the ecosystem by small and large companies alike as long as they pay the correct lip-service.

                                          You are a dramatic individual. I’ve really tried to give you a chance, but I just can’t take any more of your whining. Seriously, you’re doing the same thing you accuse others of doing. The world isn’t ending, moral panics have existed for thousands of years.

                                          Walk away from this culture war stuff, give wide berth to both heretic and inquisitors, and don’t try to understand the madness–just treat it as such.

                                          Please follow your own advice. Seriously.

                                          1. 6

                                            The world isn’t ending, moral panics have existed for thousands of years.

                                            Well, yes; and they have also caused great amounts of harm and hurt to innocent individuals. Shrugging it off as “oh, these kind of moral panics have happened before” doesn’t strike me as a very strong rebuke. We could substitute “moral panics” in your quote with all sorts of things (theft, murder, war, rape) and have exactly the same argument.

                                            (I don’t disagree that ’sock is perhaps a bit overly dramatic though).

                                            1. 5

                                              You are a dramatic individual. I’ve really tried to give you a chance, but I just can’t take any more of your whining. Seriously, you’re doing the same thing you accuse others of doing. The world isn’t ending, moral panics have existed for thousands of years.

                                              That doesn’t make moral panics good, or unworthy of political opposition. I’m not sure if @friendlysock’s dramatic rhetoric is maximally politically-effective, but he’s not wrong to express the thought.

                                            2. 1

                                              What exactly do you think “culture war” means in our society? What do you think that you’re defending?

                                              To be the child who points out that the emperor is naked: Christianity and the entire family of Abrahamic religions rest upon mistruths, propaganda, ahistorical claims, and a large amount of what we’d now call human-rights violations and war crimes. What do you actually think you mean when you say that you’re “sharing the same lot” as the author?

                                              They will burn down free software to save it from itself

                                              Sorry, but to be blunt: Do you actually produce Free Software? It’s cool if you don’t, but you shouldn’t expect pigs to take chickens seriously.

                                              1. 5

                                                Look, I’m about as atheist as they come but going on about historial war crimes when a religious person wants to starts a broader conversation about inclusivity is extremely inappropriate. This isn’t /r/atheism.

                                                1. 4

                                                  Do you actually produce Free Software?

                                                  I do, and I keep those activities quite divorced from Lobsters.

                                              1. 9

                                                The key part,

                                                Not only can a Mastodon user communicate with users on different servers on Mastodon, perhaps more importantly this user can also communicate e.g with a Friendica (macroblogging) user or a Pleroma user. These are totally different networks that all support ActivityPub. But this is even taken a step further where that same Mastodon user can follow his favourite PeerTube channel or someone that shares great photos on Pixelfed. This is like you were able to follow someone with your Twitter account on YouTube or Instagram. This also means that this Mastodon user can comment or like the PeerTube video from his/her Mastodon user interface. This is the true power of ActivityPub!


                                                There is also Tribes which provides a custom-hosted version (I run mine here).

                                                See https://jointhefedi.com if you want to quickly try out the Fediverse.

                                                1. 10

                                                  In all fairness, Mastodon has one of the least spec compliant ActivityPub implementations out there. It gets stumped with a lot of valid payloads that were generated by other services inasmuch as having to implement Mastodon’s quirks is mandatory if one wants to do development for the fediverse.

                                                  1. 8

                                                    Maybe an unpopular opinion, but without Mastodon ActivityPub would be living the life it was living before, used by dozens of nerds.

                                                    Of course that’s not a proper discussion point to some, you may or may not like its ideas and technical features, but to me it was kinda useless when it was only identi.ca and statusnet and whatnot. I’m saying this as someone who was pretty involved in many FLOSS projects at the time. Utterly useless. It was Twitter if you wanted a thing like this and 90% happened on mailing lists and IRC anyway.

                                                    1. 6

                                                      Oh, I fully agree that Mastodon is overall a force for good in the Fediverse, at least in the fact that it made it popular with the non technical crowds, but I still wish they would work harder at some things related to ActivityPub compliance. Probably my own service will not be super compatible with it, as it skirts webfinger - something that Mastodon can’t do user discovery without. :(

                                                      1. 2

                                                        I didn’t look into it very deeply, so can’t comment if they made some shortcuts for time to market, or enable stuff that would’ve been hard to do, or just because they were careless or simply didn’t care…

                                                        1. 3

                                                          From my perspective they’re prioritizing the features that makes them a better micro blog platform than the features that makes them a better ActivityPub one.

                                                          I would like to say that being the major player in this niche they should take their responsibilities in this regard more seriously, but in the end they work on what they enjoy more, and that’s absolutely fine.

                                                    2. 5

                                                      Agreed. I am present on a few Mastodon instances but my personal instance is Honk which is a very opinionated and pure (I guess?) ActivityPub server/client/thing

                                                    3. 22

                                                      See https://jointhefedi.com

                                                      The servers recommended on that page are some of the most notorious in the fediverse, notable for hosting bigoted shitheads and having nazi-friendly moderation policies.

                                                      If you sign up on them, you will find yourself blocked by basically all fediverse instances with active and competent moderators.

                                                      1. 8

                                                        If you sign up on them, you will find yourself blocked by basically all fediverse instances with active and competent moderators.

                                                        This was one of the reasons I stopped using the Fediverse. I don’t like the concept of full-on instance-bans to begin with (something like warnings for out-going actions and filtering for unrequested ingoing actions would be more appropriate). I’m not sure if federation necessarily has to lead to fragmentation, but some people seem to accept it as a necessary tool and don’t care if anyone has a different opinion. In my case I wanted to hear what people on the spinster server had to say, but it was blocked on the instance I was on (ironically this made me go out of my way to listen to the points of radical feminists, which I don’t think was the intention).

                                                        Part of the problem with Mastodon specifically is that it has inherited a lot of the worst Twitter-culture by presenting itself as “Twitter with better moderation”, while paradoxically decentralisation is usually understood as a means to avoid being shut down by a central authority. Then again, it all ties into more fundamental issues with the Fediverse and how it presents itself as “each server is it’s own community”, while at the same time I don’t care about what server another person is using. The only thing I am interested in is the moderation policy and how well they administer the server.

                                                        The part of the Fediverse I still remain hopeful for is Peertube.

                                                        1. 11

                                                          I’m not sure if federation necessarily has to lead to fragmentation, but some people seem to accept it as a necessary tool and don’t care if anyone has a different opinion.

                                                          Instance bans allow for coexistence without cohabitation. You always have the choice of choosing your own policy domain/deferring to someone else. Forcing all nodes to be wide open would remove a lot of point and cause unnecessary annoyance.

                                                          1. 2

                                                            You always have the choice of choosing your own policy domain/deferring to someone else.

                                                            To a degree yes, thought I’d still rather that not be the case, because I rarely agree with someone on everything, meaning I have to administer an instance myself. But it is not only a personal issue, with instance bans threads are also fragmented, so depending on your perspective, you might unknowingly not see the entire conversation going on, leading to more confusion than necessary.

                                                            Instance bans are sledge hammers that are applied to eagerly (I do think they make sense for actual spam servers). Maybe the situation has improved since, but I remember there only being three states:

                                                            1. No limits on federation
                                                            2. Instance bans by Users
                                                            3. Instance bans by Instances

                                                            Where I think that there should be more going on between 2. and 3.

                                                            1. 3

                                                              There are degrees between 1 and 2, at least on mastodon. Admins can “silence”, meaning posts from that instance won’t show up in the federated timeline by default. If I’m not mistaken, there’s also “mute”, meaning interactions from that instance won’t be shown to the muting instance unless there’s a preexisting relationship between the actors.

                                                              I should also note that instance bans are not really a thing– you can mute an instance at a user level, but your data is still sent there and you must trust that server’s administration.

                                                              1. 2

                                                                instance bans threads are also fragmented, so depending on your perspective, you might unknowingly not see the entire conversation going on, leading to more confusion than necessary.

                                                                this seems to be an issue even if the instance isn’t banned. I see this happen with my small instance, where viewing the thread on the hosting instance (or from an account on another instance) shows different posts, and I’m pretty sure the missing posts aren’t from blocked instances.

                                                                1. 4

                                                                  iirc mastodon will fetch replies upthread, but not downthread: that is, if the chain goes X -> Y -> Z, and your instance is made aware of post Y (someone follows the poster, it gets boosted, whatever) then it will fetch X but not Z. this is why some people have a norm to boost the last post in a thread, as opposed to the first. this isn’t a technical limitation, since pleroma (the other big fedi server) will fetch the entire thread.

                                                                  of course, in either case, if one of the posts in the thread is private and you don’t follow the person you’ll just break the thread entirely, but there’s not much that can really be done there.

                                                                  1. 1

                                                                    oh wow, that’s confusing. :|

                                                            2. 7

                                                              Due to how ActivityPub works, you need to have near-ultimate trust of an instance if you wish to federate with them. If you believe the admins are bad actors, using acceptance of harmful ideologies as a proxy for that, then you can’t trust them with your user’s data, and must defederate.

                                                              ironically this made me go out of my way to listen to the points of radical feminists, which I don’t think was the intention

                                                              This isn’t necessarily against what the blockers wanted! What is called “censorship” on the fedi is usually about protecting their own users. Trans folks don’t want to have to see the same tired take on trans exclusionism for the fifth time today, nor do they want their posts to be seen by those folks.

                                                              As you discovered, there was absolutely nothing stopping you from finding out more from the spinsters, and nothing stopping you from making an account there either, right?

                                                              If we think decentralization is the key to freedom, then we can’t stop short of free association.

                                                              1. 2

                                                                Due to how ActivityPub works, you need to have near-ultimate trust of an instance if you wish to federate with them. If you believe the admins are bad actors, using acceptance of harmful ideologies as a proxy for that, then you can’t trust them with your user’s data, and must defederate.

                                                                What do you mean by “trust them with your user’s data”? Is there something a server can only access if they are federated, that a “blocked” instance couldn’t see via it’s public feed?

                                                                What is called “censorship” on the fedi is usually about protecting their own users.

                                                                I get that an instance would decide to mute another instance by default, but if a user explicitly requests to receive data, why should they not be able to interact?

                                                                1. 8

                                                                  A user’s private posts are always federated to any instance that has a single actor subscribed to it. That means that instance is storing a user’s private posts. If the admin’s a bad actor, they could see the private posts even if they’re not authorized to normally.

                                                                  1. 4

                                                                    so private posts are not actually private, much like Facebook, though for totally different reasons. great.

                                                                    1. 5

                                                                      Yes. Unfortunately, if you view private data disclosure as a security issue, Masto/ActivityPub is less secure than a centralized platform.

                                                                      There’s hopes that CapTP will solve many of these concerns.

                                                                      1. 3

                                                                        It’s similar to plaintext email, no? As long as the plain text traverses a server somewhere it can be read by the server admins.

                                                                        As far as I know, end-to-end encryption isn’t supported by AP.

                                                                        1. 1

                                                                          yes, but email doesn’t use the term ‘private’ anywhere. I think many(most?) people understand that email is not useful for HIPAA or other things where privacy matters.

                                                                          1. 4

                                                                            Many people sign up for things with firstname @gmail.com, and then claim the account owner “hacked” them. Many people think that companyname @somecustomdomain.com means you work for them. Many people think that anything @someother.tld means you actually meant @someother.tld.com.

                                                                            I don’t think most people understand anything about email.

                                                                            1. 1

                                                                              I think with all things, it’s complicated. I’m sure people in their 70’s and older who have very little exposure to email are likely not very versed.

                                                                              For the average professional that is legally required to care about privacy, then I think they mostly have the understanding that email != private communication.

                                                                              Developers SHOULD know better, but they still do stupid things with email, because it’s the only thing you can reasonably assume someone has. (like login with email, use email for password recovery, etc) There are sane things you can do to help mitigate these things, like single use tokens, etc, but.. I’m sure there are still tons of code out there that doesn’t do these things.

                                                                              I agree email ADDRESSING, which is what you mostly are referring to, is full of assumptions and mostly none of them can be assumed. The only thing you can mostly assume from user@domain, is that the domain admin at some point thought that user should exist. :)

                                                                            2. 2

                                                                              I agree with you it’s a bit of a branding problem.

                                                                              I’m just so used to the store and forward model of email and NNTP that I just applied that model to the fediverse too. And I have not heard anything about E2EE in the “mainstream” Fediverse.

                                                                2. 3

                                                                  The servers recommended on that page are some of the most notorious in the fediverse, notable for hosting bigoted shitheads and having nazi-friendly moderation policies.

                                                                  Citation needed.

                                                                  One of the servers recommended on that page, gleasonator.com, actually was created by someone that experienced bigoted behavior from mastodon’s toxic and neoracist moderation policies: https://blog.alexgleason.me/gab-block/

                                                                  1. 21

                                                                    As a queer person and regular fedi user, I concur that these servers are notorious. Multiple accounts from shitposter.club harassed a trans friend of mine just this week because they posted a selfie to their timeline. Freespeechextremist’s users have a habit of sea-lioning their way into my mentions; I think the last one was an extremely tedious “wow aren’t gay people bigoted” monologue mixed with Q-anon rants. Freespeechextremist.com, shitposter.club, spinster.xyz, and glindr.org (another Alex Gleason joint) all have the dubious distinction of being on the relatively short mastodon.social and mstdn.social blocklists for hate speech, harassment, and transphobia. With the exception of mstdn.social, this is not a general-purpose instance list: these instances all share moderation policies aligned with reactionary views on gender and sexuality.

                                                                    1. 5

                                                                      That transphobic bigot wasn’t ejected by mastodon’s moderation policies. Mastodon is the service, moderation responsibilities lie with the server admins.

                                                                      That transphobic bigot was ejected by todon’s moderation policies, because, as he so proudly proclaims, his bigotry is contrary to the server’s stated goals and aims.

                                                                      Those goals, aims and indeed the moderation policy are clearly stated on the server:-

                                                                      “we do not accept (among other things): racism, homophobia, transphobia, sexism, ableism and other forms of discrimination, harassment, trolling, hate speech, (sexual) abuse of minors and adults (also not virtual), glorification of violence, militarism, nationalism and right-wing populism, right-wing and religious extremism, tankies (ML), capitalists, (right-wing) conspiracy ‘theories’, hoaxes, and of course no spam and other forms of advertisement.”

                                                                      Gleason is a bigot. That bigotry was noted by other todon users (I number myself among them) and he was shown the door.

                                                                      1. -3

                                                                        The word “transphobia” is often used as a loaded term, just like “hate speech” is,

                                                                        Usually the use of these terms outside of political environments brings a toxic ambiance and is not conductive to anything felicitous or productive to the domain. I’m sure you’ll have a hard time finding any actual instances of fear/hate (which is what “phobia” literally indicates) from Gleason; and of course defending for female sports rights doesn’t qualify as one (saying otherwise would be bigoted and would at best qualify as … umm … imagined phobia).

                                                                        ’tis a good thing Lobsters is not politically woke to ban the likes of Gleason, eh?

                                                                        1. 5

                                                                          Gleason is a peddler in transphobic bigotry. Its an essential part of who he is. His “sex-essential” “gender-critical” nonsense is a paper-thin mask for hatespeech against a marginalised element in society.

                                                                          You have now defended him, Freedom of Speech Zealotry, White Supremacists and transphobic bigotry up and down this story, which you appear to have posted just to link to the aforementioned listing site for hatespeech and bigots.

                                                                          You can put all the ten-dollar words you want all over your post, I can say without hesitation that you’re both posturing and a troll.

                                                                          1. 0

                                                                            All you are doing, in your anonymous account to boot, is to accuse other people (Gleason and now me–that are not anonymous, neither are afraid to hide behind a mask) without evidence and without engaging rationally (as in without refuting the central point) but merely with politically loaded language (as in resorting to thinly veiled ad hominem).

                                                                            Lobsters would be better off without such toxic comments expressing actual bigotry, and I assume on good faith that you did not intend that, and is writing in a state of not being with a sound mind - so I suggest you take a break.

                                                                      2. 7

                                                                        They out themselves as a transphone one sentence into the blog post. I’m sure many transphobes think being told they’re a transphobe is toxic.

                                                                        They also had no problem joining Gab and admitting that it’s full of, their quote, “literal nazis” in the same article.

                                                                        1. -1

                                                                          They out themselves as a transphone one sentence into the blog post. I’m sure many transphobes think being told they’re a transphobe is toxic.

                                                                          For those who haven’t read the article in full, this is what the first sentence (which according to the parent commenter indicates that Alex is outing himself to be a “transphobe”) reads: “I got deplatformed from Mastodon for supporting women’s sex-based rights. Now Mastodon is trying to stop me from using Gab.

                                                                          They also had no problem joining Gab and admitting that it’s full of, their quote, “literal nazis” in the same article.

                                                                          Again, for those who haven’t read the article in full, here’s the full quote: “Gab is a free speech platform. It is true that there are indeed “literal Nazis” on it. This isn’t a hyperbole, as there are some users who quite literally advocate for the extermination of races of people. The reason is because Gab censors no one. It’s not because Gab likes those people or wants them there.” - and that quote was a prelude to explaining why censorship is bad, by citing past examples:

                                                                          • Marginalized people are at the greatest risk of being impacted by censorship. The Feminist movement laid the groundwork for freedom of speech in the United States with the formation of the Free Speech League in 1902. They were being censored from distributing material about sex-education and abortion. Keep in mind that the majority of people were against them at the time.
                                                                          • The Civil Rights movement of the 1960’s fought hard for free speech. The movement won a landmark case, New York Times vs Sullivan, in which Martin Luther King supporters were sued for running an ad which criticized the police.
                                                                          • Black Civil Rights activists were also arrested for: praying, “parading, demonstrating, boycotting, trespassing and picketing.”, “statements calculated to breach the peace.”, “distributing literature without a permit.”, “conduct customarily known as ‘kneel-ins’ in churches.”

                                                                          Nevermind that Twitter for instance has an uncommon number of neoracists as well.


                                                                          I flagged your comment as unkind, because essentially it is a low-effort post made to flippantly accuse somebody without evidence, and there is zero fellowship regard (much less an assumption of good faith) towards Alex to the point of even misrepresenting what he wrote.

                                                                          1. 16

                                                                            Friendly warning: anything anywhere that mentions transphobia or nazis becomes a bozo bit here on Lobsters. Don’t try to argue semantics, don’t appeal to actual text or logic or history, don’t waste yours or anybody else’s time–just steer clear of it and save those cycles for making things or engaging in communities with more mature discussion capabilities.

                                                                            1. 2

                                                                              You’re too wise for this place

                                                                              1. 7

                                                                                Wisdom is what you get when do something really stupid but take notes.

                                                                                …I’ve taken a lot of notes.

                                                                            2. 17

                                                                              “Women’s sex-based rights” is absolutely a dogwhistle for transphobia, and if you look at what he wrote in his own words he says that ‘transgenderism [was] first popularized on Tumblr’ (?????), links the “TERF is a slur” page, and says “transgender ideology is fiction”. He’s transphobic through and through.

                                                                              I’m also extremely unconvinced that there’s no way to prevent people from being actual literal Nazis while not hurting marginalized people. Like, if someone was to come in to the comments section of a Lobsters post and say “by the way, I think we should kill all the Jews”, they’d get flagged and banned, right?

                                                                              And you’re ignoring the fact that constantly seeing people say that they think people like me (hi, I’m trans) are abominable freaks that are better off dead, or even ‘just’ mentally ill people who need to stop pretending, is likely to push me away from a place. This is going to happen with any sort of ‘free speech’-focused Masto instance: the bigots will migrate to your instance because they get kicked off elsewhere, and the people who don’t want to have to deal with bigots are going to go elsewhere.

                                                                              And, going back to the list, it’s not just gleasonator. As someone who’s used Fedi for several years, every single one of those instances aside from mstdn.social is one that I’ve had shitty experiences with. And it’s not a coincidence that mstdn.social is the only one that’s described as not allowing racism or sexism!

                                                                              1. 11

                                                                                Like, if someone was to come in to the comments section of a Lobsters post and say “by the way, I think we should kill all the Jews”, they’d get flagged and banned, right?

                                                                                Yes. And it’s happened: a few years ago a comment on a story about net neutrality attempted to use that to explain why the U.S. should commit genocide in the middle east. I deleted it and banned the author.

                                                                                1. 2

                                                                                  … now that’s a leap.

                                                                              2. 8

                                                                                branching off from the thread, that quote is infuriating. they make the following argument:

                                                                                • marginalized people are affected by censorship (citing civil rights activists)
                                                                                • gab does not participate in censorship
                                                                                • gab has literal nazis on it

                                                                                therefore:

                                                                                • it’s ok for gab to continue to host literal nazis because banning them is similar to the prejudice that civil rights activists face

                                                                                i.e. propagating the speech of who people arguing for an ethnostate and committing real-life violence against minorities is somehow beneficial for those same minorities. fucking inane.

                                                                                1. 7

                                                                                  My comment below is terribly off-topic, I think.

                                                                                  A transphobe is someone who fears or has a negative perception of trans people. Supporting “women’s sex-based rights” is the same as saying that people born with female sex organs have different rights than trans people who are women. That is a negative perception of trans people who are women. Saying that women who were born with female sex organs have different rights than trans people who are women is, precisely, transphobia.

                                                                                  Your comment is a low-effort attempt to deny that basic fact; if you recognize that trans people exist, saying they should be denied affordances that cis people have is clearly a manifestation of transphobia.

                                                                                  That claim is so obviously false and inflammatory that I have flagged your comment as a troll. I’ve done you the courtesy of leaving this comment explaining why even though my own comment should rightly be flagged as offtopic. That’s because I’m assuming some good faith even though the obviously false and inflammatory nature of your comment makes me think that’s vanishingly unlikely.

                                                                        1. 3

                                                                          There are sort of two critiques here–one is whether to arrange data in a columnar (e.g. all ant types stored contiguously, all ant sizes stored contiguously) or row (for each ant, all properties stored together) oriented format. Another is just that pointer-chasing is expensive, and if you aren’t going to reference data multiple times, you don’t necessarily need that pointer indirection.

                                                                          On the latter problem (and perhaps to some extent, the first) readers might be interested in something like Zach Tellman’s Vertigo, which offers nestable, packed structs and arrays backed by flat JVM ByteBuffers. Rather than chasing multiple pointers, you can jump directly to any field at any offset immediately. Vertigo also shows some possibilities for friendly iteration syntax, lazy views over structs, etc.

                                                                          1. 7

                                                                            I’m confused: WAS this a Byzantine fault? The post only describes a network partition: every node appears to have followed the Raft protocol correctly. That’s a normal behavior of asynchronous networks, not a Byzantine failure. It sounds like the partition interfered with Raft elections, but didn’t actually cause a safety violation.

                                                                            In the RAFT protocol, cluster members are assumed to be either available or unavailable, and to provide accurate information or none at all.

                                                                            This is sort of… half-true. Raft is designed to preserve safety under any asynchronous network conditions, including partitions like this one, and they haven’t described any kind of safety violation. Like all consensus systems, network partitions can interfere with availability, and during a partial network partition like this one you can wind up with less-than-ideal availability. Now… the Raft paper does say:

                                                                            [Nodes] are fully functional (available) as long as any majority of the servers are operational and can communicate with each other and with clients.

                                                                            And this claim is violated here! At the same time, we should understand that claim in the context of the paper’s repeated cautions around availability:

                                                                            … availability (the ability of the system to respond to clients in a timely manner) must inevitably depend on timing. For example, if message exchanges take longer than the typical time between server crashes, candidates will not stay up long enough to win an election; without a steady leader, Raft cannot make progress.

                                                                            Regardless, we do have what looks like a violation of the majority-liveness claim: it sounds like a partially isolated node can rapidly advance its own epoch through elections, forcing a well-connected majority component to go down. Curiously, there ARE mechanisms in the Raft paper which are specifically intended to address this:

                                                                            To prevent this problem, servers disregard RequestVote RPCs when they believe a current leader exists. Specifically, if a server receives a RequestVote RPC within the minimum election timeout of hearing from a current leader, it does not update its term or grant its vote. This does not affect normal elections, where each server waits at least a minimum election timeout before starting an election. However, it helps avoid disruptions from removed servers: if a leader is able to get heartbeats to its cluster, then it will not be deposed by larger term numbers.

                                                                            Does this mechanism fail to address this partial-partition scenario? Did etcd opt not to implement it? Or could there be a bug in etcd? Worth investigating!

                                                                            1. 8

                                                                              The introduction has a lovely bit of Norwegian:

                                                                              Freed, we dance
                                                                              For an eyeblink, we play
                                                                              We thousand small leafships
                                                                              we anticipate, on that clear morning light
                                                                              

                                                                              That’s a wonderful introduction to the story of the seedling!

                                                                              1. 18

                                                                                Oh thank goodness it worked. My Norwegian is marginal at best, and I really worried I messed up my article agreement or use of på/i in that poem.

                                                                                1. 3

                                                                                  I’m fluent in Swedish rather than Norwegian, but to me “på” fits better since that preposition translates as “on top of” rather than “i” which would be “inside of” or “encompassed by”; and they are leafships.

                                                                                  I did a quick check, looks like Swedish and Norwegian prepositions work the same way.

                                                                                  Nice poetry, thanks for this nifty post!

                                                                                  1. 7

                                                                                    Thank you! And… that’s what I was hoping for as well. På/i has been such a challenge for me–I once told a friend I was i kjøkkenet (in the kitchen) and he stared at me as if I’d uttered something completely unparseable: one can only be i certain rooms of the house. One is på hytta (upon the cabin) but i huset (in the house). One is i Oslo, but på Røros, because… inland or mountainous towns are something one is on, rather than a coastal city, which one is within, except for places like Skjåk? One is på shops, libraries, and restaurants (I think because there’s a sense that these aren’t just places, but sort of… activities that one has embarked upon? ANYWAY languages are cool and hard and I like them, CARRY ON

                                                                                2. 1

                                                                                  This was a fantastic read. I was a bit hesitant based on some of the other recent interview links that ended up turning into long discourse NOT about the article, but your quote of the Norwegian hooked my interest. Thank you for pulling that out.

                                                                                  Highly recommend this for anyone that wants to discuss the “correct” answer to FizzBuzz.

                                                                                1. 17

                                                                                  On a table you would find Jepsen, essentially the equivalent of Yelp, ideally meant to keep restaurants honest, generally correct, but ultimately not that different from other tech companies,

                                                                                  This is a really strange take on the role that Jepsen plays in the ecosystem. Kyle isn’t motivated by any of the factors that make Yelp so pathological; if we’re going to place Jepsen into the restaurant metaphor, I’d say they’re the health inspector. Can you say more?

                                                                                  1. 3

                                                                                    Hi Peter, it’s part of the truly ranty part of the post (and explicitly marked as such), so I wouldn’t take it too seriously, but in this one blog post dedicated to describing my experience with Redis, I wanted to call out all the organizations that in one way or another I feel have not done justice to Redis, even if not directly and without monetary reasons to do so.

                                                                                    The history between Redis, Antirez, and Aphyr is pretty long and I don’t really intend to dive into it in full (I probably don’t even know half of it tbh), but recently Jepsen did an audit of RedisRaft and published both a video and a post on it. I did find hilarious how much he was bashing Redis Labs in the video, and I think all of it is well deserved, but when it comes to bashing Redis itself, I can’t help but notice a hint of malice in the way everything was worded. Maybe it’s just me being too sensitive about something I care about.

                                                                                    As for the analogy, I think the bar for health inspectors is a bit higher: they certify a place is safe to eat, while given the nature of the tests Jepsen performs, it only states that the plates Aphyr ate did not contain bugs or hairs, nothing more than that. That’s the big premise of every Jepsen audit and you are basically told to draw your own conclusions, not too differently from Yelp.

                                                                                    Also, while technically correct to contextualize the audits as Jepsen does, it’s still kinda lame how the company can basically only deal slaps and nothing else.

                                                                                    But, as I said, maybe it’s just me being too sensitive :)

                                                                                    1. 15

                                                                                      As for the analogy, I think the bar for health inspectors is a bit higher: they certify a place is safe to eat, while given the nature of the tests Jepsen performs, it only states that the plates Aphyr ate did not contain bugs or hairs, nothing more than that.

                                                                                      Well, all health inspectors do, ultimately, is certify that they didn’t find rats or unhealthy practices during their audit, right?

                                                                                      Regarding malice, I understand why you think the way you do. The distsys community kind of has their collective nose in the air about Redis, and it can manifest out in… not wonderful ways in backchannels and private conversations. But their frustrations aren’t totally unfounded. Redis is a fantastic tool with an extraordinary design and elegant implementation for what it is, which is a single-core single-machine data structure server. But each of the distsys features that Redis added were flawed, and not in an edge case or implementation detail sense but in a fundamentally unsound sense. The work reflected a kind of “let’s do this from first principals” approach to the problem space, which probably served the project very well for a lot of other features but is basically untenable when solving distsys problems. If a mis-step happens once it’s not a big deal, but Redis kept hammering away at different problems with this same attitude, and each of the solutions were similarly unsound. So the distsys folks kind of responded to that. Again, not well! But that’s the history there.

                                                                                      In any case, thanks for the context. I’m a happy user of, and advocate for, Redis, so it’s interesting to see your perspective.

                                                                                      1. 2

                                                                                        Thank you, I’m also a happy user of go-kit and generally appreciative of your Go insights. I agree that Redis is far from perfect, I just think that learning to appreciate OSS projects in their full nature (flaws included) is the starting point to reclaiming control of OSS software from VC-funded madness. The technical debate should continue to exist, but lately I feel we’ve all been tricked into concentrating on that so that we wouldn’t notice what was happening on every other level.

                                                                                      2. 11

                                                                                        given the nature of the tests Jepsen performs, it only states that the plates Aphyr ate did not contain bugs or hairs, nothing more than that. That’s the big premise of every Jepsen audit and you are basically told to draw your own conclusions, not too differently from Yelp.

                                                                                        I’m not happy about Hume’s problem of induction either, but it’s a real limitation of experimental inquiry, both theoretically and in practice. I can and do miss bugs, because modern distributed systems are incredibly complex, the space of faults under which they operate is intractably large, and I have limited time and make mistakes–not to mention that many of the core problems involved in this kind of work are, in general, NP-complete. The fact that experimental techniques work on concurrency verification at all is somewhat surprising, and has been the subject of several research papers.

                                                                                        I include these disclaimers in Jepsen because even though the limits of empiricism have been discussed for somewhere between 250 to 2000 years, users and vendors alike continue to cite Jepsen reports as “proof” of a distributed system’s correctness–and to treat Jepsen reports as if they were a uniform standard, rather than an investigatory process, tailored to the system in question, and shaped by the limits of time, money, computational power, space, system knowledge, and algorithmic know-how.

                                                                                        Acknowledging the limits of Jepsen’s methods is different from saying “draw your own conclusions”. In roughly six weeks of collaboration with Jepsen, Redis-Raft went from a system that could not execute a single operation or survive a single process fault, to providing strict serializability during hundreds of hours of mixed network and process faults. That’s nothing to sneeze at, and I noted as much in the talk and report.

                                                                                        1. -1

                                                                                          Acknowledging the limits of Jepsen’s methods is different from saying “draw your own conclusions”.

                                                                                          As usual, you took a very long road to express in a ultimately self-celebratory way the same overall concept I sketched in two words. You’re also absolutely correct about the limits of the approach you employ and in your precision in pointing them out, as I also noted in my own, way less formal rant.

                                                                                          Unfortunately, that doesn’t make the end result any less lame.

                                                                                    1. 2

                                                                                      I find it curious that all the examples in the post involve one of the transactions performing blind updates.

                                                                                      I don’t think these kind of transactions are very common in normal OLTP usage, and many papers describing new consistency models explicitly point out “we assume no transactions perform blind updates”, or automatically insert “invisible reads” in transactions if they try to update a key without reading it first. Maybe that’s why these bugs went unpatched for so long?

                                                                                      1. 8

                                                                                        In my experience (webapps, HTTP APIs, streaming systems, search, monitoring), blind-write workloads are extremely common. It doesn’t matter though–we find just as many cycles even when reads precede every write.

                                                                                      1. 13

                                                                                        Sometimes, Programs That Use Transactions… Are Worse

                                                                                        You ever see a reference that is such a deep cut you feel like it’s written for you, specifically?

                                                                                        1. 9

                                                                                          I am personally delighted every time someone catches these. :)

                                                                                          1. 2

                                                                                            Haha, amazing! Thank you for introducing me to this.

                                                                                          1. 5

                                                                                            I liked the analysis, but the part about ACID and Snapshot Isolation reads a bit off:

                                                                                            Snapshot isolation is a reasonably strong consistency model, but claiming that snapshot isolation is “full ACID” is questionable.

                                                                                            Marketing usage might be deceptive, but to me this reads like you’re saying that only Serializable transactions are ACID. But even Serializable transactions exhibits real-time anomalies. Then you’re left with strict serializability, which afaik only Spanner guarantees for distributed settings. Can Postgres still claim to have ACID transactions? Is Spanner then the only player with “full ACID” transactions?

                                                                                            Although the “I” in ACID means Isolation, there are degrees. Both Read Committed and Strict Serializable transactions are, to me, “fully ACID”.

                                                                                            1. 13

                                                                                              I’d disagree! ACID isn’t… really a well-defined property, which is why I couch it carefully in this report, but in general ACID “isolation” is understood to have something to do with transactional interleaving and equivalence to a serial history. Realtime-only anomalies in serializability don’t violate isolation, because they don’t create (visible) interleavings of operations across transactions. In this sense, serializability is the weakest of several consistency models which provides ACID “I”.

                                                                                              Consistency is a bit of a different beast, because there are application-level invariants which could be violated by serializability, but preserved under, say, strong session or strict-1SR. For those cases, yeah, you could argue serializable isn’t ACID either. That’s part of why I don’t like “ACID” as a descriptor, but people keep using it, so… here we are.

                                                                                              1. 2

                                                                                                Fair enough! It doesn’t help that the literature can call Snapshot Isolation (or any other) both an “Isolation level” and a “Consistency criterion”, although I see the first one less and less nowadays.

                                                                                                That’s part of why I don’t like “ACID” as a descriptor

                                                                                                It’s true that ACID transactions can mean different things to different people, which I guess that’s why it’s the perfect marketing material. With that… it’s on us to come up with a new hip term to sell to clients!

                                                                                            1. 14

                                                                                              Another (over my head) but nice, thoughtful job!

                                                                                              @aphyr, knowing that you spend a non-trivial amount of time doing your analysis, what prompted the un-compensated analysis?

                                                                                              1. 46

                                                                                                A few things came together! One is that people are always mentioning Jepsen and MongoDB together, and asking what it does now, and I keep fumbling the ball. Another is that the MongoDB Jepsen test suite is one that people frequently try to run, and when it fails, they file GitHub issues asking what’s wrong, so it’s been in the back of my head like “yeah, I need to go dig into that at some point.”

                                                                                                About… maybe a month ago, Evan Weaver, from FaunaDB, sent me a link to the MongoDB Jepsen page which accidentally forgot to talk about default behavior. I was busy and forgot about it until Jepsen got tagged in a Twitter thread where a MongoDB developer advocate said “We are passing the Jepsen test suite and it was back in 2017 already. So, no, MongoDB is not losing anything if you know what you are doing.”, and linked to the page again! THAT was like oh, yeah, I REALLY gotta do this.

                                                                                                https://twitter.com/MBeugnet/status/1253622755049734150

                                                                                                I’d just finished a full rewrite of Jepsen’s generator system and I needed a project to use as a proving ground, to make sure it was actually usable before release. Dug into the MongoDB test suite code and realized I couldn’t get it to run either–it’d accrued a bunch of code for other environments, and I think somewhere along the way it stopped working with a standard Jepsen environment. I started a rewrite expecting I’d basically just confirm transactions were SI, but then… I found the defaults weren’t, and then even when I fixed the test suite to use the correct safety levels THOSE looked broken, and it just kinda snowballed from there.

                                                                                              1. 2

                                                                                                The date is today’s date, but 2019, is this a year old?

                                                                                                1. 11

                                                                                                  This is what I get for writing “2019-TODO” in October and only updating the TODO part before release!

                                                                                                  Should be fixed momentarily, I’ve just been waiting on gcloud to pick up the changes. Takes forever.

                                                                                                  1. 1

                                                                                                    A glimpse into how long it takes to work on these! Very curious about this now, actually…

                                                                                                    1. 35

                                                                                                      Each report generally involves 2-3 months of my full-time work before the report is “finished”, and from that point, there’s an up-to-three-month delay before publication, which gives folks a chance to make bugfixes, inform users of risks, test and cut releases, etc. I check in a day or two prior to publication to make last-minute updates–“this issue is resolved by version foo”, etc.

                                                                                                      There’s generally 1-2 days of research and documentation review, about a week to write a basic “it turns on” test harness for a new database, and another 2-16 weeks of exploration, refinement, new workloads, failure modes, etc. Throughout this process I’m in close contact with the client, showing them my findings, asking for help understanding expected behavior, suggesting documentation fixes or possible algorithmic issues, debating whether something is “really broken”, etc.

                                                                                                      I spend about a week actually writing each report–sometimes multiple weeks, depending on the complexity. Sometimes I write at the end based on my “lab notebook”–really, a text logfile. Other times, I try to write throughout the process. Concurrent with the writeup, there’s a lot of work involved in collecting type specimens for each error, condensing those specimens to something you can actually understand in the paper, writing up bug tracker tickets, figuring out which of the 20-odd development builds I tested included/fixed which bugs, etc.

                                                                                                      Client comments can add additional weeks of back-and-forth; with some clients I just get a “looks good, ship it!”, and with others, we go through literally dozens of drafts fine-tuning language and debating how to visualize or label different issues. I review my own work for structure, language flow, and typos with at least three passes, spoken out loud. For a ~10 page paper, that’s multiple hours, possibly more than a day, of work each pass.

                                                                                                      And despite my own proofreading, as well as having a half-dozen client reviewers, and occasionally peers, we still sometimes miss obvious things like “what year is it”! ;-)

                                                                                                      1. 1

                                                                                                        I’ve always wondered about your process to get these out. What percentage of the systems you test are open as opposed to proprietary systems about which you don’t post online?

                                                                                                        1. 1

                                                                                                          Good question! I was worried, when I started out, that folks would try to weasel their way into keeping analyses from the public eye, and wrote a policy specifically to address this: https://jepsen.io/ethics.

                                                                                                          Whether a system is open vs proprietary is a different question from whether I release a public analysis. I test completely open-source systems and completely closed-source ones alike, and they both get the same treatment at release time: test suite, examples, and the report are free for everyone.

                                                                                                          Basically, when folks ask for Jepsen work, they choose whether they’d like an actual public analysis or not. If they say yes, it is (I mean, assuming I can physically do the work, get paid, etc) getting published. Almost everyone chooses public. I think the last private Jepsen-analysis work I did was… one week in 2017?

                                                                                                          In addition, sometimes I do internal consulting, like reading docs or talking to engineers about their plans for a DB or infra. And there’s classes, trainings, and internal tech talks, most of which I don’t talk about publicly. I don’t think that stuff matters as much as my analysis work though–it’s generally not of public interest that I taught at FooCorp. :)

                                                                                                          1. 1

                                                                                                            Thanks aphyr! Nice to know that the percentage is quite heavy on the open side.

                                                                                                            Since you mentioned trainings, would love to be in one of our distsys classes. I’m not sure if my employer would be willing to sponsor a class but if I could attend one of your classes that you organize in the open or with employers, that’d be great! :)

                                                                                                            1. 1

                                                                                                              They’re welcome to talk to me about running a class–I don’t have any open sessions planned right now though.