Threads for dgrnbrg

    1. 3

      I used Datomic to build Cook, a multitenant preemptive batch & Spark scheduler for Mesos. We’ve used all kinds of features of Datomic, such as the object-like API, many different query features, raw index access, transaction functions, and log tailing. We also developed our own core.async API for Datomic to simplify its usage from a concurrent application (especially with retries).

      I’d be happy to answer any questions about the experience.

      1. 1

        We open-sourced some code to do that. For example, to have retrying core.async-API transactions for Datomic, “update-in” like functionality, and idempotency, this whole namespace has that functionality:

    2. 1

      Nice! I don’t have too much experience with Datomic, so I just have two pretty general questions:

      • Are you using the free version of Datomic in production? If so, have you had any issues with the embedded storage engine vs. the external ones like SQL or DynamoDB?
      • Have you open-sourced your core.async API for Datomic as an independent project? I remember trying to find one back in the day when I was exploring Datomic, and I was quite baffled not to find any leads.
      1. 2

        We’re using Datomic Pro with Riak–it’s worked quite well for us for the past ~18 months.

        Our API is here: We also include some schema snippets if you want to have all-or-nothing commits across many transactions (for giant non-atomic but isolated transactions).

        transact-with-retries and update are the 2 most powerful functions. They have several variants for blocking and core.async compatible forms. Also, there’s a helper for making idempotent transactions. We built all this to integrate with core.async and enable our application to have zero issues during Datomic failovers, if a transaction were to fail.

  1. 7

    What do folks generally do as far as monitoring, alerting and logging in the brave new containers world? Do you have to now monitor both the host machine and the container insides? How do you deal with the added complexity?

    1. 5

      It’s generally a good idea to err on the side of over-monitoring, as you never know what random question may be useful to answer while handling an incident. Monitor your host system metrics, monitor your mesos metrics ( is a dead simple metric collection example for this), and monitor your workloads. This is existing best practice, and there are many existing tools.

      When I was a SRE at Tumblr I gained a lot of respect for host-local metric and log aggregators that forward to clusters and buffer to disk when downstream failures occur - these are super useful in the context of Mesos as well. This way your tasks can hit a local endpoint, and the local aggregator worries about remote failure handling or downstream reconfiguration in a uniform way.

      One thing I’ll warn about in a more dynamic environment is that you should test anything that you rely on to re-resolve DNS. The JVM, for instance, caches indefinitely unless you explicitly tell it not to on initialization. While DNS is a nice universally half-implemented solution, a ton of stuff will fail to re-resolve during timeouts or connection failures at any threshold.

    2. 3

      We use one monitoring system for Mesos itself (soon we’ll be open-sourcing it!), and have applications & containers self-report metrics & alerts to hosted instances of Riemann. Essentially, this even allows us to split the ownership responsibility of the applications on the cluster vs. the cluster infrastructure.

  2. 5

    having never really used databases other than simple persistence stores, the bit i’m most curious about is how stored procedures are managed as code - in particular, can you include them as part of your source tree and deploy them into production the way you would compile and push a web application to your server? or are they treated more akin to a smalltalk image, where you manage the state of your database code entirely within its own environment?

    1. 9

      It’s (usually) a very fragile list of separate SQL script “migrations” from a known start state into the desired state, and these scripts are what you keep in your VCS, or in some cases in a separate store such as an issue manager, because there’s often a difference between:

      • Changes to the database that can be done before deployment
      • Changes that must only be done at deployment time
      • Changes that massage existing data rather than the structures of the database - these often need to be different for each environment, say across regions, or DEV/TEST/UAT/PRD/etc

      It can get hairy, quick :D

      1. 3

        that in and of itself would cause me to be very wary of using them, even though they otherwise sound like an excellent idea.

    2. 3

      in particular, can you include them as part of your source tree and deploy them into production the way you would compile and push a web application to your server?

      Of course. Store your functions/procedures/triggers in .sql files in a source tree. Load those files to perform your migration.

      As Sophistifunk points out, there are different types of migrations (safe whenever, need to coordinate with app deploys, etc), but that’s true whether you have logic in your database or not. The only complexity added by putting logic in your DB is the same sort of thing you get managing library dependencies – make sure the library’s semantics don’t change between versions, etc, etc.

      If you’re using functions/procedures/triggers to implement a library of functionality (like, e.g. user management), then my only strong recommendation is: treat it like a third-party library. That means: put it in a separate source repository, think about how to make calling it consistent and stable, things like that. It will mean a little more work up front, but it will also enforce a boundary that should both reduce ongoing work and prompt the sort of thought that can help prevent feature creep.

    3. 2

      A great thing about Datomic is that stored procedures are actually serialized code that can be updated by the application itself; with all of our stored procedures in Datomic, they’re versioned with the code and configured at application startup, so we avoid the fragile migration business that other stored procedure systems might need to handle.

  3. 4

    Presenter here! I’m happy to answer any questions about Mesos that weren’t covered in the video.

    1. 2

      Thanks, this was a really cool talk! It clarified a lot of stuff about Mesos for me.

      The Mesos paper (in section 3.6) says:

      … to deal with scheduler failures, Mesos allows a framework to register multiple schedulers such that when one fails, another one is notified by the Mesos master to take over. Frameworks must use their own mechanisms to share state between their schedulers.

      I’m very curious about this, but I’ve never been able to find any documentation / more info on it. Any chance you have any links to more info / know where I should be looking?

      1. 3

        Sure! I don’t think the feature exists with the exact API you’ve described, although that’s possible for Mesos 1.0 based on discussions I’ve had with Mesos core committers.

        Instead, here’s what you’d do: first, use a leader election system, like Curator’s LeaderSelector + Zookeeper, etcd, or Hazelcast. Once a leader is elected, it can create its Scheduler instance, whose FrameworkInfo must have an existing FrameworkID and failover_timeout set [1]. Then, when it starts, it’ll forcibly take over for the given FrameworkID.

        In our system, we use Curator to do leader election and store the FrameworkID. First, we check if the FrameworkID has been set at a predetermined ZooKeeper path. If not, you’ll be assigned a new FrameworkID on startup (and then you can store it for subsequent runs). It may also be possible to simply choose a FrameworkID directly rather than a letting it be generated and choosing the ZooKeeper path; I haven’t tried.


        1. 1

          interesting, thanks for the clarification!

          (sidenote, I wish this sort of thing were more clearly documented. In general the one gripe I’ve had trying to learn about Mesos is that documentation is hard to find)

    2. 2

      In the Q&A you address how persistent offers will allow durable enough storage for backing systems like HDFS that can handle their own placement & recovery from lost nodes. Is there a persistent storage solution for running a database such as Postgres inside Mesos? I checked out the ticket in JIRA but it sounds like that work is specific to HDFS/Riak style requirements.

      1. 2

        That ticket will work equally well for any persistent database on Mesos, be it Riak or Postgres. The trick is that the ticket is just providing a primitive: the ability to stay on the same machine over crashes and restarts.

        Once this exists, then we’ll be able to write a framework that handles issues like helping clients to discover where Postgres is running, to automatically configure replication and read slaves, and to automatically migrate databases between hosts.

        1. 1

          I see, makes sense. So for apps that don’t have the ability to handle their own replication the storage issues should be solved with a different tool? I’ve looked but it doesn’t look like there are any widely accepted options. Ceph RBD looks like it might do the job but I haven’t read much about it being used in production.

          1. 2

            That’s correct. Although, you probably shouldn’t be using distributed data stores without a real replication model, lest you discover the problems with replica divergence and availability the hard way.

  4. 1

    A 1-2GB cuckoo filter can track set membership of billions of items with under 1% error rate–that’s pretty cool, although I can’t think of any new applications it enables.

    1. 1

      It would be great for a terrorist watch list.

  5. 1

    What is exactly once semantics? If anyone can explain it to me simply?

    1. 1

      I linked some more in depth materials in the topic in the other branch of the thread, but it basically means guaranteeing that you’ll send each message once and only once.

      Guaranteeing message delivery is just that. That doesn’t necessarily mean it won’t get sent twice, three times, etc.

      Exactly-once messaging requires consistency if I understand correctly.

      1. 2

        Exactly-once is harder even than just having a sequentially consistent system. It is, in fact, impossible–suppose you want to send a fax (bear with me :)) exactly once when a certain event happens. Then, suppose that the call to sendFax() throws an exception. How can you know whether the fax went out or not, if the error says “didn’t recieve ACK”? This fact, that there’s always some action that isn’t part of your nice, sequentially consistent data model, is what forces every real system to make compromises.

      2. 1

        okay, thanks. Didn’t know it was such a difficult problem.

        1. 1

          Most computer stuff is like that :)

  6. 4

    I read the Tango paper a few months ago, and was generally disappointed, in that I felt that it didn’t have any novel contributions. Here, to me, is how Tango failed to be interesting:

    1. They claim to have a new idea for a high performance log. Their proof of this claim is essentially that SSDs are fast, and so by using something like paxos to dole out log locations, they can scalably write log entries in order. The issue here is that, for one thing, this strategy of doling out logical log indices before actually committing data has been known for a long time (for instance, in HPC durable queues).
    2. They write a lot about a system that amounts to “Hey look! Use materialized views to get value object semantics over a WAL!” This is pretty much how all databases work, so I was disappointed that they didn’t have a new idea here.
    3. They point out that they can shard the log in all kinds of cool ways, to meet any workload. Unfortunately, they can only shard the data portion of the log, not the transaction manager (i.e. the log index provider), and in order to determine whether transactions can proceed concurrently or must be serialized, they use the most restrictive trick in the book: analyze all reads and writes to do standard optimistic concurrency. The data portion of logs like this can obviously be cached and replicated to support the workload however needed.

    You can learn more about systems that behave like this by reading about Write Ahead Logs (for high performance transactions), Aeron [1] (HPC message queue with a similar style of fast appends), Datomic (uses a distirbuted WAL, materialized views, and adaptive replication [albeit simpler]), or Samza [2].

    Some areas of work that could, I think, improve the performance of systems like this:

    1. Come up with a limited form of transactions that would allow for logical merging of multiple logs, given limitations on transactions that span the transaction ID dispatcher.
    2. Come up with novel, adaptive ways to shard the queue to answer queries faster. Unfortunately, this is impossible with the structure of the usual queue, since the standard queue will have temporal locality and partition/entity locality. What about complex queries?

    [1] (please reply with a link to the project itself or video!)


  7. 4

    When I think of hypervisors, I think of running an ubuntu or centos image under the hypervisor. I think that this article points out something interesting: there’s another class of software, unikernels, that runs under a hypervisor. I believe that containers have beaten the VMs of yesterday–whole system images. On the flip side, I am not aware of any unikernels that have taken off or even been seriously considered for large production deployments–I’ve only seen POCs.

    I think that this is the fundamental point: containers share the kernel, thus inheriting isolation/security flaws and wonderful amounts of plumbing (e.g. filesystems, network APIs, etc). Unikernels share the hypervisor kernel, thus inheriting the security/isolation (which is better) but missing out on some of the higher-level plumbing. Ultimately, these systems seem to be converging on the same middle ground from two different sides: the userspace and the kernelspace.

    1. 3

      On the flip side, I am not aware of any unikernels that have taken off or even been seriously considered for large production deployments–I’ve only seen POCs.

      Pretty sure Galois' HaLVM has seen actual use. There are also some JeOS (just enough OS) flavors from a few distributions/OSes.

      1. 1

        That’s really exciting! Any idea what industries have been using it, or at what scale?

        1. 1

          I ran across this and this a few months ago. No firsthand knowledge.

  8. 1

    I wish they’d take migrations more seriously than they currently do.

    The only way to perform an arbitrary migration of data right now is to use my library Brambling.

    Which is a bit ridiculous since it took me a day or two to write and they could surely do a better job having access to the transactor and peer internals whereas I do not.

    Annoyingly, the only way to have a zero downtime migration is to have a middleware queue for your transactions which can pause and redirect tx flow.

    1. 2

      I think that migrations are philosophically opposed to Datomic’s notion of immutable history. When we’ve done migrations, we’re usually doing one of a couple things:

      1. Changing the index strategy. Here, you can just add or remove indices as needed
      2. Changing the data layout. If you are not changing your application’s API, but you are using a different schema, then both schemas can coexist and you can query old & new data either using DB rules or custom logic in the transactor. Alternatively, you can just rebuild the old entities to new ones in the same db; however, given that you always need a dual-read layer in any case, this buys you only a little over not rebuilding. Luckily, you can use database filters and reified transactions to ensure that, if you are rebuilding, data doesn’t show up in the old and new places.

      I can see that Brambling lets you make a new db with certain transformations of an old db, preserving the transaction structure, but I think that the other approaches (dual read layer or old/new schema) are more in line with the philosophy of datomic.

      1. 1

        I think that migrations are philosophically opposed to Datomic’s notion of immutable history.

        Having talked to them, it’s an excuse and not a reason. They used to say the same thing about shifting cardinality from One -> Many, now they support that.

  9. 10

    “So, you work in the service industry?”

    “No - I’m a chef. Cooking is definitely a profession, distinct from washing dishes and taking orders. I think it is even more than a profession, it is actually a new form of creative expression. That all said, ‘restauranteuring’ is a job that has many talented people working in it. But if they can’t cook, they are not part of my craft.”

    Establishing delineations between “those who can (code)” and “those who can’t” strikes me as nothing but an ego massage, and is a really ugly thing when you work at a company where those lines have been formed.

    1. 3

      I think that the reaction against this that some developers comes down to a split between engineers and service professionals. In building a building, the architects, civil engineers, and construction workers perform very different tasks. If most people’s experience with construction is the contractor that redid their kitchen, it’s perfectly valid for a civil engineer who builds skyscrapers to want to be distinguished from the person who lays tile floors.

      There are certainly very skilled non-programmer technical IT professionals, but there are also many who aren’t [1] [2]. These depictions of the job “IT” cast it in a negative light that many, myself included, would rather not be associated with.

      [1] [2]

    2. 2

      It’s an ego massage. And there should be some. Being a good chef is a lot simpler than to grok nowadays computer systems. Just try counting the levels of abstraction you are able to think at: user clicks at a link and a page loads. What really happened? Go!

      1. 8

        Customer orders a plate, a plate comes out. What really happened? Well, the chef was responsible for striking a balance between many different demands.

        Where did the meat come from - did he source it locally or was it purchased wholesale from a restaurant supplier? Depends on the focus of the restaurant, the demands of the owner, price evaluation… and boy, if we’re going farm-to-table things are going to get significantly more complicated!

        What spices were used? The local folks in Austin TX are going to have a different palate than a restaurant in North Dakota; you’re not going to have much luck with hákarl and brennivín outside Helsinki. So that’s an important consideration!

        Why did he pair it with kale? Is it because the taste is right? Because it’s seasonal right now? Because kale is an in-demand vegetable and therefore more likely to get ordered?

        Why is it the special tonight? Because the kale was starting to wilt and needed to be priced to move? Because the butcher had extra pork this morning?

        Dave called in sick; who am I going to put on meats? Sandra is a good commis but I don’t think she’s ready for that all on her own. Maybe Cassie can rotate in to cover?

        Ah. Table two just ordered the scallops, better get the oil heating…

        “What I do is hard, what he does is easy” is a pernicious myth that’s infected programmers. Sure, programming is hard. But lots of careers are hard. Most of the time we only think things are easy because we lack insight into what’s really happening.

        Also, as you note, there’s quite a bit happening when I click “Post” on this message. But I challenge you to find any programmer in the world that can truly explain it top to bottom. Does it make a difference that plenty of us can hand wave from the top to the bottom (“well, uh, you’ve got a bunch of transistors… on some silicon”)? Nah.

        At the end of the day we’re not that different from chefs - he works salads, I work databases. She’s the sous-chef de cuisine, he’s the team lead…

        1. 3

          I apologize, I mistook chef for a cook in your original reply. I stand corrected, being a programmer is a lot similar to being a chef.

          On the other hand, I doubt that a chef would not be offended by being considered just another kitchen monkey and I do not doubt that he is the most qualified person in the kitchen, able to replace any other role as required. I’ve seen programmers perform database or system administrator work and pick up help desk calls as needed. I have yet to see a help desk operator or a manager to restore a database backup.

          And yes, I believe that “hardness” and necessity of programming is only surpassed by “hardness” and necessity of the theoretical research in our domain (looking at Microsoft Research, IBM, Haskell guys, Racket guys and many more). Rest of IT are just the cooks, waiters and marketing guys.

          1. 2

            On the other hand, I doubt that a chef would not be offended by being considered just another kitchen monkey and I do not doubt that he is the most qualified person in the kitchen, able to replace any other role as required. I’ve seen programmers perform database or system administrator work and pick up help desk calls as needed. I have yet to see a help desk operator or a manager to restore a database backup.

            I’m the OP of the blog post. Bingo. This is exactly right.

            There is an awful lot of false equivalence going around. My post is, of course, full of ego. That’s intentional. Ego and pride go together.

            Most great programmers I know have a lot of pride in their work because of the amount of training (often autodidactic training) that goes into becoming a great programmer. And because people who don’t understand automation and programming simply don’t “get it”.

            The common conception of a software engineer is that it’s a person who “knows computers”. But this would be similar to saying that an astronomer just “knows telescopes” or a surgeon just “knows scalpels” or a chef just “knows cooking utensils”.

            I don’t believe that programming is the One True Craft. But I do believe, it is a craft. And that in the same way that there is a huge gap between a nurse and a doctor, or a paralegal and a lawyer, there is a huge gap between an IT analyst and a programmer. That gap includes a mixture of training, life devotion, and art – and is easiest to express as “programmers can code, and IT analysts can’t.”

            And it’s not “code” as in “make a script work”. Everyone can cook, but a world-class chef can create little meal masterpieces. Anyone can wield a scalpel, but a world-class surgeon can save your life with it. And yes, it’s true, anyone can program – but there are programs, and then there are programs!

    3. 1

      It seems to me you’re making his point? There’s a hell of a lot of difference between a chef, a maitre d', a silver service waiter, and a burger flipper / table wiper at McDonald’s.

      1. 1

        Right, but at the end of the day they’re all in the restaurant industry. Just like at the end of the day, we do “work in IT”. That’s the first point I was shooting at. Just because the author doesn’t like to be called “IT” doesn’t make it so.

        The second point I was going for: sure, there’s a big difference between the fast-food fry cook and Thomas Keller. But it’s foolishness to say one is more important than the other; McDonald’s can work without Keller but can’t work without Rita on the fries.

        And that’s the real importance of the “unskilled IT workers” others talk about (incorrectly, I’d say, because anything IT does require some skill…). If you have a good IT department the project managers don’t have to worry about troubleshooting their printer, which means the good project managers can focus on calling the customer and negotiating requirements with management, which means the good programmers don’t have to worry about incomplete specifications and can instead… code.

        Could I troubleshoot a printer and negotiate with managers/customers? Sure! It might take me a bit longer, but I’ve done it before - we all have. Could the IT guy do my job? Likely no. But that doesn’t mean I’m necessarily more important than the IT guy. Good low-level workers act as a multiplier for levels above them.

        As a fun anecdote, I remember one day long ago when the owner of the company I worked at accidentally deleted some majorly important file and was freaking out. Guess who was the most important person that day - not us programmers, but the IT guy that recovered it for him!

        1. 1

          I have personally met many programmers that were competent programmers but sucked at IT.

          Likewise I firmly believe there are programming equivalents of fry cooks in our industry, generally those “consultants” who’s main focus is to do one off Sharepoint plugins.

  10. 2

    I’m developing a distributed job scheduler that heavily leverages Datomic, core.async, and Mesos. There’s lots of interesting asynchronous messages constructions, since Datomic and Mesos are event-based, and core.async makes a wonderful “glue” fabric.

  11. 3

    I’m not sure I understand the argument. It seems to acknowledge that there is a need for constructs that support programming concurrently, and state a preference for the ones that prevent us from having to think concurrently, but not to include them in the language?

    Does it matter where we make concurrency mechanisms available? Probably it’s more important that we understand concurrency models and apply them appropriately.

    1. 1

      I think that you can trace this justification by looking at a system like Boehm or Ravenbrook MPS–in these systems, you need to write a lot of boilerplate to get the GC to work, and there’s still the possibility of mistakes. Compare this to writing reactive code in node.js or Java vs. using Erlang or Go, and the parallel should be more clear. Essentially, the reduced barrier to entry for these existing useful concurrency models is important for their future success.

  12. 3

    I disagree with this. Even if an application developer does not realize they are communicating with other machines, history has shown that providing tools for concurrency inside the language greatly enhances that experience. Even in rather nice abstractions like TPL, you still need to care about things that you probably don’t want to because of the substrate a Task runs on. It’s also going to be harder for languages to not provide concurrency as a first-class citizen given the success of Go, IMO.

    1. 2

      I agree with you, not really for the concurrency that a language like Go provides and more along the line of what research languages like Bloom ( are moving towards.

      1. 1

        I don’t think that Bloom’s model has much hope in the next 5 years–based on my experience in the enterprise, with fairly capable developers, the mere mention of “eventual consistency” (even when followed up w/ “but it’s optional!”) causes many to believe that eventual consistency = incorrect results. I think it will require more people than just those in compiler courses to become comfortable with semilattices early on in the career; however, I’m looking forward to that day.

  13. 12

    It seems like many of the issues in this article and those like it can be attributed to the absurdly low barrier to entry for software companies. If you have no money, no likely revenue, and nothing holding you accountable, are you a company or a club? Bio-tech startups certainly don’t work this way.

    Tangentially related: I don’t remember ever hearing anyone say “How can we get more people into chemical engineering?” or seen any Learn-to-play-with-deadly-chemicals-in-12-weeks “engineering” courses.

    1. 7

      The article’s critique was anchored specifically on the comments of the founders of Paypal, Facebook and 42Floors. “No money, no likely revenue, and nothing holding you accountable” doesn’t really seem to apply.

      Bio-tech startups would seem to have diversity problems that mirror those of software startups fairly closely.

      1. 1

        “You can’t underdress to a Bio-tech interview” He was failing the come-over-for-dungeons-and-dragons test and he didn’t even know it…

    2. 3

      Bio-tech startups certainly don’t work this way.

      Are you sure?

      1. 4

        I know of bio-tech startups that behave very similarly to this. Perhaps not in the clubby, hipster way, but in a manner that suggests a similar lack of professionalism in social aspects.

      2. 1


        1. 2

          Based on?

    3. 1

      I don’t remember ever hearing anyone say

      There are certainly efforts in other fields to encourage more people to get into them.

      But, and maybe this is just personal bias, chemical engineering does not underlie so much of modern socio/political/economic power in the way that software does. Software is eating the world, and without some kind of base literacy, it’s increasingly hard to understand what’s going on around us. That’s why I think more people should have at least a tiny understanding of how computing works.

      1. 8

        I would maybe argue that this is personal bias. Understanding chemistry is the basis of petroleum engineering, pharmaceuticals, materials science, and a bunch of other really important stuff (generating and storing electrical energy efficiently, fertilizers/pesticides, etc.). Between them those things underlie a vastly larger fraction of socio/political/economic power in the world than software.

        I think it’s easy to get an inflated idea of the importance of something (e.g. software) when you live and breathe it every day.

        If I had to guess why there’s such an outsize effort to get people into software I would probably say it’s just because there happens to be a labor shortage at the moment, but I don’t really know ¯_(ツ)_/¯.

        1. 1

          Replying to both you and your sibling, computing is what ends up controlling all those important things, however. And communication is a greater power than any particular physical good, and that’s all computerized at this point.

          I do think all fields are important, regardless.

          1. 3

            Supporting @steveklabnik’s point, from the original Software is Eating the World essay:

            Oil and gas companies were early innovators in supercomputing and data visualization and analysis, which are crucial to today’s oil and gas exploration efforts. Agriculture is increasingly powered by software as well, including satellite analysis of soils linked to per-acre seed selection software algorithms

          2. 2

            Fair point, although you can’t communicate on any modern medium without oil and electricity to power it (nor can you even make a computer!); there’s sort of a chicken and egg thing here :P

            I guess I don’t really see the argument that computing is a more reasonable thing for everyone to understand the basics of than chemistry or physics or biology or what have you; all of it is important and runs the the world in some way. Maybe we should encourage people to learn a bit of everything :)

            1. 3

              you can’t communicate on any modern medium without oil and electricity to power it

              This is absolutely true, and something I worry about way more than I should, probably.

              Maybe we should encourage people to learn a bit of everything :)

              Yes, very much this. This over-focus on STEM is incredibly harmful :(

            2. 2

              From here it looks like most jobs over the next century will require skills we currently think of as software skills. In the same way that most jobs today require some kind of “functional computer literacy”, and most jobs last century required ordinary literacy. The skills of programming seem generally applicable, in a way that something like materials science (while fascinating, and underlying lots of recent engineering advances) isn’t. Will knowing chemistry/physics/biology make you a better accountant/architect/artist? Maybe, but the connection seems more direct and obvious for computing.

              1. 1

                I’d tend to agree, which is what I posited in my OP: the push for people to learn computing is more about the labor market than anything else.

      2. 3

        I think that’s totally personal bias. Oil has more power than computing could ever dream of and no one is saying “teach all our kids petroleum engineering.”

      3. 2

        I think that the importance of software lies on the fact that it works as a tool to enhance people’s capabilities. While other fields may be directly involved in the production of good and services that are fundamental to support our modern lifestyle, software not only has an impact on these fields but in almost everything else we do, be it big or small. Because of this, its reach is, at least, much larger than any other field I know of.

  14. 10

    Although this lots of interesting points in this article, there’s one that is near and dear to me–how are these companies filled with very young people supposed to get better an real interviews? Based on my observations of both big companies like Google and MS, and small startups, there seems to be a theme that interview “training” involves watching someone else interview 2-3 times, and then you’re good to go! Or maybe just wing it!

    This doesn’t foster continual improvement or skills sharing–where have you seen a good model of interview training?

  15. 9

    I can, from personal experience. confirm a number of the scenarios layed out in this post as well as at least one other that isn’t covered.

    We alleviated our problems by reducing long gc pauses and moving to dedicated master nodes. I’d advise anyone using ES to at the minimum, take those steps.

    1. 3

      How does one reduce long gc pauses? Was it a question of GC tuning, heap size adjustments, or splitting workload to multiple VMs?

      1. 5

        We have a lot of work to do on this. With dedicated master nodes, long GC pause are now only a performance issue.

        Currently we are working on creating less garbage as that is the ideal way to deal with GC issues.

        What we did do was lower the CMS initiation rate from 75% to 50% which in our case, kicks in 2 gigs earlier as we currently have 8 gig nodes. We are looking at lowering heap size and moving to G1 which we have had great success with elsewhere.

        Additionally, we lowered some internal ES cache values greatly. We noticed no overall performance impact and got a large win in terms of length of time for GC pause.

        We currently have few stop the world compactions and long young gen GC pauses are a lot more rare. We could tune the GC more but are spending our time figuring out how we can change our indexes and queries to generate less garbage. We know we have some inefficiencies we can address there.

        In general on the JVM I try to take the following steps to deal with GC issues:

        1. lower heap size
        2. switch to g1 gc (i find it far easier to work with than other algo on the jvm in terms of time invested tuning)
        3. create less garbage

        For me, that as I work down that list, it usually becomes more and more work.

        I’d advise anyone interested in the subject get

        Java Performance by Charlie Hunt (

        and check out some of these videos:

        and posts:

        The JVM can give you great reporting on GC events. If the following options are unfamiliar to you and you are interested in GC tuning, then definitely get Java Performance as it covers these and more in depth (plus how to read the resulting files):

        -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:***** -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintSafepointStatistics

        1. 1

          Why couldn’t I get any of this out of you when I was asking for information on Twitter?

          1. 3

            Timing. Hadn’t gone boom yet

  16. 8

    This is another fantastic article about distributed systems testing by @aphyr. I have been really glad to see that he’s been breaking down how to actually use Jepsen–perhaps we’ll start seeing more and more people applying Jepsen to their projects.

    With respect to etcd in particular, this has finally pushed me over the edge to the point where I’d consider building a system with etcd, rather than only considering ZK.

    1. 7

      I would actually hold off if safety is critical. Given how ZK, Doozer, Chubby, etc went, it’ll be another five years or so before they iron out all the kinks, haha.

    2. 1

      Yeah I was gonna say, this seemed to me to argue that sticking with Zookeeper for now is probably the right choice.

      One relevant question for @aphyr: IIRC your original ZK article was with an old version of Jepsen without the linearizability checker, have you tested ZK with knossos at all?

      1. 16

        Not yet, no. Each post is between 50-100 hours of work, and this is all nights+weekends, so it takes a while.

        1. 4

          Yeah I understand, just curious. Thank you for doing these, we all appreciate it :)

        2. 1

          Is there any way I can tip you? A paypal account maybe? I have learnt a lot about distributed systems from your blog. I want to show my appreciation by sending some funds.

  17. 11

    I think that when you’re thinking about these kinds of problems, you should really understand the amount of time worth spending on research. For instance, suppose you think it’ll take a month to develop the project from scratch. You should probably spend at least 40 full hours reading about existing systems before starting development.

    Another thing to consider is that most systems are the combination of some compute layer, some data layer, and some messaging/transport layer. Everything from databases, to websites, to mapreduce frameworks can fit in this abstract model. You can apply this to your project by thinking about what communication patterns exist internally, how you want to be able to query and access your data, and what computations you’re running. These considerations can help you realize that seemingly unrelated projects get you 90% of the way to your solution!

  18. 9

    I’m not sure how old this is but some of the Riak points are not accurate. I only know Riak so I can’t speak to if the other databases have mistakes:

    Riak (V1.2)

    Riak has been on 1.4.x for quite a while.

    Main point: Fault tolerance

    Not wrong, but I think ‘write-availability’ is a better way to put it.

    Protocol: HTTP/REST or custom binary

    It’s not custom binary, it’s ProtoBufs

    Secondary indices: but only one at once

    You can have multiple secondary indicies, but you can only query by one (or a range).

    In the process of migrating the storing backend from “Bitcask” to Google’s “LevelDB”

    No, support for LevelDB has been added, but you can still use (and are suggested to) use BitCask just fine. In fact, each bucket (and you can have many) can have its own backend.

    I found this comparison severely lacking. No discussion of consistency model, failover model, scalability, etc. The comparisons were all rather shallow.

    1. 1

      I found this comparison severely lacking. No discussion of consistency model, failover model, scalability, etc. The comparisons were all rather shallow.

      Definitely the case. However, for someone who doesn’t know a lot about DBs and is trying to choose one, this could be pretty useful. The common use case was a nice touch I thought.

      No, support for LevelDB has been added, but you can still use (and are suggested to) use BitCask just fine.

      Doesn’t BitCask not support secondary indexes? I haven’t used Riak in a bit.

      I only know Riak

      May I ask what your use case is? I find Riak to be a really interesting database and I like to hear how different people use it.

      1. 4

        However, for someone who doesn’t know a lot about DBs and is trying to choose one, this could be pretty useful

        Without describing the consistency model I think it’s doing a disservice to people. Eventual consistency means things can act in very unintuitive ways. If all you see is “fault tolerant” then that doesn’t tell you much about what you give up to be fault tolerant.

        Doesn’t BitCask not support secondary indexes? I haven’t used Riak in a bit.

        Correct, but secondary indexes should mostly be avoided IME.

        May I ask what your use case is?

        I work on an an ecommerce system whose customer facing end is required to be ‘always-on’.

      2. 1

        I am in a similar situation to GP, and I disagree that this is a useful comparison for someone who doesn’t know a lot about DBs. The thing you want to look at in the very beginning is the multi-machine scaling and consistency story. For instance, with Redis, good luck–it’s only useful as a cache if you want to scale it–it can’t store data of record. When choosing between things like Riak, Cassandra, RethinkDB, Hyperdex, and Datomic, you need to understand if your workload is high write, high random read, or bulk analytics. Additionally, you need to understand what relaxation you want when your network or machines start failing–do you want as much data as possible to return in a query (but maybe be missing important things)? Or do you want to have your database help you wait until you’re able to see a consistent and reproducible view of things?