1. 36
  1.  

  2. 11

    A little trite, but worth emphasizing: avoid complexity so long as you absolutely can. If you can only take on complexity relevant to your problem, you are so much better off.

    It comes much later on that something like Kubernetes actually does simplify things. Lower global complexity at the cost of some central complexity.

    It’s also worth saying that you should always keep your mind open. Maybe etcd/Zookeeper actually drastically simplifies your solution because you need leader election.

    1. 11

      Maybe etcd/Zookeeper actually drastically simplifies your solution because you need leader election.

      Sure but if you’re making, for example, an e-commerce site handling relatively low volume and you find yourself needing leader election, you should probably think up the stack a few frames and question why you have multiple nodes that need to elect a leader.

      I’d be interested in seeing a set of tables (because, as a reformed mechanical engineer, I loves me some tables) that match various workloads with industry standard infrastructure.

      Things like:

      • If you do X transactions/hour in Y type of business, you only need a database of size Z.
      • If you have a customer base across X countries, you should have Y datacenters or availability zones.
      • If the average value of a transaction is X, and a downtime of Y results in Z missed transactions, consider HA solution W.

      I know that’s all horribly boring, but I’m getting the feeling nowadays that new developers (most of our industry, especially in startups) just have no idea what is a reasonably-sized deployment for their problem.

      It’s like bringing a new workman in to hang a picture and because it needs a nail they bring in an air compressor, hose, nailgun, and all the rest–when a tack hammer would have sufficed.

      Anybody feel similarly?

      1. 10

        Bro, once I get done handing out these sweet stickers at SXSW, we’re going to be doing like a billion requests per second.

        This article didn’t address why this happens. There’s the category of “problems it’s good to have”, and people seek out aspirational solutions to these problems, in the hope of then having the problem.

        1. 2

          I think that having a set of tables or something similar that tries to capture the common types of software and what usage patterns one should expect would be pretty handy. I’d be interested in seeing something like this, but I lack the knowledge and experience to complete it by myself. In the spirit of shipping things, however, I wouldn’t mind starting a wiki or distributed spreadsheet on it.

          I think the closest thing we have is the TechEmpower server benchmarks, and/or the discussions on sites like https://lowendbox.com/. http://highscalability.com/ seems like it might be relevant, but it also seems to focus on heavily engineered things like Uber or Amazon, which I suspect wouldn’t be the focus here.

          1. 6

            I would say you need one extra moving part for every factor of 10 users. There’s some question of how to count at the beginning, but I’d say if you have a web server, framework, and app (nginx, rails, whatevs) that’s 3 and should be good for 1000. Add memcached? 10000. I’ll count SQLite as nonmoving. This arithmetic implies not needing a separate database server or load balancer until another factor of 100, so perhaps a million users.

            1. 2

              How would you count “users” in this case? There is a large difference between passive consumers and active users. A blog may only have a single active user if it has no comments, but many readers.

              It’s funny (and very probably accurate) to count SQLite as a non-moving part. That is what forms both the appeal and trade-offs of using it.

              1. 2

                I think you can count any way (and any thing) you want. Passive users, daily actives, requests/sec, etc. The scaling math works out about the same, just different minimum thresholds. Like parts required (req/s) = log(req/s) + 2.

          2. 1

            In my commercial experience, the people who would read those tables are unfortunately the ones who least need them.

            1. 1

              Sure but if you’re making, for example, an e-commerce site handling relatively low volume and you find yourself needing leader election, you should probably think up the stack a few frames and question why you have multiple nodes that need to elect a leader.

              Totally agreed. Though sometimes it comes from external requirements, e.g. with GNIP. You can only have N consumers at a given time (where usually N=1), but want to ensure that if one fails another starts consuming. Maybe this is a really social ecommerce platform. :)

              I’d be interested in seeing a set of tables (because, as a reformed mechanical engineer, I loves me some tables) that match various workloads with industry standard infrastructure.

              That’d be great! I do think it’d need to be published every 6-12 months to keep pace with software and hardware evolution. What used to take a sharded, complicated setup with memcached and the whole nine yards in 2006 is now doable with a single server (+ a slave to failover to).

              I have a site that’s been running on vBulletin for ~10-11 years. Each year it’s grown at a pretty steady rate. Yet, beyond the first few years, I’ve been able to shrink the amount of hardware it uses. It used to soak up 90% of a dedicated server, now I’m on a fairly cheap ($40/mo) Linode VPS. In that time I’ve added lots of functionality and tons of home-rolled data collection (to avoid sending to Google Analytics). Had I been doing what I do today with the amount of traffic I have back in 2006 I’d easily be on 3-5 (or more!) servers. At this point there’s (basic) machine learning+model serving, home-rolled analytics, a Rails app fronting an ElasticSearch process with millions of items in it, MySQL for vBulletin, Postgres for everything else, and a Java API backing a mobile app.

              That’s only possible due to improvements in CPUs, RAM, and the general rollout of SSDs. What used to take a air compressor and nailgun is now as simple as a tack hammer.

          3. [Comment removed by author]

            1. 3

              Don’t go out and learn about all that stuff, guys, it’s such a hassle! :)

              You shouldn’t do this for the same reasons you shouldn’t write crypto software: you’re not qualified, and you almost certainly are not going to have the availability or reliability of RDS or Heroku Postgres. But as an educational activity, sure.

              1. 6

                “Qualified” is a red herring and unnecessarily combative. Running a HA RDBMS is quite feasible, it just takes some effort and time. The question here is really where does one want to spend their finite time. Even if you are qualified to run an HA RDBMS, it still might not be in your best interest to run it yourself.

            2. 10

              One thing that is incredibly tiring to me about modern software development than the notion going around that everybody’s thoughts are VERY important and totally worth writing a fluff blog post about.

              1. 3

                If this notion didn’t exist, both Lobsters and HN would have half as many terrible submissions :-)

              2. 4

                On the subject of containers, one thing I really like about FreeBSD jails is that you get a system out of it that you can treat like any other system. So you can carve a bunch of “machines” out of a single machine running somewhere without having to re-learn how to package very piece of software. Right now I’m doing this under a free tier in GCP where I have two VMs but I’ve carved closer to 8 systems out of them.

                1. 4

                  How can I learn more about this technique?

                  1. 7

                    There is the Jails section of the handbook:

                    https://www.freebsd.org/doc/handbook/book.html#jails

                    And I use iocage for managing my jails, which requires ZFS. iocage is undergoing a massive rewrite, which often goes poorly, but seems like it’s making good headway.

                    I create jails with 127.0.0.0/8 IPs and then I use a firewall to NAT external traffic in or out, however depending on your setup you can let a Jail get its own IP like any other system would.

                    If you have any specific questions, I’m happy to do my best to answer.

                    1. 1

                      Thanks! I’ll investigate later

                2. 3

                  This is both intelligent and moronic at the same time. Simplicity is better: but ignorance of what you’re doing and needing is worse. Don’t farm off the understanding of your problem and your solutions to someone else… some vendor happily offering those solutions for a lo lo cost…. you’ll hurt.

                  I deal with this a lot: don’t use NoSQL reflexively. Use a dang Postgres instance until it’s clear that the reasons for using a NoSQL database ( e.g., dynamodb) outweigh using a SQL database (approximately this looks like “very high volume with very few relations”, but YMMV - do the due diligence).

                  1. 2

                    I deal with this a lot: don’t use NoSQL reflexively. Use a dang Postgres instance until it’s clear that the reasons for using a NoSQL database ( e.g., dynamodb) outweigh using a SQL database (approximately this looks like “very high volume with very few relations”, but YMMV - do the due diligence).

                    I think this is backwards. If you’re using Postgres you’re signing up to a lot of constraints and non-obvious failure modes (“you want to do a 5-way unindexed join? Sure, knock yourself out. Oh, you put a few more rows in that table and your query’s running slowly?”) when you aren’t necessarily reaping the benefits (e.g. your database may be doing a lot of work enforcing ACID when you haven’t actually set up your transaction boundaries to correspond to something with business meaning; your inserts may commit slowly because your indices have to be updated every time when actually those indices are only used by daily batch queries; your database server may spend most of its CPU time parsing SQL syntax when you don’t even do any ad-hoc queries).

                  2. 1

                    I suspect we’d see less of this if problem domains were more substantial than “order underwear online.”

                    Devs also has a responsibility to step up and restrict playtime to non-work hours.

                    1. 1

                      Good points. Also: please stop spending business effort making people want useless/evil stuff they don’t need.

                      1. -2

                        Posts like this are appalling. Whilst it’s true that reinventing the exact same wheel is not an optimal use of time, the fundamental point of engineering and indeed growing engineers is to understand engineering, extend engineering, and improve engineering.

                        This post could have been written in one simple sentence: “Don’t try to understand anything or do any actual engineering, just use what is already provided, ok?”

                        1. 9

                          I don’t think the dichotomy is “use engineering” vs “don’t use engineering”. It’s “study engineering that is completely unrelated to your business product and then force the product to conform to those systems regardless of how applicable it actually is” vs “study the engineering that would actually be beneficial and use that.”

                          Can you make the case that container orchestration is worth the tradeoffs and opportunity costs? Great! Otherwise, don’t crash the wreck the business chasing your interests.

                          1. 1

                            Totally agree with you; my primary complaint is that the original post author doesn’t summarize this point as well as you have in two tiny paragraphs. :-)

                          2. 5

                            the fundamental point of engineering and indeed growing engineers is to understand engineering, extend engineering, and improve engineering.

                            Bzzzzrt, wrong.

                            The fundamental point of engineering is solving problems under constraint using approximations. Occasionally, this means inventing new tools–however, it more often means (at least in real engineering disciplines that aren’t software) knowing how to match well-understood tools to the job at hand so as to avoid unexpected delays, unanticipated costs, and unneeded retraining.

                            1. 2

                              I cannot agree with you; without experimentation with and understanding of principles there can be no advancement, and crucially there can be no repairs.

                              The sentiment I want to believe the post author is trying to convey is indeed as apy suggests that unnecessary complexity is bad; this I agree with wholeheartedly. But that is absolutely not what he is actually saying in what is in my humble opinion, an ill-considered post.

                            2. 6

                              I did not interpret this post as being about reinventing the wheel, but rather to hold off on introducing complexity to your system.

                              1. 2

                                That is not what I read. I read the eternal mantra of the good engineer: YAGNI.

                                1. 1

                                  I see it as telling people to recognize that any business has a limited amount of engineering bandwidth, and that bandwidth should be spent as much as possible on things that either directly increase business value, or need to be done and can’t be done any other way. If a reliable DB is what you need, then paying for RDS usually beats your own engineers spending time administrating your own DB server.