1. 12
  1.  

  2. 18

    This post is mostly hand waving. There might be some good ideas in it but the fact is: it comes down to money. One is cheaper than the other given a usecase. The author throws out some very quick numbers and then proceeds to just state a bunch of other costs but doesn’t even give them a dollar value.

    Back in around 2010 I was doing cost estimations for a project and we found that, with fairly high cost of ownership on datacenters, for our usecase the costs were in favor of the cloud if our machine utilization was less than 50%, on average. Ours was closer to 25%, so we ran in the cloud. There are a few other things to consider as well, like some very low-latency workloads can be difficult to get working in the cloud, and how much upfront capital one has for an idea.

    But, in the end, it’s money. And one can calculate these things, perhaps to an approximation, and that should be how one makes this decision. Hand waving doesn’t pay operation costs.

    1. 5

      There might be some good ideas in it but the fact is: it comes down to money

      There are capability issues, as well. For example: as much as public clouds have improved over the past few years, storage throughput and latency is still years behind what you can do in your own datacenter. Some architectures still need the kind of reliability that you lose when moving to a shared environment.

      1. 2

        I mention this if you read a bit longer.

    2. 17

      This post lacks addressing the fundamental reasons of not using a public cloud. While it’s beneficial for tons of usecases (distributing and computing public content, like livestreams, having access to compute power for a few hours without having to buy tons of hardware, etc), it also has downsides:

      Security

      It’s a significant attack vector to share resources with others. See the Rowhammer vulnerability, or this paper (extracting private keys on shared CPU systems, there’re more papers, this was the first I found) for example.

      Vendor lock-in

      If you rely on e.g. AWS or GCE services that can’t be migrated easily (e.g. Route53), you’ve locked yourself in, and you can only switch providers if necessary with enourmous efforts.

      Privacy

      I think it’s just irresponsible to host private customer data in locations that don’t provide basic security you can trust. Your database drive might end up in the hands of a customer or third-party service contractor, let alone issues of government access. Encryption, e.g. on GCE, which relies on keys stored on their servers only address a few of those issues.

      Price

      Running a 24/7 machine on a major cloud provider usually costs a lot.

      Edit: Formatting

      1. 7

        Price

        Running a 24/7 machine on a major cloud provider usually costs a lot.

        This is especially true when you’re doing mainly compute, rather than hosting. Advocates of public clouds often correctly focus on the variety of infrastructure you’d have to handle yourself if you don’t go with them: uptime, several kinds of failover, fault-tolerant storage, load scaling, etc. Properly costed, that may well narrow or eliminate the price difference. But in data analysis it’s much more common to have a situation where I don’t need all that good uptime, failover, integration with databases, CDNs, etc. Instead I just need a lot of compute throughput over a period of months, for ideally as little money as possible.

        For example, if you’re training deep neural networks on GPUs, which is a very common current use-case both in academia and industry, AWS for a single month of usage will charge you about what it costs to buy the machine outright. Even if you add in substantial overhead, it’s hard to make AWS come out ahead here.

        1. 1

          If you rely on e.g. AWS or GCE services that can’t be migrated easily (e.g. Route53), you’ve locked yourself in, and you can only switch providers if necessary with enourmous efforts.

          Sad face - I have some domains in R53. Why is it so hard to migrate?

          1. 3

            I suppose you can migrate away (with the usual pain of migrating DNS) unless you use “special features”.

            I should’ve made it clearer in my comment: When saying “Lock-In”, a was refering to “special” features of services. If you’re running a virtual machine with PostgreSQL on a cloud provider, you can migrate to another provider and set up your virtual machine there. But if you use managed databases (like e.g. Amazon Redshift), migrating away won’t work as easily, as Redshift is not an OpenSource product you can just run anywhere.

            1. 1

              Got it. Thanks for clarifying. Yeah some of those extensions sure do have a warm embrace, huh.