1. 68
  1. 23

    I do think big cloud providers offer something that can’t be easily quantified by cost/complexity: flexibility.

    It’s great to use a PaaS… until your app needs to do anything non-standard. It’s great to colocate… until you get customers overseas. But if you start with a big cloud provider, there’s essentially nothing that can’t be solved with one of their myriad services and a swipe of the credit card.

    This is coming from someone whose job involves running our own datacenter, which has saved us millions of dollars and counting. So I know the value of staying out of the cloud, but there are still regular pain points where we have to resort to the cloud, like multi-region DR, CDNs, or other managed services that we don’t have the staff to support.

    1. 22

      It sounds like you and your company have done the engineering work to determine that running your own machines is more cost-effective than the cloud, and you know this in enough detail to quantify the exact savings. I think that’s just what the article is asking for: making the decision to be in the cloud or not on purpose, rather than using a cloud offering “because Netflix does” or “because shiny new services”.

    2. 19

      My thinking is:

      • at the low end, service are given away for free or below cost as a loss leader to gain customers
      • at the highest end, prices have to be close to competitive because big businesses will eat an N year $X million dollar migration if it saves $Y million annually
      • therefore, avoid being medium sized!
      1. 8

        Hmm. I wonder if all cloud sceptics think the cloud is obviously unsuitable for their scale, but just imagine it must be cost-effective at others?

        I’ve done a bunch of time at companies in the build-your-own-datacenter regime, and it doesn’t seem to make financial sense for them to use the cloud for anything, even if you assume you’ll pay a small fraction of list prices. (Sometimes they still did, but this was pretty transparently motivated by middle managers working on their resumés). I’ve always assumed that AWS works by attracting customers near the start of their life, and retaining them by being so deeply embedded into their architecture that it never seems worth the immediate pain of switching. After all, this is a world where no tech company can write software fast enough to satisfy their product department—they’re not going to want to stop making customer commitments for a year while they rewrite all their stuff to not depend on Amazon.

        So if, as OP is arguing, it doesn’t make much sense at small scales either, I wonder if the reasons for it are a bit subtler. One thought that’s crossed my mind before is that by paying Amazon for things one can sneak a bunch of stuff into the budget that’d never get past the bean counters otherwise. You could build an in-house network as fast as AWS’s, and then your developers wouldn’t ever have to worry about network topology, but you won’t be allowed to add a zero to your networking infra costs just so your developers can be lazy. Maybe paying Amazon is just a way to pamper programmers.

        1. 7

          To be clear, I am at the low end, so cloud is cost effective for me because I only pay a couple hundred per year for hosting. There’s very little money to be saved vs. what I’m paying now.

          One thought that’s crossed my mind before is that by paying Amazon for things one can sneak a bunch of stuff into the budget that’d never get past the bean counters otherwise.

          That was the primary motivation for me to use AWS at my last job. It was already billing, so I could do whatever I wanted without asking anyone.

          1. 4

            You basically hit on the reason. If you have huge demand from product for new features, then it makes sense to pay a cloud provider premium so you can dedicate engineering to your own product. In this situation, engineering time is probably more scarce than money.

            That said, the other nuance is if your workload is highly predictable, then build your own data center is gonna save you a lot of money. At the end of the day, AWS, etc are not getting that much better of a deal on Intel CPUs, RAM and other physical bits that make the cloud. Plus they need margin on top of that. So at some scale it can be done cheaper, if you can predict demand.

            If your workload is not very predictable then your data center is either going to be too big for utilization and you spend too much. Or you’re holding back product or new customer onboarding due to capacity. In those cases the cloud premium also makes sense for the flexibility.

            1. 2

              One thought that’s crossed my mind before is that by paying Amazon for things one can sneak a bunch of stuff into the budget that’d never get past the bean counters otherwise.

              Been there :-)

          2. 16

            This rings false to me as someone who is mostly working at small shops these days. At my current job, using AWS means that our operations staff is a team of zero people and an occasional few hours of work by a single backend developer (me) who’s familiar with infrastructure-as-code and monitoring tools. Our annual AWS bill is less than the salary of one full-time operations person. Obviously not all shops are small and the value proposition is different at different places.

            I think the article would be more accurately titled, “You should not be misusing AWS.” It starts off with an example of having a bunch of idle hosts sitting around because you’ve been creating them willy-nilly in the AWS console and losing track of them. So… don’t do that. Creating hosts with zero discipline or recordkeeping is not a consequence of using AWS, it’s a consequence of poor operational processes. I don’t consider myself particularly expert at operations, but I’ve managed all my company’s AWS infrastructure in Terraform from day one and there are no EC2 instances or databases unaccounted for. Same with the smorgasbord of services such as serverless: I use Lambda in a couple specific places where the benefits outweigh the costs and avoid it where they don’t. Choosing the wrong technology for the job is also not really a consequence of using AWS and it happens all the time even in on-prem systems.

            1. 11

              I dunno man. EC2 is pretty good.

              1. 7

                Compared to a bare metal server? No it is not. I have yet to come across a managed service that has the same performance and feature parity as the self-hosted open source alternative. T

                The upside with cloud services is that product lifecycle is clearly defined with some (most of times semi-viable) solution, although most of the times equally inefficient.

                That said it’s true that you need a smaller team and that might be a big deal.

                Also, scale effects matter… when your app goes down, it’s on you. When AWS s3 doesn’t work no one cares because 15% of the internet doesn’t work. Even saying “AWS was down in region ABC” is taken lightly compared to “ our data center went down in region X.”

                1. 7

                  Compared to a bare metal server? No it is not. I have yet to come across a managed service that has the same performance and feature parity as the self-hosted open source alternative. T

                  You’re missing one of the most important dimensions in your analysis: cost.

                  I can a couple of VPSs at different providers, with the cheapest being <$5/month for a small VM on Vultr. The cheapest dedicated machine that I can get from the same provider is $120/month. That looks pretty similar to other providers. If I bought my own server and hosted it at home, then I’d have an up-front cost of a couple of thousand (less if I didn’t buy server-grade hardware) and then a monthly commercial ISP charge of about the same amount. I’d then also be on the hook for any repairs. The last time I ran things from my own dedicated server, I had a disk fail and, because I was a student and was cutting every possible corner on cost, had a week of downtime until the replacement arrived and was installed. I’ve also had CPUs and motherboards die on server machines I was administrating in a local room and needed to go and buy a replacement and install it.

                  With the cloud offerings, if a disk dies then the provider just adds another one from a rack to the mirror set that’s backing my storage and I don’t even notice. If a CPU dies, my VM is automatically restarted on another one.

                  1. 3

                    Wot? That’s comparing apples to oranges my guy. If you’re a digital nomad running a passive income SaaS, bare metal sounds like a giant pain in the ass. I just want the thing to run and I don’t wanna think about it.

                2. 4

                  I might agree with many of those arguments if you are building your own and have time and will to learn all, from setting backups to patching to all that the app needs. But if you aren’t technical and you want a consultancy to do the job for you, AWS is the best option IMHO. Very little time to market , and minimal running expenses.

                  1. 3

                    This advice is accidentally good. AWS, in particular, is not a good public cloud; so, avoiding building upon AWS happens to be a good choice. For example, the diagram halfway down the page only makes sense because AWS’s Kubernetes offering is not very good; on any other public cloud with a Kubernetes offering, the choice to just use Kubernetes would be easy.

                    I want to tilt your perspective a little. Imagine that you already have an existing footprint with N containers working in concert. After a planned deployment of a new feature, the footprint will have N+1 containers. What is the marginal cost of adding a new container to your existing infrastructure? I believe that your point is that N is small, even N=1, for startups and small businesses. In my experience, though, N has never been smaller than maybe N=5 (from my failed small business.)

                    1. 3

                      This really depends on how you define “good public cloud”.

                      In 2020/2021 I was leading about 40 engineers responsible for a fairly well known cloud product that offers its products on AWS, GCP and Azure.

                      We had roughly 1000 k8s clusters to look after. All using the cloud providers managed k8s service. While AWS is probably the most complex to setup, doing the least for you. For our use it probably worked the best.

                      And they certainly have the best support. Azure being the worst support, they would give us actively harmful advice about operating k8s.

                      1. 1

                        Ignoring ethics for a moment, it’s worth remembering that AWS (2006) is one of the oldest public clouds with its API-driven design. In general, AWS is lacking because it has primacy. Many AWS products are clearly old, and we might imagine that the burden of supporting massive legacy customers is sufficient to delay the introduction of superior APIs. This is also visible in older cloud-like products like Google App Engine (2008) compared to Google Cloud Platform (2013); while both are API-driven, there is a clear ossification of App Engine APIs. Or, another good example is SoftLayer (2005) which was acquired under IBM’s banner and remixed into Bluemix (2013). In both cases, Kubernetes support would have not been possible on the older, less flexible stacks.

                        Keep in mind that Kubernetes (2014) itself was released after many of these clouds were designed and in beta releases; it wasn’t obvious back then that we could permanently avoid cloud-vendor lock-in by working with a high-level vendor-neutral cluster orchestration API. Some public clouds joyously embraced Kubernetes (like GCP and Bluemix), while others took their time (like DigitalOcean and AWS).

                        If we reconsider ethics, then Amazon profits from AWS, and Amazon directly commits human-rights and labor-rights abuses. Is this “good”?

                    2. 3

                      I am not a fan of AWS, but they do have their use cases. Startups usually do not have many savings available, so paying upfront for own servers can be too expensive. Startups often do not know how many customers they will have the next month and how the software requirements evolve. One reason for AWS over some cheaper competitors can be GPU servers. I have not done a market check, but e.g. Hetzner (where I host my private projects) afaik currently does not offer GPU.

                      Personal experience is: New big customer wants to do a test run, company needs a few more servers quickly. Will they sign a contract? Who knows, but for a few weeks they need to see that we can handle their load. The other way round: Code improvements led to much faster processing, so fewer servers are needed. This is a constant up and down I saw in a startup company.

                      I totally agree on the point that you should focus on application simplicity. With the myriad of services that AWS offers one can easily pick too many. And then the application becomes more and more difficult to understand.

                      And the title says “Probably”, that sounds right. I agree with the statement that one should evaluate whether AWS is the right fit. Not just use it, because everyone uses it.

                      1. 3

                        Another evaluation factor is optimizing on highest likelihood of the cloud provider staying in business.

                        1. 3

                          I really like AWS for personal computing projects because it’s almost impossible to go above the free tier for that. 400,000 gb-seconds of lambda compute a month? Psssh like I’m ever gonna go above 10,000

                          1. 3

                            Funny, I hated them for personal projects because they don’t support spending limits. I’m afraid of configuring something wrongly (even though I am careful) and even with a spending alarm going up to more than I want to pay before I can react to the alarm.

                            1. 1

                              Funnily enough you can DIY this feature using.. (surprise) Lambda.

                          2. 2

                            Regarding costs, to be fair AWS have a generous program for startup https://aws.amazon.com/startups/

                            1. 2

                              I am not sure what this article is trying to say. It basically has a lot of fluff that boils down to:

                              • don’t use AWS
                              • deliver value to your users

                              But it is framed as if you are currently expending engineering resources and money on AWS instead of “delivering value to your users”. It then does not seem to present any actual alternative to AWS, which is great from the article’s point of view as it means all you’re doing is dropping the costs with AWS without replacing them with something else.

                              Now if all your site is is an order form, and you’re delivering complete software to your users anything like AWS is clearly overkill and you should just use square space, or some similar all in one hosting service that can manage essentially all the complexity and security for you.

                              If your company is delivering a service that depends on you having services running all the time, then I’m going to assume squarespace level hosting isn’t going to cut it.

                              So now you’re in a position that you need a service with high availability (e.g ~100% uptime). If you aren’t using a cloud provider - all of which seem to be able to handle system rollover, etc automatically, and now how to maintain everything, you have to hire those people directly. So you’re now paying at least one SRE, so that’s easily >$100k/year, or your existing engineers are now doing double duty. Is it reasonable for an app engineers, or what have you to know how to bring up and manage systems? Is that going to be enjoyable (attrition due to non enjoyable tasks is a real thing)?

                              Given reasonable uptime requirements, you’re going to need on call people, do your existing engineers want that? If not you’re probably going to want at least two SREs.

                              I am not saying that you have to use AWS or what have you in their full scaling setup, a single instance with their automatic lifetime management (liveness tests, roll over on hardware failure, etc).

                              To put in perspective, I know nothing about service management or anything at all, but I was able to trivially spin up a single Linux system and have all the tls, etc rollover, updates, etc just managed automatically. Then I was able to just futz around on a random Linux box with all that stuff managed automatically, which is afaict what the article is saying I should be doing, only also having to manage everything else as well.

                              So yes AWS costs money, but not AWS also costs money, and this article seems to simply ignore that entirely.

                              1. 2

                                Aws and other public cloud providers provide very cheap options for hobbyists. PaaS make sense for small startups until they need something custom. I wouldn’t pay the premium over buying directly from aws for a paas.

                                1. 2

                                  I am honestly not qualified to comment on the content of this blog post as I barely use AWS. But I just wanted to say it is one of the clearest, most interesting, most fun to read technical blog posts I have seen.