1. 5

    This is really good - I appreciate the in-depth nature of it!

    I remain skeptical about the practicality of event sourcing, but it helps a lot that rather than handwaving over the details of archiving and event schemas, you’ve actually gone into detail about how you do it.

    What I’m still curious about (and most articles/books on event sourcing don’t discuss much) is:

    • How do you handle changing schemas for events (you mention rewriting your S3 store, I assume this is it?)
    • How many events do you have stored?
    • How long does it take you to rebuild a projection, or rewrite an event archive at the moment with that volume? Do you ever anticipate you’ll have too many events to make it practical? If so what will you do then?
    1. 3

      Thanks for the kind feedback!

      There are definitely tradeoffs to doing an event-sourced system I wrote this post up mostly about our experience overall, and trying to give enough detail about the system itself to make it understandable. From a business perspective it has enabled so much stuff to be so easy that we’ve found the tradeoffs to mostly be skewed in the direction of higher productivity and faster feature delivery. We’ll write more about specifics in future posts (not just by me). There is a whole analytics and BI component here that I didn’t even mention.

      To answer your questions above:

      • The schemas are Protobuf so we generally follow the recommended rules for changing Protobuf schemas: you can add fields at any time. We deprecate fields as needed, though this is rare. You can add new events any time. You never change a field name or type. We have processes around getting new event schemas approved that require review by folks who understand them well and know what works. We have LADRs that explain some of this. If you added a fields that needs to be back-calculated for previous events, you would run a backfill to emit all of those for the entities that were missing them. If it’s more complicated, we can use Spark to rewrite the store.
      • We have under 10 billion or so events (this is all the public stuff keep in mind, not all the RPC). At this scale you could obviously use other tools. But we plan to grow on this platform in perpetuity, hence some of the decisions. Generally you are not operating on all of the events, though. You are operating on a subset that you care about for your service. You can generally scope it to tens of millions for a service you are working on. We have some things that behave like events but which we treat as ephemeral and don’t archive because there is no utility in replays or audit trails.
      • We can run Spark over the whole event store in a couple of hours, less if we want to throw more/bigger instances at it. We can build a new projection for a reasonable service in a couple of hours. The limit here is usually the throughput on RabbitMQ for the size of cluster we replay on. We can pay for more if we need it faster. Snapshots are the solution to this longer term, by starting from an intermediary state rather than from scratch each time. We build and use them already, but by specializing them for each model, we could massively reduce bootstrap time. I mentioned the first way we will do that in the article.

      Hope that helps answer some of the questions. We have been really practical about it and solve problems as they arrive, using an iterative approach rather than building a lot of support up front. We have a business to run first! :) Generally Protobuf, S3, Spark, and Athena mean we haven’t seen anything frightening yet. Quite the opposite: the tooling seems more than up to the job ahead.

      1. 1

        Thanks so much for getting back to me, that’s really impressive :)

    1. 5

      Can you re-use the test suite if your entire software is replaced with an opaque neural network?

      Haha this is great, I’m writing it down.

      1. 7

        The good thing is that many of the disasters I’ve talked about have good answers, and the industry has created better tools to make them solvable by organizations other than FAANG.

        I like the article, but it’d be 10x better if he enumerated what these actually were. Especially the one about dev environments, I still haven’t found a really good solution there.

        1. 5

          I’d love to see a video of this in action!

          1. 3

            I can try to get one today.

          1. 1

            I’ve never thought of watching videos of myself coding, but that’s a good idea. I’ve been thinking about trying to track my time for a while (since I read The Effective Executive), but I’ve never found a method that works well (e.g. a program that tracks the currently in-focus window can’t tell that I’m having a conversation with a colleague).

            I’ve also found that putting my phone where I can’t reach it from my bed helps a lot with not wasting time in the mornings, as does installing leechblock in all my browsers (higher limit at home, low limit at work!).

            1. 2

              I’ve started using ffmpeg -video_size 3840x2160 -framerate 2 -f x11grab -i :0.0 output.mp4 That saves two frames a second, and uses my native screen resolution, seems to work well enough.

            1. 6

              The README explains fairly well what this does, but what is the motivation? Like, when is this better than shell scripting?

              I’m probably missing something, but to me as an onlooker - it feels like a lot more work to do the same things here, and why use NodeJS if you remove it’s only strong benefit (asynchronous from the ground up) in how your service works? Seems like Ruby or Perl do this better out of the box, there are strong libraries do this in Python, and the difference in overhead of using this library vs writing it in Go or similar seems small.

              1. 2

                I find that shell code very quickly becomes hard to read as the script’s complexity increases; the syntax just isn’t as clear and explicit as that of something like Ruby or JavaScript. As to why you’d choose one over the other, I hear people have pretty strong preferences when it comes to languages :)

                1. 6

                  I think that bash gets a bad rep in this way. However, I think it’s not because the language is bad but because everyone treats every script in it as a quick-and-dity hack. You do that in any language and you’ll get the same problem.

                2. 2

                  We use node for a lot of the glue scripts in our project - it’s easier than bash, works cross-platform, and because our project is already javascript we all both know the language and have the runtime installed.

                  1. 1

                    Ruby does indeed have anything you might need out of the box, and shelling out is just a matter of writing the command in backquotes. The only pain point is that when you need to both read the output of a command and get its status code you have to use popen3 which I always found to have a clunky API.

                    1. 3

                      Sure, but any clunkier than we’ve got using node now?

                  1. 56

                    Fortunately, it’s also the best of currently available major browsers, so it’s not exactly a hardship.

                    1. 22

                      Not on macOS. Sure, it has a whole lot of great features, but it’s just slow. It feels slow, looks slow, and macOS keeps telling me that Firefox is using an excessive amount of power compared to other browsers.

                      I guess it’s too much to ask for, for Firefox to feel like a good, native macOS app, like Safari, but the fact of the matter is that that is why I don’t use it as my main browser.

                      1. 19

                        I use it on Mac OS X and it doesn’t feel slow to me at all. And it’s not using an excessive amount of power that I can tell. Perhaps it’s the version of Firefox being used?

                        1. 14

                          I’ve been sticking to Safari on MacOS because I’ve read that it really does make a difference to battery life (and I’m on a tiny Macbook so, you know, CPU cycles aren’t exactly plentiful). This thread just prompted me to check this for myself.

                          I opened a typical work mix of 10 tabs in both Safari 12.1 and Firefox 66.0.3 on MacOS 10.14.4: google calendar + drive, an open gdocs file, two jira tabs, this lobsters thread (well, it is lunchtime…) and the rest github. Time for some anec-data! :-)

                          After leaving both browsers to sit there for 10 mins while I made lunch (neither in the foreground, but both visible and showing a github page as the active tab), these are the numbers I eyeballed from Activity Monitor over about a 30 second period:

                          Firefox:

                          • Energy Impact: moving between 3.3 and 15.6, mostly about 4
                          • CPU: various processes using 0.3, 0.4, 0.5 up to one process using 1.4% CPU

                          Safari:

                          • Energy Impact: moving between 0.1 and 1.3, mostly around 0.5
                          • CPU: more processes than Firefox, but most using consistently 0.0 or 0.1% CPU

                          Firefox isn’t terrible but Safari seems really good at frequently getting itself down to a near-zero CPU usage state. I’ll be sticking with Safari, but if I was on a desktop mac instead I think I’d choose differently.

                          As an aside, Activity Monitor’s docs just say “a relative measure of the current energy consumption of the app (lower is better)”. Does anyone know what the “Energy Impact” column is actually measuring?

                          1. 5

                            I have had the same experience with Firefox/Chrome vs Safari.

                            I use Chrome for work because we’re a google shop and I tend to use Firefox any time my MacBook is docked.

                            But I’m traveling so much, I generally just use Safari these days.

                          2. 9

                            I use it on Mac OS X and it doesn’t feel slow to me at all.

                            If you can’t feel and see the difference in the experience between, say, Firefox and Safari, I don’t know what to tell you.

                            And it’s not using an excessive amount of power that I can tell. Perhaps it’s the version of Firefox being used?

                            Have you tried checking in the battery menubar-thing? There’s an “Using Significant Energy” list, and Firefox is always on it on my machine if it’s running. And that is both Firefox as well as Firefox Nightly, and it is so for all versions since a long time. My two installs are updated per today, and it’s the same experience.

                            1. 1

                              If you can’t feel and see the difference in the experience between, say, Firefox and Safari, I don’t know what to tell you.

                              There are plenty of people who can’t hear the difference between $300 and $2000 headphones. Yes, there are audiophile snobs who’re affronted by the mere idea of using anything but the most exquisitely constructed cans. But those people are a vanishingly small minority of headphone users. The rest of us are perfectly happy with bog standard headphones.

                              Apple likely had to descend through numerous circles of hell while hand-optimizing Safari for the single platform that it needs to run on. Will Firefox get there? Unlikely. Will most users even notice the difference? Most certainly not.

                              1. 6

                                They will when their battery life is abysmal and they start hearing that it’s because of Firefox.

                                I really want to see Firefox get more adoption, but there are a lot of techies with influence who will keep away because of this, myself included. It’s not a convenience thing - I just can’t get to mains power enough as it is in my job, so more drain is a major problem.

                                1. 1

                                  They will when their battery life is abysmal and they start hearing that it’s because of Firefox.

                                  The problem is that the feedback cycle isn’t even long enough for them to hear about this. The cause and effect are almost immediate depending on your display resolution settings with bug 1404042.

                                  1. 3

                                    This is what happens when you fight the platform.

                                    1. 2

                                      This is what happens when the platform is hostile to outsiders.

                                      1. 8

                                        See, I don’t see it that way. I see it as Mozilla deciding on an architecture for their software that renders that software definitely suboptimal on the Mac. It’s just a bad fit. I’m not claiming that Mozilla should have done things differently – they are welcome to allocate their resources as they see fit, and the Mac is most definitely a minority platform. There are many applications that run on the Macintosh that are not produced by Apple that don’t have these problems.

                                        iOS is a different story, one where hostility to outsiders is a more reasonable reading of Apple’s stance.

                                2. 2

                                  Now that I’m at work, I’m seeing what hjst is showing. This doesn’t bother me that much because I use the laptop at work more like a desktop (I keep it plugged in). But yes, I can see how Firefox might be a bit problematic to use on the Mac.

                                3. 1

                                  I’ll have to check the laptop at work. At home I have a desktop Mac (okay, a Mac mini).

                                4. 4

                                  There are known issues which are taking a long time to fix. Best example is if you change the display resolution on a retina Mac. You can almost see the battery icon drain away on my machine.

                                  1. 3

                                    I find it depends a lot on what FF is doing - usual browsing is fine, but certain apps like Google Docs or anything involving the webcam make it go crazy.

                                    1. 20

                                      Google sites, unsurprisingly if disappointingly, don’t work as well in Firefox as they do in Chrome. But that’s really on Google, not Mozilla.

                                      1. 15

                                        They used to actively break them - e.g. GMail would deliberately feed Firefox Android a barely-functional version of the site. https://bugzilla.mozilla.org/show_bug.cgi?id=668275 (The excuse was that Firefox didn’t implement some Google-specific CSS property, that had a version in the spec anyway.) They’ve stopped doing that - but Google’s actions go well beyond passively not-supporting Firefox.

                                  2. 5

                                    For me, it feels faster than Chrome on MacOS, but the reason I don’t use it is weird mouse scroll behavior (with Apple mouse). It differs too much from Chrome’s behavior. I don’t know how to debug it, how to compare, what is right behavior (I suspect Chrome’s scrolling is non-standard and it dampens acceleration, while Firefox use standard system scrolling). It just feels very frustrating, but in subtle way: I become nervous after reading lots of pages (not right after the first page). I tried various mouse-related about:config settings but none of them had any effect (and it’s hard to evaluate results because differences are very subtle).

                                    Maybe the answer is to use standard mouse with clicky scroll wheel, but I hate clicky scroll wheels. “Continuous” scrolling is one of the best input device improvements of recent times (however it would be better if it was real wheel/trackball instead of touch surface).

                                    1. 1

                                      Have you tried Nightly yet? I believe there are some great improvements made recently for this. It isn’t all fixed, but it has improved.

                                      1. 3

                                        I’m on Nightly right now, and it hasn’t improved for me at least.

                                      2. -1

                                        I think macOS disadvantages apps that compete with Apple products. That’s unfortunate though.

                                        1. 7

                                          Any evidence for this statement?

                                          1. 9

                                            Do you have any proof?

                                            Anecdotally I use a lot of third-party apps that are a lot better than Apples contemporaries.

                                            I just think the truth is that Firefox’ hasn’t spent enough time on optimizing to each platform, and on macOS where feel and look is a huge deal, they simply fall through.

                                            1. 1

                                              The reports that Firefox has issues on macOS and Apple’s behaviour with iOS, for starters.

                                              1. 7

                                                Often the simplest solution is the correct one, meaning that it’s more likely that Firefox just hasn’t optimized for macOS properly. If you look at the bug reports on the bug tracker, this seems to be the case.

                                                Also if your theory were to be correct, why is other non-apple browser like chromium not having these issues? Could it perhaps be that they have in fact optimized for macOS, or do you propose that apple is artifically advantaging them?

                                                1. 13

                                                  pcwalton hints at twitter that gains that e.g. Safari and Webkit have is through the usage of private API in macOS. You could probably use those API as well from Firefox, at the cost of doing tons of research on your own, while Webkit can just use them. (further down the thread, he hints at actually trying to bind to them)

                                                  https://twitter.com/pcwalton/status/1068933432275681280

                                                  1. 3

                                                    That’s very interesting, and it’s probably a factor. However these are problems that Firefox have, not all third-party browsers. No Chromium based browser have these issues, at least in my experience. Maybe it’s through privat API that you can optimise a browser the most on macOS, but it doesn’t change the fact that Firefox is under-optimised on macOS, which is why it performs as it does.

                                                    1. 8

                                                      Point being: Chromium inherits optimisations from apples work which Mozilla has to work hard to develop in a fashion working with their architecture. Yes, there’s something to be said about organisational priorities, but also about not being able to throw everyone at that problem.

                                                      I’m really looking forward to webrender fixing a lot of those problems.

                                                      1. 1

                                                        And it’s a sad fact, because I’d love to use Firefox instead of Safari.

                                                        1. 7

                                                          Sure, from a users perspective, all of that doesn’t matter.

                                                          Just wanted to say that this is hard and an uphill battle, not that people don’t care.

                                                          The Firefox team is well aware of those two contexts.

                                                  2. 0

                                                    It’s certainly possible. But at the very least Apple has little incentive to have Firefox work well on macOS. Chrom{e|ium} is so widely used, that Apple would hurt themselves if it didn’t work well on macOS.

                                                    I’d be a bit surprised if Mozilla is really falling down on optimising Firefox on macOS. It’s not as if Mozilla is a one man operation with little money. But perhaps they decided to invest resources elsewhere.

                                              2. 1

                                                That’s true in cases where apps want you to pay for features (like YouTube not offering Picture-in-Picture since it’s a paid feature and Apple wants money for it to happen) but not true in the case of Firefox. Unfortunately, Firefox’s JavaScript engine is just slower and sucks up more CPU when compared to others.

                                            2. 7

                                              Yeah, I’ve switched between Firefox and Chrome every year or two since Chrome came out. I’ve been back on Firefox for about 2 years now and I don’t see myself going back to Chrome anytime soon. It’s just better.

                                              1. 3

                                                Vertical tabs or bust.

                                                1. 6

                                                  A little more convincing: if you correlate with just “fifa”, the peaks do align. (And there are “fifa” spikes in the last week of June that are 10x bigger, and don’t align with “web” anything). Good reminder as to what’s really popular outside our expanding tech bubble.

                                                  1. 4

                                                    Wow, it’s the kind of thing that puts our little web development bubble into perspective. Just searches for a single web app are enough to swamp the numbers for “web app” in general :|

                                                    1. 3

                                                      Why does FIFA popularity increase every September?

                                                      1. 8

                                                        Like most sports franchise games, a new iteration is released annually, and in FIFA’s case it is released around September: typically $59.99 gets you minor gameplay and graphics updates, maybe a new gameplay mode nobody really cares about, and (most importantly) you get new player and team adjustments.

                                                        Why is it released in September? Players are free to move from club to club during transfer windows which are only open twice per season — and the leagues most people play and follow (England, Spain, most European leagues) close mid to late August. So this gives the developer time to handle any late transfers and set the rosters before release time.

                                                        Whether or not this is why “web app” spikes I’m not sure. But that’s why FIFA spikes in September.

                                                        1. 2

                                                          Ah, thanks for explaining! I didn’t know that.

                                                      2. 1

                                                        Wow, I think that’s it!

                                                      1. 3

                                                        Trying to publish a helm chart for Magda this week made me really wish that there was something like NPM for Helm where I could just write publish, so I’ve decided to make one.

                                                        Shouldn’t be that hard, just a wrapper around Chart Museum, although so far I haven’t even managed to get Minikube to run on my home linux box :(.

                                                        1. 5
                                                          • The learning curve is steep. Most tutorials do a great job of explaining it with super basic examples (e.g. if my function is sum(x, y) I should be able to swap the order of x and y with no change), but there’s a big jump from that to testing actual business logic with actual input
                                                          • I’ve used both ScalaCheck and JsVerify - in both I seem to spend a lot of time writing boilerplate for generators… e.g. there’ll be a generator for taking a subset of a list, but not an in-order sublist, or it can generate a random object but not one that conforms to a type definition. There’s gotta be room for a more user-friendly way of generating input.
                                                          • Shrinking is always a massive source of surprises - often you’ll hit a failing case, have the input shrunk to something that doesn’t satisfy the original parameters you set, then puzzle as to how it managed to generate a value that you specifically told it not to generate. This is especially bad in ScalaCheck because by default it shrinks lists of tuples down in a way that swaps the values inside the tuple.
                                                          1. 6

                                                            For work, the automatic tools used by our pentester has uncovered that one of our services makes some seriously inefficient use of the database, enough so that a few queries per second can put it at 100% usage… so I’m trying to optimise them. There’s something oddly fun about using the postgres query analyzer, it’s a bit like a Zachlike game.

                                                            For side project I’ve just finished doing a bunch of front-end work to try to make it more fun - inserting screens that show the user their progress along the way. Next technical step is probably making it so that you don’t have to log in right away, but the correct startup move is probably to close the text editor and do marketing for a while :(.

                                                            1. 3

                                                              Kamal: With Kubernetes you can set up a new service with a single command Julia: I don’t understand how that’s possible. Kamal: Like, you just write 1 configuration file, apply it, and then you have a HTTP service running in production

                                                              The best thing is that’s just the tip of the iceberg - you can describe entire n-tier systems like this, and with helm you can turn that configuration file into a template then publish it for everyone else to use and override where necessary. So say you have a setup with nginx in front of an app server talking to postgres and indexed by elasticsearch, you can write it all down as text, have the user override some key variables (maybe they need to use GCS instead of AWS) then have them install the whole thing with one command. If they need to run it locally, have them pass in a different set of variables and it’s the same command.

                                                              When you’ve got it all set up right, making a massive cluster of containers effortlessly spring into life and start talking to each other is just beautiful. Spending hours or days installing and setting up a multi-tier application is going to disappear as kubernetes catches on.

                                                              1. 2

                                                                Spending hours or days installing and setting up a multi-tier application is going to disappear as kubernetes catches on.

                                                                Maybe. The article even admits how much hand-waving the author is doing and links to “kubernetes the hard way”. Instead you’ll choose between spending hours then days spinning up and configuring kubernetes (make sure you really understand how it works including overlay networks, ingress/egress, security and secrets, etc.) or going with a k8s provider who has people who do that.

                                                              1. 2

                                                                Are there any decent alternatives?

                                                                1. 3

                                                                  I just use Google (which has full integration with these sites), hotel tonight, or the hotel’s website/phone line (they’ll usually price match).

                                                                  1. 1

                                                                    These days going directly to the hotel’s site is usually the same price or cheaper, especially if you’re booking way in advance. I’ve just been through a big booking spree for my trip through South America and I was amazed - even non-chain boutique hotels are sometimes 30% under the price of the aggregators.

                                                                    My usual process now is momondo -> google the hotel’s site.

                                                                    1. 1

                                                                      If you’re traveling to/within Asia, Agoda is the best option.

                                                                      1. 1

                                                                        hotels.com maybe, but I think they use similar techniques.

                                                                        1. 1

                                                                          trivago seems ok?

                                                                          1. 5

                                                                            They are the same company, just different domains: http://www.expediainc.com/expedia-brands/

                                                                            1. 2

                                                                              Never knew that, very interesting.

                                                                      1. 3

                                                                        I’ll let someone with more expertise in microservice architecture discuss the pros/cons. I didn’t find much here to convince me as to what is being missed.

                                                                        That’s probably because this is more a article about the Magda project (a data portal thing in JS), not what the title says.

                                                                        1. 2

                                                                          Yeah fair enough - I was trying to go for “and we’re already doing this so we know it’s practical” rather than something completely hypothetical. Didn’t actually realise until now that half the article’s specific to our project :|.

                                                                        1. 4

                                                                          I mostly agree, with the exception of calling into customer service (which the article specifically mentions) - for all intents and purposes that’s an open interface and if it turns out you can get valuable information out of it then there is remediation work to be done - either training the CSRs better or limiting what they have access to.

                                                                          1. 5

                                                                            I agree that there’s value in auditing ‘are CSRs following our security rules’ (outside of a pen-test).

                                                                            If you want to know whether the rules themselves are sufficient, you can give the pen-testers a copy to analyze and skip the part where they treat your staff like shit to see who cracks.

                                                                            1. 3

                                                                              With respect to CSRs, probably the most attention should be paid to their management and incentives. If they get rewarded for keeping whoever is on the phone happy, no matter what, and punished for refusing people, even if they’re asking to break the rules, then all demands to follow the rules no matter what aren’t going to have much effect.

                                                                            1. 3

                                                                              For my job I’m rushing to get a bunch of stuff into MAGDA for the end of June - primarily and authentication and discussion mechanism for datasets.

                                                                              Also been making a surprisingly large amount of progress towards the first MVP of my side project NicheTester - an injured wrist keeping me out of Judo has lead to a bunch more free side project time :).

                                                                              1. 2

                                                                                Your project looks great.

                                                                                1. 1

                                                                                  Thanks :D

                                                                                  1. 1

                                                                                    I can see a paid path for the future to export the fake site to a real one using something like shopify. I’ll be curious to see if any companies people validate on your platform take off.

                                                                              1. 6

                                                                                The result is a list of companies that do take-home tests… not much of an improvement to be honest, a lot of the time I’d rather spend 15 minutes on a whiteboard than the entire weekend trying to polish up whatever question they thought would only take 2 hours.

                                                                                1. 28

                                                                                  Now you can undermine unionized workers with free software, how cool is that? /s

                                                                                  1. 5

                                                                                    Could actual taxis use this kind of software?

                                                                                    1. 7

                                                                                      Here in Berlin the Taxis have this kind of software. In SF you have flywheel, which works like it. So yes, they of course can.

                                                                                    2. 4

                                                                                      It’s not like the taxi companies treat them well or anything.

                                                                                      1. 4

                                                                                        To be fair, I get the idea that this is more for remote places and the developing world where there’s no unionised taxi industry or even taxis - it’s more to give uber-like functionality to auto-rickshaws and taxi scooters.

                                                                                        EDIT: His blog on the app - the use case he had in mind was his village in siberia: https://medium.com/@romanpushkin/how-i-made-uber-like-app-in-no-time-with-javascript-and-secret-sauce-94ef9120c7f6#.lez7k44zy

                                                                                        1. 1

                                                                                          How cool is it that the unions can get higher wages by keeping people out of the profession?

                                                                                          1. 0

                                                                                            Good point.

                                                                                          1. 7

                                                                                            While I agree with the title, the actual text of this article is pretty much “good coders aren’t like <things that don’t describe me>, they’re like <things that do describe me>”.

                                                                                            Grouping two 30-year-old female characters in with the Carver from Silicon Valley tops off what is pretty much an exercise in gatekeeping. I know plenty of good coders who don’t start being productive until after 9pm, who are brilliant but absent-minded enough to leave backups near speakers, who blast music, who wear confrontational t-shirts etc.

                                                                                            1. 7

                                                                                              I’ll be more interested to read their post after they’ve been on bare metal for a while. Going on-prem is one of those things that as a nerd I’d love to believe is better, but I haven’t heard a lot of great success stories.

                                                                                              1. 10

                                                                                                It turns out running a data center is a very difficult thing to do if you’re trying to optimize for cost, redundant utilities, manageability, and green-ness (high power efficiency per space).

                                                                                                1. 4

                                                                                                  Is renting space in a DC the same as running one? I was always under the impression one rented out some space and got some power outlets and network connections and everything else was managed by the DC owners.

                                                                                                  1. 4

                                                                                                    No, it’s not the same. It’s a compromise between the two extremes.

                                                                                                    You get to skip the headaches of having to manage redundant power, fibre connectivity, most of the green-ness concerns, though you still need to do due diligence to assure yourself that the people running the DC are doing all those things to a standard which is acceptable for your use case.

                                                                                                    You get to keep the cost savings from buying machines up front (and amortising the capex) instead of renting them — this is AFAIK where most of the opex savings come from going when switching from AWS to anything else. (Do price this up against reserved cloud VMs rather than against on-demand cloud VMs though, because you’re committing quite hard when you buy servers up front.

                                                                                                    You will pay more for electricity, space, physical management and connectivity than the raw price of running a DC, because of course the company running the DC wants to make a profit. Note that a big DC selling colocation to a whole bunch of customers is going to be able to get way better economies of scale on some things that are really expensive, (such as the person-time required to keep multiple actually-redundant internet connections in the face of telcos merging lines without telling you,) which might offset the cost of their profit margin until your need for machines gets gargantuan.

                                                                                                    You will have to manage physical machines yourself, including managing the risk that an actual physical machine has an actual physical fault and dies. This can be “fun”; there are good and bad ways to find out what the lead time is for Dell/HP/etc to assemble and ship a new box to a colo (IME, most of a month), try to do it one of the good ways. ?

                                                                                                    I was always under the impression one rented out some space and got some power outlets and network connections and everything else was managed by the DC owners.

                                                                                                    Yes, that’s what colocation facilities offer. You’ll have to manage what the machines actually do by yourself. Usually a KVM-over-IP switch too for maintenance tasks. The colo will also provide basic services like “stick a Ubuntu 16.04 DVD in the drive and push the ‘on’ button for you so you can run the installer for that via the KVM”.

                                                                                                    1. 4

                                                                                                      Oh and one more thing: you can satisfy “this shared hypervisor isn’t providing enough storage IO, we need bare metal” without actually having to do all the above capex/opex tradeoff, manage and purchase physical machines yourself, etc. Some companies will quite happily rent you bare metal machines by the hour on roughly the same basis as they’d rent out VMs to you. (e.g. RackSpace sell this as “OnMetal”)

                                                                                                      Edit: Amazon AWS sell something similar as “dedicated hosts”, where you still get a VM rather than bare metal but it’s guaranteed to be the only VM running on that physical server, so you aren’t subject to noisy-neighbour problems.