That’s a really good summary. Just like security, this is 90% about having an inventory and knowing what happens where. Awesome write-up and pretty close to a “you’re joining a new company as a devopsy/sre person, how do you learn about things missing from onboarding / what questions to ask”.
I bookmarked it with the same “what to ask on joining” intention, good point! Do you know anything like this for data engineering and adjacent disciplines? I can’t get through the massive number of “top 678 interview questions to ask”-type articles when I search for it :’(
[my problem is not primarily with the content of this article, but the context]
“Whatever got you here, your infrastructure person has left, maybe suddenly. You have been moved into the role with almost no time to prepare. “
This is a terrible, terrible situation.
The point of devops is not “devs can do ops!”. The point of devops is that devs and ops should work together, using the tools of modern software development to implement repeatable, debuggable operational infrastructure. This is not a new idea; cfengine was written in 1993, and that wasn’t the first run at devops methods, just the first one that gained traction. Ops have always written software tools to help out.
If you have no ops experience on your team, you cannot “do devops.” At best, you can start learning operations skills. Think of this as a software development project where you do not have a subject-matter expert at hand and you probably don’t have a requirements document other than “the spice must flow!”.
If your company had a single operations person and suddenly they don’t, you have three problems:
If this is the situation you find yourself in, it is probably past time to find another company.
If you are the founder/owner: how did you not see this coming? There were no warning signs? Consider pausing all activity in order to rethink the company.
A company with a single devops person is probably a company with limited resources, not enough work for two devops person, and has probably many more risk than losing a devops person. While this is not a nice situation to be in, there are many more terrible situation to be in. It doesn’t make any sense to overcommit scarce resources in one business area, especially if one possible fix is to take an existing developer, hope he keeps the fire down for a while, giving you time to find someone else to keep thing running smoothly. What else would you suggest?
You say “devops person” as though it were a separate job title.
In that small company, everyone in development and ops and network engineering and security is on the devnetopsec team. Cross-train. Document. Or accept the risk that one person leaving will sink the company.
Great takeaways from this article. I would love to see some of the listed follow-up articles on this topic, especially:
DNS. If it’s not in the terraform directory we made before under Route53, it must be somewhere else. We gotta manage that like we manage a server because users logging into the DNS control panel and changing something can cripple the business.
Kubernetes. Should you use it? Are there other options? If you are using it now, what do you need to know about it?
Migrating to managed services. If your company is running its own databases or baking its own AMIs, now might be a great time to revisit that decision.