Apologies if this is off-topic for lobsters, but it’s a big enough service to be interesting when it goes down. I’m hoping they post a postmortem at some point.
Dyn being DDoS'ed. Apparently this also took down PagerDuty in the process.
also, if you like knowing about outages, the outages mailing list is a good read/watch: https://puck.nether.net/mailman/listinfo/outages
I had no idea this existed, looks pretty handy, thanks!
Root cause appears to be https://www.dynstatus.com/incidents/nlr4yrr162t8
Maybe I’m naive, but I’m surprised that DNS is outsourced for a company as big as Twitter. It’s how costumers get to your website, I’d have thought they’d want to control that.
Given the nature of DNS, having as few hops as possible makes a difference. So, it’s not surprising to me that it’s outsourced. What is surprising to me is the number of large sites (my employer included) who opted to host on a single provider without a backup plan / redudancy in place.
Amazon.com also hosts their DNS on Dyn, but they seem to replicate to UltraDNS as well. I’ve found very few examples of this strategy.
Amazon.com also hosts their DNS on Dyn, but they seem to replicate to UltraDNS as well.
As of some time today – when I looked this morning:
~> host -t NS us-east-1.amazonaws.com
us-east-1.amazonaws.com name server ns3.p31.dynect.net.
us-east-1.amazonaws.com name server ns1.p31.dynect.net.
us-east-1.amazonaws.com name server ns2.p31.dynect.net.
us-east-1.amazonaws.com name server ns4.p31.dynect.net.
Interestingly enough, east-2 and west-1 and 2 used UltraDNS, Dyn, and amazonaws.com. It looks like they’ve since added Dyn and amazonaws.com to east-1.
I’d be very curious to know why. I’d guess “Dyn is fine enough, it’s on the backlog” is the likely answer.
Yeah, it’s not clear to me when they added UltraDNS. Maybe they were quicker to respond to this problem than most and did it when Dyn started failing.
Some time between 09:51 EDT and 15:03 EDT today, if my Slack timestamps are to be trusted!
But yeah, I’d imagine they had built the ability to do DNS replication and simply hadn’t placed it in their legacy us-east-1.
Interesting. I would’ve thought they’d use their own infrastructure (eg, Route 53) rather than relying on Dyn or anyone else.
Neither Google nor Facebook seem to outsource this.
I’m more surprised that Facebook doesn’t outsource this. But then again, they likely have data centers, and full staffs supporting them. If you own everything but the actual link, you can tune and optimize it all, and potentially save power, energy…milliseconds of latency.
And I’m sure they have lots of mitigation strategies for DDoS, though I’m sure Dyn does too…
If it is off-topic enough you have to say it in the description, don’t post it .
Wait for the outage report, then post that. Good instincts, just follow them next time.
I disagree. This is more than just Twitter. The Dyn DNS outage is breaking news.
I could agree with that, but then the news is “Dyn under attack” not “twitter down”.
Agreed. Unfortunately I can’t edit it once posted :/
If I’m not mistaken–and this is just for future reference–I think you can usually poke one of the mods to help in this sort of case.
There are 4 upvotes, so some people are obviously interested.
I agree though. The public post mortem will be a much more interesting post.
“interested” is the threshold over at HN. We exercise a little more choosiness here.
If somebody posted a cartoon about the current election, I guarantee it would get up votes despite being woefully off-topic and toxic.
We exercise a little more choosiness here.
As a member of the lobste.r’s community I haven’t given you permission to speak for me. Feel free to point out if you don’t think certain content is appropriate but don’t try to speak for me.
Would you agree or disagree that submissions to Lobsters generally do not follow the simple “interesting to good hackers” metric that has caused spam to pervade HN?
I appreciate your point about not speaking for you, and I’m sorry if my wording caused offense.
The problem with HN is not the “interesting” part, but the definition of “hacker”, really. Well, that’s a problem with HN.
Please. Do you have any idea how long I spent browsing curated lists of editor configs? I know all about being a hacker.
But are you a growth hacker?
You didn’t browse them from home, did you?
I had no problem with the submission. Anything that brings Twitter down for hours, the keyword being hours, is likely to be interesting.
This is an interesting perspective, because I assume by “we” you mean the community at large, which has now posted many comments on the original story, and voted it to 8 points. Apparently we don’t want to exercise choosiness here. I didn’t upvote this, though I find the comments particularly interesting.
The Dyn outage is a pain. Github login is down for us :(
Github … another painful single point of failure. We’ll all regret it someday.
I think many are already regretting it today due to deployments not working which have to call out to a Git/package repo (like npm).
When big DNS outages like this happen, people are quick to come to the defense of customers on those providers, saying it’s smarter to use a big provider like Dyn than to do it yourself because surely you alone would not be able to mitigate such an attack. But what usually happens is that these large providers get targeted because they are so large. If someone wanted to take down GitHub, PayPal, Twitter, and a bunch of other sites all one one day and they all hosted their own DNS, would such an attack be possible? Would someone even want to target all of them at once?
When small companies start relying on these few giant providers like Dyn, AWS, CloudFlare, Akamai, etc., they become collateral damage and probably suffer more outages because of problems with other customers than if they would have hosted themselves (at least in terms of being the target of a DDoS attack).
I fear that the world we live in today is so hacked and broken that for any company the gap between “too small for anyone to care about” and “big enough to trust real companies with core infrastructure” is shrinking incredibly quickly.