General theme I saw:
Understand your third-party dependencies. Two outages were because the authors depended on a service. One was because they didn’t keep note if they were getting close to their limit and the other was because if you use a third party or your database you’re bound to their SLAs.
Static types are probably a good idea when dealing with unicode.
Another thing to note is that these numbers are not very large. The reminders for 5000 people would easily fit in RAM.
Hey, thanks for reading! Yeah, we did not understand our services very well. We didn’t even realize Google’s timezone API had a limit until we hit it. We’re also using python, so unfortunately no static types. Finally, yes, our data was not very large (it was about 2MB before we migrated), but it was computationally expensive to search as we put naively put everything into a Redis list. It worked fine when it was only us and some friends using Jarvis, but obviously this does not work at scale.
I’m frankly completely baffled that you didn’t use something simpler for such a small data set.
Why’d you all start out with Redis? And why’d you use it the wrong way?
What is simpler than Redis? We could just store things in memory, but we’d lose it on every push. MongoDB is probably a better choice, but that’s about as complicated as Redis to setup: you still need to connect to the db and make the same queries.
By push do you mean redeploy?
How hard would it be to write a signal handler to accept something like SIGUSR1, stop accepting connections, barf the memory as a json blob to disk, and shutdown? And then to add a command-line option to specify a file to seed with on startup before accepting connections?
Hah, don’t even need json. Python does this automagically.
Literally just pickle and signal.
This doesn’t seem that much easier to me. I’m not very familiar with writing signal handlers in Python, whereas I’ve gotten Redis set up before. Plus, what you suggested can’t work on our platform. Heroku uses an ephemeral filesystem, so the filesystem gets destroyed on every redeploy. So we’d lose the json blob anyways.
Not to mention the Heroku docs make it ridiculously easy to get Redis set up. So really, we just went with the path of least resistance for us.
Ah, okay, that makes more sense if you’re on Heroku. Carry on!
Doesn’t Heroku have a trivial PostgreSQL setup?
Yeah, but then we’d have to deal with migrations and ORMs. In the end, we had to do it anyways, but it was a hedge: if we didn’t get as much traffic as we did, we wouldn’t have wasted too much time writing boilerplate.
You don’t need an ORM. This is a common misconception. You can also store indexable key-value data, and indexable JSON documents in Postgres.
Postgres is pretty damn flexible…
Tangential observation about peakiness: 13000 reminders at 150 req/min is less than 100 minutes. Warhol was right. In the future, everyone’s app will be famous for 15 minutes. :)
Hopefully that’s not the case here! We’re planning on submitting Jarvis again to other places this weekend. We’d really like to see him grow, as we think he’s legitimately useful to people. Thanks for reading!
What is the stack behind this, and what was the road to get this working with Facebook Messenger?