For most of my apps we use both Redis and Memcached. Thinking about completely removing Memcached from the stack, since everything Memcached does works just as well with Redis. What do you think? Is it times to remove Memcached from the stack and use only Redis?
I’ve dealt with both at relatively high scales (low MHz, persistent and non-persistent workloads, interesting outages related to pushing to high scales, heavy sharding) and may be able to help you zero in on an answer. Before anyone can give you a useful answer, please answer these questions:
Generally I think of redis as a nice prototyping tool, but I haven’t been very impressed with it at high scale, and it has a lot of operational edges that cause it to be more human-time expensive than something like memcached for caches or mysql for persistence.
Concur. I found most of my semi permanent semi ephemeral data could be recategorized as either permanent or ephemeral.
I’ve used Redis at high scale (x00,000 operations per second per instance, with several instances, for years) and it’s worked like a charm. Redis by itself never crashes unless you exceed some operational parameter, or you have people who don’t understand Redis writing code (e.g. using KEYS at all for any reason). I’ve additionally never lost data from a Redis instance that wasn’t directly attributable to user error or Amazon.
That said, Redis is actually a database construction kit rather than a database, so if you show up thinking you’re going to use it without understanding it like you sort-of-can with mysql, then you’re screwed. It also has sharp operational edges (e.g. you should leave about twice, and certainly no more than once again, the size of your maximum redis memory load as headroom, because Redis forks in order to save state, and rapid queries may mark all the copy-on-write pages and force at least one copy of your entire state; and the filesystem driver may do so as well), and it has zero HA story (‘redis cluster’ is a bandaid, not a solution). If you want ‘high’ availability, have two or more slave readers subscribed to master, and if the master ever goes away, manually promote a slave to master, reconfigure the topology, continue.
Redis is probably not my first choice for a persistent store, but it’s not that bad. It has a decent enough filesystem log story that is actually not too dissimilar from the way, e.g., a mysql instance handles it. It has the ability to send writes to slaves. You can persist to disk in the background without screwing the main thread. Pragmatically all that it really lacks is a real clustering story, but most of the time you can shard your database somehow, and Redis is so fast that it offers a lot of headroom until you get to that point unless your data is very large. Now that you can get machines with 2 terabytes of memory, very large is pretty darn large.
I really like this description! In the infrastructure that I became involved with it, it had been thrown up as a way to easily persist state at high throughputs without requiring users to deal with time intensive capacity planning conversations that needed to happen in that organization before using MySQL. As you may predict, the org learned many lessons about the importance of capacity planning! Well after non-recomputable data was being stored in it using an in-house sharding proxy (with auto-promote built in for handling [read: creating] failures) we realized that it was impossible to safely fail over a host and begin reslaving 10 instances using a single rotational disk.
Lessons:
Yeah. If you underprovision most database servers, they will grind to a halt in a moderately safe manner. If you underprovision redis, you can get into unexpected trouble real fast.
That said: properly provisioned Redis with enough memory to fit the use case and a fast SSD: insanely fast, great collection of features, about as safe log-wise as any other database, and scales surprisingly far.
I’m somewhat of an expert in the area, having developed the main memcached client for Ruby (Dalli) and a major user of Redis (Sidekiq). Short answer: I love them both.
memcached is fantastic because it requires almost no configuration at all: set the amount of memory to use and you’re done. It’s threaded so it will use as many cores as necessary and scale to the moon.
redis requires more configuration to tune the persistence properly based on your needs. If you are using Redis for background jobs and caching, you ideally should have two different Redis instances with different persistence configurations. It’s also single threaded but this shouldn’t be a problem for anything but the largest of scales. Typical business apps will be fine.
More info:
I think it’s worth drilling in on a few points you made and expanding a bit:
Unlike memcached, Redis is persistent, but like you said it does require configuration. Redis, given the elements of persistence, can lend its self well to becoming a critical piece of infrastructure storing more than just cache data. Practically, relying on Redis works out 99.9% of the time (or more), but as soon as it’s not a cache anymore you need to think about HA and disaster recovery. That’s where having two instances with two different configurations makes a lot of sense – you don’t need hard durability for things you’re caching.
Memcached is sharded out of the box. Add another machine and you’re good to go. After all, it’s just a cache, right? Redis requires a little more effort and care to shard. twmemproxy is a great way to do that if you need to, and it works for memcached too (though the benefits are a little less obvious unless you’re running a huge cache farm).
Reading the above, you might think I’d be voting in favor of keeping memcached. Honestly, it depends on how big your deployment is, and the kinds of work you’re bringing to Redis. If your cache usage is fairly light (100/s-1000/s), Redis might well be a good way to consolidate infrastructure. Of all the workloads you could heap onto Redis, caching is among the best since it does not require durability (though is helped considerably by it; cold caches hurt).
Using Redis for most things besides caches and truly ephemeral data is a pretty bad idea. While it can be made persistent, it has no real HA story (Sentinel is not ready for production, writes are not guaranteed to make it to each slave before a failover, and in testing I found it really easy to confuse sentinel and get it in a state where it could not elect a new master).
Hello, just a few comments about Sentinel and HA in Redis, in order to expose a point of view different than your one.
Also note that newer versions of Redis tend to be much more verbose and are able to report why a promotion is not possible: you can also configure it with different tradeoffs. For example normally slaves that does not appear to have an updated state are not considered candidates to replace the old master, so by random reboots it is very easy to get into a state where no promotion is possible once you get all the slaves not able to provide evidence they have a decently updated data set.
Disclaimer: I’m the primary author of Redis and Redis Sentinel.
Thank you for sharing! So basically at high volume you wouldn’t recommend replacing Memcached with Redis for the simple key/value cache use case?
It totally depends on your use case. Just don’t use either as your primary data store. Redis is a better choice for pure KV in certain situations, but has more operational complexity. That operational complexity probably isn’t worth it, but if you need to maximize throughput at the cost of multitenancy and increased automation and debugging time, then you can pin redis instances to different cpu cores, and pin network related IRQ handling to another. Disable all disk activity, and make your users pick the maxmemory-policy for their clusters so that it becomes psychologically real for them that their data will be deleted over time / their cluster will stop accepting new data and their writepath will block.