1. 13

    I’m somewhat of an expert in the area, having developed the main memcached client for Ruby (Dalli) and a major user of Redis (Sidekiq). Short answer: I love them both.

    memcached is fantastic because it requires almost no configuration at all: set the amount of memory to use and you’re done. It’s threaded so it will use as many cores as necessary and scale to the moon.

    redis requires more configuration to tune the persistence properly based on your needs. If you are using Redis for background jobs and caching, you ideally should have two different Redis instances with different persistence configurations. It’s also single threaded but this shouldn’t be a problem for anything but the largest of scales. Typical business apps will be fine.

    More info:

    1. 6

      I think it’s worth drilling in on a few points you made and expanding a bit:

      Unlike memcached, Redis is persistent, but like you said it does require configuration. Redis, given the elements of persistence, can lend its self well to becoming a critical piece of infrastructure storing more than just cache data. Practically, relying on Redis works out 99.9% of the time (or more), but as soon as it’s not a cache anymore you need to think about HA and disaster recovery. That’s where having two instances with two different configurations makes a lot of sense – you don’t need hard durability for things you’re caching.

      Memcached is sharded out of the box. Add another machine and you’re good to go. After all, it’s just a cache, right? Redis requires a little more effort and care to shard. twmemproxy is a great way to do that if you need to, and it works for memcached too (though the benefits are a little less obvious unless you’re running a huge cache farm).

      Reading the above, you might think I’d be voting in favor of keeping memcached. Honestly, it depends on how big your deployment is, and the kinds of work you’re bringing to Redis. If your cache usage is fairly light (100/s-1000/s), Redis might well be a good way to consolidate infrastructure. Of all the workloads you could heap onto Redis, caching is among the best since it does not require durability (though is helped considerably by it; cold caches hurt).

      1. 4

        Using Redis for most things besides caches and truly ephemeral data is a pretty bad idea. While it can be made persistent, it has no real HA story (Sentinel is not ready for production, writes are not guaranteed to make it to each slave before a failover, and in testing I found it really easy to confuse sentinel and get it in a state where it could not elect a new master).

        1. 2

          Hello, just a few comments about Sentinel and HA in Redis, in order to expose a point of view different than your one.

          1. Sentinel starting with Redis 3.0 is considered to be production ready.
          2. The fact that it has a weak (and very documented) consistency model (best effort consistency with asynchronous replication, with different failure modes that can lead to data loss, but with attempts to avoid losing data when possible), does not mean you can’t use it, it means that you need to apply it for use cases where this consistency model makes sense for the application.
          3. The new Redis Sentinel (v2) acts on a very small set of fixed rules. For example every failover generates a guaranteed unique configuration number, and newer configurations eventually wins over old ones when partitions heals, it’s all documented. If there are behaviors that don’t conform with what Sentinel is supposed to do, please report them, we test Sentinel and can’t find issues like the ones you describe (however make sure to test latest 3.0 for maximum stability – latest 3.2 can be also an option but has a lot of new code).

          Also note that newer versions of Redis tend to be much more verbose and are able to report why a promotion is not possible: you can also configure it with different tradeoffs. For example normally slaves that does not appear to have an updated state are not considered candidates to replace the old master, so by random reboots it is very easy to get into a state where no promotion is possible once you get all the slaves not able to provide evidence they have a decently updated data set.

          Disclaimer: I’m the primary author of Redis and Redis Sentinel.

        2. 1

          Thank you for sharing! So basically at high volume you wouldn’t recommend replacing Memcached with Redis for the simple key/value cache use case?

          1. 2

            It totally depends on your use case. Just don’t use either as your primary data store. Redis is a better choice for pure KV in certain situations, but has more operational complexity. That operational complexity probably isn’t worth it, but if you need to maximize throughput at the cost of multitenancy and increased automation and debugging time, then you can pin redis instances to different cpu cores, and pin network related IRQ handling to another. Disable all disk activity, and make your users pick the maxmemory-policy for their clusters so that it becomes psychologically real for them that their data will be deleted over time / their cluster will stop accepting new data and their writepath will block.

      1. 4

        The major feature in this seems to be Redis Cluster. The specification of Redis Cluster is:

        • Acceptable degree of write safety: the system tries (in a best-effort way) to retain all the writes originating from clients connected with the majority of the master nodes. Usually there are small windows where acknowledged writes can be lost. Windows to lose acknowledged writes are larger when clients are in a minority partition.
        • Availability: Redis Cluster is able to survive to partitions where the majority of the master nodes are reachable and there is at least a reachable slave for every master node that is no longer reachable. Moreover using replicas migration, masters no longer replicated by any slave, will receive one from a master which is covered by multiple slaves.

        I think these semantics are a bit odd. I’m having trouble parsing the Availability section completely but I believe it is saying that it is neither consistent nor available. I’m not sure what class of problems this solution fits into.

        1. 3

          Hello, availability is basically limited on purpose in the minority partition, even if semantically Redis Cluster is eventually consistent, since the merge function is “last failover wins” so basically there is no gain at all in writing in the minority partition, that would all go to populate the “lost writes” fiesta. So in practical terms the cluster is available only in the side of the partition where: 1) There are the majority of masters. 2) There is at least a slave serving the hash slots of each of the masters not reachable. Assuming you have a 6 nodes cluster, M1, M2, M3, S1, S2, S3, if M1, M2, and S3 are alive, the cluster can continue, but if a partition splits it into M1, M2, S1, S2 | M3, S3, there is no side able to continue.

          The availability is improved via replicas migration. In the above setup, if you add S4, an additional slave, it will migrate to masters remaining uncovered. So for example if in the above setup M1 fails, S1 gets promoted as new master for the same keys, but M1 does not return back, S4 will migrate from the master that had two slaves in order to protect M1.

          You are right Redis Cluster is not consistent nor available in terms of CAP.

          About the set of problems served: the guarantees are identical to current Redis master-slave setups, and also similar to PostreSQL / MySQL failovers when asynchronous replication is used. So people that are now using those systems may find Redis Cluster appropriate for their set of problems.

          1. 1

            Oh, I understand the semantics, I just don’t see how they are useful.

            also similar to PostreSQL / MySQL failovers when asynchronous replication is used. So people that are now using those systems may find Redis Cluster appropriate for their set of problems.

            Yes, I also don’t really know what problems that setup solves in PostgreSQL/MySQL.

            IMO, Redis semantics simply are not usefully distributed without paying the price of synchronous replication.

        1. 2

          “The fourth misconception is that eventual consistency is all about CAP, and that everybody would chose strong consistency for every application if it wasn’t for the CAP theorem.”

          This is an extremely widespread misconception. CP systems are actually pretty good at availability while not reaching “A” of CAP, since all the clients in the majority partition can make progresses. This is enough HA for most systems. The huge tradeoff in CP systems is performance, that’s why eventual consistency systems are so popular, not because most people need clients able to make progresses in minority partitions (a few use cases definitely require that btw).