This is a pretty balanced article. Absent is Nomad, which claims the world but when you dig into the code pretty large advertised chunks are simply not there yet. Nomad seems like a much more straightforward implementation of the Borg paper, and one day may be interesting once they write the rest of it. A nice Kubernetes feature that is similar to what you can do with fleet is the “Daemon Set” which lets you run certain things on every node. Some cool Mesos features that are pretty new and haven’t been talked about much yet:
persistent volumes: let frameworks have directories on agent machines returned to them after an availability event happens, which is nice for replicated stateful services
maintenance primitives: schedule machines to go offline at certain times, and ask frameworks if it’s safe to take nodes offline. This will soon start being used for stateful services so that they can vote on when it’s safe to take out a replica, and to trigger proactive rereplication when maintenance is desired.
oversubscription: if you have an agent that has given away all of its resources, but the agent detects that there is still some unutilized CPU, it can start “revokable tasks” to fill up the slack up until it starts interfering with existing workloads.
Would it be worth it to use something like Kubernetes with only one node?
I don’t know if one of them support it but I would like to have zero-downtime deployment with containers and it could be nice to have the possibility to add another node someday if I need to.
I wouldn’t consider any form of cluster manager and scheduler until I was well past 10 nodes, the operational overheads of managing such a setup will far exceed any benefits you get.
For a single node you can get away with managing it entirely by hand (document your setup process), or use something like ansible so it’s more easily reproducible. If you plan on growing it’s good to make sure your system is compatible with the direction you plan on growing your infrastructure, but it’s rarely wise to do it from day one.
I’m not sure that ‘virtualization’ is a good tag for this, but there is no ‘orchestration’, ‘cluster’ or similar.
“distributed” tag seems like an obvious choice.
Has anyone worked with RancherOS? I hear good things.