Presenter here! I’m happy to answer any questions about Mesos that weren’t covered in the video.
Thanks, this was a really cool talk! It clarified a lot of stuff about Mesos for me.
The Mesos paper (in section 3.6) says:
… to deal with scheduler failures, Mesos allows a
framework to register multiple schedulers such that when
one fails, another one is notified by the Mesos master to
take over. Frameworks must use their own mechanisms
to share state between their schedulers.
I’m very curious about this, but I’ve never been able to find any documentation / more info on it. Any chance you have any links to more info / know where I should be looking?
Sure! I don’t think the feature exists with the exact API you’ve described, although that’s possible for Mesos 1.0 based on discussions I’ve had with Mesos core committers.
Instead, here’s what you’d do: first, use a leader election system, like Curator’s LeaderSelector + Zookeeper, etcd, or Hazelcast. Once a leader is elected, it can create its Scheduler instance, whose FrameworkInfo must have an existing FrameworkID and failover_timeout set . Then, when it starts, it’ll forcibly take over for the given FrameworkID.
In our system, we use Curator to do leader election and store the FrameworkID. First, we check if the FrameworkID has been set at a predetermined ZooKeeper path. If not, you’ll be assigned a new FrameworkID on startup (and then you can store it for subsequent runs). It may also be possible to simply choose a FrameworkID directly rather than a letting it be generated and choosing the ZooKeeper path; I haven’t tried.
interesting, thanks for the clarification!
(sidenote, I wish this sort of thing were more clearly documented. In general the one gripe I’ve had trying to learn about Mesos is that documentation is hard to find)
In the Q&A you address how persistent offers will allow durable enough storage for backing systems like HDFS that can handle their own placement & recovery from lost nodes. Is there a persistent storage solution for running a database such as Postgres inside Mesos? I checked out the ticket in JIRA but it sounds like that work is specific to HDFS/Riak style requirements.
That ticket will work equally well for any persistent database on Mesos, be it Riak or Postgres. The trick is that the ticket is just providing a primitive: the ability to stay on the same machine over crashes and restarts.
Once this exists, then we’ll be able to write a framework that handles issues like helping clients to discover where Postgres is running, to automatically configure replication and read slaves, and to automatically migrate databases between hosts.
I see, makes sense. So for apps that don’t have the ability to handle their own replication the storage issues should be solved with a different tool? I’ve looked but it doesn’t look like there are any widely accepted options. Ceph RBD looks like it might do the job but I haven’t read much about it being used in production.
That’s correct. Although, you probably shouldn’t be using distributed data stores without a real replication model, lest you discover the problems with replica divergence and availability the hard way.