I can, from personal experience. confirm a number of the scenarios layed out in this post as well as at least one other that isn’t covered.
We alleviated our problems by reducing long gc pauses and moving to dedicated master nodes. I’d advise anyone using ES to at the minimum, take those steps.
How does one reduce long gc pauses? Was it a question of GC tuning, heap size adjustments, or splitting workload to multiple VMs?
We have a lot of work to do on this. With dedicated master nodes, long GC pause are now only a performance issue.
Currently we are working on creating less garbage as that is the ideal way to deal with GC issues.
What we did do was lower the CMS initiation rate from 75% to 50% which in our case, kicks in 2 gigs earlier as we currently have 8 gig nodes. We are looking at lowering heap size and moving to G1 which we have had great success with elsewhere.
Additionally, we lowered some internal ES cache values greatly. We noticed no overall performance impact and got a large win in terms of length of time for GC pause.
We currently have few stop the world compactions and long young gen GC pauses are a lot more rare. We could tune the GC more but are spending our time figuring out how we can change our indexes and queries to generate less garbage. We know we have some inefficiencies we can address there.
In general on the JVM I try to take the following steps to deal with GC issues:
For me, that as I work down that list, it usually becomes more and more work.
I’d advise anyone interested in the subject get
Java Performance by Charlie Hunt (http://www.amazon.com/Java-Performance-Charlie-Hunt/dp/0137142528)
and check out some of these videos:
The JVM can give you great reporting on GC events. If the following options are unfamiliar to you and you are interested in GC tuning, then definitely get Java Performance as it covers these and more in depth (plus how to read the resulting files):
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:***** -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=1M -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintSafepointStatistics
Why couldn’t I get any of this out of you when I was asking for information on Twitter?
Timing. Hadn’t gone boom yet
FWIW, I was recently speaking to some ElasticSearch devs and resolving the split brain scenarios in their cluster protocol is apparently a top priority for them. It’s a bit of uncharted territory as ElasticSearch isn’t really a database, but a search index (Lucene) that was used as the basis for a distributed system.
FYI, literally the first massive distributed systems in production in the way we talk about them today were search systems. Inktomi, etc. Heck, I’ve seen many distributed production systems using Lucene, too. And CAP came out of search systems, too! So, this isn’t really uncharted territory.