    Short article, but decent substance and useful links. I’d heard of Storm, but not Spark, and it looks like it might be worthwhile for something I’m working on.

        Spark is implemented in Scala, but you don’t have to know Scala in order to use it. Right now, you can write Spark jobs in Scala, Java, or Python (source: https://spark.incubator.apache.org/). I assume more languages will be supported later.

          The interesting part to me is the lack of complexity it has. It doesn’t deal with data replication - it let’s HDFS do that. It doesn’t deal with replication during calculations - it just recalculates from the last good checkpoint in case of server failure. It seems to add a minimal amount of complexity on top of existing technologies. I also think Scala is part of that. If you use Scala well, you can significantly reduce the complexity. (Granted, you can also massively complect a simple idea, so there’s that too). This is probably why we’re seeing a big increase in major Apache projects written in Scala.