1. 8

  2. 2

    Short article, but decent substance and useful links. I’d heard of Storm, but not Spark, and it looks like it might be worthwhile for something I’m working on.

    1. [Comment removed by author]

      1. 4

        Spark is implemented in Scala, but you don’t have to know Scala in order to use it. Right now, you can write Spark jobs in Scala, Java, or Python (source: https://spark.incubator.apache.org/). I assume more languages will be supported later.

        1. 1

          The interesting part to me is the lack of complexity it has. It doesn’t deal with data replication - it let’s HDFS do that. It doesn’t deal with replication during calculations - it just recalculates from the last good checkpoint in case of server failure. It seems to add a minimal amount of complexity on top of existing technologies. I also think Scala is part of that. If you use Scala well, you can significantly reduce the complexity. (Granted, you can also massively complect a simple idea, so there’s that too). This is probably why we’re seeing a big increase in major Apache projects written in Scala.