This could be one of the best presentations ever made.
The startup time for a JRuby application comes down to four things: the startup time of the JVM, the loading of the JRuby runtime, and the loading of libraries, and finally the loading of the application code. The JVM is actually not that slow to start, and JRuby starts executing Ruby code pretty quickly, but due to a lot of factors it is not anywhere near as fast as MRI at loading and compiling Ruby code.
The overwhelming majority of JRuby startup time is unfortunately loading the JRuby core, the core libraries, the standard library, and your application code. Looking for files on disk and sometimes in JAR files, reading hundreds if not thousands of files, handling lots of exceptions that libraries generate during loading (try running with -Xlog.exceptions=true -Xlog.backtraces=true too see them all), it’s a lot of work.
Is it worth it? It’s definitely a pain waiting those seconds (though it’s not two minutes anymore) for the tests to run, for sure – and I have still not found anything that helps, Nailgun, Drip, I don’t remember all the preloading that I’ve seen over the years, they either don’t work very well, or they don’t matter because loading all that code is what takes the majority of the time.
I tried running the unit tests for an application (not Rails, just plain JRuby) and they run in 25s, (RSpec reports that “files took 10s to load”, but I’m not sure how it measures that and exactly what it means). I ran with -Xdebug.loadService.timing=true and I start getting output in one or two seconds, so the JVM and the JRuby runtime has started executing in that time, that part is not slow at least.
I can see that it takes 15s between JRuby starts running Ruby code until the first test has run. First it’s core (jruby.rb, ~300ms), then Ruby core (jruby/kernel.rb, ~100ms), RubyGems (~500ms). Then it starts loading the libraries my app depends on, and there are a lot of things in the standard library that take on the order of 500ms to load.
time reports 40s of real time, which sounds about right: 15s for loading Ruby, 25s to run the tests.
Running just ruby -e 'puts 1' takes less than two seconds, and could take a lot shorter if RubyGems wasn’t always loaded (try it yourself: time ruby -Xdebug.loadService.timing=true -e 'puts 1').
ruby -e 'puts 1'
time ruby -Xdebug.loadService.timing=true -e 'puts 1'
Having a runtime without a GIL, being able to load any Java library as if it were a Ruby library, and the performance makes it a very appealing platform for me, but YMMV.
It’s really too bad that it’s only got two zones. At least that means Paris is going to have three so that there are two EU regions with three zones (there was an AWS blog post a couple of weeks ago that mentioned the total number of zones in EU and there were five in total unaccounted for).
This should probably include “(2008)” in the title, it’s old enough that “the old way” has been new and then old again, again.
Perhaps you have already done this but you can suggest titles for the story. Here is a direct link.
It’s funny when RDBMS proponents rack down on NoSQL for not being “durable” or not “being ACID”, when in fact, most RDBMS, including Postgres aren’t either.
I found this presentation to be mostly FUD and misunderstandings about what the things they talked about mean. I’m sure the Postgres things are correct, but the first half of the presentation was mostly misinformation about other ways of storing data. It’s easy to be consistent when there’s only one copy of your data – but to say that something that stores only a single copy of the data is “durable” when one that replicates it is not, then what do you even mean by “durable”?
I’m sure Postgres' HStore and JSON columns are very useful if you’re already using Postgres, and for things where a relational model is useful (on the fly aggregations of small data sets, for example), and Postgres replication story seems to have gotten better lately. I just find it silly and unhelpful to bash NoSQL for things like “not being ACID”, when you include a slide like #24 and say “by the way, if you need better performance, you can turn off that pesky durability, just like those fast NoSQL systems”. Why do you think those NoSQL systems exist in the first place?
to say that something that stores only a single copy of the data is “durable” when one that replicates it is not, then what do you even mean by “durable”?
That’s easily answerable by looking up what durability means in the context of ACID. It means that, once a commit succeeds, the changes are not lost in case of system-level outage.
This is actually really cool. I’d love to see a proof of concept, maybe forking and changing libketama to see how it works with memcached?
Guava has a Java implementation of the algorithm, that’s how I found the paper. Someone on the Guava mailing list asked about the code and if there was any background on the algorithm.
Great point! I’m using 0x1F as field separator all the time when building cache keys or similar things, it’s much better than using , - / or other printables since that’s bound to break sooner or later.
There’s a few screenshots in the repositories.
Those look very much like SublimeText with the Soda and amCoder themes.