I think, fundamentally, Hadoop really isn’t intended for scientific HPC. It’s designed to do incredibly stupid analysis over huge volumes of data.
It is, very much, built specifically to serve the sort of high-volume, completely uninteresting analysis that non-R&D corporations do. It might be possible to shoehorn it into some scientific problem domains; but I’m pretty sure in general it’s not a good fit.
Hadoop serialization boundaries also impose a phenomenal cost on multistage jobs; not exactly ideal for high-performance computing.