This is considered to be the bible of Deep Learning. Authored by Ian Goodfellow, Yoshua Bengio and Aaron Courville.
This is considered to be the bible of Deep Learning.
In our reading group, we went through most of the book . What is likeable is that it covers most of the techniques and theory pretty much up to the state of the art. What we disliked about the book is that it is quite badly organized. Some chapters do not have much of a logical progression. Also, it tends to dwell on trivial things, and goes over things that are more complex and important very quickly. (E.g., the sections on back-propagation are a good example of a suboptimal explanations)
In the end, the book is a bit of a mixed bag. It’s great for machine learning practitioners - it treats most of the topics in NNs, but it’s not something that I’d recommend for an undergraduate or even a graduate course. I am still keeping my hopes up for Neural Networks and Deep Learning .
 Note that we read a draft of 4-6 months ago. Our experience may not be completely relevant anymore.
Thanks for a nice review. So, what do you recommend to someone who have an understanding in Neural Nets (through Udacity’s ML Engg. nano-degree). for getting into Deep Learning?
Books/(MOOC)Courses/any resources welcome.
I think it depends on your goal. I can only reason from the perspective to someone who uses machine learning to contribute to another field (computational linguistics) and not as someone who is looking to contribute to the state of the art of machine learning.
If your goal is similar (application of NNs), it’s very doable to do neural nets without diving head-first into theory. Some packages allow you to get started quickly, but still offer the possibility to build more complex graphs when you need to. Keras, in particular, has this property: simple networks are easy, complex networks can be done and are not hard. I also found that Keras' source code is easy to read to get a deeper understanding of the implementation. At some point interesting theory questions will definitely bubble up (why pick a ReLu over a hyperbolic tangent activation function? what is the difference between a GRU unit and LSTM unit? what is this unrolling business?). That’s a good point to read the relevant sections in Goodfellow’s book and/or look at the original literature.
There is also this really good resource on ML and Deep Learning called the “A Course in Machine Learning by Hal Daumé III”. But, this is WIP for some chapter though.