1. 9
  1.  

  2. 1

    In my opinion, this is the state of affairs in the fuzzing world too. No one discusses the non-obvious questions, and makes any attempt to find out exactly why a certain approach worked.

    1. 1

      This post especially resonates with me because I know nothing about deep learning but I’d like to. I’ve been searching for a reference sheet that explains “how do I construct a dataset and model to answer my question and why do I make these specific decisions.” I have yet to find good explanations about even simple things like “how many layers do I need to classify an image and why.”

      I have a hypothesis this has to do in part with the fact that deep learning is still a new and rapidly changing field, and that many of the people with substantial knowledge know better than to waste time documenting rationales that could easily be obsolete in under a year.

      Maybe I’m not looking hard enough, but even many simple flappy bird YouTube tutorials link repositories that don’t compile/run anymore due to the rapid change in the ecosystem. It’s relieving to hear a similar grievance voiced, but I don’t know what approach would best mitigate this problem.

      1. 1

        Deep learning is a gold rush. It’s a chance to plant a flag and if you plant a flag first, then you have a better ticket to the lottery of getting many citations. It attracts a lot of people and motivates them to push out stuff as quickly as possible. Papers with higher numbers generally do better than papers with a strong theoretical background. This problem is compounded by the fact that, since there are so many more submissions, conferences have to rely on an increasing number of junior researchers. When I started in computational linguistics, it would be something if you were near wrapping up your PhD and you were asked to review for one of the top-tier conferences under supervision of your PhD supervisor. Now 1st year PhD students get invited, simply because there are to many submissions.

        That said, there is genuine, well-understood progress. For example in natural language processing (in very broad strokes):

        • Feed forward neural networks generally have generally beaten SVMs and logistic regression, largely because they implicitly extract features. Plus NNs work well with word embedings, which have improved lexical coverage tremendously.
        • RNNs such as LSTMs have generally beaten FF nets, because they are able to model larger contexts (long-distance dependencies, e.g. consider correctly annotating separable verb particles).
        • Bidirectional RNNs have generally beaten unidirectional RNNs, because it helps to have information from the right context in addition to the left context. (E.g. consider distinguishing copular verbs from auxiliary verbs in many languages.)
        • Transformers have generally beaten Bidirectional RNNs. Because they can attend to all tokens in a sequence at the same time, it is easier for the network to percolate information over longer distances. Plus the extensive use of residual connections makes it possible to train deeper networks (although residual connections can of course also be used in stacked RNNs). Practically, even though most commonly used attention mechanisms result in O(n^2) complexity, transformers are faster than RNNs, because n is usually small (especially when processing at a sentence level) and each time step can be computed in parallel.

        Perhaps people will become more interested in the why when the gold rush is over and accuracy ceilings have been reached.

        1. 1

          Perhaps people will become more interested in the why when the gold rush is over and accuracy ceilings have been reached.

          But then, there will be another ‘gold rush’, for something else.

          May be, a more general issue, is that current academic environment does not reward skeptics and does not reward checking papers for reproducibility.

          Unless those incentive are organically in place, it is, at least for me, hard to see how the current situation will change.

          1. 2

            May be, a more general issue, is that current academic environment does not reward skeptics and does not reward checking papers for reproducibility.

            Unless those incentive are organically in place, it is, at least for me, hard to see how the current situation will change.

            It is hard to change these incentives (from with)in academia. As long as funding is allocated based on number of citations, prominence, etc. it will remain more interesting to push out paper with spectacular numbers obtained through lots of hyperparameter tuning, etc.