1. 6
  1.  

  2. 1

    Side question: does anyone know of something like this but for handwriting recognition? (from recorded stylus paths, not OCR)

    1. 1

      The only RNN or LSTM approach I’ve heard about is from the work on Google Handwriting Input for Android: https://arxiv.org/abs/1902.10525

      I don’t imagine you’ll find much in the way of pre-trained models since the main use-case of taking the directed graph approach would presumably be to improve the real-time accuracy of the same app that’s providing the training data.

      1. 1

        By directed graph approach you mean stylus paths? I’m absolutely green in ML and such stuff, sorry… I’m basically somewhat interested in writing open-source apps with handwriting recognition for stylus-based tablets. Auto-adjustments / learning to better recognize specific person seems a very desirable feature to me, but I’d think I still need something pre-trained on some general corpus, no? My understanding is one can’t just teach a RNN by asking user to type the alphabet just a few times? or can I? though even if I can, even this would be enough of a chore that I could barely call it user friendly…

        1. 1

          Ah, I took your question to mean you would consider the same “O” drawn clockwise different from being drawn counter-clockwise, i.e. there is temporal ordering to the character data in the stylus path itself.

          If you’re just talking about recognizing a finished character, that’s no different than any other OCR problem; you can apply all of the same basic techniques you’d see in any tutorial with e.g. the MNIST data set. I haven’t sought out any pre-trained models or sets of labeled data for the full English alphabet, but I do expect they’re out there.

          But MNIST data is for sure the place to start learning about handwriting recognition though; it’s been the standard starting point on this subject since the beginning.

          1. 1

            You took it right initially :) it’s just that I’m also confused… Maybe let’s state it this way: firstly, AFAIU the “directed graph approach” brings intrinsically more information, so it should be easier to do than pure OCR, no? Then, if I wanted to write an app for this, as an absolute newb, how would you advise me to start? If that’s not too huge of a question :)

            1. 2

              Have a look at the $-family of recognizers, starting from the $1 unistroke recognizer. The others build from there. They’re all designed to be simple and easy to implement with no ML required.