Nice writeup. If I’m understanding this correctly, this network is purely a discriminative model that gives a short horizon evaluation of the current position (tactics). Proposed moves are coming from MCTS.
I wonder if there’s been an attempt to use this discriminator with a learned generative model. Either a VAE or GAN for example.