I’ve been working on writing an article about AlphaStar for a while (cough, very late), but after last week’s events I decided to sit down and write about OpenAI’s Five success.
There are a few areas I wish I had more knowledge to expand on:
- I’m pretty hazy on how OpenAI’s Rapid really works (there isn’t that much besides the PR articles)
- Policy gradients and the implications on self-play
- Pros and Cons of PPO for Dota 2. I’d also like to know what didn’t else OpenAI tried that didn’t work. It was pretty evident that scaled hardware covered for model inefficiencies.
- Decision Tree of Starcraft 2 vs OpenAI. Relative to each of the games, OpenAI has solved more of Dota 2’s action space. However, Starcraft 2 seems to have a larger decision tree?
- OpenAI mentioned “surgery” when thinking of transfer learning amongst Dota 2 patches, but there isn’t much information out there (yet?).
Very open to feedback and suggestions, thanks!