1. 9

With today third victory on a match of best of five, AlphaGo, Google’s deep mind company AI already proven that he is far better than his opponent Lee Sedol, the best player of the last ten years in go game.

It did it from such an overwhelming way according to the commentators that there is, in my opinion, really tiny hope than any human can win a single game to this unstoppable machine.

I think that for google, it has proven his point that they have created new kind of game AI on shoulders of giants. The last two games will prove until which point they are right.

I preferred to write that with my own words, there is already a lot of media attention that you probably saw it.

  1.  

  2. 10

    really tiny hope than any human can win a single game to this unstoppable machine.

    And then Lee Sedol won ;): https://gogameguru.com/alphago-4/

    1. 4

      It wasn’t worth it’s own post, but here’s an AI researcher (/doomsayer) writing before the fourth match: https://www.facebook.com/yudkowsky/posts/10154018209759228

      1. 1

        I tried to read this page, found it strange and somewhat wrong, look at the site it point to and did not find more enlighten

      2. 3

        It feels like Game 4 not only indicates that AlphaGo might be defeated but also that there might be a systematic weakness within it. This would not be surprising exactly since it’s a novel system playing for the first time in games with such powerful opponents, I suppose. It’ll be interesting to see if Lee Sedol can find it more completely and exploit it again, but it also may or may not be something that the DeepMind team can fix by improving the structure of the AlphaGo system (BetaGo?).

        All together this has been a fascinating set of matches.

        1. 3

          It’s possible, but I think not likely beyond the very short term. There was an idea in chess of anti-computer tactics, that humans could improve their play versus a system like Deep Blue if they learned its specific strengths and weaknesses, instead of playing it as if it were a human. Some hoped this would lead to the initial Deep Blue win being more of a fluke, taking advantage of poorly prepared humans who simply were unfamiliar with computer-style play, but would in time adapt and learn to counter it. I think there is some plausibility in the idea that humans' best play against a computer may not be identical to their best play against humans. But in the case of chess, any possibilities there were swamped by just the rapid increase in computing power. Anti-computer tactics can only plausibly help you on the margins, not once you’re hugely outclassed by a system that has all but solved the game.

          Go is harder computationally, but it’s still a finite game on a smallish discrete board, with well-specified formal rules, and a very well-defined goal. All things that are ideal terrain for a computer. So I think it’s only a matter of time before Go-playing systems improve to to the point that they have a very good approximation of the entire combinatorial space, good enough that they’re impossible for humans to beat using any tactics. If I were looking for areas where humans aren’t likely to be outclassed in the medium term, I wouldn’t pin my hopes on trying to beat computers in understanding formally specified combinatorial spaces—that’s their turf.

        2. 2

          One interesting detail about how AlphaGo works: each network evaluation sees the board position afresh, without reusing the results of previous analysis.

          Lee Sedol, or the folks commenting on the game, obviously don’t do that–if a group of stones on some side of the board is “settled”, they don’t spend much effort on thinking about how a move elsewhere affects them. Many hours and moves into the game they’ve probably developed some sense of how the current state of the board “works” and can understand one new move much quicker than they could understand an unfamiliar board.

          AlphaGo looking at each position afresh may occasionally lead it to notice an interaction that humans wouldn’t (or put the other way, sometimes humans may think we already understand some part of the board when we don’t), but on the whole, “throwing away” previous analysis work feels like it must not be the theoretically ideal way. The value network is looking over and over at a part of the board where nothing’s changed, for example.

          Recurrent neural networks can remember their previous inputs, and are used when an NN needs to respond to a stream of inputs (bytes of text, pen movements for handwriting recognition, etc.). You can imagine AlphaGo’s “fast rollout” network being replaced with an RNN that passes some state (estimated white/black strength in various chunks of the board?) between when it outputs one move and starts to look for the next. How you train those, and how best to maintain state in a search process that jumps back and forth between positions, and other hard questions are for the experts. But, generally, “memory” seems like a general direction you could go for trying to improve computer Go with NNs.

          (There are some memory-related things already in AlphaGo. The policy networks do have inputs to help them focus on moves near the previous two, and it probably remembers evaluations of some exact board positions across moves. But in the AlphaGo paper I don’t see any mention of using “learnings” from one neural-network evaluation to inform the evaluation on a related position, and that’s what I’m thinking about here.)

          1. 1

            (Less speculatively, neural network evaluations will get much cheaper; folks are playing with 8-bit NNs and wacky approximate-math ideas and in any case, if enough folks really use networks, that alone will lead to cheaper hardware and refined software and techniques. If nothing else, you can expect AlphaGo-like feats will be feasible for folks other than Google.)