AlphaGo: Solving Go with
Machine Learning
For the
IUT International Days
By
Hugo Mougard
On
March, 3
DeepMind published a paper in Nature on the 27th of January to
introduce AlphaGo.
AlphaGo is the first AI to ever beat a
human Go pro without free moves.
Overview
-
The Game of Go
- Previous Go AIs
- Convolutional Neural Networks
- AlphaGo
- Conclusion
The Game of Go
A Simple Ruleset
There are 2 important rules:
- stones surrounded by enemy stones are captured
-
empty intersections surrounded by your stones are your territory
The Game of Go
Capture
The Game of Go
Territory
The Game of Go
A Complex Game
Despite its very simple rules, Go is very hard to
master:
- Average number of possible moves at each turn: 200
- Average number of moves in a game: 300
-
Number of legal positions: estimated number of atoms in the
observable universe squared
Overview
- The Game of Go
- Previous Go AIs
- Convolutional Neural Networks
- AlphaGo
- Conclusion
Previous Go AIs
Prerequisite: Game Tree
Previous Go AIs
Objective of a game AI
Explore the game tree efficiently to find the best move.
Previous Go AIs
Min-Max
Previous Go AIs
Min-Max applicable to Go?
- Average number of possible moves at each turn: 200
- Average number of moves in a game: 300
→ 200300 moves to explore
Previous Go AIs
Monte Carlo Tree Search
Min-Max approximation.
Overview
- The Game of Go
- Previous Go AIs
-
Convolutional Neural Networks
- AlphaGo
- Conclusion
Convolutional Neural Networks
Introduction
Approximators of very complex functions, usually “intuitive” ones.
Convolutional Neural Networks
Application
Mainly image and text understanding.
Convolutional Neural Networks
Architecture
Multiple layers of filters combined together.
Convolutional Neural Networks
Filters
Convolutional Neural Networks
Learned Filters
Convolutional Neural Networks
Application
Convolutional Neural Networks
Relation to go
Instead of working on pixels, work on intersections.
Overview
- The Game of Go
- Previous Go AIs
- Convolutional Neural Networks
- AlphaGo
- Conclusion
AlphaGo
Intro by DeepMind
AlphaGo
Core of the approach
Augment Monte Carlo Tree Search with two Convolutional Neural
Networks.
AlphaGo
Policy Network
Predict the next move given the position.
AlphaGo
Value Network
Predict the winner given the position.
AlphaGo
Integration in MCTS
AlphaGo
Supervised Learning
- 29M positions from 160k KGS games
- 8M positions from Tygem games
AlphaGo
Reinforcement Learning
-
Make the policy network play against its previous versions to
create new data to learn from
-
Use the new policy network to create 30M positions to learn the
value network
AlphaGo
Hardware
Google scale:
- 1202 CPUs
- 176 GPUs
- 40 search threads
AlphaGo
Performance
Beat Fan Hui, the European Go Champion 5-0.
AlphaGo
Final Boss
Google challenged Lee Sedol, the best player of the last 10 years.
Starting in 6 days!
1,000,000$ prize!
Overview
- The Game of Go
- Previous Go AIs
- Convolutional Neural Networks
- AlphaGo
- Conclusion
Conclusion
- Yet another milestone reached for AI
- Very general techniques, applicable to many tasks
Thank
you
very
much
for
your
attention
😍
Do you have any question?