Inaugural Lecture - David Silver: Mastering Go, Chess, Shogi by Self-Play with a General Reinforcement Learning Algorithm

Speaker: David Silver, UCL-CS
UCL Contact: CSC.Communications (Visitors from outside UCL please email in advance).
Date/Time: 23 May 18, 17:30 - 20:00
Venue: Ambrose Fleming LT
Further Information:

There are just a few places left at the inaugural lecture of Professor David Silver next Wednesday 23 May, from 5.30pm – 8pm. The event will take place in the Ambrose Fleming LT, Roberts Building, and will be followed by a reception with drinks and canapés reception in Roberts Foyer, GO2.



A long-standing ambition of artificial intelligence (AI) is to construct algorithms that can learn, without human guidance, to achieve superhuman performance in challenging domains. In this talk I will describe AlphaGo: the first program to defeat a professional player, and subsequently a world champion, in the game of Go - long viewed as the most challenging of classic games for AI. I will then describe AlphaZero: an algorithm that learns entirely by self-play, without any human data or guidance. The core idea is that AlphaZero becomes its own teacher: a neural network is trained both to predict AlphaZero’s own move selections and also to predict the winner of AlphaZero vs. AlphaZero games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play games in the next iteration. Starting tabula rasa from random play and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of Go, chess and shogi (Japanese chess), and convincingly defeated the incumbent world-champion program in each case.

David Silver

David Silver leads the reinforcement learning research group at Google DeepMind. David graduated from Cambridge University in 1997 with the Addison-Wesley award. Subsequently, David co-founded the video games company Elixir Studios, where he was CTO and lead programmer, receiving several awards for technology and innovation. David returned to academia in 2004 to study for a PhD on reinforcement learning with Rich Sutton, where he co-introduced the algorithms used in the first master-level 9x9 Go programs. David was awarded a Royal Society University Research Fellowship in 2011, and subsequently became a lecturer at University College London. David consulted for DeepMind from its inception, joining full-time in 2013. His recent work has focused on combining reinforcement learning with deep learning, including a program that learns to play Atari games directly from pixels (Nature 2015). David led the AlphaGo project, culminating in the first program to defeat a top professional player in the full-size game of Go (Nature 2016), and the AlphaZero project, which learned by itself to defeat the world's strongest chess, shogi and Go programs (Nature 2017).