The Science Behind a Computer Program’s Mastery of Go

AlphaGo Paper Figure 1

Figure 1. The high-level diagram of AlphaGo’s deep neural network pipeline and architecture from Figure 1 in the paper “Mastering the game of Go with deep neural networks and tree search” published January 28, 2016 in Nature.

This past March a high profile challenge match for one million dollars took place between the computer program AlphaGo and the world’s top human Go player Lee Sedol. AlphaGo won 4 out of 5 matches and won the challenge.

Five months before, on October 2015, AlphaGo became the first computer program in history to beat a professional Go player when it won 5 out of 5 matches against European Champion Fan Hui. The following January, the paper “Mastering the game of Go with deep neural networks and tree search” (published January 28, 2016 in Nature) was published that discussed the mechanisms used to power of AlphaGo in that match. They may be summarized as powered by deep neural networks trained using a combination of 1) supervised learning from human expert games and 2) reinforcement learning from games of self play (see Figure 1 above).

An enormous amount of research is encapsulated in the overview shown in figure 1. Stay tuned and we will examine the science behind the computer algorithms that enabled AlphaGo to play go at an impressive level of mastery.