AlphaGo – Part (II)
ANN research were confined to a few ghettos, most prominently, Geoffrey Hinton in Toronto, Yann LeCunn in New York, Yoshua Bengio in Montreal and Jurgen Schmidhuber in Losano. They worked hard to solve a fundamental problem in ANNs: how to train deep neural networks?
Two main difficulties seems insurmountable: 1) how to avoid the gradient vanishing and gradient explosion problem? and 2) how to avoid overfitting?
Faster computers, more data and efficient learning tricks come to the rescue. A new wave was initiated in 2006, when Hinton published a landmark work proposing an algorithm to train a class of machines called Restricted Boltzmann Machines (RBM) with several layers. The idea was simple: train the network layer by layer with an algorithm called Contrastive Divergence (CD) – when a layer is trained, its weights are freezed and their hidden states taken as input to the next layer.
The architecture of these layers stacked in a greedy approach were called a Deep Belief Network (DBN). DBN worked fine achieving top performance on the classical MNIST handwritten character recognition problem.
Note that, from a statistical point of, training of DBN (i.e. finding the latent parameters describing the posterior probability distribution based on a set of observations) is an intractable problem, i.e., the number of possibilities is far too high to calculate the likelihood integrals – sorry for all this Bayesian statistical jargon… The trick was to use sampling, techniques like monte carlo markov chain MCMC or Gibbs sampling. And it worked!
However, bear in mind, that these machines are very wild beasts. They are very complex and time consuming to train and finding the right architecture can become a nightmare as almost no theoretical support was developed so far. These machines could have hundreds of millions of parameters. Finding the right set of parameters is like locating the grand canyon at complete obscurity…
Despite all these difficulties, Deep Learning is the thing. It beats all the other techniques in complex problems, sometimes by orders of magnitude: image recognition, video, speech recognition, translation, NLP. To human race dismay, we reach a point where machines beat humans in recognizing objects in an image. So, no much surprise Google beating humans in Go. Our old brains simply can not cope with this huge and powerful learning algorithms.
So much for deep learning. No comes the No part. Why Deep Learning alone will not lead us to GAI?
That will be explained in part III.