Deep Learning is here to stay. This is not another clever way to train machines in big data, it’s a revolution.
Recent advances in deep learning methods have led to a widespread enthusiasm among pattern recognition and machine learning communities. Inspired by the depth structure of the brain, deep learning architectures have revolutionized the approach to data analysis performed by computers. Initially proposed by Geoffrey E. Hinton (now leading the AI team at Google), Deep Learning networks have won a paramount number of hard machine learning contests, from voice recognition, image classification, Natural Language Processing (NLP) to time-series prediction – sometimes at a large margin.
Deep models have feature detector units at each layer (level) that gradually extract more sophisticated and invariant features from the original raw input signals. Lower layers aim at extracting simple features that are then clamped into higher layers, which in turn detect more complex features. In contrast, shallow models (two-layers neural network or support vector machine) present very few layers that map the original input features into a problem-specific feature space.
Being essentially non-supervised machines, deep architectures can be exponentially more efficient than shallow ones. Since each element of the architecture is learned using examples, the number of computational elements one can afford is only limited by the number of training samples – which can be of the order of billions. Deep models can be trained with hundreds of millions of weights and therefore tend to outperform shallow models such as SVMs. Moreover, theoretical results suggest that deep architectures are fundamental to learn the kind of complex functions that represent high-level abstractions (e.g. vision, language, semantics), characterized by many factors of variation that interact in non-linear ways, making the learning process difficult.