Competition AlphaGo between South Korea famous chess player Lee se-dol, eventually AlphaGo defeats lee se-dol in 4:1. This is the victory of the artificial intelligence, and AlphaGo utilization technology is deep learning!

Actually, Deep learning is established on the basis of the neural network technology, because in the late 90 s, neural network technology has not been good, later in order to win back people's attention and study the neural network technology of Canadian scientists Geoffrey Hinton in 2006 for his development of multi-layer neural network technology with a new name "Deep learning" (Deep Leaning), and by every computer image recognition in the global competition of technology, far beyond the other rivals, including Google and other large commercial products, from the Deep learning gradually became the most popular artificial intelligence technology in the world! The basic characteristics of neural network, is trying to imitate the brain neurons transmit, process information model.

A computing model, should be divided as neural networks, usually need a lot of nodes connected to each other (also known as' the neurons'), and have two features:

Each neuron, through a specific output function (also called excitation function activation function), computing the weighted input values from other neighboring neurons

The strength of the information transmission between neurons, using so-called weighted value to define, algorithm will continue to self-learning, adjust the weighted values.

However, deep learning of the basic idea is the machine automatically learning: suppose we have a neural network system S, it has n layer (Sn S1, etc.), it is I the input, the output is O, vividly expressed as: I = > S1 = > S2 = >.....= > Sn = > O, if the output is equal to the input O I, namely input I without any information loss after the system changes, keep a constant, which means that the input I passed each layer Si without any information loss, or layer in any Si, it is the original information (type I) another said. I suppose we have a pile of input (like a heap of images or text), assuming that we have designed a system S (n), by adjusting the system parameters, we make it the output is the input of I, then we can automatically retrieve get input features a series of levels, I namely S1,..., Sn.

For deep learning, the idea is to stack multiple layers, that is to say, this layer of output as the input of the next layer. In this way, can realize the input information expressed by stage. In simple terms, the machine can learn the same information, under the condition of different form of expression, this is actually the essence of the so-called smart!

Search and optimization

Many problems in AI can be solved in theory by intelligently searching through many possible solutions:Reasoning can be reduced to performing a search. For example, logical proof can be viewed as searching for a path that leads from premises to conclusions, where each step is the application of an inference rule.Planning algorithms search through trees of goals and subgoals, attempting to find a path to a target goal, a process called means-ends analysis. Robotics algorithms for moving limbs and grasping objects use local searches in configuration space.Many learning algorithms use search algorithms based on optimization.

Simple exhaustive searches are rarely sufficient for most real world problems: the search space (the number of places to search) quickly grows to astronomical numbers. The result is a search that is too slow or never completes. The solution, for many problems, is to use "heuristics" or "rules of thumb" that eliminate choices that are unlikely to lead to the goal (called "pruning the search tree"). Heuristics supply the program with a "best guess" for the path on which the solution lies.Heuristics limit the search for solutions into a smaller sample size.

A very different kind of search came to prominence in the 1990s, based on the mathematical theory of optimization. For many problems, it is possible to begin the search with some form of a guess and then refine the guess incrementally until no more refinements can be made. These algorithms can be visualized as blind hill climbing: we begin the search at a random point on the landscape, and then, by jumps or steps, we keep moving our guess uphill, until we reach the top. Other optimization algorithms are simulated annealing, beam search and random optimization.

Deep feedforward neural networks

Deep learning in artificial neural networks with many layers has transformed many important subfields of artificial intelligence, including computer vision, speech recognition, natural language processing and others.

According to a survey,the expression "Deep Learning" was introduced to the Machine Learning community by Rina Dechter in 1986 and gained traction after Igor Aizenberg and colleagues introduced it to Artificial Neural Networks in 2000.The first functional Deep Learning networks were published by Alexey Grigorevich Ivakhnenko and V. G. Lapa in 1965.These networks are trained one layer at a time. Ivakhnenko's 1971 paper describes the learning of a deep feedforward multilayer perceptron with eight layers, already much deeper than many later networks. In 2006, a publication by Geoffrey Hinton and Ruslan Salakhutdinov introduced another way of pre-training many-layered feedforward neural networks (FNNs) one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine, then using supervised backpropagation for fine-tuning.Similar to shallow artificial neural networks, deep neural networks can model complex non-linear relationships. Over the last few years, advances in both machine learning algorithms and computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer.

Deep learning often uses convolutional neural networks (CNNs), whose origins can be traced back to the Neocognitron introduced by Kunihiko Fukushima in 1980.In 1989, Yann LeCun and colleagues applied backpropagation to such an architecture. In the early 2000s, in an industrial application CNNs already processed an estimated 10% to 20% of all the checks written in the US.Since 2011, fast implementations of CNNs on GPUs have won many visual pattern recognition competitions.

Deep feedforward neural networks were used in conjunction with reinforcement learning by AlphaGo, Google Deepmind's program that was the first to beat a professional human player.