From its origins in the 1950’s until the mid 1980’s, the major approach to AI was to design a set of rules describing the behaviour of the machine. These rules tell the system what to do in a variety of conditions; if a certain condition is met then a specific action will be performed. With a good set of rules and heuristics describing a task, we can make a machine behave intelligently. This approach has been hugely successful in application areas where we can specify exactly how to perform a task under a specific range of conditions, for example, controlling a factory production line.
However, rule based systems require us to specify what the rules are, and so their application is limited to well understood tasks in simple or controlled environments. For example, visually checking the size of a product is straight forward if the product is a known distance from the camera and against an uncluttered background such as a conveyor belt. Identifying the same product in a natural scene would be much more challenging and typically beyond current rule based approaches. Despite these limitations, Rule Based AI systems are widely used in today’s industries and for many problems they represent the best approach.
Below we describe how we have got from rule based systems to Deep Learning:
In the mid 1980’s, a different paradigm emerged designing machines that learn how to solve problems by themselves. Machine Learning differs from Rule Based Systems in that we don’t have to specify how a task is to be performed, instead the system learns how to solve a problem by looking at training examples, refining its behaviour using statistical methods. When it works well, the trained system will have learned the relationship between the input data and the desired output (the actions it should perform) sufficiently well to generalise to new situations. The clear advantage here is that problems can be solved even if we don’t know how to solve them ourselves.
Of course, there are many approaches to Machine Learning, including Bayesian Networks, Support Vector Machines, and Neural Networks, to name a few. Until recently, however, most successful Machine Learning applications have still required some level of problem specific design. The performance of Machine Learning algorithms on raw data is often poor, so hand designed manipulations of the data (for example applying edge detection to an image, or frequency analysis to a sound wave), are often required before good performance can be achieved.
This all changed with the rise of Deep Learning, an approach to training deep neural networks, which learns the whole problem for you.
Deep Learning is fast becoming successful, already establishing itself as the mainstream modern approach to pattern recognition for perceptual tasks. Deep Learning outperforms humans at set tasks with less errors, from identifying items in a photo, to finding tumours in MRI scans. When you talk to Siri, Google voice, Cortana, or Skype translate, your speech is being interpreted by a Deep Neural Network.
Whether its targeted at product recommendations to autonomous vehicles, defeating the World GO champion, or predicting financial markets, Deep Learning is being applied in social media, defence/intelligence, consumer electronics, medical, energy, media & entertainment, finance, robotics, and beyond… so, where next?
Deep Learning is not a new idea and isn’t complicated. The simple idea is to train a very deep (with multiple layers) neural network with huge amounts of training data. We know that a properly configured neural network can approximate any function (think universal Turing machines). The problem then becomes how to properly configure that neural network. This is where Deep Learning comes in, instead of solving a problem in one or two big steps (as most neural networks do), Deep Neural Networks, having many layers, solve a problem in lots of little steps because smaller steps are easier to learn. This makes it possible to solve problems or perform tasks that have previously eluded us.
While Deep Learning is not a new idea, learning typically requires a large training set and is so computationally expensive in that it is only hardware-demanding with the use of graphics cards as massively parallel processors that we use to finally reach their potential. Training Deep Neural Networks is now almost exclusively done using graphics cards (GPU’s), however the resulting deep network can often be deployed as a relatively light load running on embedded hardware, or even a smartphone.
By far the most commonly used Deep Learning model is the Convolutional Neural Network, originally designed for computer vision problems, and now widely applied to all kinds of perceptual problems (labelling complex data); for example, to identify a plant from a photo, an individual from a security camera, a tank from a satellite image, or a tumour from an MRI scan, words from speech, and so on. But those labels can just as easily be actions (e.g. steering a drone, or playing Go) or predictions (e.g. when to buy or sell).
Once trained, Deep Learning models are easily embedded into larger hybrid systems enabling new technologies. For example, any autonomous car must be able to reliably recognise road signs, pedestrians, cyclists, other cars, and so on.
Ultimately, Machine Learning, and Deep Learning models find complex patterns and relationships in your data. How you use that is up to you. With Deep Learning, your initiatives future is what you choose for it.