Practically all of the achievements mentioned so far stemmed from machine learning, a subset of AI that accounts for the vast majority of achievements in the field in recent years. When people talk about AI today, they are generally talking about machine learning.
Currently enjoying something of a resurgence, in simple terms, machine learning is where a computer system learns how to perform a task rather than being programmed how to do so. This description of machine learning dates all the way back to 1959 when it was coined by Arthur Samuel, a pioneer of the field who developed one of the world’s first self-learning systems, the Samuel Checkers-playing Program.
To learn, these systems are fed huge amounts of data, which they then use to learn how to carry out a specific task, such as understanding speech or captioning a photograph. The quality and size of this dataset are important for building a system able to carry out its designated task accurately. For example, if you were building a machine-learning system to predict house prices, the training data should include more than just the property size, but other salient factors such as the number of bedrooms or the size of the garden.
The key to machine learning success is neural networks. These mathematical models are able to tweak internal parameters to change what they output. A neural network is fed datasets that teach it what it should spit out when presented with certain data during training. In concrete terms, the network might be fed greyscale images of the numbers between zero and 9, alongside a string of binary digits — zeroes and ones — that indicate which number is shown in each greyscale image.
The network would then be trained, adjusting its internal parameters until it classifies the number shown in each image with a high degree of accuracy. This trained neural network could then be used to classify other greyscale images of numbers between zero and 9. Such a network was used in a seminal paper showing the application of neural networks published by Yann LeCun in 1989 and has been used by the US Postal Service to recognise handwritten zip codes.
The structure and functioning of neural networks are very loosely based on the connections between neurons in the brain. Neural networks are made up of interconnected layers of algorithms that feed data into each other. They can be trained to carry out specific tasks by modifying the importance attributed to data as it passes between these layers. During the training of these neural networks, the weights attached to data as it passes between layers will continue to be varied until the output from the neural network is very close to what is desired. At that point, the network will have ‘learned’ how to carry out a particular task. The desired output could be anything from correctly labelling fruit in an image to predicting when an elevator might fail based on its sensor data.
A subset of machine learning is deep learning, where neural networks are expanded into sprawling networks with a large number of sizeable layers that are trained using massive amounts of data. These deep neural networks have fuelled the current leap forward in the ability of computers to carry out tasks like speech recognition and computer vision.
There are various types of neural networks with different strengths and weaknesses. Recurrent Neural Networks (RNN) are a type of neural net particularly well suited to Natural Language Processing (NLP) — understanding the meaning of text — and speech recognition, while convolutional neural networks have their roots in image recognition and have uses as diverse as recommender systems and NLP.
The design of neural networks is also evolving, with researchers refining a more effective form of deep neural network called long short-term memory or LSTM — a type of RNN architecture used for tasks such as NLP and for stock market predictions – allowing it to operate fast enough to be used in on-demand systems like Google Translate.