Data

The History and Future of Neural Networks

In January 2020, the BBC listed developments in the application of neural networks, ranging from facial recognition to language translation.  Whilst these applications have surfaced in the twenty-first century, the technology underpinning Machine Learning has taken decades of research with plenty of highs and lows along the way.  This article outlines the long journey from early logic gates, through “AI winters” to the deep neural networks that are being developed today.

The first Artificial Neural Network

The first artificial neural network was proposed back in 1943 by Warren McCulloch, a Neurophysiologist, and Walter Pitts, a Mathematician, as a result of their research into applying Mathematics, in the form of Boolean logic, to model how neurons within the brain work.  By deploying logic gates, they were able to demonstrate how outputs could be activated when specified inputs are active, to model how neurons work in the brain.  As a model of the brain, the functionality provided was considerably simpler compared to the billions of neurons in the human brain. In addition, their model was only able to process binary inputs and outputs, providing a very simplified model of a neuron.

The Perceptron

Frank Rosenblatt, a Psychologist, developed a more advanced neural network called the Perceptron in 1958 which was able to process numbers as inputs, in addition to associating weights with inputs.   By using weights, the Perceptron modelled behaviour that had been observed in the brain, that a neuron is more likely to activate a second neuron that it close to when they are both triggered multiple times which is described by Hebb’s Rule that Siegrid Löwel paraphrased as “neurons that fire together, wire together.”

The Perceptron was able to categorise basic images without being given a pre-programmed sequence of steps to execute.  This caused understandable excitement; for the first time, a machine was able to function without being given detailed instructions to follow.

Unfortunately, the breakthrough with the Perceptron also led to a problem that has been a problem throughout the history of Artificial Intelligence: unrealistic claims and hype, including a claim by The New York Times in 1958 that the Perceptron would “be able to walk, talk, see, write, reproduce itself and be conscious of its existence.” 

The first AI winter

In reality, it was becoming apparent that there were limitations to what could be achieved with Neural Networks that could only feed-forward and in 1969, Marvin Minksy and Seymour Papert demonstrated limitations with Perceptrons including their inability to solve problems such as the XOR classification problem, which led to funding drying up.

This period of inactivity, referred to as the first AI winter, continued until the 1980s.  

Back Propagation

The breakthrough that finally ended the AI winter in 1986 was a back-propagation algorithm that could be used to train multi-layer networks, overcoming the limitations of the single-layer network Perceptron.  This led to a renewed interest in the potential application of neural networks.

The Second Winter

However, with the relatively-slow processing power where training a model could take weeks and the lack of availability of large datasets in the 1990s, the AI community moved away from neural networks to other methods such as Support Vector Mechanisms (a technique to categorise data by determining which side of a hyperplane item falls on) as these regularly produced better results.

Although neural nets were out of favour in the 1990s, there was still active research which led to key breakthroughs: in 1997, Hochreiter and Schmidhuber developed the long-term short memory (LTSM) to enable values to be persisted across networks with many layers and in 1998 the Convolutional neural network, a multi-layer network based on the visual cortex that is often applied for visual imagery, was presented by LeCun, Bottou, Bengio, and Haffner.

Deep Neural Networks

The end of the second neural network winter occurred in 2006 when Geoffrey Hinton, a Cognitive Psychologist, developed greedy layer-wise pretraining that addressed the problem that the layers closest to the input layer were not updated in deep neural networks.  By being able to train deep neural networks (neural networks with a number of hidden layers), much more powerful processing was possible. This technology has led to major breakthroughs from advanced image recognition (including human faces) through to the emergence of self-driving cars and the ability to translate between languages.

Challenges with Neural Networks

However, there are still outstanding challenges with the deployment of neural networks.  The process of training a neural network involves setting a large number of weights so that the behaviour of the model reflects the data that was used for training.  Explaining how a model, which consists of a large number of numeric values for weights, derived a conclusion is not always possible.  Research into explainable artificial intelligence is essential if neural networks are going to be deployed into fields such as law or healthcare where life-changing decisions will be challenged. 

In addition, the robustness of neural networks needs to be improved if they are to be deployed into areas that could endanger lives such as self-driving cars.  For example, an article in Nature showed how self-driving cars have been unable to recognise stop signs when stickers were placed on them. To be able to overcome this brittleness may require a level of understanding of the data being processed and a number of areas are being researched, including the development of hybrid AI which aims to combine neural networks with symbolic AI (that describes the environment in terms of hard-coded rules).

The future for Neural Networks

With the size of neural networks increasing at a considerable pace (doubling every 2.4 years), the power of the technology to solve increasingly complex problems becomes more feasible. Provided that AI ethics are incorporated, the combination of larger neural networks, increasing processing power, larger datasets, and the results of decades of research offers an exciting future for the application of Artificial Neural Networks to benefit society.

Author

  • Tamsin Crossland

    Tamsin Crossland is a Senior Architect at Icon Solution with thirty years experience of implementing technology solutions, primarily in the Finance sector. Tamsin is currently researching the use of Machine Learning in Payments. She has spoken at conferences, including QCon and Minds Mastering Machines. Tamsin is co-author of "The AI Book”

    View all posts

Related Articles

Back to top button