A Primer for Busy Executives Who Are Too Embarrassed to Ask
Weāve all heard how artificial intelligence (AI), machine learning (ML), and high-tech algorithms are changing the world, but how much does the layperson understand about what these terms even mean? In this no-judgment zone we will do a cursory overview to at least cover the basic terms ā¦ which will hopefully be a launching point for your deeper exploration into the topic.
In 1950, Alan Turing wrote Computing Machinery and Intelligence, in which he asked a simple question: āCan machines think?ā AI is the branch of computer science that aims to answer this question by replicating human intelligence in machines.
Narrow vs. General AI
Many still think of AI as the stuff of science fiction (think HAL 9000 in the movie 2001 or Data, the android from Star Trek). The important distinction between those AI systems and the ones that do everything from drive cars to make movie recommendations. The sci-fi bots would be classified as artificial general intelligence (AGI), which is the application of generalized AI into any domain (i.e. it mirrors human intelligence); while the AI programs of today would be classified as Narrow AI (ANI). ANI is the application of AI to solve a very specific task ā such as identifying faces in images or beating the best players in the world at chess or go.
Narrow AI has advanced rapidly over the last decade due to rapid increases in computing power and the exponentially increasing amount of data that is available to train these systems, but there is still a long way to go before computers are talking to us via our flying suits like Jarvis in the Marvel movies. Thatās not to say that AI isnāt already revolutionizing the way we interact with the world. From self-driving cars to robotic process automation (RPA) streamlining processes at work, weāre feeling the impact of AI in all aspects of our lives.
But what is it? How do computers know how to identify people in our pictures or how to properly code invoices?
Most conversations around AI today are focused on machine learning. ML is the subset of AI that uses algorithms that can learn from data without being explicitly programmed. In the broadest sense, the two types of ML are statistical learning and deep learning.
Statistical Learning
Statistical learning involves āinferenceā or gaining knowledge, making predictions, making decisions, or constructing models from a set of data based on a statistical framework. Statistical learning is also sometimes called āsupervised learningā ā supervised because it involves human interaction in labeling the data used to train the algorithms. The two types of supervised learning are regression and classification. Regression analysis is used to find correlations between a dependent variable and one or more independent variables. Regression analysis can solve data science problems like determining home prices based on variables like the number of square feet, year built, zip code, and the number of bathrooms or predicting ice cream sales based on the outside temperature.
The other type of statistical or supervised learning is classification. Classification is used in several machine learning algorithms that power everything from facial recognition to natural language processing (NLP).
One of the advantages of statistical learning is that the process used by algorithms to arrive at a solution is clear and understood ā at least by data scientists and mathematicians. Models can be trained and tuned to limit bias, prevent overfitting, and eliminate most bad results (assuming the model is built properly and well trained).
Deep Learning
Deep learning ā on the other hand ā is a little more difficult to understand. Deep learning is a subset of machine learning that attempts to mimic the way the human brain functions.Ā For many tasks, it outperforms traditional machine learning approaches. Deep learning can be used to solve many of the same problems as statistical learning, but the algorithms work in a fundamentally different way.
Deep learning models ā called artificial neural networks (ANNs) ā are modeled after the way human brains work in that they learn progressively by processing data rather than relying on task-based programming. Unlike statistical learning models, the way deep learning algorithms work is opaque (at best) to human observers. Thatās because neural networks āinterpretā sensory data through a kind of machine perception. Artificial neural networks have been around since the 1960s, but it is only in the last decade or so that weāve had enough computing power to make them work.
The simplest way to think of an ANN is to visualize a single node at first. Like a biological neuron, think of a single node on a neural network as a cell body (as pictured below) with dendrites that act as antennae to collect inputs. When inputs are received, the neuron performs an activation function and outputs new information via a digital axon.
Anatomy of a Neuron
Just think of a neuron or a node as a section of code that fires when an input is received. The neuron takes the inputs it receives and performs some math functions on them and applies a weight to create an output.
In the most common type of ANN called a feed-forward network, the digital version of this construct is generally represented like this:
In a feed-forward network, data flows in a single direction without looping back. Inputs are presented to the model in the input layer and then passed to the hidden layer where they are assigned weights. These weights are used to assign significance to the features as input. This is how a neural network begins to classify the data.
The data flows through the model, and if the output layer correctly classifies the data it is given a weight of 1. If the data is not correctly classified, it is given a value of 0. In model ātuningā the weights are adjusted to make certain features or parameters more influential until it can correctly classify the data.
While the above model only shows a single hidden layer, the ādeepā in deep learning comes from the number of layers in a neural network. In real-world deep learning models, there are many hidden layers that are limited only by the amount of computational power available. Each layer of neurons trains on a distinct set of features based on the previous layerās output with each layer discerning higher and higher-level features.
As is evident in the graphic above, hidden layers are able to make sense of intermediate representations that are not immediately clear to human viewers. For this reason, a lot of the learning that goes on in the hidden layers is considered a āblack boxā of computation. Neural networks are not programmed to detect specific items at each level of the feature hierarchy ā the model is choosing those features on its own.
Before getting bogged down in back propagation, gradient descent, loss functions, and all of the calculus that goes on behind the scenes (we can save that for another day), letās turn our focus to why neural networks are so valuable. With todayās available computing power, ANNs are capable of identifying correlations within massive amounts of raw data (i.e. unstructured and unlabeled). This is important because with the volume, variety, and velocity of data available today they are the best way to classify and organize the raw data into something usable and useful.
Like humans, neural networks can model non-linear and complex relationships by building on previous knowledge using live data. They are able to cluster and classify huge amounts of data, and even have some level of built-in fault tolerance. For this reason, deep learning is the go-to methodology for time-sensitive activities like credit card fraud detection, robotics, hedge fund analytics, and video analytics and surveillance.
The application of deep learning and other machine learning algorithms have become integral parts of so many aspects of our lives from gaming to medical diagnoses. We see AI at work in natural language processing (NLP) that makes computers able to interact with humans in a more natural way, and in intelligent robots that can react to sensory perceptions of the physical world like heat, lights, sounds and movement. We see AI in chatbots that answer customer support questions and other expert systems that process data and make recommendations.
As the amount of available data and computing power continue to increase, we can expect the explosion of machine learning to continue. With the largest technology companies in the world (Facebook, Amazon, Apple, Netflix and Google) pouring billions into research and development of these tools it is easy to imagine the pace of advancement will only accelerate from here.
It is more important than not that we all become computer engineers, but that we commit to a base understanding of the technology so that we donāt get left behind.