AI practitioner John Spooner of H2O says the sector needs to own up to its core issue and deal with it as a priority
It’s hardly new news to say there are some big concerns with bias in Artificial Intelligence (AI); recruiting software for programmers that skips women applicants, police systems that seem to be racist, the list goes on, unfortunately.
But be assured: the AI sector knows this. And while we may not have a silver bullet that will resolve the issue tomorrow, we do have practical ideas about how to address the bias monster, and move to what we all want: fully ‘responsible’ AI.
We do all need to be fair here and start by understanding that there is going to be bias in AI models. The underlying challenge is that the data that you’re feeding into AI systems is a representation of reality, and at the end of the day, machine learning has inherited bias. Why? Because these models are just a representation of the world, and there’s bias inherent within the world.
Bias: too endemic to solve?
We don’t need to get into all the psychology, philosophy and sociology of this, but humans are imperfect creatures who don’t always use pure logic to make sense of the world around them. Bias has always been there in any decision that we make. But if bias is endemic, is it too big a problem to solve? Absolutely not; there are things that you can do to make sure that you’re addressing those particular issues. We’re sort of putting it on the machine learning people and the data science people to fix this, and we’re rising to the challenge, but I suspect that the end point is some form of regulation. But however we want to progress this, the first step is always to ensure we are educating ourselves that bias exists in different forms.
So how do we deal with it? How do we work to uncover the bias and build a framework around that to enable people to trust the systems and the results that emerge from them. Well, if we think about machine learning and what it’s trying to do, it’s ultimately trying to create systems that learn through experience. And the only way that it can do that is to build on data and to build systems that are based on data.
What that obliges us to do is to think about—before we build machine learning models and incorporate those into AI systems—about the fact that bias will exist within this process. So how are we making sure that we’re putting things in place to prevent bias systematically going through the whole of that machine learning process? At present, the majority of the time these models are optimised for accuracy. So data scientists are trying to squeeze out the extra percentage point—but by focusing on the accuracy, they forget about optimising on fairness.
So the challenge data scientists have is that they need to make sure that when they’re building those machine learning models, that the data is clean, accurate, and is free from any bias that could skew results. Actually, sometimes those machine learning algorithms can very quickly spot those biases that are naturally within us and the world, but we always need to be very conscious of not introducing even more bias into these systems when we’re selecting data, or identifying the types of data that we’re looking to collect.
Bias very often also creeps in with the actual model building process. The key here is to make sure that we document all of the steps that are taken to collect the data, select the data and make sure that we’re checking for bias in the data that we put into those models. That means that practically speaking, everyone and anyone, vendor, business person, open source developer, government organisation or citizen who are building AI take the right steps to ensure that the bias is not leaking into the decisioning of these machine learning platforms.
The need to broaden data science diversity
The next thing is how do we then make sure that we’re exploring all different angles of this. This step is about how we know where to look for where bias could exist. But one of the big challenges that we have in data science is typically the kind of person that works in IT, i,e. male and nerdy, as the stereotype goes. So we really should broaden the diversity of the people working in data science if we possibly can, because if we have more voices in the team, you’re able to work out where biases exist in this potential decision-making process.
We need to be sure that we’re broadening the diversity of the data that we’re collecting and analysing. We tend to just jump in and analyse a subset of data, whereas actually we should be thinking well, this is just a subset, how could we create a wider selection of data that this machine learning model could build from? To be honest, I do think lots of organisations go straight into, let’s build the most accurate machine learning model possible: we’ve got some data, let’s build a model and increase the accuracy rather than to say, What are we trying to solve here? What are the business decisions that we’re going to be making off the back of this machine learning model, and what data are we going to be using to feed that? And let’s look at that data, first of all and see if there are any challenges with the data that we’re using before we build the machine learning model? I don’t think we spend enough time doing that right now, but we absolutely can and should. You ideally need to ensure that you’ve got a true representation of representative characteristics as far as practical.
So we have some simple steps to get bias identified and the building blocks of Responsible AI being created. But can this process be automated? I don’t think fully, as yet. What you can automate is the process of checking the quality of the data across a number of different dimensions, so for example, check it to make sure that you’ve got a representative sample of males and females or a representation sample of all of the protected characteristics that you may want to protect.
One blockage here is there is no central standard about what de-biased data needs to look like–but there’s no reason at all we couldn’t move toward such a standard. We do also now have tools that can help measure how fair a model is, such as if there’s any bias toward different groups via something called disparate impact analysis. This works by viewing selected protected characteristics, seeing commonality in the dataset across different protected characteristics, and checking if you are getting similar types of results across it (e.g. if you were to look at gender, for example, is the accuracy of the model the same for males and females?).
It also needs to be stated that models are a little bit like fish; they do go off easily! If you don’t constantly monitor, review and rebuild those machine learning models, then bias is going to creep into the decisioning frameworks. So a very good standard of avoiding bias is to make sure that you’ve got a governance process around building models.
My company is already doing this, by the way. We are working with a number of financial services organisations on this issue, because they are very focused on eliminating bias. For example, US card issuer Discover Financial Services is using our technology to speed up the process of checking its models, and is now able to create measures that break down individual machine learning predictions into their components. This is all happening via very sophisticated machine learning algorithms to work out which customers to accept or decline, but the brand is absolutely also able to give detailed explanations for their decision. That means that for every single credit decision it accepts or declines, the company can give an individual the specific reasons of why their application was approved, or why their prediction was declined.
‘Every company I speak to is genuinely very conscious about gender or racial bias’
Summing up, I think AI does need to address this core bias worry, but it also needs to be acknowledged that AI developers are aware of the issue. Increasingly, brands are also becoming more and more open that they’re using machine learning technology to speed up a process, but they’re also very conscious that machine learning models have a reputation of being a black box and therefore opaque and trying to open that up. And every company I speak to is genuinely very conscious that it cannot have machine learning-based decisioning processes open to any gender or racial bias, and they are more and more every day making sure that they’re checking all of those decisions for as much bias as they can catch.
And let’s be honest, from a commercial perspective the worst thing that could happen is reputational risk: if you start to get known in the marketplace that you’re automating decisions but not offering the right products to people from some particular background, that will really come back and bite you.
After all, the bias monster has pretty sharp teeth you want to avoid as much as possible. So let’s put it into the cage it deserves to be constrained in, for everyone’s benefit.
These topics are discussed in more detail in a special white paper which is accessible here: