Capturing valuable business data is one thing, but analysing it using machine learning (ML) technology is another. In the coming years we will see the democratisation of ML, whereby tools such as AutoML will enable organisations to rapidly adopt automation to successfully tap into real-time data. Organisations of all sizes will be able to reap the benefits of automation more cost-effectively, and without the need for as many specialised data scientists. As a result, more businesses have been able to build applications on top of cloud data platforms to drive digital transformation, improve customer experiences, and generate faster data insights.
However, for all the promise AutoML holds, organisations must remain acutely aware of eliminating any potential biases encoded in the ML algorithms, and encourage an ethical data science environment to ensure effective and accurate data insights.
Evidence suggests that ML bias can impact business revenue with one in three companies suffering financial losses due to ML bias in one or several algorithms. Tackling such bias requires companies to address both the technological and human issues that cause it. The challenge, therefore, becomes about building a team that can look at not only the algorithms, but also the data, conclusions, and results, in an equitable and fair-minded way.
Use diverse datasets
To help overcome the challenge of bias, AutoML algorithms and tools must have access to more diverse and expansive datasets. Structurally, data can be biased because if it doesn’t accurately represent a model’s use case when being analysed by a machine learning algorithm, it will produce skewed results. When examining the risk of bias in machine learning, companies must first ask themselves, are we using a broad enough set of data that we’re not presupposing the outcome? If the answer is, no, then IT and data teams should be widening their net to ensure all relevant data captured is representing a comprehensive cross-section of the entire business to provide the most equitable results.
In addition to ingesting a vast range of data from within a company, organisations can leverage additional third-party data from data marketplaces, enabling them to build more sophisticated AutoML models. This is because they are collecting data from outside of their organisation, from competitors and the wider marketplace, reducing the risk of bias within the models themselves. The development of successful models using AutoML will also see organisations sharing and monetising these models, as part of a more collaborative data sharing ecosystem.
Removing coded bias in algorithms
Once a broad, diverse set of data has been established, companies must then confront the issue of potential bias in the algorithm itself. How an algorithm is coded depends on the actions and thought process of the person doing the coding, meaning that it is susceptible to bias depending on who wrote it.
This is why business leaders should consider the impact that diversity in the workforce has on ML algorithms. This includes all the dimensions of diversity including experience, socio-economic background, ethnicity and gender. It’s not just one factor. It’s multidimensional like so many things in analytics. If ML algorithms are being created without diversity in mind, businesses risk skewing data and creating results that aren’t reflective of diverse perspectives. Diversifying the workforce is a significant step towards ethical ML and data analytics.
Establish an ethics council
When it comes to tackling bias in machine learning, there are both technology and human issues that should be factored in. Diversifying the workforce is a good first step towards ethical data analytics however companies wishing to make the biggest change can go further. Such organisations will need to establish a dedicated ethics council or ethics board. This means a group of people who examine output, process, and ensure there is a balance of data and values.
An ethics council can also invest in initiatives such as an AI and ML risk framework, an evaluation framework, and an ethics program where people are actively engaged in resolving issues of bias. AI and ML evaluation frameworks will ensure that algorithms, data, conclusions, and results, are produced equitably. Through these robust measures, AutoML can flourish, establishing automated data insights that are fair, accurate and reliable.
An equitable future
A discussion of the ethics of AutoML models is an integral undertaking of any company that wants to do the right thing and use these technologies in the most equitable way. However, discussion for the sake of discussion won’t yield results. Companies should reflect on how they currently use these tools and create tangible actions that will help bring about ethical practices when it comes to AutoML.
Organisations that are serious about eliminating bias encoded in their ML algorithms must take a multi-layered approach by broadening their datasets, diversifying their workforce and forming a dedicated team of people whose function it is to remove any potential bias in the process of collecting and analysing data. Only then will businesses make the most of their data by creating an ethical data science environment that delivers accurate and fair insights.