In the last few decades, Artificial Intelligence (AI) has gone from science fiction to business-critical technology. AI now sits at the heart of many of our daily activities: designing the chemicals that flavour your breakfast, checking if your credit card transaction is legitimate, deciding what to display next in your Netflix queue, and even ensuring your train to work arrives on time. Yet despite the prevalence of AI, we’re still a long way from a world in which AI ethics is well understood.
In theory, AI promises to automate complex systems to be safer, fairer, and better able to operate at scale. But reaching widespread adoption requires trust based on ethical design and transparency. Without transparency into the data as well as the ethical safeguards used to create AI systems, we won’t achieve the kind of accountability we expect in other aspects of our common life (e.g. consumer rights, non-discrimination, etc.).
Ethical Design
The past few years have seen increased momentum in public, private, and government interest in creating guidelines for AI systems that better align to cultural values. Expert international bodies have put forward high-level ethical principles, including human rights, data agency, transparency, and accountability. The EU AI Act will come into force later this year, as the first comprehensive regulatory framework for AI globally.
The US has also undertaken efforts to create guidelines for AI use through the National Institute of Standards and Technology (NIST), bringing together stakeholders across the country to debate and comment upon its draft Artificial Intelligence Risk Management Framework (AI RMF). Neo4j submitted comments on the need to focus risk management on the prevention of human harm, calling for NIST to recognise ethical principles as the foundation of AI risk management.
We argued that the ethical principles proposed by NIST—fairness, transparency, and accountability—should be embedded in AI at the design stage. These principles must be defined in operational terms according to the AI use case and implementation context. For example, fairness can be defined as group fairness (equal representation of groups) or procedural fairness (treating every individual the same way). Ethical principles like fairness have different technical implications depending on the goal of the AI application and the nature of the risk involved.
Being transparent about the ethical principles used in design is an important step in fostering public trust in AI. One can’t help but wonder if many of today’s biggest AI scandals might have been avoided with a little more thought around ethical guardrails, as AI is only about as good as the data you feed it.
Accountability
People often talk about the need for accountability with AI given the inherent difficulties of seeking recourse for decisions made by a machine. But achieving any kind of meaningful accountability for AI is impossible until we have a deep understanding of our data. Data lineage is an unsexy term for what I’d call one of the most pressing issues in modern data and analytics. Without knowing where data has come from and when, who has changed it and how, we can’t develop accountability mechanisms. Responsible AI means having practices in place to ensure that we can examine the data, including how it was selected and what was omitted.
If an opaque AI system has been used to make significant decisions, we cannot unpack the causes behind a course of action without carefully documented data lineage. Knowledge graphs excel at storing the history of data over time given their flexible, open-ended structure. By documenting the processes of data collection with the data itself in a knowledge graph, organisations can scale their data acquisition efforts with the assurance that they’ll be able to adapt to new (and likely more stringent) compliance requirements. They’ll also have the ability to go back into the data to answer questions, tracing it in time to the raw data, its source, and its transformation.
While it’s not always a straightforward task to know exactly which data was used to reach an AI output, tracking data lineage allows us to go backward in the first place—a vital step towards explainability.
Transparent Data Context
One of the greatest challenges in training AI is having enough relevant, representative data. And as the COVID-19 crisis has amply demonstrated, historical data can quickly become irrelevant even when you have enough to train a model. Given these difficulties, it’s clear that we need a new paradigm.
When AI must operate within new and changing situations, it needs to integratecontext the way the human mind can. Yet current methods often fail to incorporate data context—the relationships between data—missing out on powerful predictive information.
Graph theory was developed as a way to represent network structures, which allows us to analyse the relationships between data. Graph databases retain data with all the existing relationships intact, providing valuable context for ML models. Data relationships should be thought of as an entirely new class of data that is not only more predictive, but which also decreases our dependence on historical data. It’s a new class of data not because it didn’t exist before, but because traditional databases do not capture it.
Relying on models that learn from statistical patterns in historical data often leads to situations in which AI makes improper decisions based on demographic attributes. Ethically and legally, this is a grave problem. Adding the context of data relationships helps us create more responsible AI by giving the model more relevant information to take into account. We don’t want AI to learn from the biases in historical data, so let’s give it the context of that data. Within an effective graph data model, this context will yield new insights into sources of bias in the dataset. While I do not posit graphs as a panacea for such a complex issue, I see a need for further research to explore the ethical potential of context-driven AI.
We’re already seeing customers who have improved their capacity to serve customers ethically because they can store context in a graph. Qualicorp, a leading healthcare benefits administrator in Brazil, created a graph-based AI application that helped them avoid omitting information to customers in need of an insurance policy. Providing the right information at the right time is an ethical responsibility, but it’s also a challenge given the thousands of products, rules and conditions, and health histories involved in this use case. The context provided by graphs makes it easier to treat people as individuals with many characteristics, rather than as groups handled in a similar way.
To sum up: as fostering trust in AI becomes an ever-more pressing concern, we need to embed ethical principles in our systems while keeping to a high standard of transparency. With graphs, we can supply AI systems with the context necessary to enable more appropriate decisions and understand the data used to arrive at those decisions.