Artificial Intelligence, Machine Learning, and Deep Learning: Addressing the Bias
The title of a blogpost such as this, from a company that is offering certifications on AI, might be expected to include something upbeat: “the potential of AI,” “what AI can do for you,” and so on. A lot is being written about that potential. But the truth is, there’s almost no need to promote the benefits, which are quite obvious. Less obvious perhaps, and a point of urgent concern, is the potential that the cluster of technologies has to put us, or some of us, at a disadvantage. And the question of who that “some” will be is related to the bias that some AI output has shown from the start.
As the introduction to the ISO/IEC TR 24027:2021 standard notes1, bias in AI can turn up in different ways: it can “be introduced as a result of structural deficiencies in system design, arise from human cognitive bias held by stakeholders or be inherent in the datasets used to train models. That means that AI systems can perpetuate or augment existing bias or create new bias.”
Both the human-like aspects of the bias that systems show and the missed attempts to have them emulate human decision-making deserve our urgent attention. But perhaps surprisingly—given how much scrutiny AI has been coming in for—not a lot has been written about the explosive combination of the two kinds, even in work that refers to both of them.
AI is the most general of three terms we’ll refer to here. Within it, machine learning (ML) is “based around the idea that we should really just be able to give machines access to data and let them learn for themselves.” But machine learning isn’t just a subdiscipline within AI: it is, as Bernard Marr puts it, “the current state-of-the-art—...the field of AI which today is showing the most promise at providing tools that industry and society can use to drive change.”
Within machine learning there is the further specialization, deep learning, which Marr invites us to think of “as the cutting-edge of the cutting-edge:”2
"ML takes some of the core ideas of AI and focuses them on solving real-world problems with neural networks designed to mimic our own decision-making. Deep Learning focuses even more narrowly on a subset of ML tools and techniques, and applies them to solving just about any problem which requires “thought”—human or artificial."
Marr goes on to list some of the “impressive applications” that are coming out of this sub-sub-set of AI, including self-driving cars, which “are learning to recognize obstacles and react to them appropriately,” and the ability to “correctly predict a court’s decision, when fed the basic facts of the case.”
For whatever reason, though, Marr steers clear of the entire question of bias, of either kind—the “successful-emulation” kind or the dumb-machine-after-all kind. For the latter, we only have to look at cases in which self-driving cars carrying back-up drivers have been involved in accidents, in one case killing a pedestrian.3
In those cases, drivers are performing the same function as the cars. Thus, while the technology is advanced, how the error arises is not that complex: the algorithms in the object-detection (pedestrian-detection) function didn’t work as intended, and the back-up driver wasn’t paying attention. So the failsafe mechanism failed all the same.
In other cases, by contrast, human and machine compound each other’s errors in a kind of overdetermined negative, blind-leading-the-blind synergy where errors start off badly and get even worse.
To see how this can happen, let’s consider what kinds of bias there are. First, there’s “pre-existing bias in an algorithm,” which is “a consequence of underlying social and institutional ideologies.”4 “Technical bias” is the dumb-machine kind, which “emerges through limitations of a program, computational power, its design, or other constraints on the system,”5 while “emergent bias” is “the result of the use and reliance on algorithms across new or unanticipated contexts.”6
To see these errors in action, let’s look at an example:
Sgt. Charles Coleman popped out of his police SUV and scanned a trash-strewn street popular with the city’s homeless, responding to a crime that hadn’t yet happened.
It wasn’t a 911 call that brought the Los Angeles Police Department officer to this spot, but a whirring computer crunching years of crime data to arrive at a prediction: An auto theft or burglary would probably occur near here on this particular morning.
...Soon, Coleman was back in his SUV on his way to fight the next pre-crime.7
The “whirring computer” is from PredPol, a predictive-policing system that is used in “more than 60...departments across the country” in addition to Los Angeles, “making it the nation’s most popular predictive-policing system.”
Simulations by PredPol in Oakland, California “suggested an increased police presence in black neighborhoods based on crime data reported by the public”:
The simulation showed that the public reported crime based on the sight of police cars, regardless of what police were doing. The simulation interpreted police car sightings in modeling its predictions of crime, and would in turn assign an even larger increase of police presence within those neighborhoods.
The initial error was the confusion of a police presence with actual crime—an error that the system compounded by assigning more police to the area. Not surprisingly, then:
There are widespread fears among civil liberties advocates that predictive policing will actually worsen relations between police departments and black communities. “It’s a vicious cycle,” said John Chasnoff, program director of the ACLU chapter for Eastern Missouri. “The police say, ‘We’ve gotta send more guys to North County...because there have been more arrests there,’ and then you end up with even more arrests, compounding the racial problem.”8
We are thus witnessing a clash between potential advances for humanity and the undoing of those advances as a result of bias. That clash is framed nicely in what is known as the Toronto Declaration on machine learning, which begins:
As machine learning systems advance in capability and increase in use, we must examine the impact of this technology on human rights. We acknowledge the potential for machine learning and related systems to be used to promote human rights, but are increasingly concerned about the capability of such systems to facilitate intentional or inadvertent discrimination against certain individuals or groups of people. We must urgently address how these technologies will affect people and their rights. In a world of machine learning systems, who will bear accountability for harming human rights?9
Examples abound of efforts to ensure this accountability. The summary to AI Now’s 2019 report (the most recent it has put out) gives a brief overview of initiatives on this front:
From tenant rights groups opposing facial recognition in housing to Latinx activists and students protesting lucrative tech company contracts with military and border agencies, this year we saw community groups, researchers, and workers demand a halt to risky and dangerous AI technologies.10
The conclusion to the report itself notes that “urgent concerns remain, and the agenda of issues to be addressed continues to grow: the environmental harms caused by AI systems are considerable, from extraction of materials from our earth to the extraction of labor from our communities. In healthcare, increasing dependence on AI systems will have life-or-death consequences.”11
The pushback and the concerns are being complemented by searches for solutions to this “algorithmic authority,” including having an “AI audit, in which the auditor is an algorithm that systematically probes the original machine-learning model to identify biases in both the model and the training data.”
The concept of an AI audit is the brainchild of James Zou and Londa Schiebinger, who give an example of what their AI auditing tool can do, through an approach called word embedding:
It captures analogy relations, such as “man” is to “king” as “woman” is to “queen”. We developed an algorithm—the AI auditor—to query the word embedding for other gender analogies. This has revealed that “man” is to “doctor” as “woman” is to “nurse”, and that “man” is to “computer programmer” as “woman” is to “homemaker.”12
The authors ask some pointed questions about how we can approach attempts to solve these issues:
As computer scientists, ethicists, social scientists and others strive to improve the fairness of data and of AI, all of us need to think about appropriate notions of fairness. Should the data be representative of the world as it is, or of a world that many would aspire to? ...Who should decide which notions of fairness to prioritize?13
As urgent as these questions are, and as much as we would like to find objective answers, it is clear that even the criteria by which we should proceed have not been agreed on.
One thing is clear: we cannot look on as the balance of power shifts slowly but surely to algorithms, as the world shifts from one in which “human behavior generated data to be collected and studied” to a dystopia where “powerful algorithms...could shape and define human behaviors.”
So we have to keep up, one way or the other, to maintain the initiative. With so much at stake—the need for genuinely collective responses to a still-raging pandemic and a planet that’s heating up faster than ever—we need the best decision-making we can get. That, and not the narrower requirement that one or another business succeed, is what calls for us to get in the know, stay in the know, and remain vigilant.
Zou and Schiebinger call, among other things, for “human-centered AI,” and observe:
AI is transforming economies and societies, changing the way we communicate and work and reshaping governance and politics. Our societies have long endured inequalities. AI must not unintentionally sustain or even worsen them.
Interested to learn more?
If you’ve got this far in this post, perhaps you might like to start, or continue, your journey to becoming AI-savvy and—who knows?—help deliver solutions to the world’s most pressing problems. One way to do that is through the EXIN BCS Artificial Intelligence Foundation certification. It tests a candidate’s knowledge and understanding of the terminology and general principles of AI. It covers the potential benefits of, but also the big challenges associated with, creating and maintaining AI systems ethically and sustainably. It delves into the basics of machine learning.
To find out more, you can download the EXIN BCS Artificial Intelligence Foundation certification preparation guide. Or if you’re a learning provider, feel free to get in touch with our support team about delivering the course as an accredited EXIN partner.
1. Information technology — Artificial intelligence (AI) — Bias in AI systems and AI aided decision making, https://www.iso.org/standard/77607.html
3. Uber's self-driving operator charged over fatal crash: https://www.bbc.com/news/technology-54175359
7. Police are using software to predict crime. Is it a ‘holy grail’ or biased against minorities? The Washington Post, 17 November 2016: https://www.washingtonpost.com/local/public-safety/police-are-using-software-to-predict-crime-is-it-a-holy-grail-or-biased-against-minorities/2016/11/17/525a6649-0472-440a-aae1-b283aa8e5de8_story.html
8.Policing the Future: In the aftermath of Michael Brown's death, St. Louis cops embrace crime-predicting software, The Marshall Project—non-profit journalism about criminal justice, March 2, 2016.
9. The Toronto Declaration, which was published by Amnesty International and Access Now, and signed by other organizations including Human Rights Watch and the Wikimedia Foundation. https://www.torontodeclaration.org/declaration-text/english/
Emphasis in the original.
12. James Zou and Londa Schiebinger, AI can be sexist and racist—it’s time to make it fair: https://www.nature.com/articles/d41586-018-05707-8
14. James Zou and Londa Schiebinger, op. cit.