“The artificial intelligence developed by Google to detect ‘hate language’ seems to be racist.”
Thus or in very similar terms they are titling diverse Anglo-Saxon media in reference to an academic investigation around Perspective, an AI developed by Google’s Anti-abuse Team to give ‘toxicity’ ratings to online texts, and which is used in moderation of online debates by various organizations, among them the New York Times.
Thus, it seems like the umpteenth news about the ‘racial bias’ of an artificial intelligence (a bias that, in most cases, is rather attributable to a bad selection of data used to train AI), however the problem in this case does not lie in any neural network or any data-set, but in political correctness.
Twice as likely to be considered offensive if you are African American?
Let’s start at the beginning: a team of researchers at the University of Washington, led by NLP (natural language processing) PhD student Maarten Sap, has discovered that Perspective is twice as likely to identify as ‘hate speech’ tweets written in “African American English” or by users who “identify themselves as African American”.
They came to this conclusion after analyzing two data-sets of texts used for the detection of ‘hate speech’; in total, more than 100,000 tweets that human beings had previously labeled such as ‘hate speech’, ‘offensive’ or ‘nothing’.
The researchers tested both ex-professionally created IA (and trained with data-set texts) and Google’s own Perspective: in both cases about half of the harmless tweets containing ‘African American English’ terms were categorized as ‘offensive’.
A later test, used with a much larger data-set (5.4 million tweets) and indicating the race of its authors, showed that African-Americans were 1.5 times more likely to be classified as offensive.
Maarten Sap and his colleagues asked volunteers to categorize the toxicity of another 1000 tweets, this time taking into account racial factors, and the result was a significant drop in the number of African American tweets marked as offensive.
Matthew Williams, a researcher at Cardiff University (UK) cited by New Scientist, concluded that “because humans have inherent biases, we have to assume that all algorithms are biased”.
Should we require AI to know and apply our double standards?
How is this possible? How do human biases infiltrate the classification criteria of an algorithm, especially one such as perspective, intended to identify -precisely- hate speech? Or are we focusing it badly, perspective works perfectly, and the bias is in the assessment of its functioning?
In Observer they give us a fundamental clue to understand what is happening:
“For example, a Twitter publication that reads “Wassup, nigga” has an 87% probability of being detected as toxic, while another that reads ” Wassup, bro” has only a 4% probability of being labeled as toxic.
“Nigga” (like the less colloquial, but equally offensive, “nigger”) it is a socio-culturally charged word so controversial that it is not uncommon to see the U.S. media citing it solely as “the N-word.
However, there is a duality of criteria when assessing the use of this word: while resorting to it can have negative social (and even legal) consequences for many Americans, it is understood that members of the African American community have the right to use it habitually, almost as a synonym for ‘colleague’ if they are in dialogue with another African American. Despite the distances, it is similar to what happens in Spain with the term “maricón” within the LGTB community.
That is why Sap and his team understand that there is a bias against African-Americans in perspective, although it could be said that perspective applies a guarantee criterion here: in the face of the impossibility for the machine to know the whole context of the tweet, it acts with neutrality classifying those terms, and others similar, as offensive.