Automated Content Moderation, Hate Speech and Human Rights
Within the framework of a multi-stakeholder, cross-border, EU project entitled SHERPA ‘Shaping the Ethical Dimensions of Smart Information Systems (SIS)’, a project led by the University of De Montfort (UK), a deliverable was developed on 11 specific challenges that SIS (the combination of artificial intelligence and big data analytics) raise with regards to human rights. This blog post seeks to focus on one of those challenges, namely ‘Democracy, Freedom of Thought, Control and Manipulation.’ This challenge considered, amongst others, the impact of SIS on freedom of expression, bias and discrimination. Building on the initial findings of that report, this short piece will examine the use of Artificial Intelligence (AI) in the process of content moderation of online hate speech. This particular challenge was chosen given that the moderation of online hate speech is a hot potato for social media platforms, States and other stakeholders such as the European Union. Recent developments such as the EU’s Digital Services Act and the proposed Artificial Intelligence Act seeking to cultivate new grounds through which online content and the use of AI will be managed.
Online communication occurs on a “massive scale”, rendering it impossible for human moderators to review all content before it is made available. The sheer quantity of online content also makes the job of reviewing, even reported content, a difficult task. As a response, social media platforms are depending, more and more, on AI in the form of automated mechanisms that proactively or reactively tackle problematic content, including hate speech. Technologies handling content such as hate speech is still in its “infancy”. The algorithms developed to achieve this automation are habitually customized for content type, such as pictures, videos, audio and text. The use of AI is not only a response to issues of quantity but also to increasing State pressure on social media platforms to remove hate speech quickly and efficiently. Examples of such pressure include, inter alia, the German NetzDG which requests large social media platforms to remove reported content that is deemed illegal under the German Penal Code and to do so quickly (sometimes within 24 hours) and at risk of heavy fines (up to 50 million Euros in certain cases). To be able to comply with such standards, companies use AI alone or in conjunction with human moderation to remove allegedly hateful content. As noted by Oliva, such circumstances have prompted companies to “act proactively in order to avoid liability…in an attempt to protect their business models”. Gorwa, Binns and Katzenbach highlight that as “government pressure on major technology companies build, both firms and legislators are searching for technical solutions to difficult platform governance puzzles such as hate speech and misinformation”. Further, the “work from home” Covid-19 situation has also led to enhanced reliance on AI accompanied by errors in moderation. In fact, as noted, for example, by YouTube, due to COVID-19, the amount of in-office staff has been reduced meaning that the company temporarily relies on more technology for content review and that this could lead to errors in content removals.
Over-blocking and Freedom of Expression
Relying on AI, even without human supervision can be supported when it comes to content that could never be ethically or legally justifiable, such as child abuse. However, the issue becomes more complicated when it comes to areas which are contested, with little or complicated legal (or ethical) clarification on what should actually be allowed (and what not) – such as hate speech. In the ambit of such speech, Llansó states that the use of these technologies raises “significant questions about the influence of AI on our information environment and, ultimately, on our rights to freedom of expression and access to information”. For example, YouTube wrongly shut down (then reinstated) an independent news agency reporting war crimes in Syria. Several videos were wrongly flagged as inappropriate by an automatic system designed to identify extremist content. Other hash matching technologies such as PhotoDNA also seem to operate in “context blindness” that could be the reason for the removal of the videos on Syria. YouTube subsequently reinstated thousands of the videos which had been wrongly removed.
As highlighted in a Council of Europe report, automated mechanisms directly impact the freedom of expression, which raises concerns vis-à-vis the rule of law and, in particular, notions of legality, legitimacy and proportionality. The Council of Europe noted that the enhanced use of AI for content moderation may result in over-blocking and consequently place the freedom of expression at risk. Beyond that, Gorwa, Binns and Katzenbach argue that the increased use of AI threatens to exacerbate already existing opacity of content moderation, further perplex the issue of justice online and “re-obscure the fundamentally political nature of speech decisions being executed at scale”. Automated mechanisms fundamentally lack the ability to comprehend the nuance and context of language and human communication. The following section provides an example of how automated mechanisms may become inherently biased and further lead to concerns relating to respect for the right to non-discrimination.
The Issue of Bias and Non-Discrimination
AI can be infiltrated with biases at the stage of design or enforcement. In its report ‘Mixed Messages: The Limits of Automated Social Content Analysis’, the Centre for Democracy and Technology revealed that automated mechanisms may disproportionately impact the speech of marginalized groups. Although technologies such as natural language processing and sentiment analysis have been developed to detect harmful text without having to rely on specific words/phrases, research has shown that they are “still far from being able to grasp context or to detect the intent or motivation of the speaker”. Such technologies are just not cut out to pick up on the language used, for example, by the LGBTQ community whose “mock impoliteness” and use of terms such as “dyke,” “fag” and “tranny” occurs as a form of reclamation of power and a means to prepare members of this community to empower themselves to deal with hatred and discrimination.
Through the conscious or unconscious biases that mark the automated mechanisms moderating content as depicted in the above examples, the use of AI for online hate speech therefore leads not only to an infringement of freedom of expression due to over-blocking and silencing dissenting voices as demonstrated above but, also, shrinks the space for minority groups such as the LGBTQ community. This shrinking space that results from inherent bias therefore leads to a violation of the fundamental doctrine of international human rights law, namely that of non-discrimination.
As noted by Llansó, the above issues cannot be tackled with more sophisticated AI. Tackling hate speech by relying on AI without human oversight (to say the least) and doing it proactively and not only reactively, places the freedom of expression in a fragile position. At the same time, the inability of technologies to pick up on the nuances of human communication in addition to the biases that have affected the make-up and functioning of such technologies brings up issues pertaining to the doctrine of non-discrimination.
Natalie Alkiviadou is a Senior Research Fellow at Justitia, Denmark.