Forgotten humans-in-the-loop. Labour-oriented critique of the current discussion of algorithmic content moderation

In the past decade, thriving online harassment and hate speech, armies of bots spreading disinformation as a mean of interference in the elections, far-right propaganda, and waves of obscurantism disseminating through the COVID-19 related fake news repeatedly made online platforms’ content moderation the topic on everyone’s lips and newsfeeds. A recent round of discussion was triggered by the suspension of former US president Donald Trump’s Twitter account in January 2021 in the aftermath of Capitol storming. Twitter’s controversial decision to limit interactions, and later hide Trump’s posts, followed by the ultimate suspension of his account, was taken at the board of directors level. However, not every single user on the platform is paid equal attention when it comes to content moderation. More often, due to the large scope of data, machine learning systems are tasked with moderation. Human moderators are only responsible for reviewing the machine decisions in cases when users appeal those decisions or when the machine learning algorithms flag a given case as contentious.

Concerned by the implications algorithmic moderation can have on freedom of expression, David Kaye, a former UN special rapporteur on the promotion and protection of the right to freedoms of opinion and expression, called on online platforms to “[ensure] that any use of automation or artificial intelligence tools [in this particular case, for the enforcement of hate speech rules] involve human-in-the-loop”. Although the watchdog expressed valid concerns about the implications of algorithmic moderation for the freedom of expression, there is a strong case to argue that humans have never been out of the content moderation loop. 

Human-in-the-loop refers to the need for human interaction with machine learning systems in order to improve their performance. Indeed, algorithmic moderation systems cannot function without humans serving it. The machine has to be designed, created, maintained and constantly provided with new training data, which requires a complex human labor supply chain.

Given the increasing relevance of content moderation in public discourse, it is important to adopt a labour-oriented perspective to understand how algorithmic moderation functions. Contrary to popular fallacy that contraposes machine moderation to human moderation, indeed, current moderation presents a mixture of humans and machines. In other words, humans are pretty much “in-the-loop”.

The aggregated supply chain of all the human involvement is visually presented:

Human labour supply chain involved into algorithmic content moderation

Boards of Platforms

Algorithmic moderation starts from platforms boards’ members and employed consultants who decide on the general rules for moderation. These rules are primarily a product of the need to manage external risks. Among risks that have to be managed are pressures from civil society, regulators, and internet gatekeeping companies.

For instance, Facebook’s initial policy was to remove breastfeeding photos. The platform was defending itself by referring to the general no nudity policy which the fully exposed breasts violated. It was only in 2014, when after years of pressure from Free the Nipple activists, Facebook allowed pictures of women “actively engaged in breastfeeding”. Among other achievements of pressure from civil society was tightening of most of the platforms’ policies towards hate speech like misogyny, racism, and explicit threats of rape and violence.

A number of more established civil society groups also propose solutions on how to moderate based on their proposed ethics or derived from the existing norms of national and international law. Intergovernmental organizations like OSCE, Council of Europe, and the United Nations have their own projects dedicated to ensuring the freedom of expression while respecting the existing norms of international law.

The pressure from the regulators most visibly manifested itself in the adoption of the rules and practices aimed at curtailing the spread of terrorist content and fake news. The latter problem, which became widely discussed after the alleged interference of Russia’s government in the US elections, received additional attention due to the biopolitical concerns raised by Covid-19 pandemic.

When it comes to pressure from the gatekeeping companies, the example of the Parler social network is the most graphical. This social network ceased to exist after Amazon stopped providing the platform with cloud computing under the pretext of insufficient moderation, and both major distribution platforms App Store and Google Play, suspended Parler’s apps. In similar fashion, nudity ban on Tumblr, which led to a mass exodus of users from that platform, came after Apple banned Tumblr’s app from the iOS App Store due to reported child pornography. Likewise, Telegram’s CEO Pavel Durov reported that his decision to remove channels disclosing the personal data of law enforcement officers responsible for brutal dispersion of rally participants in Russia was forced by the gatekeeping company: he claimed that Apple did not allow the update for the IOS app to be released until these channels were removed. During the 2021 Russian parliamentary elections, Apple and Google, being squeezed by the Russian government, in their turn demanded Telegram to suspend a chatbot associated with the Smart Voting project run by the allies of jailed politician Alexey Navalny. The chatbot provided recommendations for Russian voters on which candidates to support in order to prevent the representatives of ruling party from getting the mandates

Not every platform CEOs would employ or at least report on using the algorithmic moderation systems as a solution. Clubhouse’s moderation, for example, works in a way that no machine-learning algorithm is employed. The conversations are recorded and stored by Agora, the Shanghai-based company providing the back-end for Clubhouse. In case of a complaint either by the users or a government, the platform could study the recording and pass the verdict.

The decision on whether to manage the aforementioned risks with the help of algorithmic moderation, always lies with board members of the platform. The boards create the rules, the engineers find the ways to impose them – although the border is not well demarcated, given that the CEOs often hold degrees in engineering themselves and might also be directly involved in designing the systems’ architecture.


It is engineers who decide which algorithm to introduce for content moderation, design and maintain that algorithm, and seek ways to modernize or replace it.

Engineers choose between two main categories of algorithms, which are commonly both applied to content moderation.

One category of algorithms deals with searching for partial similarity (perceptual hashing) between newly uploaded content and an existing database of inappropriate content. For example, perceptual hashing is effective in preventing the circulation of inappropriate content such as viral videos of mass shootings, extremist texts or copyrighted films and songs. The most well-known example of a perceptual hashing-based algorithm is the Shared Industry Hash Database (SIHD), used by companies like Google, Facebook, Microsoft, LinkedIn, and Reddit. The database was created in 2017, contains terrorism-related content and has been criticized for its lack of transparency.

The second category encompasses algorithms that predict (machine learning) if content is inappropriate. Machine learning technologies like speech recognition and computer vision are effective in classifying user-generated content that infringes on the platforms’ terms of services (ToS). This technology has however drawn criticism for discriminating against certain groups, as in the case of overmoderation of tweets written in Black English. These biases are not generated by the algorithm itself, but are formed due to inappropriately compiled datasets in the training of that algorithm.

Driven by the orders coming from the platforms’ board, engineers constantly seek new datasets to improve the work of their algorithms. These datasets are manually labelled by human moderators, outsourced data flaggers, and regular users.

Human moderators

The main role of human moderators is to review users’ appeals against particular machine-made decisions and to decide in those cases when the level of machine learning algorithm confidence is low. Moderators often work for outsourced companies based in the Global South and the conditions of their labour are a matter of concern of the human rights activists. Besides usually being in economically precarious situations, the moderators suffer huge psychological pressure by dealing with very sensitive content like videos of live streamed suicides on a daily basis.

Moderators’ role is not limited to resolving disputes between the user and platform. If human moderators confirm that the uploaded content violates the ToS of the platform, this content, now verified by the expert, can further augment the dataset used for algorithmic training.

The mainstream practice of human moderation presupposes anonymity of the moderators. The course on the pioneering approach has been taken by Facebook. In their attempt to meet the demand for an increased transparency and improve the company’s legitimacy, Facebook’s board has introduced the system considerably reminding of the constitutional technology of separation of powers employed by the national states. Indeed, the idea of creating a quasi-legal judicial body within Facebook being dedicated to content moderation matters came from Noah Feldman, a professor at Harvard Law School who took part in drafting the interim Iraqi constitution.

In 2020, the platform established the so-called Oversight Board (OB) referred to by commentators as Facebook’s Supreme Court. The OB comprises twenty members “paid six-figure salaries for putting in about fifteen hours a week” among whom are acknowledged human rights activists, journalists, academics, lawyers, as well as former judges and politicians. By October 2021, the OB has adopted 18 decisions, some of which have overturned the initial decision passed by anonymous human moderators or the board itself. Other decisions, the most significant of which is Trump’s account suspension, have been upheld by the OB. In passing its decisions, the OB refers to both the platform’s community guidelines and the international human rights standards, namely the provisions of International Covenant on Civil and Political Rights. Clearly, twenty OB members are unable to review all the cases eligible for appeal so their goal is limited to reviewing the most representative ones, chosen by Facebook, to produce advisory policy recommendations and, supposedly, create the precedents human moderators can refer to in their practice. The company states that their “teams are also reviewing the board’s decisions to determine where else they should apply to content that’s identical or similar”.

Crowdsourced flaggers

Algorithm training sets are usually compiled by crowdsourced flaggers who classify content for a small financial reward, working through platforms such as Amazon’s Mechanical ‘Turk’ or Yandex’s ‘Toloka’. Using the example of Yandex Toloka, flaggers are tasked with classifying images into the following six categories: “pornography”, “violence”, “perversion”, “hinting”, “doesn’t contain porn”, “didn’t open”. As shown below, the left image, taken from the tutorial, is classified as “doesn’t contain porn”, while the other two images are classified as “hinting”. The explanatory signs indicate that the middle image displays “an obvious focus on the genital area” while the right image shows an anatomical depiction of genitals. These classified datasets are most probably used by Yandex to moderate their social media platforms like Messenger and Zen. The latter enjoys relative popularity in the Russian segment of the Web. At the same time, the explicitly norm prescribing manner in which these datasets are compiled serves to illustrate an observation that “the training dataset is a cultural construct, not just a technical one”.


Regular users of online platforms also contribute to the training of algorithms or updating the databases for similarity-searching algorithms through reporting on content they deem inappropriate. While for users themselves reporting is a way to make their voices heard by the platform, for the latter the feedback is valuable as any feedback could be and as an unpaid labour of mapping the training datasets for predictive algorithms. 

Once a sufficient number of users report that a piece of content doesn’t meet the requirements of the platform, the content is sent to the human moderators for further review. If the moderator confirms that the content violates the ToS of the platform, those users have demonstrably contributed to improving the algorithms.


While current mainstream approaches to the analysis of automated moderation systems focus strictly on the technical details of how the algorithms work, the people involved always go unseen. This paper pays tribute to the humans whose labour makes the automated moderation possible but kept lost in the false human-machine dichotomy, when in fact the current practice of content moderation presents an assemblage of humans and machines intertwined.

Ilya Lobanov is an independent researcher from Saint-Petersburg, currently based in Vienna. His interests lie in the areas of political economy of digital capitalism, urban politics, and history of mind.

Read more: