Social media have long been more than just scrolling through photo galleries from vacations or family gatherings. With all the benefits they provide us, social media platforms also carry great risks for their users. How to deal with harassment on the internet, what are the consequences and which steps to take to protect yourself?
Without basic digital hygiene there is no security on the internet. The software we use is not perfect and each vulnerability puts our data at risk of being misused by cybercriminals. Why is it important to take care of the security of online services, accounts and devices we use and how to protect our personal data, financial information and other digital resources?
Personal data is information that identifies you more closely and with the help of which, directly or indirectly, an individual can be identified. Examples are name, telephone number, fingerprint, political belief or medical history. What is personal data protection and what are the guarantees that the collection and processing of data are carried out transparently and in accordance with the law?
In the past decade, thriving online harassment and hate speech, armies of bots spreading disinformation as a mean of interference in the elections, far-right propaganda, and waves of obscurantism disseminating through the COVID-19 related fake news repeatedly made online platforms’ content moderation the topic on everyone’s lips and newsfeeds. A recent round of discussion was triggered by the suspension of former US president Donald Trump’s Twitter account in January 2021 in the aftermath of Capitol storming. Twitter’s controversial decision to limit interactions, and later hide Trump’s posts, followed by the ultimate suspension of his account, was taken at the board of directors level. However, not every single user on the platform is paid equal attention when it comes to content moderation. More often, due to the large scope of data, machine learning systems are tasked with moderation. Human moderators are only responsible for reviewing the machine decisions in cases when users appeal those decisions or when the machine learning algorithms flag a given case as contentious.
Concerned by the implications algorithmic moderation can have on freedom of expression, David Kaye, a former UN special rapporteur on the promotion and protection of the right to freedoms of opinion and expression, called on online platforms to “[ensure] that any use of automation or artificial intelligence tools [in this particular case, for the enforcement of hate speech rules] involve human-in-the-loop”. Although the watchdog expressed valid concerns about the implications of algorithmic moderation for the freedom of expression, there is a strong case to argue that humans have never been out of the content moderation loop.
Human-in-the-loop refers to the need for human interaction with machine learning systems in order to improve their performance. Indeed, algorithmic moderation systems cannot function without humans serving it. The machine has to be designed, created, maintained and constantly provided with new training data, which requires a complex human labor supply chain.
Given the increasing relevance of content moderation in public discourse, it is important to adopt a labour-oriented perspective to understand how algorithmic moderation functions. Contrary to popular fallacy that contraposes machine moderation to human moderation, indeed, current moderation presents a mixture of humans and machines. In other words, humans are pretty much “in-the-loop”.
The aggregated supply chain of all the human involvement is visually presented:
Human labour supply chain involved into algorithmic content moderation
Boards of Platforms
Algorithmic moderation starts from platforms boards’ members and employed consultants who decide on the general rules for moderation. These rules are primarily a product of the need to manage external risks. Among risks that have to be managed are pressures from civil society, regulators, and internet gatekeeping companies.
For instance, Facebook’s initial policy was to remove breastfeeding photos. The platform was defending itself by referring to the general no nudity policy which the fully exposed breasts violated. It was only in 2014, when after years of pressure from Free the Nipple activists, Facebook allowed pictures of women “actively engaged in breastfeeding”. Among other achievements of pressure from civil society was tightening of most of the platforms’ policies towards hate speech like misogyny, racism, and explicit threats of rape and violence.
A number of more established civil society groups also propose solutions on how to moderate based on their proposed ethics or derived from the existing norms of national and international law. Intergovernmental organizations like OSCE, Council of Europe, and the United Nations have their own projects dedicated to ensuring the freedom of expression while respecting the existing norms of international law.
The pressure from the regulators most visibly manifested itself in the adoption of the rules and practices aimed at curtailing the spread of terrorist content and fake news. The latter problem, which became widely discussed after the alleged interference of Russia’s government in the US elections, received additional attention due to the biopolitical concerns raised by Covid-19 pandemic.
When it comes to pressure from the gatekeeping companies, the example of the Parler social network is the most graphical. This social network ceased to exist after Amazon stopped providing the platform with cloud computing under the pretext of insufficient moderation, and both major distribution platforms App Store and Google Play, suspended Parler’s apps. In similar fashion, nudity ban on Tumblr, which led to a mass exodus of users from that platform, came after Apple banned Tumblr’s app from the iOS App Store due to reported child pornography. Likewise, Telegram’s CEO Pavel Durov reported that his decision to remove channels disclosing the personal data of law enforcement officers responsible for brutal dispersion of rally participants in Russia was forced by the gatekeeping company: he claimed that Apple did not allow the update for the IOS app to be released until these channels were removed. During the 2021 Russian parliamentary elections, Apple and Google, being squeezed by the Russian government, in their turn demanded Telegram to suspend a chatbot associated with the Smart Voting project run by the allies of jailed politician Alexey Navalny. The chatbot provided recommendations for Russian voters on which candidates to support in order to prevent the representatives of ruling party from getting the mandates
Not every platform CEOs would employ or at least report on using the algorithmic moderation systems as a solution. Clubhouse’s moderation, for example, works in a way that no machine-learning algorithm is employed. The conversations are recorded and stored by Agora, the Shanghai-based company providing the back-end for Clubhouse. In case of a complaint either by the users or a government, the platform could study the recording and pass the verdict.
The decision on whether to manage the aforementioned risks with the help of algorithmic moderation, always lies with board members of the platform. The boards create the rules, the engineers find the ways to impose them – although the border is not well demarcated, given that the CEOs often hold degrees in engineering themselves and might also be directly involved in designing the systems’ architecture.
It is engineers who decide which algorithm to introduce for content moderation, design and maintain that algorithm, and seek ways to modernize or replace it.
Engineers choose between two main categories of algorithms, which are commonly both applied to content moderation.
One category of algorithms deals with searching for partial similarity (perceptual hashing) between newly uploaded content and an existing database of inappropriate content. For example, perceptual hashing is effective in preventing the circulation of inappropriate content such as viral videos of mass shootings, extremist texts or copyrighted films and songs. The most well-known example of a perceptual hashing-based algorithm is the Shared Industry Hash Database (SIHD), used by companies like Google, Facebook, Microsoft, LinkedIn, and Reddit. The database was created in 2017, contains terrorism-related content and has been criticized for its lack of transparency.
The second category encompasses algorithms that predict (machine learning) if content is inappropriate. Machine learning technologies like speech recognition and computer vision are effective in classifying user-generated content that infringes on the platforms’ terms of services (ToS). This technology has however drawn criticism for discriminating against certain groups, as in the case of overmoderation of tweets written in Black English. These biases are not generated by the algorithm itself, but are formed due to inappropriately compiled datasets in the training of that algorithm.
Driven by the orders coming from the platforms’ board, engineers constantly seek new datasets to improve the work of their algorithms. These datasets are manually labelled by human moderators, outsourced data flaggers, and regular users.
The main role of human moderators is to review users’ appeals against particular machine-made decisions and to decide in those cases when the level of machine learning algorithm confidence is low. Moderators often work for outsourced companies based in the Global South and the conditions of their labour are a matter of concern of the human rights activists. Besides usually being in economically precarious situations, the moderators suffer huge psychological pressure by dealing with very sensitive content like videos of live streamed suicides on a daily basis.
Moderators’ role is not limited to resolving disputes between the user and platform. If human moderators confirm that the uploaded content violates the ToS of the platform, this content, now verified by the expert, can further augment the dataset used for algorithmic training.
The mainstream practice of human moderation presupposes anonymity of the moderators. The course on the pioneering approach has been taken by Facebook. In their attempt to meet the demand for an increased transparency and improve the company’s legitimacy, Facebook’s board has introduced the system considerably reminding of the constitutional technology of separation of powers employed by the national states. Indeed, the idea of creating a quasi-legal judicial body within Facebook being dedicated to content moderation matters came from Noah Feldman, a professor at Harvard Law School who took part in drafting the interim Iraqi constitution.
In 2020, the platform established the so-called Oversight Board (OB) referred to by commentators as Facebook’s Supreme Court. The OB comprises twenty members “paid six-figure salaries for putting in about fifteen hours a week” among whom are acknowledged human rights activists, journalists, academics, lawyers, as well as former judges and politicians. By October 2021, the OB has adopted 18 decisions, some of which have overturned the initial decision passed by anonymous human moderators or the board itself. Other decisions, the most significant of which is Trump’s account suspension, have been upheld by the OB. In passing its decisions, the OB refers to both the platform’s community guidelines and the international human rights standards, namely the provisions of International Covenant on Civil and Political Rights. Clearly, twenty OB members are unable to review all the cases eligible for appeal so their goal is limited to reviewing the most representative ones, chosen by Facebook, to produce advisory policy recommendations and, supposedly, create the precedents human moderators can refer to in their practice. The company states that their “teams are also reviewing the board’s decisions to determine where else they should apply to content that’s identical or similar”.
Algorithm training sets are usually compiled by crowdsourced flaggers who classify content for a small financial reward, working through platforms such as Amazon’s Mechanical ‘Turk’ or Yandex’s ‘Toloka’. Using the example of Yandex Toloka, flaggers are tasked with classifying images into the following six categories: “pornography”, “violence”, “perversion”, “hinting”, “doesn’t contain porn”, “didn’t open”. As shown below, the left image, taken from the tutorial, is classified as “doesn’t contain porn”, while the other two images are classified as “hinting”. The explanatory signs indicate that the middle image displays “an obvious focus on the genital area” while the right image shows an anatomical depiction of genitals. These classified datasets are most probably used by Yandex to moderate their social media platforms like Messenger and Zen. The latter enjoys relative popularity in the Russian segment of the Web. At the same time, the explicitly norm prescribing manner in which these datasets are compiled serves to illustrate an observation that “the training dataset is a cultural construct, not just a technical one”.
Regular users of online platforms also contribute to the training of algorithms or updating the databases for similarity-searching algorithms through reporting on content they deem inappropriate. While for users themselves reporting is a way to make their voices heard by the platform, for the latter the feedback is valuable as any feedback could be and as an unpaid labour of mapping the training datasets for predictive algorithms.
Once a sufficient number of users report that a piece of content doesn’t meet the requirements of the platform, the content is sent to the human moderators for further review. If the moderator confirms that the content violates the ToS of the platform, those users have demonstrably contributed to improving the algorithms.
While current mainstream approaches to the analysis of automated moderation systems focus strictly on the technical details of how the algorithms work, the people involved always go unseen. This paper pays tribute to the humans whose labour makes the automated moderation possible but kept lost in the false human-machine dichotomy, when in fact the current practice of content moderation presents an assemblage of humans and machines intertwined.
Ilya Lobanov is an independent researcher from Saint-Petersburg, currently based in Vienna. His interests lie in the areas of political economy of digital capitalism, urban politics, and history of mind.
As an organisation dedicated to digital rights and freedoms, fighting against the use of mass biometric surveillance, we welcome the decision of the Serbian Minister of Interior to withdraw the controversial Draft Law on Internal Affairs.
We call on the authorities to take another step and impose a moratorium on the use of advanced technologies for biometric surveillance and mass processing of citizens’ biometric data. Such a move would be in line with the recommendations of the United Nations and the European Union, as well as of numerous organisations and experts around the world.
We also call on the Ministry and the Government of Serbia to secure a broad public debate in the future law-making process, especially when intending to regulate the use of advanced technologies in our society, so that we could jointly contribute to the quality of laws concerning all Serbian citizens.
The public debate on the Draft Law on Internal Affairs has officially introduced into legal procedure provisions for the use of mass biometric surveillance in public spaces in Serbia, advanced technologies equipped with facial recognition software that enable capturing and processing of large amounts of sensitive personal data in real time.
If Serbia adopts the provisions on mass biometric surveillance, it will become the first European country conducting permanent indiscriminate surveillance of citizens in public spaces. Technologies that would thus be made available to the police are extremely intrusive to citizens’ privacy, with potentially drastic consequences for human rights and freedoms, and a profound impact on a democratic society. For that reason, the United Nations and the European Union have already taken a stand against the use of mass biometric surveillance by the police and other security services of the states.
SHARE Foundation has used the opportunity of the Draft Law public debate to submit its legal comments on the provisions regulating mass biometric surveillance in public spaces, demanding from the authorities to declare a moratorium on the use of such technologies and systems in Serbia without delay.
Although modestly publicized, only three weeks long public debate on the disputed Draft Law gathered national and international organizations in a common front against the harmful use of modern technologies. Among others, EDRi, the European network of NGOs, experts, advocates and academics advancing digital rights, reacted. The official letter to the Serbian government and the Ministries of interior and justice states that provisions of the Draft Law allowing the capture, processing and automated analysis of people’s biometric and other sensitive data in public spaces, are incompatible with the European Convention on Human Rights which Serbia ratified in 2004.
“The Serbian government’s proposal for a new internal affairs law seeks to legalise biometric mass surveillance practices and thus enable intrusion into the private lives of Serbian citizens and residents on an unprecedented scale. Whilst human rights and data protection authorities across the EU and the world are calling to protect people from harmful uses of technology, Serbia is moving in a dangerously different direction”.
Diego Naranjo, EDRi
Gwendoline Delbos-Corfield, a French MEP from the Greens has warned against the use of these intrusive technologies and further restricting the rights of those living in Serbia, emphasizing that these technologies magnify the discrimination that marginalised groups already face in their everyday life. “We oppose this draft law that would allow law enforcement to use biometric mass surveillance in Serbia. It poses a huge threat to fundamental rights and the right to privacy”, said Delbos-Corfield.
“In Serbia, a country that Freedom House rated as only ‘partly free’, we suspect that the government has already begun the deployment of high-resolution Huawei cameras, equipped with facial recognition technology, in the city of Belgrade. If this draft law comes into effect, the government might have a legal basis for the use of biometric mass surveillance and the use of these cameras. Serbia now runs the risk of becoming the first European country to be covered by biometric mass surveillance. We call on the Serbian government to immediately withdraw the articles of this draft law that regulate biometric mass surveillance.”
Gwendoline Delbos-Corfield, MEP, Greens/EFA Group
The disputed provisions stipulate installation of a system of mass biometric surveillance throughout Serbia, without determining the necessity of the proposed measure for all residents of Serbia to be constantly treated as potential criminals by disproportionately invading the privacy of their lives. Of particular concern is the lack of a detailed assessment of the impact that the use of total biometric surveillance can have on vulnerable social groups, but also on journalists, civic activists and other actors in a democratic society.