Guiding Principles for the Future of Content Moderation: Four Scholars and Advocates in Conversation

170px-Magna_charta_cum_statutis_angliae_p1At All Things in Moderation 2017, a group of people came together in the final breakout session to brainstorm the way forward for content moderation and platform governance. Working at breakneck pace (and into the final plenary – we had to go get them so they wouldn’t miss it!), Jillian C. York (EFF), Sarah Myers West (USC), Tarleton Gillespie (Microsoft Research), and Nicolas Suzor (QUT) put forth a set of brainstormed principles that they wrestled with and worked through with each other and with the assembled group. At our request, they authored the following as a document of the day’s endeavors in order to capture the substance and the spirit of the session. We anticipate that this conversation has only just begun, and, in that light, are pleased to share the fruits of it with you now.

Guiding Principles for the Future of Content Moderation

With increasing attention to the labor, criteria, and implications of content moderation, come opportunities for real change in the ways that platforms are governed. After high profile exposés like The Guardian’sFacebook Files,” it is becoming more difficult for platforms to regulate in secret. Governments around the world are increasingly seeking to influence moderation practices, and platforms now face substantial pressure from users, civil society, and industry groups to do more on specific issues like terrorism, hatred, ‘revenge porn’, and ‘fake news.’

In light of this pressure and the opportunities it implies, we hosted a roundtable discussion at All Things in Moderation to consider possibilities for the future of content moderation and to ask not just how the moderation apparatus should change, but what principles should guide these changes?

A conversation between four researchers and advocates, Tarleton Gillespie, Nic Suzor, Jillian York and Sarah Myers West, sought to bring together perspectives from media and information studies, law, and civil society to explore a variety of approaches to envisioning a set of high level principles that could guide the interventions of different actors in content moderation processes in future.

Due Process

Jillian C. York

The origins of due process are generally understood to be contained in chapter 39 of the Magna Carta, which declares that “No free man shall be arrested, or detained in prison, or deprived of his freehold, or outlawed, or banished, or in any way molested; and we will not set forth against him, nor send against him, unless by the lawful judgment of his peers and [or] by the law of this land.” This entered into the realm of English common law, trickling down to the American Constitution in the Fifth Amendment, which provides “”No person shall…be deprived of life, liberty, or property, without due process of law,” and the Fourteenth Amendment, which applies the concept to all states.

On platforms, no such due process is required to exist—and in most cases, it doesn’t. In the early days of Facebook and other platforms, decisions handed down to users were often final, with users being told they had no opportunity to appeal. As time went on, and advocates caught wind of the issue, companies began to change their tune, but even today, most companies only offer a limited set of options for appealing takedowns—for example, Facebook users can only appeal the removal of an account or a page, but not individual pieces of content. They also cannot appeal during a temporary suspension, even if that suspension was made in error.

On the day of All Things in Moderation, Jillian was contacted by a well-known and verified Instagram user, whose account had been hijacked (or hacked). The user had tried reaching out to the company to fix the situation, but to no avail. It was only after Jillian tweeted about it, and was contacted by a Facebook employee who knew the right person at Instagram, that the problem was resolved.

We know that companies make a lot of errors, and that most users—unless they’re famous or have proximity to a company’s employees—cannot access the help they need. It’s incredible that customer service has all but disappeared in the age of social media. Due process is a key principle that should be available to all users in all instances, regardless of how or whether they have violated community standards.


Sarah Myers West

A common – and still unaddressed – critique of content moderation systems is that they lack transparency. Though many social media companies produce transparency reports, almost none of them include any information about content taken down as a result of Terms of Service or content policies. Journalists have played an important role in shedding light on content moderation processes through investigative projects, some of which relied on leaks of confidential material – such as the operational guidelines provided to moderators – from within companies.

More transparency in this space is critical if platforms are to be held accountable for how they moderate content. However, this may be an important moment to consider transparency as research question to be investigated rather than only as a policy goal. What does meaningful transparency look like in the context of content moderation? It may require more than another data point in a transparency report.

For example, often when there is a crisis over how a particular instance of content moderation is handled, the response from the company will be that the takedown was a mistake, the result of moderator error. Understanding what proportion of terms of service takedowns occur in error would be an important step toward accountability. However, this data would not tell us much about other, just as critical, elements of content moderation systems: who are the moderators of our online content? What guidelines are they provided for interpretation? What conditions do they work under, and how does this shape moderation outcomes?

Transparency includes not only the content that is moderated (though this is important), but also transparency of the process and of the broader content moderation system. The process of making content moderation processes transparent may mean changing content moderation systems themselves. Given the high level of complexity – both social and technical – of the process of content moderation, mandating more transparency may mean reframing the question that guides this principle. Instead, we might ask, “how can content moderation systems be made legible, both to us as users and to the companies that are running them?


Tarleton Gillespie

Platforms sometimes describe their content moderation as a kind of ‘custodianship’ – relatively hidden janitorial work, sweeping out the ‘bad stuff’ as it is flagged. This crystallized in a recent case, when Facebook repeatedly deleted a photo of a nude Vietnamese girl running from a napalm attack taken by Nick Ut at the Associated Press, commonly described as the ‘napalm girl’ photo.

The press presented the deletions as a failure on Facebook’s part. And it was mishandled. But this photo has always been immensely hard to deal with, and is without a doubt upsetting and shocking. Editors at the Associated Press debated whether to distribute the photo at all, and provided a blurred version to newspapers. After it was published in its original form by the New York Times and elsewhere, many newspapers received letters of complaint from readers, calling it obscene. The photo is obscene, but the question is, should it be published? These challenging edge cases don’t have any clean answer. But they point to the limits of the ‘custodian’ model as it’s understood by the platforms.

We are at an inflection point in which social media platforms are increasingly understood as responsible for public scale consequences on essential functions. The harms caused by these systems are no longer limited to a single user, but to the public itself, even those who do not directly participate on the platform. For example, fraudulent news circulating on social media platforms has a wide effect that includes even those who do not directly participate on these platforms: it impacts the entire democratic process.

What would mean for social media platforms to take on responsibility for their role in curating content and profiting off of it? These platforms have become a cultural forum where conflicting values must encounter one another, and where malicious actors want to take advantage of their proximity to their intended targets.

We should consider whether these companies can be custodians—not in the janitorial sense, but in the sense of guardianship where they are responsible for facilitating processes for working out these  unresolved tensions, publicly. One possibility would be to hand back – with care – the agency to users to address these questions ourselves, rather than platforms reserving the right to handle it for us. What would have happened if some of the innovation that has gone into designing the existing systems of moderation had instead gone into developing tools to support users’ efforts to collectively figure that out for themselves?

Human Rights

Nicolas Suzor

The Magna Carta isn’t a bad starting point for how we want our social spaces to be governed. Currently, however, we’re still trapped in thinking about governance in ways that pose a binary of self-regulation vs. full government control, and we have good reasons to be distrustful of government regulation: It is often not particularly thoughtful about how technology works.

Perhaps a better role for law when it comes to technology is a more protection-oriented mindset, and laws that protect what we love about the internet. Law is, inherently, a system for developing rules on the fly: It has flexibility to address new circumstances; it can be developed quickly through parliamentary or congressional processes; and there is largely transparency throughout this system. It is nevertheless still expensive, however.

One of the big challenges we face when dealing with platforms, however, is that they’re often governed outside of the law. Companies are allowed through Section 230 of the Communications Decency Act (CDA 230) to impose rules as they see fit, and aren’t subject to any real oversight. Fundamentally, this is not what we would recognize as legitimate governance in any legal sense. With these restrictions in place, is it possible to imagine a better future where we can see more public intervention in internal processes?

Among the biggest challenges we have is that whenever debates about how technology companies should regulate speech or data come up in governance fora, the core concern of tech companies is whether regulation will “break the internet,” so to speak.

It is possible, however, to create regulations that don’t violate the core principle of CDA 230; that is, protection from liability for companies when they make mistakes. We can regulate standards of transparency or due process that might allow for escalation of disputes and that would allow NGOs or other groups to monitor how these systems are working—which is how regulation works best.

This can be depressing work in the United States, because human rights don’t really enter the discourse. Globally, we have a set of consensus-based institutions (such as the UDHR, ICCPR, and the wide range of monitoring and enforcement organizations that support these instruments), that set out what we think of as fundamental rights of human beings, and we have a whole set of international organizations designed to monitor how well various entities are doing, and of course, we have enforcement mechanisms. These are flawed, but they can apply to the digital realm just as they have to everything that came before it.

As Kate Klonick has said, moderation is a new system of governance, but these platforms govern the same kind of space that has traditionally been public. So what do we do when private actors are doing public things? We’re at a “Magna Carta moment,” and it’s increasingly clear that public values are at stake. It’s myopic to see CDA 230 as the only way to govern speech. For example, the DMCA sets out a set of processes for regulating copyright material, and tries to set out some protection for disputes and due process. It’s not perfect, but we can imagine how we could construct a system that works at scale (like the DMCA) with escalating levels of due process to handle problems when things go wrong.

So what can we do? We can look at other structures of governance, ones that protect competition and the freedom of platforms to innovate but that also set out certain responsibilities to govern in a legitimate way. We can look to other types of institutions and administrative bodies that do adjudicative work, to find new mechanisms of due process that work at scale. These may help us imagine different systems that can help people find redress for their grievances without sacrificing efficiency or innovation. The language of human rights helps here : it helps to understand that businesses have an obligation to respect rights of the people they affect. The international human rights framework rights, which has developed to build consensual norms and set out multiple responsibilities of state and non-state actors, can help us move past limitations on our thinking about how we want to think about legitimacy in our social spaces.

Toward a more principled future

A lively discussion was held among participants from various backgrounds after these interventions, details of which can be found in the live notetaking document for the session. While the discussion covered a variety of issues, most participants agreed with the importance of the above principles, and of reconsidering the role that these spaces inhabit in our societies. More specifically, participants expressed a desire for a more holistic approach to advocacy, that better considers the needs of marginalized or targeted communities, and that ensures human rights principles are considered throughout every aspect of how these companies operate.




The Future of Eating Disorder Content Moderation

This guest post is the second in a series of writings on the state of content moderation from ATM participants and colleagues engaged in examining content moderation from a variety of perspectives. We welcome Dr. Ysabel Gerrard in the post below.

In recent months, there have been various public furores over social media platforms’ failures to successfully catch — to moderate — problematic posts before they are uploaded. Examples include two YouTube controversies: vlogger Logan Paul’s video of a suicide victim’s corpse which he found hanging in a Japanese forest, and other disturbing videos found on the YouTube Kids app, featuring characters from well-known animations and depicting them in upsetting and frightening scenarios. By and large, it’s easy to see why these cases hit the headlines and why members of the public called on platforms to rethink their methods of moderation, however high their expectations may be. But it’s harder to straightforwardly categorise other kinds of content as ‘bad’ and decide that they don’t have a place on social media, like pro-eating disorder (pro-ED) communities.

In early 2012, Tumblr announced its decision to moderate what it called ‘blogs that glorify or promote anorexia’. Its intervention came five days after a widely-circulated Huffington Post exposé about the ‘secret world’ of Tumblr’s thinspiration blogs. Facing mounting public pressures about their roles in hosting pro-ED content, Instagram and Pinterest introduced similar policies. All three platforms currently issue public service announcements (PSAs) when users search for terms related to eating disorders, like #anorexia and #thinspo, and Instagram blocks the results of certain tag searches. They also remove users and posts found to be violating their rules.

“Important questions remain unanswered.”

Almost six years later, important questions remain unanswered: why did platforms decide to intervene in the first place, given their hesitancy to remove other kinds of content despite public pressures? How often do moderators get it wrong? And, perhaps most crucially, what are the consequences of platforms’ interventions for users? I can’t answer all of these questions here, but I will address some of the issues with platforms’ current moderation techniques and propose some future directions for both social media companies and researchers.

The politics of the tag.

Platforms’ decisions to moderate hashtags implies that people use certain tags to ‘promote’ eating disorders. Hashtags tell moderators and users what a post is about, and moderated tags currently include ‘#proana’ and ‘#proed’, but they also include more generic phrases that aren’t explicitly pro-ED, like ‘#anorexia’ and ‘#bulimia’. A blanket ban on all eating disorder-related tags — ‘pro’ or otherwise — implies a disconnect between what platforms think people are doing when they use a certain tag and what they are actually doing and why they are doing it. In a forthcoming paper, I show how Instagram and Tumblr users continue to use ED-related tags despite their awareness of content moderation and of the volatility of ED hashtags. Without speaking to users (an ethically tricky but nonetheless important future research direction), it is difficult to know why they continue to use tags they know are banned. Perhaps their actions are not intended to ‘trigger’ or encourage other users — if indeed people can ‘catch’ an eating disorder solely by looking images of thin women — but to find like-minded people, as so many other social media users do. After all, this is the ethos on which social media companies were built.

For example, one anonymised Tumblr user addressed one of her posts to users who report ‘thinspo’ blogs. She called these blogs a ‘safe place’ for people with eating disorders, warning them that reporting posts can be triggering and assuring them that she will always create replacement blogs. Her post was re-blogged over 10,000 times and seemed to echo the sentiment of other Tumblr users, ‘pro-ED’ or otherwise.

Tumblr said it would not remove blogs that are ‘pro-recovery’, but this user’s feed — along with many others’ — is a tangled web of pro-eating disorder, pro-recovery and other positionalities. Users do not always conform to a stereotypically and recognisably ‘pro’ eating disorder identity, if indeed platforms (and anyone, for that matter) would know how to recognise it when they saw it. If social media offer ‘safe’ spaces to people whose conditions are socially stigmatised and marginalised, then platforms need to exercise greater care to understand the content of social media posts. But as I explain below, this might not be feasible for platforms, whose moderation workforces are already pushed to their absolute limits.

Why blanket rules don’t work.

The fact is that blanket rules for content moderation do not work, and nor should we expect them to. They always and inevitably miss the subtleties; the users who sit somewhere in the murky middle-ground between acceptability, distaste and often illegality. In social media’s eating disorder communities, various discourses — pro-ED, pro-recovery, not-pro-anything, amongst many others — are entangled with each other. But while platforms took measures to address eating disorders, they evidently did not adjust their own moderation mechanisms to suit such a complex issue. As Roberts explains, Commercial Content Moderators (CCMs) have only a few seconds to make a decision about moderation. It is concerning that individual posts can be de-contextualised from a user’s full feed — through no fault of a CCM’s, given the speed with which they must make decisions — and be removed.

For example, in its Community Guidelines, Pinterest includes an example of an image of a female body that would be ‘acceptable’ in a moderator’s eyes. The image’s overlaid text: ‘It’s not a diet, it’s a way of life. FIT Meals’ de-situates it from pro-eating disorder discourses:


(Image source:

“Perhaps we are expecting too much of moderators, but not enough of platforms.”

But in the absence of hashtags and a text overlay, would a content moderator know not to categorise this as ‘pro-ED’ and ban it? How do they know how to interpret the signals that ED communities have become infamous for? How can a moderator do all of this in only a few seconds? And what happens to users if moderators get it wrong, something Facebook recently admitted to in relation to hate speech posts? Perhaps we are expecting too much of moderators, but not enough of platforms.

The future of eating disorder content moderation.

If eating disorder content continues to be moderated, are the current approaches appropriate? Perhaps social media companies should not encourage users to ‘flag’ each other in their public-facing policies, given the historical and problematic surveillance of girls’ and women’s bodies. Maybe Instagram should not continue to chase hashtags and ban the ones the ones emerge in their place, given the minimal space they occupy at the margins of social networks. Platforms could also provide full and exhaustive lists of banned tags to help users navigate norms, vocabularies and cultures. In the follow-up to its original policy, Tumblr admitted that it was ‘not under the illusion that it will be easy to draw the line between blogs that are intended to trigger self-harm and those that support sufferers and build community’. This was admirable, but what if Tumblr became more transparent about how its moderators make these decisions?

Moving forward, one suggestion for social media companies to foster collaborations with social scientists to research and better understand these cultures (a point made recently by Mary Gray). Those with an understanding of eating disorders know that their stigmatisation has historically prevented people from seeking treatment; a problematic and gendered issue that platforms’ panicked decisions to intervene have potentially worsened. An in-depth exploration of online eating disorder communities and a more open dialogue between researchers and platform policy-makers might help social media companies to promote an alternative view of eating disorders than simply being ‘bad’.

Ysabel-Gerrard-smallDr. Ysabel Gerrard is a Lecturer in Digital Media and Society at the University of Sheffield. She co-organises the Data Power Conference and is the current Young Scholars’ Representative for ECREA’s Digital Culture and Communication Section. One of the aims of her next project is to talk to social media workers who have been/are involved in making decisions about pro-eating disorder and self-harm content moderation. If you are, or know of anyone who might want to share their experiences, please email her at:

Content Moderation and Corporate Accountability: Ranking Digital Rights at #ATM2017

This entry is the first in a series of posts recapping portions of #ATM2017 from the perspective of participants. More to come.


Who gets to express what ideas online, and how? Who has the authority and the responsibility to police online expression and through what mechanisms?

Dozens of researchers, advocates, and content moderation workers came together in Los Angeles this December to share expertise on what are emerging as the critical questions of the day. “All Things in Moderation” speakers and participants included experienced content moderators — like Rasalyn Bowden, who literally wrote the moderation manual for MySpace — and pioneer researchers who understood the profound significance of commercial content moderation before anyone else, alongside key staff from industry. After years of toiling in isolation, many of us working on content moderation issues felt relief at finally finding “our people” and seeing the importance of our work acknowledged.

If the idea that commercial content moderation matters is quickly gaining traction, there is no consensus on how best to study it — and until we understand how it works, we can’t know how to structure it in a way that protects human rights and democratic values. One of the first roundtables of the conference considered the methodological challenges to studying commercial content moderation, key among which is companies’ utter lack of transparency around these issues.

While dozens of companies in the information and communication technology (ICT) sector publish some kind of transparency report, these disclosures tend to focus on acts of censorship and privacy violations that companies undertake at the behest of governments. Companies are much more comfortable copping to removing users’ posts or sharing their data if they can argue that they were legally required to do it. They would much rather not talk about how their own activities and their business model impact not only people’s individual rights to free expression and privacy, but the very fabric of society itself. The data capitalism that powers Silicon Valley has created a pervasive influence infrastructure that’s freely available to the highest bidder, displacing important revenue from print journalism in particular. This isn’t the only force working to erode the power of the Fourth Estate to hold governments accountable, but it’s an undeniable one. As Victor Pickard and others have forcefully argued, the dysfunction in the American media ecosystem — which has an outsized impact on the global communications infrastructure — is rooted in the original sin of favoring commercial interests over the greater good of society. The FCC’s reversal of the 2015 net neutrality rules is only the latest datapoint in a decades-long trend.

The first step toward reversing the trend is to get ICT companies on the record about their commitments, policies and practices that affect users’ freedom of expression and privacy. We can then evaluate whether these disclosed commitments, policies and practices sufficiently respect users’ rights, push companies to do better, and hold them to account when they fail to live up to their promises. To that end, the Ranking Digital Rights (RDR) project (where I was a fellow between 2014 and 2017) has developed a rigorous methodology for assessing ICT companies’ public commitments to respect their users’ rights to freedom of expression and privacy. The inaugural Corporate Accountability Index, published in November 2015, evaluated 16 of the world’s most powerful ICT companies across 31 indicators, and found that no company in the Index disclosed any information whatsoever about the volume and type of user content that is deleted or blocked when enforcing its own terms of service. Indeed, Indicator F9 — examining data about terms of service enforcement — was the only indicator in the entire 2015 Index on which no company received any points.

We revamped the Index methodology for the 2017 edition, adding six new companies to the mix, and were encouraged to see that three companies — Microsoft, Twitter, and Google — had modest disclosures about terms of service enforcement. Though it didn’t disclose any data about enforcement volume, the South Korean company Kakao disclosed more about how it enforces its terms of service than any other company we evaluated. Research for the 2018 Index and company engagement is ongoing, and we are continuing to encourage companies to clearly communicate what kind of content is or is not permitted on their platforms, how the rules are enforced (and by whom), and to develop meaningful remedy mechanisms for users whose freedom of expression has been unduly infringed. Stay tuned for the release of the 2018 Corporate Accountability Index this April.

Our experience has proven that this kind of research-based advocacy can have a real impact on company behavior, even if it’s never as fast as we might like. Ranking Digital Rights is committed to sharing our research methodology and our data (downloadable as a CSV file and in other formats) with colleagues in academia and the nonprofit sector. The Corporate Accountability Index is already being cited in media reports and scholarly research, and RDR is working closely with civil society groups around the world to hold a broader swath of companies accountable. All of RDR’s methodology documents, data, and other outputs are available under a Creative Commons license (CC-BY) — just make sure to give RDR credit.

Marechal headshotNathalie Maréchal is a PhD candidate at the University of Southern California’s Annenberg School for Communication and Journalism. Between 2014 and 2017, she held a series of fellowships at Ranking Digital Rights, where she authored several white papers and scholarly articles, conducted company research for the Corporate Accountability Index, and spearheaded the expansion of the Index’s methodology to include mobile ecosystems starting in 2017.