Implementing Responsible Data Enrichment Practices at an AI Developer: The Example of DeepMind

Sonam Jindal

Executive Summary

As demand for AI services grows, so, too, does the need for the enriched data used to train and validate machine learning (ML) models. While these datasets can only be prepared by humans, the data enrichment workers who do so (performing tasks like data annotation, data cleaning, and human review of algorithmic outputs) are an often-overlooked part of the development lifecycle, frequently working in poor conditions continents away from AI-developing companies and their customers.


For the purposes of this white paper we refer to individuals completing data enrichment as “workers.” In doing so, we recognize the variety of employment statuses that can exist in the data enrichment industry, including independent contractors on self-service crowdsourcing platforms, subcontractors of data enrichment providers, and full-time employees.

Last year, the Partnership on AI (PAI) published “
Responsible Sourcing of Data Enrichment Services
,” a white paper exploring how the choices made by AI practitioners could improve the working conditions of these data enrichment professionals. This case study documents an effort to put that paper’s recommendations into practice at one AI developer: DeepMind, a PAI Partner.

In addition to creating guidance for responsible AI development and deployment, PAI’s Theory of Change includes collaborating with Partners and others to implement our recommendations in practice. From these collaborations, PAI collects findings which help us further develop our curriculum of responsible AI resources. This case study serves as one such resource, offering a detailed account of DeepMind’s process and learnings for other organizations interested in improving their data enrichment sourcing practices.

Sourcing enriched data
Sourcing data enrichment work is a process that requires a number of steps including, but not limited to, defining the enrichment goal, choosing the enrichment provider, defining the enrichment tools, defining the technical requirements, writing instructions, ensuring that instructions make sense, setting worker hours, determining time spent on a particular task, communicating with enrichment workers, rejecting or accepting work, defining a project budget, determining workers’ payment, checking work quality, and providing performance feedback.

After assessing DeepMind’s existing practices and identifying what was needed to consistently source enriched data responsibly, PAI and DeepMind worked together to prototype the necessary policies and resources. The Responsible Data Enrichment Implementation Team (which consisted of PAI and members of DeepMind’s Responsible Development and Innovation team, which we will refer to as “the implementation team” in this case study) then collected multiple rounds of feedback, testing the following outputs and changes with smaller teams before they were rolled out organization-wide:

A two-page document offering fundamental guidelines for responsible data enrichment sourcing
An updated ethics review process
A checklist detailing what constitutes “good instructions” for data enrichment workers
A table to easily compare the salient features of various data enrichment platforms and vendors
A spreadsheet listing the living wages in areas where data enrichment workers commonly live

Versions of these resources have been added to PAI’s responsible data enrichment sourcing library and are now available for any organization that wishes to improve its data enrichment sourcing practices.

Ultimately, DeepMind’s multidisciplinary teams developing AI research, including applied AI researchers (or “researchers” for the purposes of this case study, though this term might be defined differently elsewhere) said that these new processes felt efficient and helped them think more deeply about the impact of their work on data enrichment workers. They also expressed gratitude for centralized guidance that had been developed through a rigorous process, removing the burden for them to individually figure out how to set up data enrichment projects.

Data Enrichment

Data enrichment is curation of data for the purposes of machine learning model development that requires human judgment and intelligence. This can include data preparation, cleaning, labeling, and human review of algorithmic outputs, sometimes performed in real time.

Examples of data enrichment work:

Data preparation, annotation, cleaning, and validation:
Intent recognition, Sentiment tagging, Image labeling

Human review (sometimes referred to as “human in the loop”):
Content moderation, Validating low confidence algorithmic predictions, Speech-to-text error correction

While organizations hoping to adopt these resources may want to similarly engage with their teams to make sure their unique use cases are accounted for, we hope these tested resources will provide a better starting point to incorporate responsible data enrichment practices into their own workflows. Furthermore, to identify where the implemented changes fall short of ideal, we plan to continue developing this work through engagement and convenings. To stay informed, sign up for updates on PAI’s Responsible Sourcing Across the Data Supply Line Workstream page.

This case study details the process by which DeepMind adopted responsible data enrichment sourcing recommendations as organization-wide practice, how challenges that arose during this process were addressed, and the impact on the organization of adopting these recommendations. By sharing this account of how DeepMind did it and why they chose to invest time to do so, we intend to inspire other organizations developing AI to undertake similar efforts. It is our hope that this case study and these resources will empower champions within AI organizations to create positive change.

Implementing Responsible Data Enrichment Practices at an AI Developer: The Example of DeepMind

Importance of Data Enrichment Workers and Pathways to Improve Working Conditions

Case Study as a Method of Increasing Transparency and Sharing Actionable Guidance

Background on DeepMind’s Motivations

Process and Outcomes of the DeepMind and PAI Collaboration

Changes and Resources Introduced to Support Adoption of Recommendations

Two-Page Data Enrichment Sourcing Guidelines Document

Adapted Review Process

Good Instructions Checklist

Vendor and Platform Feature Comparison Table

Living Wages Spreadsheet

Addressing Practical Complexities That Arose While Finalizing Changes

Assessing Clarity of Guidelines and Rolling Out Changes Organization-Wide

Reactions, Impact, and Next Steps

Response from Research and Development Teams

Key Stakeholders/Leadership Reflections and Motivations

Continued Work for DeepMind

Limitations of Case Study Applicability



Appendix A: Initial Discovery Process and Getting Reactions to PAI Responsible Sourcing Recommendations

Sources Cited

Table of Contents

AI and Job Quality

Insights from Frontline Workers

PAI Staff

Based on an international study of on-the-job experiences with AI, this report draws from workers’ insights to point the way toward a better future for workplace AI. In addition to identifying common themes among workers’ stories, it provides guidance for key stakeholders who want to make a positive impact. These opportunities for impact can be downloaded individually as audience-specific summaries below.

Opportunities for impact for:

Across industries and around the world, AI is changing work. In the coming years, this rapidly advancing technology has the potential to fundamentally reshape humanity’s relationship with labor. As highlighted by previous Partnership on AI (PAI) research, however, the development and deployment of workplace AI often lacks input from an essential group of experts: the people who directly interact with these systems in their jobs.

Bringing the perspectives of workers into this conversation is both a moral and pragmatic imperative. Despite the direct impact of workplace AI on them, workers rarely have direct influence in AI’s creation or decisions about its implementation. This neglect raises clear concerns about unforeseen or overlooked negative impacts on workers. It also undermines the optimal use of AI from a corporate perspective.

This PAI report, based on an international study of on-the-job experiences with AI, seeks to address this gap. Through journals and interviews, workers in India, sub-Saharan Africa, and the United States shared their stories about workplace AI. From their reflections, PAI identified five common themes:

  1. Executive and managerial decisions shape AI’s impacts on workers, for better and worse. This starts with decisions about business models and operating models, continues through technology acquisitions and implementations, and finally manifests in direct impacts to workers.
  2. Workers have a genuine appreciation for some aspects of AI in their work and how it helps them in their jobs. Their spotlights here point the way to more mutually beneficial approaches to workplace AI.
  3. Workplace AI’s harms are not new or novel — they are repetitions or extensions of harms from earlier technologies and, as such, should be possible to anticipate, mitigate, and eliminate.
  4. Current implementations of AI often serve to reduce workers’ ability to exercise their human skills and talents. Skills like judgment, empathy, and creativity are heavily constrained in these implementations. To the extent that the future of AI is intended to increase humans’ ability to use these talents, the present of AI is sending many workers in the opposite direction.
  5. Empowering workers early in AI development and implementation increases the opportunities to attain the aforementioned benefits and avoid the harms. Workers’ deep experience in their own roles means they should be treated as subject-matter experts throughout the design and implementation process.

In addition, PAI drew from these themes to offer opportunities for impact for the major stakeholders in this space:

  1. AI-implementing companies, who can commit to AI deployments that do not decrease employee job quality.
  2. AI-creating companies, who can center worker well-being and participation in their values, practices, and product designs.
  3. Workers, unions, and worker organizers, who can work to influence and participate in decisions about technology purchases and implementations.
  4. Policymakers, who can shape the environments in which AI products are developed, sold, and implemented.
  5. Investors, who can account for the downside risks posed by practices harmful to workers and the potential value created by worker-friendly technologies.

The actions of each of these groups have the potential to both increase the prosperity enabled by AI technologies and share it more broadly. Together, we can steer AI in a direction that ensures it will benefit workers and society as a whole.

AI and Job Quality

The need for workers’ perspectives on workplace AI

The contributions of this report

Our Approach

Key research questions

Research methods

Site selection

Who we learned from

Participant recruitment

Major Themes and Findings

Theme 1: Executive and managerial decisions shape AI’s impacts on workers, for better and worse

Theme 2: Workers appreciate how some uses of AI have positively changed their jobs

Theme 3: Workplace AI harms repeat, continue, or intensify known possible harms from earlier technologies

Theme 4: Current implementations of AI in work are reducing workers’ opportunities for autonomy, judgment, empathy, and creativity

Theme 5: Empowering workers early in AI development and implementation increases opportunities to implement AI that benefits workers as well as their employers

Opportunities for Impact

Stakeholder 1: AI-implementing companies

Stakeholder Group 2: AI-creating companies

Stakeholder Group 3: Workers, unions, and worker organizers

Stakeholder Group 4: Policymakers

Stakeholder Group 5: Investors



Appendix 1: Detailed Site and Technology Descriptions

Appendix 2: Research Methods

Sources Cited

Table of Contents

Making AI Inclusive: 4 Guiding Principles for Ethical Engagement

Tina Park


While the concept of “human-centered design” is hardly new to the technology sector, recent years have seen growing efforts to build inclusive artificial intelligence (AI) and machine learning (ML) products. Broadly, inclusive AI/ML refers to algorithmic systems which are created with the active engagement of and input from people who are not on AI/ML development teams. This includes both end users of the systems and non-users who are impacted by the systems.“Impacted non-user” refers to people who are impacted by the deployment of an AI/ML system, but are not the direct user or customer of that system. For example, in the case of students in the United Kingdom in 2020 whose A-level grades were determined by an algorithm, the “user” of the algorithmic system is Ofqual, the official exam regulator in England, and the students are “impacted non-users.” To collect this input, practitioners are increasingly turning to engagement practices like user experience (UX) research and participatory design.

Amid rising awareness of structural inequalities in our society, embracing inclusive research and design principles helps signal a commitment to equitable practices. As many proponents have pointed out, it also makes for good business: Understanding the needs of a more diverse set of people expands the market for a given product or service. Once engaged, these people can then further improve an AI/ML product, identifying issues like bias in algorithmic systems.

Despite these benefits, however, there remain significant challenges to greater adoption of inclusive development in the AI/ML field. There are also important opportunities. For AI practitioners, AI ethics researchers, and others interested in learning more about responsible AI, this Partnership on AI (PAI) white paper provides guidance to help better understand and overcome the challenges related to engaging stakeholders in AI/ML development.

Ambiguities around the meaning and goals of “inclusion” present one of the central challenges to AI/ML inclusion efforts. To make the changes needed for a more inclusive AI that centers equity, the field must first find agreement on foundational premises regarding inclusion. Recognizing this, this white paper provides four guiding principles for ethical engagement grounded in best practices:

  1. All participation is a form of labor that should be recognized
  2. Stakeholder engagement must address inherent power asymmetries
  3. Inclusion and participation can be integrated across all stages of the development lifecycle
  4. Inclusion and participation must be integrated to the application of other responsible AI principles

To realize ethical participatory engagement in practice, this white paper also offers three recommendations aligned with these principles for building inclusive AI:

  1. Allocate time and resources to promote inclusive development
  2. Adopt inclusive strategies before development begins
  3. Train towards an integrated understanding of ethics

This white paper’s insights are derived from the research study “Towards An Inclusive AI: Challenges and Opportunities for Public Engagement in AI Development.” That study drew upon discussions with industry experts, a multidisciplinary review of existing research on stakeholder and public engagement, and nearly 70 interviews with AI practitioners and researchers, as well as data scientists, UX researchers, and technologists working on AI and ML projects, over a third of whom were based in areas outside of the US, EU, UK, or Canada. Supplemental interviews with social equity and Diversity, Equity, and Inclusion (DEI) advocates contributed to the development of recommendations for individual practitioners, business team leaders, and the field of AI and ML more broadly.

This white paper does not provide a step-by-step guide for implementing specific participatory practices. It is intended to renew discussions on how to integrate a wider range of insights and experiences into AI/ML technologies, including those of both users and the people impacted (either directly or indirectly) by these technologies. Such conversations — between individuals, inside teams, and within organizations — must be had to spur the changes needed to develop truly inclusive AI.

Guiding Principles for Ethical Participatory Engagement

Principle 1: All Participation Is a Form of Labor That Should Be Recognized

Principle 2: Stakeholder Engagement Must Address Inherent Power Asymmetries

Principle 3: Inclusion and Participation Can Be Integrated Across All Stages of the Development Lifecycle

Principle 4: Inclusion and Participation Must Be Integrated to the Application of Other Responsible AI Principles

Recommendations for Ethical Engagement in Practice

Recommendation 1: Allocate Time and Resources to Promote Inclusive Development

Recommendation 2: Adopt Inclusive Development Strategies Before Development Begins

Recommendation 3: Train Towards an Integrated Understanding of Ethics



Table of Contents

After the Offer: The Role of Attrition in AI’s ‘Diversity Problem’

Jeffrey Brown

As a field, AI struggles to retain team members from diverse backgrounds. Given the far-reaching effects of algorithmic systems and the documented harms to marginalized communities, the fact that these communities are not represented on AI teams is particularly troubling. Why is this such a widespread phenomenon and what can be done to close the gap? This research paper, “After the Offer: The Role of Attrition in AI’s ‘Diversity Problem’” seeks to answer these questions, providing four recommendations for how organizations can make the AI field more inclusive. Click the button below to download a summary of these recommendations or continue on to read the paper in full.

Summary of Recommendations

Amid heightened attention to society-wide racial and social injustice, organizations in the AI space have been urged to investigate the harmful effects that AI has had on marginalized populations. It’s an issue that engineers, researchers, project managers, and various leaders in both tech companies and civil society organizations have devoted significant time and resources to in recent years. In examining the effects of AI, organizations must consider who exactly has been designing these technologies.

Diversity reports have revealed that the people working at the organizations that develop and deploy AI lack diversity across several dimensions. While organizations have blamed pipeline problems in the past, research has increasingly shown that once workers belonging to minoritized identities get hired in these spaces, systemic difficulties affect their experiences in ways that their peers from dominant groups do not have to worry about.

Attrition in the tech industry is a problem that disproportionately affects minoritized workers. In AI, where technologies already have a disproportionately negative impact on these communities, this is especially troublesome.

Minoritized Workers

This report uses minoritized workers as an umbrella term to refer to people whose identities (in categories such as race, ethnicity, gender, or ability) have been historically marginalized by those in dominant social groups. The minoritized workers in this study include people who identified as minoritized within the identity categories of race and ethnicity, gender identity, sexual orientation, ability, and immigration status. Because this study was international in scope, it is important to note that these categories are relative to their social context.

We are left wondering: What leads to these folks leaving their teams, organizations, or even the AI field more broadly? What about the AI field in particular influences these people to stay or leave? And what can organizations do to stem this attrition to make their environments more inclusive?

The current study uses interviews with folks belonging to minoritized identities across the AI field, managers, and DEI (diversity, equity, and inclusion)- leaders in tech to get rich information about what aspects of cultures within an organization promote inclusion or contribute to attrition. Themes that emerged during these interviews formed 3 key takeaways:

  1. Diversity makes for better team climates
  2. Systemic supports are difficult but necessary to undo the current harms to minoritized workers
  3. Individual efforts to change organizational culture fall disproportionately on minoritized folks who are usually not professionally rewarded for their efforts

In line with these takeaways, the study makes 4 recommendations about what can be done to make the AI field more inclusive for workers:

  1. Organizations must systemically support ERGs
  2. Organizations must intentionally diversify leadership and managers
  3. DEI trainings must be specific in order to be effective and be more connected to the content of AI work
  4. Organizations must interrogate their values as practiced and fundamentally alter them to include the perspectives of people who are not White, cis, or male

These takeaways and recommendations are explored in more depth below.

Key Takeaways

1. Diversity makes for better team climates

Across interviews, participants consistently expressed that managers who belonged to minoritized identities or who took the time to learn about working with diverse identities were more supportive of their needs and career goals. Such efforts reportedly resulted in teams that were also more diverse, inclusive, interdisciplinary, and engendering of a positive team culture/climate. In these environments, workers belonging to minoritized identities thrived. A diversity in backgrounds and perspectives was particularly important for AI teams that needed to solve interdisciplinary problems.

Conversely, the negative impact of work environments that were sexist or where participants experienced acts of prejudice such as microaggressions was also a recurring theme.

While collaborative or positive work environments were also a common theme, such environments did not in themselves negate predominant cultures which deprioritized “DEI-focused” work, work that was highly interdisciplinary, or work that did not serve the dominant group. Negative organizational cultures seemed to exacerbate experiences of prejudice or discrimination on AI teams.

2. Systemic supports are difficult but necessary to undo the current harms to minoritized workers

Participants belonging to minoritized identities said that they either left or intended to leave organizations that did not support their continued career growth or possessed values that did not align with their own. Consistent with this, participants described examples of their organizations not valuing the content of their work.

Participants also tied their desires to leave with instances of prejudice or discrimination, which may also be related to “toxic” work environments. Some participants reported instances of being tokenized or being subject to negative stereotypes about their identity groups, somewhat reflective of wider contexts in tech beyond AI.

Systemic supports include incentive structures that allow minoritized workers to succeed at every level, from the teams that they work with actively validating their experiences to their managers finding the best ways for them to deliver work products in accordance with both individual and institutional needs. Guidelines for promotion that recognize the barriers these workers face in environments mostly occupied by dominant group norms are another important support.

3. Individual efforts to change organizational culture fall disproportionately on minoritized folks who are usually not professionally rewarded for their efforts

Individuals discussed ways in which they tried to make their workplaces or teams more inclusive or otherwise sought to incorporate diverse perspectives into their work around AI. Participants sometimes had to contend with bias against DEI efforts, reporting that other workers in their organizations would dismiss their efforts as lacking rigor or focus on the product.

There were some institutional efforts to foster a more inclusive culture, most commonly DEI trainings. DEI trainings that were very specific to some groups (e.g., gender diverse folks, Black people) were reported as being the most effective. However, even when they were specific, DEI trainings seemed to be disconnected from some aspects of the workplace climate or the content of what teams were working on.

Participants who mentioned Employee Resource Groups (ERGs) uniformly praised them, discussing the huge positive impact they had on a personal level, forming the bases of their social support networks in their organizations and having a strong impact on their ability to integrate aspects of their identities or other “DEI topics” they were passionate about into their work.



1. Organizations must systemically support ERGs

Employees specifically named ERGs as one of their main sources of support even in work environments that were otherwise toxic.. Additionally, ERGs provided built-in mentorship for those who did not have ready access to mentors or whose supervisors had not done the work to understand the kinds of support needed for those of minoritized identities to thrive in predominantly White and male environments.

What makes this recommendation work?

Within these ERGs, there existed other grass-roots initiatives that supported workers, such as informal talking circles and networks of employees that essentially provided peer mentoring that participants found crucial to navigating White- and male-dominated spaces. The mentorship provided by ERGs was also essential when HR failed to provide systemic support for staff and instead prioritized protecting the organization.

What must be in place?

While participants uniformly praised ERGs, they required large amounts of time from staff members that detracted from their work. Such groups also ran the risk of getting taken over by leadership and having their original mission derailed. Institutions should seek a balance between supporting these groups and giving them the freedom to organize in pursuit of their own best interests.

What won’t this solve?

ERGs will not necessarily make an organization’s AI or tech more inclusive. Rather, systematically supporting ERGs will provide more support and community for minoritized workers, which is meant to promote a more inclusive workplace in general.

2. Organizations must intentionally diversify leadership and managers
What makes this recommendation work?

Participants repeatedly pointed to managers and upper-level leaders who belonged to minoritized identities (especially racial ones) as important influences, changing policy that permeated through various levels of their organizations. A diverse workforce may also bring with it multiple perspectives, including those belonging to people from different disciplines who may be interested in working in the AI field due to the opportunity for interdisciplinary collaboration, research, and product development. Bringing in folks from various academic, professional, and technical backgrounds to solve problems is especially crucial for AI teams.

What must be in place?

There must be understanding about the reasons behind the lack of diversity and the “bigger picture” of how powerful groups more easily perpetuate power structures already in place. Participants spoke of managers who did not belong to minoritized identities themselves but who took the time to learn in depth about differences in power and privilege in the tech ecosystem, appreciating the diverse perspectives that workers brought. These managers, while not perfect, tended to take advocating for their reports very seriously, particularly female reports who often went overlooked.

What won’t this solve?

Intentionally diversifying leadership and managers will not automatically create a pipeline for diversity at the leadership level, nor will it automatically override institutional culture or policies that ignore DEI best practices.

3. DEI trainings must be specific in order to be effective and be more connected to the content of AI work
What makes this recommendation work?

Almost all participants reported that their organizations mandated some form of DEI training for all staff. These ranged widely, from very general ones to very specific trainings that discussed cultural competency about more specific groups of people (e.g., participants reported that there were trainings on anti-Black racism). Participants discussed that the more specific trainings tended to be more impactful.

What must be in place?

Organizations must invest in employees who see the importance of inclusive values in AI research and product design. Participants pointed to the importance of managers who had an ability to foster inclusive team values, which was not something that HR could mandate.

What won’t this solve?

As several participants observed, DEI trainings will not uproot or counteract institutional stigmas against DEI. It would take sustained effort and deliberate alignment of values for an organization to emphasize DEI in its work.

4. Organizations must interrogate their values as practiced and fundamentally alter them to include the perspectives of people who are not White, cis, or male
What makes this recommendation work?

Participants frequently reported that a misalignment of values was a primary reason for them leaving their organizations or wanting to leave their organizations. Participants in this sample discussed joining the AI field to create a positive impact while growing professionally. This led them to feeling disappointed when their organizations did not prioritize these goals (despite them being among their stated values).

What must be in place?

Participants found it frustrating when organizations stated that they valued diversity and then failed to live up to this value with hiring, promotion, and day-to-day operations, ignoring the voices of minoritized individuals. If diversity is truly a value, organizations may have to investigate their systems of norms and expectations that are fundamentally male, Eurocentric, and do not make space for those from diverse backgrounds. They then must take additional steps to consider how such systems influence their work in AI.

What won’t this solve?

Because achieving a fundamental re-alignment like this is a more comprehensive solution, it cannot satisfy the most immediate and urgent needs for reform. Short-term, organizations must work with DEI professionals to recognize how they are perpetuating potentially harmful norms of the dominant group and work to create policies that are more equitable. Longer term fixes may not, for instance, satisfy the immediate and urgent need for more diversity in leadership and teams in general.

After the Offer: The Role of Attrition in AI’s ‘Diversity Problem’

Why Study Attrition of Minoritized Workers in AI?


Problems Due to Lack of Diversity of AI Teams

More Diverse Teams Yield Better Outcomes

Current Level of Diversity in Tech

Diversity in AI

What Has Been Done

Attrition in Tech

Current Study and Methodology









Efforts to Improve Inclusivity

Summary and the Path Forward



Appendix 1: Recruitment Document

Appendix 2: Privacy Document

Appendix 3: Research Protocol

Appendix 4: Important Terms

Table of Contents

Fairer Algorithmic Decision-Making and Its Consequences: Interrogating the Risks and Benefits of Demographic Data Collection, Use, and Non-Use

PAI Staff

Introduction and Background



Algorithmic decision-making has been widely accepted as a novel approach to overcoming the purported cognitive and subjective limitations of human decision makers by providing “objective” data-driven recommendations. Yet, as organizations adopt algorithmic decision-making systems (ADMS), countless examples of algorithmic discrimination continue to emerge. Harmful biases have been found in algorithmic decision-making systems in contexts such as healthcare, hiring, criminal justice, and education, prompting increasing social concern regarding the impact these systems are having on the wellbeing and livelihood of individuals and groups across society. In response, algorithmic fairness strategies attempt to understand how ADMS treat certain individuals and groups, often with the explicit purpose of detecting and mitigating harmful biases.

Many current algorithmic fairness techniques require access to data on a “sensitive attribute” or “protected category” (such as race, gender, or sexuality) in order to make performance comparisons and standardizations across groups. These demographic-based algorithmic fairness techniques assume that discrimination and social inequality can be overcome with clever algorithms and collection of the requisite data, removing broader questions of governance and politics from the equation. This paper seeks to challenge this assumption, arguing instead that collecting more data in support of fairness is not always the answer and can actually exacerbate or introduce harm for marginalized individuals and groups. We believe more discussion is needed in the machine learning community around the consequences of “fairer” algorithmic decision-making. This involves acknowledging the value assumptions and trade-offs associated with the use and non-use of demographic data in algorithmic systems. To advance this discussion, this white paper provides a preliminary perspective on these trade-offs derived from workshops and conversations with experts in industry, academia, government, and advocacy organizations as well as literature across relevant domains. In doing so, we hope that readers will better understand the affordances and limitations of using demographic data to detect and mitigate discrimination in institutional decision-making more broadly



Demographic-based algorithmic fairness techniques presuppose the availability of data on sensitive attributes or protected categories. However, previous research has highlighted that data on demographic categories, such as race and sexuality, are often unavailable due to a range of organizational challenges, legal barriers, and practical concerns Andrus, M., Spitzer, E., Brown, J., & Xiang, A. (2021). “What We Can’t Measure, We Can’t Understand”: Challenges to Demographic Data Procurement in the Pursuit of Fairness. ArXiv:2011.02282 (Cs). Some privacy laws, such as the EU’s GDPR, not only require
data subjects to provide meaningful consent when their data is collected, but also prohibit the collection of sensitive data such as race, religion, and sexuality. Some corporate privacy policies and standards, such as Privacy By Design, call for organizations to be intentional with their data collection practices, only collecting data they require and can specify a use for. Given the uncertainty around whether or not it is acceptable to ask users and customers for their sensitive demographic information, most legal and policy teams urge their corporations to err on the side of caution and not collect these types of data unless legally required to do so. As a
result, concerns over privacy often take precedence over ensuring product fairness since the trade-offs between mitigating bias and ensuring individual or group privacy are unclear Andrus et al., 2021.

In cases where sensitive demographic data can be collected, organizations must navigate a number of practical challenges throughout its procurement. For many organizations, sensitive demographic data is collected through self-reporting mechanisms. However, self reported data is often incomplete, unreliable, and unrepresentative, due in part to a lack of incentives for individuals to provide accurate
and full information Andrus et al., 2021. In some cases, practitioners choose to infer protected categories of individuals based on proxy information, a method which is largely inaccurate. Organizations also face difficulty capturing unobserved characteristics, such as disability, sexuality, and religion, as these categories are frequently missing and often unmeasurable Tomasev, N., McKee, K. R., Kay, J., & Mohamed, S. (2021). Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities. ArXiv:2102.04257 (Cs). Overall, deciding on how to classify and categorize demographic data is an ongoing challenge, as demographic categories continue to shift and change over time and between contexts. Once demographic data is collected, antidiscrimination law and policies largely inhibit organizations from using this data since knowledge of sensitive categories opens the door to legal liability if discrimination is uncovered without a plan to successfully mitigate it Andrus et al., 2021.

In the face of these barriers, corporations looking to apply demographic-based algorithmic fairness techniques have called for guidance on how to responsibly collect and use demographic data. However, prescribing statistical definitions of fairness on algorithmic systems without accounting for the social, economic, and political systems in which they are embedded can fail to benefit marginalized
groups and undermine fairness efforts Bakalar, C., Barreto, R., Bogen, M., Corbett-Davies, S., Hall, M., Kloumann, I., Lam, M., Candela, J. Q., Raghavan, M., Simons, J., Tannen, J., Tong, E., Vredenburgh, K., & Zhao, J. (2021). Fairness On The Ground: Applying Algorithmic Fairness Approaches To Production Systems. 12.. Therefore, developing guidance requires a deeper understanding of the risks and trade-offs inherent to the use and non-use of demographic data. Efforts to detect and mitigate harms must account for the wider contexts and power structures that algorithmic systems, and the data that they draw on, are embedded in.

Finally, though this work is motivated by the documented unfairness of ADMS, it is critical to recognize that bias and discrimination are not the only possible harms stemming directly from ADMS. As recent papers and reports have forcefully argued, focusing on debiasing datasets and algorithms is (1) often misguided because proposed debiasing methods are only relevant for a subset of the kinds of bias ADMS introduce or reinforce, and (2) likely to draw attention away from other, possibly more salient harms Balayn, A., & Gürses, S. (2021). Beyond Debiasing. European Digital Rights. uploads/2021/09/EDRi_Beyond-Debiasing-Report_Online.pdf. In the first case, harms from tools such as recommendation systems, content moderation systems, and computer vision systems might be characterized as a result of various forms of bias, but resolving bias in those systems generally involves adding in more context to better understand differences between groups, not just trying to treat groups more similarly. In the second case, there are many ADMS that are clearly susceptible to bias, yet the greater source of harm could arguably be the deployment of the system in the first place. Pre-trial detention risk scores provide one such example. Using statistical correlations to determine if someone should be held without bail, or, in other words, potentially punishing individuals for attributes outside of their control and past decisions unrelated to what they are currently being charged for, is itself a significant deviation from legal standards and norms, yet most of the debate has focused around how biased the predictions are. Attempting to collect demographic data in these cases will likely do more harm than good, as demographic data will
draw attention away from harms inherent to the system and towards seemingly resolvable issues around bias.

Fairer Algorithmic Decision-Making and Its Consequences: Interrogating the Risks and Benefits of Demographic Data Collection, Use, and Non-Use

Social Risks of Non-Use

Hidden Discrimination

''Colorblind'' Decision-Making

Invisibility to Institutions of Importance

Social Risks of Use

Risks to Individuals

Encroachments on Privacy and Personal Life

Individual Misrepresentation

Data Misuse and Use Beyond Informed Consent

Risks to Communities

Expanding Surveillance Infrastructure in the Pursuit of Fairness

Misrepresentation and Reinforcing Oppressive or Overly Prescriptive Categories

Private Control Over Scoping Bias and Discrimination

Conclusion and Acknowledgements



Table of Contents

ABOUT ML Reference Document

Last Updated

To share your ideas, suggestions, and other feedback related to this evolving document, please reach out to Sarah Villeneuve, Lead of Fairness, Transparency, Accountability & ABOUT ML. Learn more about the origins of ABOUT ML and contributors to the project here.

Section 0: How to Use this Document

This ABOUT ML Reference Document is a reference and foundational resource. Future contributions of the ABOUT ML work will include a PLAYBOOK of specifications, guides, recommendations, templates, and other meaningful artifacts to support ML documentation work by individuals in any and all of the roles listed below. Use cases made up of various artifacts from the PLAYBOOK along with other implementation instructions will be packaged as PILOTS for PAI Partners to try out in their organizations. Feedback from their use of these cases will further mature the artifacts in the PLAYBOOK and will support the ABOUT ML team’s continued, rigorous, scientific investigation of relevant research questions in the ML documentation space.

Recommended Reading Plan

Based on the role a reader plays in their organization and/or the community of stakeholders they belong to, there are several different approaches for reading and using the information in this ABOUT ML Reference Document:

Role Recommendations
ML system developers/deployers ML system developers/deployers are encouraged to do a deep dive exploration of Section 3: Preliminary Synthesized Documentation Suggestions and use it to highlight gaps in their current understanding of both data- and model-related documentation and planning needs. This group will most benefit from further participation in the ABOUT ML effort by engaging with the community in the forthcoming online forum and by testing the efficacy and applicability of templates and specifications to be published in the PLAYBOOK and PILOTS, which will be developed based on use cases as an opportunity to implement ML documentation processes within an organization.
ML system procurers ML system procurers might explore Section 2.2: Documentation to Operationalize AI Ethics Goals to get ideas about what concepts to include as requirements for models and data in future requests for proposals relevant to ML systems. Additionally, they could use Section 2.3: Research Themes on Documentation for Transparency to shape conversations with the business owners and requirements writers to further elicit detailed key performance indicators and measures for success for any procured ML systems.
Users of ML system APIs and/or experienced end users of ML systems Users of ML system APIs and/or experienced end users of ML systems might skim the document and review all of the coral-colored Quick Guides to get a better understanding of how ML concepts are relevant to many of the tools they regularly use. A review of Section  2.1: Demand for Transparency and AI Ethics in ML systems will provide insight into conditions where it is appropriate to use ML systems. This section also explains how transparency is a foundation for both internal accountability among the developers, deployers, and API users of an ML system and external accountability to customers, impacted non-users, civil society organizations, and policymakers.
Internal compliance teams Internal compliance teams are encouraged to explore Section 4: Current Challenges of Implementing Documentation and use it to shape conversations with developer/deployment teams to find ways to measure compliance throughout the Machine Learning Lifecycle (MLLC).
External auditors External auditors could skim Appendix: Compiled List of Documentation Questions and familiarize themselves with high-level concepts as well as tactically operationalized tenets to look for in their determination of whether or not an ML System is well-documented.
Lay users of ML systems and/or members of low-income communities Lay users of ML systems and/or members of low-income communities might skim the document and review all of the blue-colored How We Define boxes in order to get an overarching understanding of the text’s contents. These users are encouraged to continue learning ABOUT ML systems by exploring how they might impact their everyday lives. Additional insights can be gathered from the Glossary section of this Reference Document.

Quick Guides

More information about a topic. Oftentimes, this will be a high-level and less academic expression of a term or concept.

Throughout this ABOUT ML Reference Document, we will use coral callout boxes with text to further explain a concept. This is a readability enhancement tactic recommended by our Diverse Voices panel and is meant to make the content more accessible and consumable to lay users of machine learning systems.

How We Define

Example Term

We’ll use this space to give background definitions of terms and phrases and, in some cases, to call out existing work related to the ABOUT ML effort.

Throughout this ABOUT ML Reference Document, we will use the blue callout boxes with text to showcase our accepted (near-consensus) definition of a term or phrase. This is meant to give foundational background information to viewers of the document and also provides a baseline of understanding for any artifacts that may be derived from this work. Additional terms can be found in the glossary section. Future versions of this reference and/or artifacts in the forthcoming PLAYBOOK will explore audio/video offerings to support the consumption of this information by verbal/visual learners.

Contact for Support

If you have any questions or would like to learn more about this effort, please reach out to us by:

Visiting our ABOUT ML page to make contributions to the work

ABOUT ML Reference Document

Section 0: How to Use this Document

Recommended Reading Plan

Quick Guides

How We Define

Contact for Support

Section 1: Project Overview

1.1 Statement of Importance for ABOUT ML Project

1.1.0 Importance of Transparency: Why a Company Motivated by the Bottom Line Should Adopt ABOUT ML Recommendations

1.1.1 About This Document and Version Numbering

1.1.2 ABOUT ML Goals and Plan

1.1.3 ABOUT ML Project Process and Timeline Overview

1.1.4 Who Is This Project For? Audiences for the ABOUT ML Resources Stakeholders That Should Be Consulted While Putting Together ABOUT ML Resources Audiences for ABOUT ML Documentation Artifacts Whose Voices Are Currently Reflected in ABOUT ML? Origin Story

Section 2: Literature Review (Current Recommendations on Documentation for Transparency in the ML Lifecycle)

2.1 Demand for Transparency and AI Ethics in ML Systems 

2.2 Documentation to Operationalize AI Ethics Goals

2.2.1 Documentation as a Process in the ML Lifecycle

2.2.2 Key Process Considerations for Documentation

2.3 Research Themes on Documentation for Transparency 

2.3.1 System Design and Set Up

2.3.2 System Development

2.3.3 System Deployment

Section 3: Preliminary Synthesized Documentation Suggestions

3.4.1 Suggested Documentation Sections for Datasets Data Specification Motivation Data Curation Collection Processing Composition Types and Sources of Judgement Calls Data Integration Use Distribution Maintenance

3.4.2 Suggested Documentation Sections for Models Model Specifications Model Training Evaluation Model Integration Maintenance

Section 4: Current Challenges of Implementing Documentation

Section 5: Conclusions

Version 0

Version 1

Appendix A: Compiled List of Documentation Questions 

Fact Sheets (Arnold et al. 2018)

Data Sheets (Gebru et al. 2018)

Model Cards (Mitchell et al. 2018)

A “Nutrition Label” for Privacy (Kelley et al. 2009)

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards (Holland et al. 2019)

Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science (Bender and Friedman 2018)

Appendix B: Diverse Voices Process and Artifacts

Procurement Recruitment Email

Procurement Confirmation Email 

Appendix C: Glossary

Table of Contents

Responsible Sourcing of Data Enrichment Services

PAI Staff

As AI becomes increasingly pervasive, there has been growing and warranted concern over the effects of this technology on society. To fully understand these effects, however, one must closely examine the AI development process itself, which impacts society both directly and through the models it creates. This white paper, “Responsible Sourcing of Data Enrichment Services,” addresses an often overlooked aspect of the development process and what AI practitioners can do to help improve it: the working conditions of data enrichment professionals, without whom the value being generated by AI would be impossible. This paper’s recommendations will be an integral part of the shared prosperity targets being developed by Partnership on AI (PAI) as outlined in the AI and Shared Prosperity Initiative’s Agenda.

High-precision AI models are dependent on clean and labeled datasets. While obtaining and enriching data so it can be used to train models is sometimes perceived as a simple means to an end, this process is highly labor-intensive and often requires data enrichment workers to review, classify, and otherwise manage massive amounts of data. Despite the foundational role played by these data enrichment professionals, a growing body of research reveals the precarious working conditions these workers face. This may be the result of efforts to hide AI’s dependence on this large labor force when celebrating the efficiency gains of technology. Out of sight is also out of mind, which can have deleterious consequences for those being ignored.

Data Enrichment Choices Impact Worker Well-being

There is, however, an opportunity to make a difference. The decisions AI developers make while procuring enriched data have a meaningful impact on the working conditions of data enrichment professionals. This paper focuses on how these sourcing decisions impact workers and proposes avenues for AI developers to meaningfully improve their working conditions, outlining key worker-oriented considerations that practitioners can use as a starting point to raise conversations with internal teams and vendors. Specifically, this paper covers worker-centric considerations for AI companies making decisions in:

  • selecting data enrichment providers,
  • running pilots,
  • designing data enrichment tasks and writing instructions,
  • assigning tasks,
  • defining payment terms and pricing,
  • establishing a communication cadence with workers,
  • conducting quality assurance,
  • and offboarding workers from a project.

This paper draws heavily on insights and input gathered during semi-structured interviews with members of the AI enrichment ecosystem conducted throughout 2020 as well as during a five-part workshop series held in the fall of 2020. The workshop series brought together more than 30 professionals from different areas of the data enrichment ecosystem, including representatives from data enrichment providers, researchers and product managers at AI companies, and leaders of civil society and labor organizations. We’d like to thank all of them for their engaged participation and for valuable feedback. We’d also like to thank Elonnai Hickok for serving as the lead researcher on the project and Heather Gadonniex for her committed support and championship. Finally, this work would not be possible without the invaluable guidance, expertise, and generosity of Mary Gray.

Our intention with this paper is to aid the industry in accounting for wellbeing when making decisions about data enrichment and to set the stage for further conversations within and across AI organizations. Additional work is needed to ensure industry practices recognize, appreciate, and fairly compensate the workers conducting data enrichment. To that end, we want to use this paper as an opportunity to increase awareness amongst practitioners and launch a series of conversations. We recognize that there is a lot of variance in practices across the industry and hope to start a productive dialogue with organizations across the spectrum who are working through these questions. If you work at a company involved in building AI and want to host a conversation with your colleagues around data enrichment practices, we would love to join and help facilitate the conversation. If you are interested, please get in touch here.

To read “Responsible Sourcing of Data Enrichment Services” in full, click here.