Our Blog
/
Blog

Protecting AI’s Essential Workers: A Pathway to Responsible Data Enrichment Practices

$hero_image['alt']

When we think about the way that AI will transform society, the conversation often focuses on its impact once it is deployed. However, there is less attention devoted to understanding the conditions under which AI is developed. Bringing greater scrutiny to the development process is important not only to ensure that the resulting AI models are safe for society but also to ensure that we are not harming society as a means to develop AI. 

One key aspect of the development process that PAI has been working on since 2020 is the role and conditions of data enrichment workers – workers who label, categorize, annotate, or otherwise enrich data so it can be used to train high-quality AI models. Data enrichment work includes activities like labeling objects in traffic videos to train self-driving vehicles, identifying when online content is toxic to train content moderation algorithms, labeling medical images to train AI radiology software, responding to prompts with short stories to train large language models, and various other tasks required to train and maintain various AI models.  

While mainstream media outlets have declared the onset of the “AI Revolution,” this excitement frequently overlooks the humans using their very human intelligence to train AI models. By focusing on end results, the dominant AI discourse ignores questions about the conditions under which humans are embedding their intelligence into AI models. This has enabled the development of a haphazard global data supply chain in which developers and contracting companies provide data enrichment workers with poor working conditions. 

To ensure that we build an equitable economy around AI, it is critical that the data enrichment workers responsible for enabling AI development are treated fairly.

The AI industry has the power to improve conditions for data enrichment workers. We can leverage solutions and approaches developed in other sectors to tackle these similar problems. Guided by our research and best practices from other fields, we have proposed the following Path for Developing Responsible AI Supply Chains:

  1. Act: Adopt the Data Enrichment Guidelines to improve conditions for data enrichment workers.
  2. Introduce Internal Governance: Embed these guidelines into an internal governance mechanism that enables consistent adherence to the guidelines by key decision-makers across the organization.
  3. Promote Consistent Practices Across the Supply Chain: Monitor impact on workers throughout the data enrichment value chain and make improvements as needed. This will require AI-developing companies to work closely with downstream vendors and platforms to ensure that data enrichment best practices are being followed.
  4. Publish Transparency Reports: Publicly report on activities relating to data enrichment practices across their supply chain. These reports should include the data enrichment practices and policies being followed by the company, internal governance mechanisms to create consistency in adherence to these policies, how the company works with their downstream vendors, and other key data enrichment practices.
  5. Include workers’ voices when evolving best practices: Develop a pathway to have workers and worker representatives shape future iterations of best practices. This is important because the nature of data enrichment work and the harms will continue to evolve as AI development progresses.

We see these actions as initial steps to enable better accountability and governance across the data supply chain. We hope that these actions enable worker representatives, policymakers, auditors, and others to hold key actors across the supply chain accountable for their impact on data enrichment workers. As we continue working with the Community of Practice members and other key stakeholders to adopt these changes, we welcome feedback.

How We Got Here

PAI has investigated how we can improve conditions for this workforce for the past four years – starting with a five-part multistakeholder workshop series in 2020. The findings from this workshop along with research in the field informed our 2021 white paper, which highlighted recommendations for AI developing companies on  improving conditions for data enrichment workers. To further this work and push for the adoption of these recommendations, we began having conversations with various AI developers to see what it would take for them to act on these recommendations.

Based on the white paper’s recommendations, we drafted and tested the implementation of the Data Enrichment Sourcing Guidelines. We first collaborated with DeepMind, one of our partner companies, to pilot the guidelines in their data enrichment workflows. Our resulting case study demonstrated that these recommendations were practical and feasible. Following this, we began engaging with more industry partners to understand what it would take to achieve industry-wide adoption. To understand this, we held a workshop in June of 2023 with AI-developing companies which focused on barriers to adoption. This workshop yielded three key insights, commonly seen in other globally distributed supply chains: 

  1. Advancing just labor conditions for data enrichment workers will require AI-developing companies to work closely with other actors across the supply chain, including downstream vendors. 
  2. Greater transparency and accountability are necessary to drive widespread adoption of responsible data enrichment practices across the value chain and AI industry.
  3. Cross-industry action on these issues will limit “race to the bottom” dynamics that harm workers, and establish baselines for acceptable treatment of workers. 

Following this June 2023 workshop, this Community of Practice has continued to meet regularly to work on identifying pathways to address some of these themes and figure out how to push for adoption within their respective companies. Our work with this community of practice and conversations with our broader partner network and civil society organizations and labor advocates have helped inform our proposed next steps for AI-developing companies, the above Path for Developing Responsible AI Supply Chains. 

Going Forward

Many of these recommendations align with actions that have been put in place to better protect workers and human rights in other supply chains. Companies can learn from and leverage the existing infrastructure used in other supply chains to extend these processes to cover data enrichment workers.

We are motivated by the recent interest in improving working conditions for data enrichment workers and we will continue to push for more widespread adoption of responsible data enrichment practices. In the coming few months, we will be sharing:

  • Our takeaways from a workshop we co-hosted with Fairwork in May of 2024 which brought together key AI-developing companies, data enrichment vendors and experts from intergovernmental organizations, civil society, and academia. The focus of this workshop was to identify how to create shared responsibility across the entire value chain for data enrichment workers.
  • A first draft of two additional resources to be added to our existing resource library: A Vendor Engagement Guide and Transparency Reporting Template for public comment. These resources were created in part to address the challenges identified through a 2023 workshop we hosted. These resources are intended to make it easier for the industry to uphold better data enrichment practices. Our public comment period is meant to elicit feedback from worker representatives, policymakers, human rights advocates, procurement and supply chain professionals, and other key experts that will help shape these resources that will be publicly available on our website. 
  • Our efforts to help inform policymakers and key government agencies about the harms across the data supply chain and the role that they can play in improving conditions for data enrichment workers.

If you are interested in getting involved in this work, please contact Sonam Jindal at sonam@partnershiponai.org. To stay up to date on how PAI continues to make an impact in this space sign up for our newsletter!