b279f11 - Partnership on AI

Reactions, Impact, and Next Steps

Response from Research and Development Teams

Establishing clear processes with supporting resources empowered researchers and reduced the burden on them to determine best practices on their own.

In an effort to improve conditions for workers, our intention from the start of this collaboration was to build and test resources that could be feasibly adopted by the people setting up data enrichment projects.

From early conversations, the implementation team heard from many researchers that they were actively seeking more guidance on how to set up data enrichment projects. Some of this was motivated by concerns over data quality. Researchers recognized that some tasks could only be completed by human annotators and were eager for guidance on how to set up projects in a way that would result in high-quality data sets. Researchers with experience in setting up data enrichment projects were exploring how to share best practices across the company and appreciated that there was an effort to align on worker-oriented considerations, given that those challenges required buy-in and input from a more diverse set of internal stakeholders. This underscored the desire for guidance, affirmed researchers’ appetite to engage on questions related to data enrichment, and presented an opportunity to shape best practices that incorporated worker-oriented considerations from the start.

While there was an appetite for guidance, some researchers raised concerns over the cost and additional time needed to implement the guidelines. DeepMind’s Responsible Development and Innovation team and leadership concluded that introducing a set of clear guidelines and a dedicated review process for any new data enrichment projects would actually save researchers’ time as well as reduce long term costs due to the creation of higher quality data sets. By standardizing the best practices for data enrichment projects and providing actionable guidance to researchers, DeepMind was able to lessen the burden on individual researchers who previously had to grapple with questions they didn’t necessarily have the expertise or experience to address on their own.

By standardizing the best practices for data enrichment projects and providing actionable guidance to researchers, DeepMind was able to lessen the burden on individual researchers who previously had to grapple with questions they didn’t necessarily have the background to address on their own.

Early feedback from teams using the new guidelines was generally positive. They were appreciative of having a clear set of best practices to follow, and shared that having an open and collaborative review process provided them with a useful level of reassurance. They highlighted that the process felt efficient and helped prompt them to consider the impact of their project design on the user experience for data enrichment workers. After going through the new process and completing a data enrichment project, one researcher shared that the experience of setting up the data enrichment project made them realize how useful the guidelines and resources on the internal site were, reflecting, “Everything you have on the microsite is exactly what we needed.” Through conversation with researchers using these guidelines, it also became clear that researchers recognized and appreciated the attentiveness that was required to work with data enrichment workers in order to build higher-quality datasets. This reaffirmed that designing data enrichment projects and specifically the work that data enrichment workers do are central to building high quality AI models.

The implementation team also received constructive feedback that helped us make a few additional adjustments to the guidance itself and gave us a deeper understanding of the level of guidance that researchers sought. For instance, teams brought up that it would be helpful to have a set of examples to see how other teams had set up similar projects. This is something that the Responsible Development and Innovation team will be collecting from internal teams as more projects go through the process and will be made available to prospective teams (pending consent). To complement this, researchers also thought it would be helpful to have additional workshops or office hours where they could learn best practices and pose questions.

Both wanting more examples from teams who had already set up data enrichment projects and wanting more opportunities to engage directly with people who were knowledgeable about data enrichment demonstrated that researchers appreciated guidance. Acting on this feedback, the Responsible Development and Innovation team will be hosting regular office hours to create opportunities for DeepMind researchers to get acquainted with the changes and serve as an additional resource for teams setting up data enrichment projects.

Key Stakeholders/Leadership Reflections and Motivations

Seeing data enrichment as critical to dataset quality and recognizing the importance of ethical data supply chains, leaders at DeepMind appreciated the standardization of responsible data enrichment practices.

As the need for enriched data and, therefore, data enrichment workers has grown across DeepMind, more individuals across the organization were thinking through how to design data enrichment projects. However, much of this was being thought through in silos. By initiating this effort to centralize an ethics review process, the Responsible Development and Innovation team brought together various stakeholders from across the organization to provide feedback on the proposed ethical data enrichment guidelines and review process. This effort tapped into a large appetite to engage on these topics and helped get people to think more critically about how they were constructing these data enrichment projects. Beyond wanting to know how to design experiments well, there was also a recognition that this needed to be done in an ethical and safe way. This was driven by both an existing company culture of building AI ethically and a desire to have a similar level of guidance and review as they maintained for projects requiring IRB approval.

Beyond wanting to know how to design experiments well, there was also a recognition that this needed to be done in an ethical and safe way.

While facing a few questions from researchers about the potential costs mentioned above, the leadership at DeepMind supported this initiative led by the Responsible Development and Innovation team. Leadership agreed that it was important to fill the gap of not having any explicit ethical guidance for projects requiring data enrichment workers, particularly given the increasing number of projects requiring enriched data. Additionally, as mentioned above, DeepMind’s Responsible Development and Innovation team previously concluded that (in the absence of existing guidance) introducing the guidelines would actually save researchers’ time and lead to higher quality data. This was underscored in a conversation the implementation team had with one senior researcher. They reflected that rather than having to individually advise research teams trying to set up data enrichment projects, they could now simply point them to the internal site housing standardized resources that had been created through a rigorous process. Additionally, while many of the guidelines are focused on the impact on data enrichment workers, many are also about making sure that researchers and these workers are aligned on how tasks need to be done. In helping researchers design data enrichment projects that lead to greater alignment with workers, researchers are less likely to need to repeat data enrichment projects.

Continued Work for DeepMind

This collaboration acted as an important launch point for DeepMind to further invest in internal infrastructure to scale their data enrichment operations responsibly.

As the industry builds more complex AI models requiring enriched data sets and begins to scale up its reliance on data enrichment workers, data enrichment workflows and the nature of this work will continue to change. As a result, it is important to recognize that our understanding of data enrichment work might evolve and we will need to consistently analyze the impact of changing data supply chains on workers. This DeepMind and PAI collaboration represents DeepMind’s starting point to formalize and consciously incorporate worker-oriented considerations into the company’s data enrichment practices. Given the lack of regulation or industry-wide standards guiding how these workers need to be treated, this is an important step. However, DeepMind acknowledges that additional work is needed to continuously improve conditions for data enrichment workers. While PAI will explore ecosystem-wide changes that would help workers with a future Data Supply Lines Roadmap, there are also impactful ways for DeepMind to build on the guidelines and review process they have introduced.

First, as new data enrichment use cases emerge, the resources developed during this collaboration should be adapted to make sure they provide additional guidance to researchers as needed. While the guidance was designed to be general enough that it would broadly apply to the various research teams across the organization, new use cases may require the guidelines to be adjusted in the future.

Second, the process of finalizing the data enrichment guidelines for DeepMind revealed a few obstacles that make it difficult to fully adhere to the guidelines in some contexts. Internally, DeepMind will continue to invest in infrastructure and resources that will make it easier to adhere to these guidelines and close some of the gaps identified during this collaboration. Outlining these guidelines provided the necessary impetus for organization-level investment in creating this infrastructure. This may also need to be supplemented by working directly with vendors and platforms to make it easier for researchers to consistently uphold the guidelines.

New use cases may require the guidelines to be adjusted in the future.

For example, one recommendation that we pursued during this engagement was creating regular communication channels between workers and researchers, as well as re-engaging with the same set of workers for similar projects. However, we found that following this recommendation was not as straightforward as anticipated due to limited functionality permitted by some platforms for communicating with workers and privacy restrictions, making it difficult to re-engage with the same workers. Effectively adhering to this guideline would, among other changes, require building tools and infrastructure that allow researchers to easily communicate and re-engage with workers.

This collaboration has helped DeepMind build internal support to invest in infrastructure that would make regular communication with workers more feasible. That being said, though platforms are the best fit for some of DeepMind’s use cases, DeepMind may benefit from considering the use of managed services for use cases that require more regular communication or re-engagement with workers.

Limitations of Case Study Applicability

While some AI organizations may have less infrastructure in place to make similar changes, the resources shared in this case study are designed to make it easier for organizations of any size to responsibly create enriched datasets.

Due to DeepMind having a strong research practice and adhering to IRB for studies involving humans, it may have been less of a lift for them to implement a parallel review process for projects involving data enrichment workers. Less research-focused organizations may have less infrastructure in place to support this kind of review. Despite having different organizational structures, more commercially oriented organizations are procuring data and involving data enrichment workers in similar ways and should be thinking about the impact on workers. One of the key motivating factors for DeepMind to invest in this process was that this saved time for researchers and lessened the burden on individual researchers to have to deal with these questions on their own.

The other major limitation is that it is difficult to immediately assess the ultimate impact on workers. In the absence of this information, all of the recommendations are backed by research and multistakeholder input. The intention of this effort was to implement those recommendations in practice to evaluate their feasibility. Additionally, as stated earlier, we recognize that we are still at an early stage in our collective understanding of how to transform data supply chains in the AI industry so that they work better for workers. We hope this will help us have further conversations about the additional work and guidance that needs to exist to improve conditions for these workers. Being able to put theoretical recommendations to the test has helped us identify additional levers of change that we plan on exploring to continue to strive towards improved conditions for workers. At the same time, we hope to get feedback from workers on this effort and PAI’s future Data Supply Lines Roadmap.