Documenting the Impacts of Foundation Models

A Progress Report on Post-Deployment Governance Practices

Albert Tanjaya, Jacob Pratt

Web pages related to ChatGPT were viewed over 3.7 billion times in January 2025, making it the 13th most viewed domain on the internet. Meanwhile, billions of people use products powered by foundation models, such as Gemini in Google Search and Meta AI in Meta products. However, we are just beginning to understand the use and societal impact of these models after they have been deployed.

What are people using these systems for? How are they helping people do things better? What are the most common or severe harms they cause?

This report on the impacts of foundation models explores the progress of the field in answering these questions, and is PAI’s inaugural Progress Report. It provides key insights for different audiences, such as:

Policymakers
  • analysis of the current adoption of four documentation practices
  • recommended actions to take over the next 6 – 24 months
Model providers, model deployers, or actors in the value chain
  • benefits of adopting four documentation practices
  • current challenges to adopting these practices
  • recommended actions to take to improve foundation model governance
  • over 70 examples of how organizations collect and document information on the impacts of foundation models
Academic or civil society researchers or interested users of AI models and systems
  • open questions that merit further exploration
  • information on benefits and challenges to support future research
  • over 70 examples of how organizations collect and document information on the impacts of foundation models

Key Findings

Developed with the input of over 30 different stakeholders across academia, civil society, and industry, the report builds on PAI’s Guidance for Safe Foundation Model Deployment and Policy Alignment on AI Transparency work, and produced the following key findings.

Practices

The report highlights four key practices for documenting the impacts of foundation models.

Practice 1

Share usage information

E.g.

  • Activity data (input data; output data)
  • Usage by geography, sector, and use case
  • Total chat time usage
  • Information on downstream applications.
Practice 2

Enable and share research on post-deployment societal impact indicators

E.g.

  • Labor impact indicators
  • Environmental impact indicators
  • Synthetic content impact indicators
Practice 3

Report incidents and disclose policy violations

E.g.

  • Safety incidents
  • Violations of terms of use and policies
  • Mitigation and remediation actions
Practice 4

Share user feedback

E.g.

  • How users submit feedback
  • Feedback received and type
  • How provider utilizes feedback

Challenges & Recommendations

The main challenges for collecting and sharing information on post-deployment impacts can be grouped into five themes. To overcome these challenges and ensure the benefits of effective post-deployment governance, we recommend that stakeholders take the following actions to move the field forward.

Challenge 1

Lack of standardization and established norms

Recommendation 1

Define norms for the documentation of post-deployment impacts through multistakeholder processes, which may be formalized through technical standards

Challenge 2

Data sharing and coordination barriers

Recommendation 2

Explore mechanisms for responsible data sharing

Challenge 3

Misaligned incentives

Recommendation 3

Policymakers should explore where guidance and rules on documenting post-deployment impacts are needed

Challenge 4

Limited data sharing infrastructure

Recommendation 4

Policymakers should develop blueprints for national post-deployment monitoring functions

Challenge 5

Decentralized nature of open model deployment

Recommendation 5

Conduct research into methods for collecting information on open model impacts

Looking forward

PAI will continue to improve collective understanding of the field and drive accountability through the development of future progress reports. If you would like to know more, please contact policy@partnershiponai.org.

Acknowledgments

This report reflects the collaborative efforts of our multistakeholder community. We are grateful to our working group members who contributed their expertise and dedicated their time to shape this work. We also thank the experts who provided critical insights through our initial questionnaire and Policy Forum workshop. This work was developed with guidance from the Policy Steering Committee. For a complete list of contributors, please see the full report.

Eyes Off My Data

Exploring Differentially Private Federated Statistics To Support Algorithmic Bias Assessments Across Demographic Groups

PAI Staff

Executive Summary

Executive Summary

Designing and deploying algorithmic systems that work as expected every time for all people and situations remains a challenge and a priority. Rigorous pre- and post-deployment fairness assessments are necessary to surface any potential bias in algorithmic systems. As they often involve collecting new user data, including sensitive demographic data, post-deployment fairness assessments to observe whether the algorithm is operating in ways that disadvantage any specific group of people can pose additional challenges to organizations. The collection and use of demographic data is difficult for organizations because it is entwined with highly contested social, regulatory, privacy, and economic considerations. Over the past several years, Partnership on AI (PAI) has investigated key risks and harms individuals and communities face when companies collect and use demographic data. In addition to well-known data privacy and security risks, such harms can stem from having one’s social identity being miscategorized or data being used beyond data subjects’ expectations, which PAI has explored through our demographic data workstream. These risks and harms are particularly acute for socially marginalized groups, such as people of color, women, and LGBTQIA+ people.

Given these risks and concerns, organizations developing digital technology are invested in the responsible collection and use of demographic data to identify and address algorithmic bias. For example, in an effort to deploy algorithmically driven features responsibly, Apple introduced IDs in Apple Wallet with mechanisms in place to help Apple and their partner issuing state authorities (e.g., departments of motor vehicles) identify any potential biases users may experience when adding their IDs to their iPhones.IDs in Wallet, in partnership with state identification-issuing authorities (e.g., departments of motor vehicles), were only available in select US states at the time of the writing of this report.

In addition to pre-deployment algorithmic fairness testing, Apple followed a post-deployment assessment strategy as well. As part of IDs in Wallet, Apple applied differentially private federated statistics as a way to protect users’ data, including their demographic data. The main benefit of using differentially private federated statistics is the preservation of data privacy by combining the features of differential privacy (e.g., adding statistical noise to data to prevent re-identification) and federated statistics (e.g., analyzing user data on individual devices, rather than on a central server, to avoid the creation and transfer of datasets that can be hacked or otherwise misused). What is less clear is whether differentially private federated statistics can attend to some of the other risks and harms associated with the collection and analysis of demographic data. To understand this, a sociotechnical lens is necessary to understand the potential social impact of the application of a technical approach.

This report is the result of two expert convenings independently organized and hosted by PAI. As a partner organization of PAI, Apple shared details about the use of differentially private federated statistics as part of their post-deployment algorithmic bias assessment for the release of this new feature.

During the convenings, responsible AI, algorithmic fairness, and social inequality experts discussed how algorithmic fairness assessments can be strengthened, challenged, or otherwise unaffected by the use of differentially private federated statistics. While the IDs in Wallet use case is limited to the US context, the participants expanded the scope of their discussion to consider differential private federated statistics in different contexts. Recognizing that data privacy and security are not the only concerns people have regarding the collection and use of their demographic data, participants were directed to consider whether differentially private federated statistics could also be leveraged to attend to some of the other social risks that can arise, particularly for marginalized demographic groups.

The multi-disciplinary participant group repeatedly emphasized the importance of having both pre- and post-deployment algorithmic fairness assessments throughout the development and deployment of an AI-driven system or product/feature. Post-deployment assessments are especially important as they enable organizations to monitor algorithmic systems once deployed in real-life social, political, and economic contexts. They also recognized the importance of thoughtfully collecting key demographic data in order to help identify group-level algorithmic harms.

The expert participants, however, clearly stated that a secure and privacy-preserving way of collecting and analyzing sensitive user data is, on its own, insufficient to deal with the risks and harms of algorithmic bias. In fact, they expressed that such a technique is not entirely sufficient for dealing with the risks and harms of collecting demographic data. Instead, the convening participants identified key choice points facing AI-developing organizations to ensure the use of differentially private federated statistics contributes to overall alignment with responsible AI principles and ethical demographic data collection and use.

This report provides an overview of differentially private federated statistics and the different choice points facing AI-developing organizations in applying differentially private federated statistics in their overall algorithmic fairness assessment strategies. Recommendations for best practices are organized into two parts:

  1. General considerations that any AI-developing organization should factor into their post-deployment algorithmic fairness assessment
  2. Design choices specifically related to the use of differentially private federated statistics within a post-deployment algorithmic fairness strategy

The choice points identified by the expert participants emphasize the importance of carefully applying differentially private federated statistics in the context of algorithmic bias assessment. For example, several features of the technique can be determined in such a way that reduces the efficacy of the privacy-preserving and security-enhancing aspects of differentially private federated statistics. Apple’s approach to using differentially private federated statistics aligned with some of the practices suggested during the expert convenings: the decision to limit the data retention period (90 days), allowing users to actively opt-into data sharing (rather than creating an opt-out model), clearly and simply sharing what data the user will be providing for the assessment, and maintaining organizational oversight of the query process and parameters.

The second set of recommendations surfaced by the expert participants primarily focus on the resources (e.g., financial, time allocation, and staffing) necessary to achieve a level of alignment and clarity on the nature of “fairness” and “equity” AI-developing organizations are seeking for their AI-driven tools and products/features. While these considerations may seem tangential, expert participants emphasized the importance of establishing a robust foundation on which differentially private federated statistics could be effectively utilized. Differentially private federated statistics, in and of itself, does not mitigate all the potential risks and harms related to collecting and analyzing sensitive demographic data. It can, however, strengthen overall algorithmic fairness assessment strategies by supporting better data privacy and security throughout the assessment process.

Eyes Off My Data

Executive Summary

Introduction

The Challenges of Algorithmic Fairness Assessments

Prioritization of Data Privacy: An Incomplete Approach for Demographic Data Collection?

Premise of the Project

A Sociotechnical Framework for Assessing Demographic Data Collection

Differentially Private Federated Statistics

Differential Privacy

Federated Statistics

Differentially Private Federated Statistics

A Sociotechnical Examination of Differentially Private Federated Statistics as an Algorithmic Fairness Technique

General Considerations for Algorithmic Fairness Assessment Strategies

Design Considerations for Differentially Private Federated Statistics

Conclusion

Acknowledgments

Funding Disclosure

Appendices

Appendix 1: Fairness, Transparency and Accountability Program Area at Partnership on AI

Appendix 2: Case Study Details

Appendix 3: Multistakeholder Convenings

Appendix 4: Glossary

Appendix 5: Detailed Summary of Challenges and Risks Associated with Demographic Data Collection and Analysis

Table of Contents
1
2
3
4
5
6
7
8
9
10