Direct disclosure has limited impact on AI-generated Child Sexual Abuse Material

An analysis by researchers at Stanford HAI

How can disclosure support harm mitigation methods for AI-generated Child Sexual Abuse Material?

  • Child Sexual Abuse Material (CSAM) poses a unique challenge when it comes to mitigating harm from generative AI models – the harm is done as soon as the content is created, unlike other synthetic content categories which cause harm only when shared.
  • However, both direct and indirect disclosure can still be helpful to a number of non-user audiences that seek to mitigate harm from this content such as Trust and Safety teams, researchers, and law enforcement.
  • Although bad actors have little incentive to disclose AI-generated CSAM, direct and indirect disclosure should still be incorporated by Builders into their models in order to mitigate harm from such content.

This is a case submission by researchers Riana Pfefferkorn and Caroline Meinhardt of Stanford HAI as a supporter of PAI’s Synthetic Media Framework. Explore the other case studies

Download this case study

Mitigating the risk of generative AI models creating Child Sexual Abuse Materials

An analysis by child safety nonprofit Thorn

 

Can generative AI models be built in a way that prevents creation of Child Sexual Abuse Materials (CSAM)?

  • Thorn identified how even generative AI models created by well-intentioned Builders, such as Stable Diffusion 1.5, can contain CSAM in their training data or be fine-tuned by bad actors to create CSAM.
  • They also highlight how the use of generative AI to create CSAM furthers harm beyond the creation of the content itself: it can impede victim identification, increase revictimization, and reduce barriers to harm.
  • Builders and hosting sites of generative AI models can help mitigate the risk of their tools creating CSAM by removing models trained or capable of creating CSAM from their platforms, and evaluating training data to ensure abuse material is not included.

This is Thorn’s case submission as a supporter of PAI’s Synthetic Media Framework. Explore the other case studies

Download this case study

Policy Alignment on AI Transparency

Analyzing Interoperability of Documentation Requirements across Eight Frameworks

DOWNLOAD THE REPORT

As governments and organizations worldwide race to develop policy frameworks for foundation models, we face a juncture that demands both swift action and careful coordination. However, without coordination, we risk creating an inconsistent patchwork of frameworks and divergent understandings of best practices.

Ensuring these frameworks work together is critical.

Partnership on AI’s Policy Alignment on AI Transparency conducts a comparative analysis of eight leading policy frameworks for foundation models, with a particular focus on documentation requirements, which are a critical lever for achieving transparency and safety.

In this report, we analyze current and potential near-term interoperability challenges between the documentation requirements in leading policy frameworks, and offer recommendations that aim to promote interoperability as well as establish a common baseline for best practices for accountability and transparency. However, we recognize that this is only the beginning of a much larger conversation. Achieving global interoperability will require ongoing efforts and substantial input, particularly from the Global Majority, to reflect diverse perspectives and priorities.

The full report provides a comprehensive exploration of interoperability challenges, including the nuances of our methodology, key findings, and detailed recommendations. This summary serves to highlight our most salient conclusions, aiming to inform and guide ongoing policy discussions and decision-making in this rapidly evolving field.

As we navigate the complex landscape of AI governance, the need for coordinated, interoperable policy frameworks becomes increasingly clear. By working together across borders and sectors, we can create a more coherent, effective approach to managing the risks and harnessing the potential of foundation models, ensuring accountability, transparency and fostering innovation in the global AI ecosystem.

Jump to Summary of Recommendations

Read the Report

Download the complete Policy Alignment on AI Transparency report (41 pages) as a PDF.
Download the full report

Without coordination, we risk creating an inconsistent patchwork of frameworks and divergent understandings of best practices.

Frameworks Reviewed

Documentation and transparency play a key role in managing risk for foundation models, and are a common feature of policy frameworks.

To explore how documentation guidance is being incorporated in current policy frameworks, we compared key frameworks from the US, EU, UK, and select multilateral initiatives (see Table 1). The table outlines each framework’s provisions for foundation model documentation, ranging from high-level transparency guidelines to specific artifact requirements. It also indicates ongoing development processes for each framework.

Table 1. Frameworks reviewed in this report

A B C D E
Multilateral
OECD AI Principles
Seoul Frontier AI Safety Commitments
Hiroshima AI Process Code of Conduct
Council of Europe AI Convention
Regional
EU AI Act
National
US AI Executive Order 14110
NIST AI RMF (with Gen-AI Profile)
UK AI White Paper and followup
A: Contains high-level commitments to transparency
B: Requires/recommends documentation practices
C: Requires documentation artifacts
D: Further/more detailed provisions proposed or in development
E: Specifically addresses foundation models

Mapping Documentation Requirements Across the Frameworks

Documentation guidance and requirements under the reviewed frameworks are summarized in Table 2A below. Key findings include:

  • Documentation is a common feature of the frameworks, though this is couched in various terms. Several of the high-level frameworks recommend providing certain kinds of information to various actors; some require recording and/or reporting of information, and some require the preparation of specific documentation artifacts.
  • The most commonly referenced artifacts are technical documentation, instructions for use, information about datasets, and incident reports. However, there is little detail in most of the frameworks about what should be included in each of these documents and the form each document should take.
  • This analysis suggests that there is an opportunity to develop standardized requirements for some of the key documentation artifacts required across frameworks – provided that agreement can be reached about what the content of these artifacts should be.

Tables 2A, 2B and 2C contain a comparison of documentation requirements across in-scope frameworks. Specific documentation artifacts are shown in red. The principal documentation guidelines from PAI’s Model Deployment Guidance are included as a comparator.

Table 2A. Comparison of documentation requirements across in-scope frameworks

Stage in AI Lifecycle Framework
PAI Model Deployment Guidance EU AI Act AI Executive Order NIST RMF and Generative AI Companion Hiroshima Code of Conduct Seoul Frontier AI Commit­ments COE Convention OECD AI Principles UK AI White Paper, AI Principles, Response
R&D
Pre-deployment/ on deployment •  •  • 
Post-deployment •  • 
Across lifecycle • 
Unspecified
Documentation requirements for specific stage in the AI lifecycle
Specific documentation artifacts
General documentation requirements
Table 2B. Comparison of documentation requirements across in-scope frameworks
Specific documentation artifacts are shown in red. The principal documentation guidelines from PAI’s Model Deployment Guidance are included as a comparator.

Stage in AI Lifecycle Framework
PAI Model Deployment Guidance EU AI Act AI Executive Order NIST RMF and Generative AI Companion Hiroshima Code of Conduct
R&D Pre-system card: Planned testing, evaluation, and risk management procedures for foundation/ frontier models prior to development. Including:

  • Intended training data approach
  • Responsible AI practices
  • Development Team
  • Written “safety case”
Notify EU Commission of models with systemic risk Report dual-use models to Department of Commerce; report cybersecurity protections N/A N/A
Pre-deployment/ on deployment

Publicly report model impacts

“Key ingredient list”: including details of evaluations, limitations, risks, compute, parameters, architecture, training data approach, model documentation

Disclose performance benchmarks, intended use, risks and mitigations, testing and evaluation methodologies, environmental and labor impacts

Downstream use documentation: including appropriate uses, limitations, mitigations, safe development practices

Share red-teaming findings

Technical documentation: including information about training, testing, and evaluations

Documentation for downstream developers: including information about capabilities, limitations, and to aid downstream compliance

Public summary of training data

Report red-teaming results to Department of Commerce Multiple guidelines for documentation, including of:

  • Risks and potential impacts
  • Knowledge limits
  • TEVV considerations & tools
  • Measures of trustworthiness
  • Residual risks after mitigations
  • Model details
  • Data curation policies
  • Environmental impacts

Technical documentation

Transparency reports: with “meaningful information”

Instructions for Use

Technical Documentation

Documentation to include details of evaluations, capabilities/ limitations re: domains of use; risks to safety and society; red-teaming results

Post-deployment

Incident reporting

Transparency reporting (frontier model usage and policy violations)

Serious incident reports N/A

Incident and performance reporting

Transparency reports with steps taken to update generative AI systems

Maintain “appropriate documentation” of reported incidents
Across lifecycle

Iteration of model development

Provide documentation to government as required

Environmental Impacts

Severe labor market risks

Human rights impact assessments

N/A N/A Multiple guidelines to document processes and management systems “Work towards” information sharing and incident reporting, including on:

  • Evaluation reports
  • Safety & security risks
  • “ensuring appropriate and relevant documentation and transparency across the AI lifecycle”

Document datasets, processes and decisions during development

Regularly update Technical Documentation

Table 2C. Comparison of more general documentation and transparency requirements, at unspecified stages of the AI lifecycle

Framework
Seoul Frontier AI Commit­ments COE Convention OECD AI Principles UK AI White Paper, AI Principles, Response

Publicly report model or system capabilities, limitations, and domains of appropriate and inappropriate use

Provide public transparency on implementation of commitments, including on:

  • Risk assessments, effectiveness of mitigations, evaluation results
  • Risk thresholds
  • Approach to mitigations
  • Processes to follow if risk thresholds are met/ exceeded
Countries ratifying the convention must have frameworks (such as national laws) that:

  • Contain documentation requirements that will allow people to seek remedies for human rights violations
  • Require developers to adopt measures to identify, prevent, and mitigate risk. These measures are to include documentation of risks and mitigations
Principles include:

Transparency and Explainability:

  • “Provide meaningful information” to “foster understanding of AI Systems”
  • “Provide plain and easy-to-understand information on the sources of data/input, factors, processes and/or logic”
  • “Provide information [to] enable those adversely affected by an AI system to challenge its output.”

Accountability:

  • “Ensure traceability, including in relation to datasets, processes and decisions made during the AI system lifecycle”
Provide transparency and accountability, including “documentation on key decisions throughout the AI system life cycle”

Other Features of the Frameworks Relevant to Interoperability

In reviewing the in-scope frameworks, a number of additional factors emerged including their binding nature, enforcement mechanisms, scope of applicable models, overseeing institutions, development processes, and emphasis on collaboration, as detailed in Table 3.

Table 3. In-scope frameworks
Normative status, coverage/thresholds, reference to international standardization processes and collaboration/interoperability

Framework PAI Model Deployment Guidance EU AI Act AI Executive Order NIST RMF and Generative AI Companion Hiroshima Code of Conduct UK AI White Paper, Consultation Response OECD Seoul frontier AI Commitments
Binding or Voluntary? Voluntary Binding Partly binding Voluntary Voluntary Voluntary (guidance for sectoral regulators) Voluntary Voluntary
Coverage Foundation models (with guidance tailored according to three capability bands and four release strategies). The most stringent guidance applies to “paradigm-shifting or frontier” models

General- purpose AI models (baseline requirements)

“General- purpose AI models with systemic risk”

“Dual-use foundation models” AI systems (NIST AI RMF)

Generative foundation models (Gen-AI Profile)

“The most advanced AI systems, including the most advanced foundation models and generative AI systems”

AI systems; generally a sectoral approach

Initial focus of UK AISI on advanced systems

Planned laws for “the most powerful AI systems”

AI Systems “Frontier AI” – “highly capable general- purpose AI models or systems that can perform a wide variety of tasks and match or exceed the capabilities present in the most advanced models”
Initial Threshold N/A

None (baseline requirements)

10^25 FLOPs (models “with systemic risk”)

10^26 FLOPs (10^23 FLOPs for models trained on biological sequence data) N/A N/A N/A N/A N/A
Institutions/ Oversight N/A AI Office Dept of Commerce (for reporting requirements) N/A OECD (monitoring mechanism under development) AISI N/A N/A
Next Steps N/A

Codes of Practice for GPAI due August 2025

Templates for training data (AI Office)

Harmonized standards

Delegated acts – thresholds for GPAI with systemic risk; documentation requirements

Various, including:

OMB materials for federal procurement

Copyright guidance

Dept of Commerce can change threshold for dual-use model reporting

NIST/the NIST AISI have a broad work plan including developing tools, evaluations, metrics

COC to be iterated by G7 HAIP

OECD developing monitoring mechanism

Intention to legislate announced re advanced models, and to place AISI on statutory footing OECD developing Due Diligence Guidance (DDG) for AI under OECD Responsible Business Conduct (RBC) guidelines

AI Action Summit February 2025 (France)

AI Safety Science Report to be published at AI Action Summit

Commitment to cooperation/ collaboration? Collaborate with cross-sector Al stakeholders re risk identification, methodologies, best practices, standardization Mandates creation of AI Board, Advisory Forum; multistakeholder participation in development of Codes of Practice and harmonized standards Under EO, NIST released plan for global engagement on AI standards; Secretary of State is developing Global Development Playbook; EO contained several consultation requirements

Several references to collaboration e.g. with external researchers, industry experts, and community representatives about best risk measurement and management practices

NIST is committed to collaboration/ cooperation, e.g. through AISI Consortium and pending Network of AISIs

Across sectors, including on research to assess/ adopt risk mitigations, document incidents, and share information with the public to promote safety Focus on collaboration across government, stakeholder groups, and internationally OECD convenes the Network of Experts Information sharing, collaboration on safety research (Seoul AI Principles)
Commitment to standards Development and adoption of standards EU harmonized Standards—though EU committed to adopting international standards where possible

NIST is required to develop standards

Under EO, NIST has released plan for global engagement on promoting and developing AI standards

Contains references to considering relevance of standards (including NIST frameworks)

NIST will continue to align AI RMF with international standards

Advance development and adoption of standards Support for work on assurance techniques and technical standards Governments should promote standards development Contribute to/ take account of international standards

Summary of Findings

  • Interoperability and collaboration are explicitly included as policy goals in a number of the international frameworks, though there is no current agreement on how those goals will be achieved.
  • The frameworks emphasize the importance of documentation for foundation models, but remain light on detail about the form and content it should take.
  • There are a number of steps that could be taken to advance interoperability now and in the future, leveraging existing and proposed forums, mechanisms and processes.
  • An early focus should be on agreeing upon thresholds for regulation, to provide international consistency about which foundation models are captured by regulatory and policy frameworks.
  • While there are some challenges to relying on international standardization processes to align AI policy frameworks, standards remain an important plank in that effort.
  • Harmonizing key documentation requirements across national, regional, and international foundation model policy frameworks – and in particular, harmonizing the form and content of documentation artifacts – should be made a priority.
  • The lack of consensus on the best approaches to manage AI risks is a significant challenge to developing interoperable frameworks, including for documentation.
  • The existing and newly announced AI Safety Institutes can establish a foundation for AI safety consensus through research, evaluations, scientific advancement, and collaborative development of safety standards and documentation practices.
  • Participation by civil society and the global community is needed in all major foundation model policy initiatives if we are to ensure that they lead to alignment around best practices, and that the agenda for global interoperability is not set by a comparatively small group of nations from the Global North.

Participation by civil society and the global community is needed in all major foundation model policy initiatives.

Summary of Recommendations

  • National governments and the EU should prioritize cooperation in setting thresholds for identifying which foundation models require additional governance measures, including through supporting the OECD’s work on this issue. The AI Summit Series could also be used to take this forward. Agreeing on a common definition, and thresholds, for the models covered by policy frameworks, should flow through to greater alignment between the frameworks, including in relation to documentation requirements.
  • The G7 Presidency should continue developing the Hiroshima Code of Conduct into a more detailed framework to provide more detail about thresholds, relevant risks, and the form and content of documentation artifacts. This work should be a focus of Canada’s G7 Presidency in 2025, including aligning closely with the EU Code of Practice development timeline. In doing this, it should seek input from foundation model providers, civil society, academia and other stakeholder groups equally.
  • When creating and approving initial Codes of Practice for the EU AI Act, all involved parties should prioritize compatibility with other major AI governance frameworks where possible. The involvement of non-EU model providers, experts and civil society organizations will help advance this objective.
  • To support the development of standardized documentation artifacts, Standards Development Organizations should ensure that their processes are informed by socio-technical expertise and diverse perspectives as well as required resources. To that end, SDOs, industry, governments, and other bodies should invest in capacity building for civil society and academic stakeholders to engage in standards-making processes, including to ensure participation from the Global South.
  • The development of standardized documentation artifacts for foundation models should be a priority in AI standardization efforts. This would promote internationally comparable documentation requirements for foundation models – promoting interoperability and establishing a baseline for best practice internationally.
  • International collaboration and research initiatives should prioritize research that will support the development of standards for foundation model documentation artifacts. Documentation is a key feature of foundation model policy requirements, and common requirements for artifacts will directly improve interoperability. It will also make comparisons between models from different countries easier, promoting accountability and innovation.
  • National governments should continue to prioritize both international dialogue and collaboration on the science of AI Safety, however with more specificity and tracking of progress on commitments that will foster good practice. This work will inform a common understanding of what should be included in documentation artifacts to promote accountability and address foundation model risks.
  • National governments should support the creation/development of AI Safety Institutes (or equivalent bodies), and ensure they have the resources, functions, and powers necessary to fulfill their core tasks. Efforts should be made to align the functions of these bodies with those common among existing AISIs. This will promote efforts to develop trusted mechanisms to evaluate advanced foundation models, and may, at a later stage, lead to the potential to work towards “institutional interoperability.”
  • The Network of AISIs (and bodies with equivalent or overlapping functions such as the EU AI Office) should be supported and efforts should be made to expand its membership. Consideration should be given to how the Network could support broader AI Safety research initiatives.

Background and Methodology

The work plan leading to this report was developed with guidance from PAI’s Policy Steering Committee. This report has been informed through desk research and consultations with experts from industry, civil society, academia, and non-profit organizations, drawn from PAI’s partner and wider stakeholder networks. We tested our initial thinking in a virtual multistakeholder workshop in August 2024. The views and recommendations in this report remain those of PAI.

For a comprehensive exploration of interoperability challenges, including our methodology, key findings, detailed recommendations, and more, please download the full report. To stay in touch with our latest policy work, sign up here.

DOWNLOAD THE FULL REPORT

The development of standardized documentation artifacts for foundation models should be a priority in AI standardization efforts.

How OpenAI is building disclosure into every DALL-E image

 

What’s the best way to inform people that an image is AI-generated?

  • OpenAI explored the use of an image classifier (a synthetic media detector) to provide disclosure for the synthetic content created with their generative AI tools and prevent the potential misuse.
  • OpenAI considered the various tradeoffs in rolling out an image classifier, including accessibility (open vs. closed), accuracy, and public perception of OpenAI as a leader in the synthetic media space. By learning from their decision to take down a text classifier that was not meeting accuracy goals, OpenAI decided to slowly roll out a more accurate image classifier.
  • The Framework provided OpenAI with guidance for Builders on how to responsibly disclose the content created with DALL•E, including providing transparency to users about its limitations, addressed by a phased rollout of the classifier.

This is OpenAI’s case submission as a supporter of PAI’s Synthetic Media Framework. Explore the other case studies

Download this case study

ABOUT ML Foundational Resource

Overview


ABOUT ML (Annotation and Benchmarking on Understanding and Transparency of Machine learning Lifecycles) is a multi-year, multi-stakeholder initiative aimed at building transparency into the AI development process, industry-wide, through full lifecycle documentation. On this page, you will find the collected outputs of ABOUT ML, a library of resources designed to help organizations and individuals begin implementing transparency at scale. To further increase the usability of these resources, recommended reading plans for different readers are provided below.

Learn more about the origins of ABOUT ML and contributors to the project here.

Recommended Reading Plans

At the foundation of these resources lies the newly revised ABOUT ML Reference Document, which both identifies transparency goals and offers suggestions on how they might be achieved. Using principles provided by the Reference Document and insights about implementation gathered through our research, PAI plans to release additional ML documentation guides, templates, recommendations, and other artifacts. These future artifacts will also be available on this page.

Read the full ABOUT ML Reference Document

 

Recommended Reading Plans for…


ML System Developers/Deployers

ML system developers/deployers are encouraged to do a deep dive exploration of Section 3: Preliminary Synthesized Documentation Suggestions and use it to highlight gaps in their current understanding of both data- and model-related documentation and planning needs. This group will most benefit from further participation in the ABOUT ML effort by engaging with the community in the forthcoming online forum and by testing the efficacy and applicability of templates and specifications to be published in the PLAYBOOK and PILOTS, which will be developed based on use cases as an opportunity to implement ML documentation processes within an organization.


ML System Procurers

ML system procurers might explore Section 2.2: Documentation to Operationalize AI Ethics Goals to get ideas about what concepts to include as requirements for models and data in future requests for proposals relevant to ML systems. Additionally, they could use Section 2.3: Research Themes on Documentation for Transparency to shape conversations with the business owners and requirements writers to further elicit detailed key performance indicators and measures for success for any procured ML systems.


Users of ML System APIs and/or Experienced End Users of ML Systems

Users of ML system APIs and/or experienced end users of ML systems might skim the document and review all of the coral Quick Guides to get a better understanding of how ML concepts are relevant to many of the tools they regularly use. A review of Section 2.1: Demand for Transparency and AI Ethics in ML Systems will provide insight into conditions where it is appropriate to use ML systems. This section also explains how transparency is a foundation for both internal accountability among the developers, deployers, and API users of an ML system and external accountability to customers, impacted non-users, civil society organizations, and policymakers.


Internal Compliance Teams

Internal compliance teams are encouraged to explore Section 4: Current Challenges of Implementing Documentation and use it to shape conversations with developer/deployment teams to find ways to measure compliance throughout the Machine Learning Lifecycle (MLLC).


External Auditors

External auditors could skim Appendix: Compiled List of Documentation Questions and familiarize themselves with high-level concepts as well as tactically operationalized tenets to look for in their determination of whether or not an ML System is well-documented.


Lay Users of ML Systems and/or Members of Low-Income Communities

Lay users of ML systems and/or members of low-income communities might skim the document and review all of the blue “How We Define” boxes in order to get an overarching understanding of the text’s contents. These users are encouraged to continue learning ABOUT ML systems by exploring how they might impact their everyday lives. Additional insights can be gathered from the Glossary section of the ABOUT ML Reference Document.

Managing the Risks of AI Research: Six Recommendations for Responsible Publication

PAI Staff

Once a niche research interest, artificial intelligence (AI) has quickly become a pervasive aspect of society with increasing influence over our lives. In turn, open questions about this technology have, in recent years, transformed into urgent ethical considerations. The Partnership on AI’s (PAI) new white paper, “Managing the Risks of AI Research: Six Recommendations for Responsible Publication,” addresses one such question: Given AI’s potential for misuse, how can AI research be disseminated responsibly?

Many research communities, such as biosecurity and cybersecurity, routinely work with information that could be used to cause harm, either maliciously or accidentally. These fields have thus established their own norms and procedures for publishing high-risk research. Thanks to breakthrough advances, AI technology has progressed rapidly in the past decade, giving the AI community less time to develop similar practices.

Recent pilots, such as OpenAI’s “staged release” of GPT-2 and the “broader impact statement” requirement at the 2020 NeurIPS conference, demonstrate a growing interest in responsible AI publication norms. Effectively anticipating and mitigating the potential negative impacts of AI research, however, will require a community-wide effort. As a first step towards developing responsible publication practices, this white paper provides recommendations for three key groups in the AI research ecosystem:

  • Individual researchers, who should disclose and report additional information in their papers and normalize discussion about the downstream consequences of research.
  • Research leadership, which should review potential downstream consequences earlier in the research pipeline and commend researchers who identify negative downstream consequences.
  • Conferences and journals, which should expand peer review criteria to include engagement with potential downstream consequences and establish separate review processes to evaluate papers based on risk and downstream consequences.

Additionally, this white paper includes an appendix which seeks to disambiguate a variety of terms related to responsible research which are often conflated: “research integrity,” “research ethics,” “research culture,” “downstream consequences,” and “broader impacts.”

This document represents an artifact that can be used as a basis for further discussion, and we seek feedback on it to inform future iterations of the recommendations it contains. Our aim is to help build our capacity as a field to anticipate downstream consequences and mitigate potential risks.

To read “Managing the Risks of AI Research: Six Recommendations for Responsible Publication” in full, click here.