As machine learning becomes central to many decision-making processes — including high-stakes decisions in criminal justice, healthcare, and banking — organizations using ML systems to aid or automate decisions face increased pressure for transparency on how these decisions are made. In a 2019 Harvard Business Review article, Eric Colson states that routine decisions based on structured data are best handled by artificial intelligence as AI is “less prone to human’s cognitive bias.” However, the author goes on to warn, developers and deployers of AI, specifically ML systems, should consider the inherent “risk of using biased data that may cause AI to find specious relationships that are unfair.” Annotation and Benchmarking on Understanding and Transparency of Machine learning Lifecycles (ABOUT ML) is a project of the Partnership on AI (PAI) working towards establishing new norms on transparency by identifying best practices for documenting and characterizing key components and phases throughout the ML system lifecycle from design to deployment, including annotations of data, algorithms, performance, and maintenance requirements.

Annotation and Benchmarking on Understanding and Transparency of Machine-learning Lifecycles (ABOUT ML) is a project of the Partnership on AI working towards establishing new norms on transparency.

Transparency

For this document we adopt a meaning for transparency that includes any “efforts to increase explainability, interpretability, or other acts of communication and disclosure.”

Presently, there is neither consensus on which documentation practices work best nor on what information needs to be disclosed and for which goals. Moreover, the definition of Transparency itself is highly contextual. Because there is currently no standardized process across the industry, each team that wants to improve transparency in an ML system must address the entire suite of questions about what transparency means for their team, product, and organization within the context of their specific goals and constraints. Our goal is to provide a start to that process of exploration. We will offer a summary of recommendations and practices that is mindful of the variance in transparency expectations and outcomes. We hope to provide an adaptive resource to highlight common themes about transparency, rather than a rigid list of requirements. This should serve to guide teams to identify and address context-specific challenges.

While substantial decentralized experimentation is currently taking place, aims to accelerate progress by pooling insights more quickly, sharing resources, and reducing redundancy of highly similar efforts. In doing this together, the community can improve quality, reproducibility, rigor, and consistency of these efforts by gathering evaluation data for a variety of proposals. The Partnership on AI (PAI) aims to provide a gathering place for researchers, AI practitioners, civil society organizations, and especially those affected by AI products to discuss, debate, and ultimately decide on broadly applicable recommendations.

While substantial decentralized experimentation is currently taking place, aims to accelerate progress by pooling insights more quickly, sharing resources, and reducing redundancy of highly similar efforts. In doing this together, the community can improve quality, reproducibility, rigor, and consistency of these efforts by gathering evaluation data for a variety of proposals. The Partnership on AI (PAI) aims to provide a gathering place for researchers, AI practitioners, civil society organizations, and especially those affected by AI products to discuss, debate, and ultimately decide on broadly applicable recommendations.

ABOUT ML

The ABOUT ML Initiative was presented at the Human-Centric Machine Learning Workshop at Neural Information Processing Systems Conference in 2019.

In this work, Deb Raji and Jingying Yang note that “transparency through documentation is a promising practical intervention that can integrate into existing workflows to provide clarity in decision making for users, external auditors, procurement departments, and other stakeholders alike.”

ABOUT ML seeks to bring together representatives from a wide range of relevant stakeholder groups to improve public discussion and promulgate best practices into new industry norms that will reflect diverse interests and chart a path forward for greater transparency in ML. We encourage any organization undertaking transparency initiatives to share their practices and lessons learned to PAI for incorporation into future versions of this document and/or artifacts in the forthcoming PLAYBOOK.

This is an ongoing project with regular evaluation points to keep up with the rapidly evolving field of AI. PAI’s broad  range of partner organizations, including corporate developers of AI, civil society organizations, and academic institutions, will be involved in the drafting and vetting of documentation themes recommended in this document. In addition, PAI engaged with the Tech Policy Lab at the University of Washington to run a Diverse Voices panel to gather opinions from stakeholders whose perspectives might not otherwise be captured. Through this process, PAI has gained deeper insights into the Diverse Voices process in order to inform the ABOUT ML recommendations on how to incorporate the perspectives of diverse stakeholders.

We began by highlighting recurrent themes in ML research about documentation, but our ambitious aim is to identify all practices that have sufficient positive data of efficacy to be deemed best practices in ML transparency. PAI has welcomed a public discussion of what it takes to have sufficient data to be deemed best practices alongside the design of ABOUT ML PILOTS. Now that the input from the Diverse Voices process has been incorporated in this current version of the document, PAI aims to continue investigating and refining best practices so they can be disseminated broadly into new norms to improve transparency in the AI industry. We will also continue to highlight promising but insufficiently well-supported practices that are especially deserving of further study.

Our ambitious aim is to identify all practices that have sufficient positive data of efficacy to be deemed best practices in ML transparency.

Companies can showcase and implement their commitment to responsible AI by adopting the tenets set forth in this Version 1 (v1) reference document and any forthcoming components of the PLAYBOOK. This work is meant to empower that intention with scientifically supported recommendations and artifacts to support the “actioning” of transparency and accountability. As noted in Section 2.2: Documentation to Operationalize AI Ethics Goals, documentation provides important benefits even in contexts where full external sharing is not possible.

The ABOUT ML effort aims to encourage organizations to invest in and build the internal processes and infrastructure needed to implement and scale the creation of documentation artifacts. Internal documentation (for other teams inside the same organization, with more details) and external documentation (for broader consumption, with fewer sensitive details) are both valuable and should be undertaken together as they provide complementary incentives and benefits. Organizations will benefit from the alignment of internal and external incentives with the incentives behind proper documentation.

Section Latest version Public Comment Steering Committee Diverse Voices
1. Project Overview v1.0
2. Literature review v1.0
3. Preliminary Synthesized Documentation Suggestions v1.0
4. Challenges v1.0
5. Promising interventions to try v1.0
6. ML primer v1.0
7. Appendix v1.0

1.1.4.1

Audiences for the ABOUT ML resources

The primary audiences for the ABOUT ML resources vary by stage of the plan laid out in Section 1.1.2 ABOUT ML Goals and Plan. Below is a summary of these key audiences and why they play the key role in each subgoal.

Sequence ABOUT ML Subgoal Key Audience for ABOUT ML Resource Theory of Change
1 Enable internal accountability Individual champions at all levels and roles inside organizations that build ML systems who are interested in implementing ABOUT ML recommendations Motivate resource investment in building internal processes and tooling to enable implementing ABOUT ML’s documentation recommendations
2 Enable external accountability Groups with the most influence over external accountability for organizations that build ML systems, including advocacy organizations, government agencies, and policy and compliance teams inside organizations Once internal processes and tooling exist to enable implementing documentation, builders of ML technology will be ready to enter and act on a detailed conversation with other stakeholders on what the contents of the documentation need to be to enable external accountability
3 Standardize documentation across industry based on high adoption of practice Organizations that build ML systems With enough data and iteration from organizations that implement the documentation for external accountability, this community can decide what set of questions make sense as an initial industry norm, which can still evolve over time

Beyond the refinement of this v1 document, any additional templates or resources developed as a part of the ABOUT ML effort should be shared with and reviewed by various stakeholder groups. Here is an overlapping, non-comprehensive list of stakeholders that should be particularly consulted while putting together ABOUT ML resources and why their input is valued for the ABOUT ML project. These are stakeholders who may not otherwise use the ABOUT ML resources nor read the documentation artifacts:

  • People impacted by ML technology because their priorities, desires, and concerns should be acknowledged in the ABOUT ML resources and reflected in the documentation artifacts
  • People in roles that would potentially implement ABOUT ML recommendations (e.g., product, engineering, data science, analytics, and related departments in industry; researchers who collect datasets and build models in academia and other nonprofits) because ABOUT ML needs to practically fit into their workflow
  • People in roles that have the power, headcount, and/or budget to sign off on implementing ABOUT ML because they need to buy in to the recommendations
  • People in roles that have auditing rights or power over ML technologies (e.g., government agencies and civil society organizations like advocacy organizations) because they could use ABOUT ML’s artifacts to audit technologies and the artifacts need to be usable for that purpose

Additionally, all audiences for the ABOUT ML resources and audiences for the artifacts should also be consulted.