ABOUT ML Reference Document

Last Updated

1.1.4 Who Is This Project For?

1.1.4 Who Is This Project For?

There are many sets of stakeholders that should be considered and incorporated into the ABOUT ML project at various stages to make it the most beneficial for the largest number of people.

  • Stakeholders that should be consulted while putting together ABOUT ML resources, with a particular focus on people impacted by ML technology who may otherwise not be given a say in how that technology is built. This includes feedback from panels hosted in late 2019 by Diverse Voices which represented:
    • Lay users of expert systems
    • Members of low-income communities
    • Procurement decision makers
  • Audiences for ABOUT ML documentation artifacts, which include:
    • The ABOUT ML Reference Document: This document, which serves as an evolving resource and reference for the ABOUT ML foundational work.
    • PLAYBOOK: A repository of artifacts, specifications, guides, and templates developed and/or recommended by the ABOUT ML effort and based on foundational tenets introduced in the Reference Document.
    • PILOTS: Late-2021 implementation of use cases developed from several artifacts in the PLAYBOOK and shared with PAI Partners in an effort to acquire feedback on the use of recommended ML documentation templates.

To bring focus and prioritization to this large undertaking, PAI has set out an initial plan in consultation with the Steering Committee for how to sequence efforts for each of the above audience sets. In order to identify communities and groups within each of these sets of stakeholders and audiences, it is important to detail what goals each meta-category of stakeholders might have for engaging with ABOUT ML so that each section below begins with a discussion of possible goals. We welcome feedback on this plan. Audiences for the ABOUT ML Resources Audiences for the ABOUT ML Resources

The primary audiences for the ABOUT ML resources vary by stage of the plan laid out in Section 1.1.2 ABOUT ML Goals and Plan. Below is a summary of these key audiences and why they play the key role in each subgoal.

Sequence ABOUT ML Subgoal Key Audience for ABOUT ML Resource Theory of Change
1 Enable internal accountability Individual champions at all levels and roles inside organizations that build ML systems who are interested in implementing ABOUT ML recommendations Motivate resource investment in building internal processes and tooling to enable implementing ABOUT ML’s documentation recommendations
2 Enable external accountability Groups with the most influence over external accountability for organizations that build ML systems, including advocacy organizations, government agencies, and policy and compliance teams inside organizations Once internal processes and tooling exist to enable implementing documentation, builders of ML technology will be ready to enter and act on a detailed conversation with other stakeholders on what the contents of the documentation need to be to enable external accountability
3 Standardize documentation across industry based on high adoption of practice Organizations that build ML systems With enough data and iteration from organizations that implement the documentation for external accountability, this community can decide what set of questions make sense as an initial industry norm, which can still evolve over time Stakeholders That Should Be Consulted While Putting Together ABOUT ML Resources Stakeholders That Should Be Consulted While Putting Together ABOUT ML Resources

Beyond the refinement of this ABOUT ML Reference Document, any additional templates or resources developed as a part of the ABOUT ML effort should be shared with and reviewed by various stakeholder groups. Here is an overlapping, non-comprehensive list of stakeholders that should be particularly consulted while putting together ABOUT ML resources and why their input is valued for the ABOUT ML project. These are stakeholders who may not otherwise use the ABOUT ML resources nor read the documentation artifacts:

  • People impacted by ML technology because their priorities, desires, and concerns should be acknowledged in the ABOUT ML resources and reflected in the documentation artifacts
  • People in roles that would potentially implement ABOUT ML recommendations (e.g., product, engineering, data science, analytics, and related departments in industry; researchers who collect datasets and build models in academia and other nonprofits) because ABOUT ML needs to practically fit into their workflow
  • People in roles that have the power, headcount, and/or budget to sign off on implementing ABOUT ML because they need to buy in to the recommendations
  • People in roles that have auditing rights or power over ML technologies (e.g., government agencies and civil society organizations like advocacy organizations) because they could use ABOUT ML’s artifacts to audit technologies and the artifacts need to be usable for that purpose

Additionally, all audiences for the ABOUT ML resources and audiences for the artifacts should also be consulted. Audiences for ABOUT ML Documentation Artifacts Audiences for ABOUT ML Documentation Artifacts

The audiences most likely to use ABOUT ML documentation artifacts are people for whom the documentation would fit directly and naturally into their workflow. This includes people directly involved in either the building or purchasing of ML systems or people who have another strong reason to examine ML systems, including end users, compliance departments, or external auditors. They fall into the following categories:

  • ML system developers/deployers
  • ML system procurers
  • Users of ML system APIs
  • End users
  • Internal compliance teams
  • External auditors
  • Marketing groups

Other people who have a stake in reading the ABOUT ML documentation artifacts but who are not necessarily as likely to know that documentation could exist are non-users who are impacted by the ML systems (for example, people assigned credit scores by an ML model) and people advocating on behalf of these impacted non-users such as civil society organizations. It is important to make ABOUT ML documentation artifacts accessible to these people as well, especially given that they may have less direct access, knowledge, and influence over the ML systems than the groups named above. Whose Voices Are Currently Reflected in ABOUT ML? Whose Voices Are Currently Reflected in ABOUT ML?

The current releases as of mid-2021 reflect the work and input of the following groups:

  • PAI editors (Alice Xiang, Deb Raji, Jingying Yang, Christine Custis)
  • Authors of the Datasheets, Model Cards, Factsheets, and Data Statements papers
  • Interested people from PAI’s Partner community during an internal review process
  • People who submitted comments during the public comment process
  • ABOUT ML Steering Committee
  • Diverse Voices panels consisting of experiential experts from the following communities:
    • Lay users of expert systems: The Diverse Voices process of The Tech Policy Lab within the University of Washington defined lay users of ML systems as anyone who uses or might use ML systems as part of their work (such as rideshare drivers) but who do not have expertise in the technical engineering of ML systems. In this panel, one panelist was a rideshare driver, one panelist was a medical student, and one panelist was an administrative office worker. All panelists were currently using ML systems or expected to use ML systems in the near future.
  • Members of low-income communities: The Diverse Voices process of The Tech Policy Lab within the University of Washington defined members of low-income communities as anyone whose household income is less than twice the federal poverty threshold. In this panel, five panelists identified themselves as being low-income community members and two panelists served the low-income community in a professional capacity (e.g., employment counselor, property manager for a low-income apartment building).
  • Procurement decision makers: The Diverse Voices process of The Tech Policy Lab within the University of Washington defined procurement decision makers as anyone who, as part of their work, is involved in the acquisition of new technology by defining technological needs for an organization, preparing requests or bids for new technology, or ensuring the service or product complies with state and federal laws. In this panel, all panelists were involved in some part of the technology procurement process, though none of the panelists held the title of procurer. Three panelists were responsible for procurement decisions in the public sector (e.g., public libraries, city government, state government) and two panelists had experience with procurement in non-profit organizations. Origin Story Origin Story

ABOUT ML is a project of PAI working towards establishing new norms on transparency via identifying best practices for documenting and characterizing key components and phases throughout the ML system lifecycle from design to deployment, including annotations of data, algorithms, performance, and maintenance requirements.

Hanna Wallach, Meg Mitchell, Jenn Wortman Vaughan, and Timnit Gebru had a series of meetings given their work in documentation and standardization. These efforts include seminal research related to Datasheets for Datasets and Model Cards for Model Reporting. After those initial discussions coinciding with the early days of PAI (circa 2018-2019), Hanna and Meg approached PAI and suggested that this work be continued and advanced under the umbrella of the multistakeholder organization and with the continued support and input of the Partner community.

Francesa Rossi and Kush Varshney, both from Partner IBM, also approached PAI with the idea to focus on documentation work and contributed to the early and ongoing efforts of ABOUT ML. IBM’s research related to Factsheets was meaningful to this practical effort. PAI has since continued to work with tech companies, nonprofits, academic researchers, policymakers, end users, and impacted non-users to coordinate and influence practice in the ML documentation space. Eric Horvitz at Microsoft was also a key contributor in identifying the need to unify all of these projects bringing datasheets and model cards and other documentation practices and templates together to inspire the research focus for a single PAI program.

Jingying Yang was PAI’s original Program Lead for the ABOUT ML work. She, along with other staff members within PAI, developed a research plan for how to engage with the stakeholders in order to set a new industry norm of documenting all ML systems built and deployed, thus changing practice at scale. Important contributors during this stage of the work included PAI Fellow Deb Raji and Alice Xiang, Head of Fairness, Transparency, and Accountability Research, who served as PAI editors of the v0 foundational document. Hanna Wallach, Meg Mitchell, Jenn Wortman Vaughan, and Timnit Gebru continued their pivotal support along with Lassana Magassa in shaping the program’s intentions and heightening awareness of important concepts related to attribution and inclusion.

Through an evidence/research-based multi-pronged initiative that includes and responds to solicited feedback from many stakeholders, the ABOUT ML work has progressed and the ultimate goal is to bring companies and organizations together with similar ideas around AI documentation in an effort to push for general guidelines and an overall higher bar of responsible AI. The impact we believe this work has and will continue to have is helping to create an organizational infrastructure for ethics in ML and helping to increase responsible tech development and deployment via transparency and accountability.

The work continues and we welcome the input of the AI community in the ongoing revisions to our foundational document as well as the artifacts and templates we plan to share as a result of that work. We have listed several other contributors to this effort on an internal website and ask that you visit this list and help us to add to it with names of other supporters, reviewers, researchers and contributors in the ABOUT ML effort by filling out this form.

Below is a list of contributors to the ABOUT ML project since its inception:

  • Norberto Andrade – Facebook
  • Thomas Arnold – Tufts HRILab
  • Amir Banifatemi – XPRIZE
  • Rachel Bellamy – IBM
  • Umang Bhatt – Leverhulme Centre for the Future of Intelligence
  • Miranda Bogen – Facebook
  • Ashley Boyd – Mozilla Foundation
  • Jacomo Corbo – QuantumBlack
  • Hannah Darnton – BSR
  • Anat Elhalal – Digital Catapult
  • Daniel First – McKinsey / QuantumBlack
  • Sharon Bradford Franklin – Open Technology Institute
  • Ben Garfinkel – Future of Humanity Institute
  • Timnit Gebru – AI/ML Researcher
  • Jeremy Gillula – EFF
  • Jeremy Holland – Apple
  • Ross Jackson – EY
  • Libby Kinsey – Digital Catapult
  • Brenda Leong – Future of Privacy Forum
  • Tyler Liechty – DeepMind
  • Lassana Magassa – Tech Policy Lab
  • Richard Mallah – Future of Life Institute
  • Meg Mitchell – AI/ML Researcher
  • Amanda Navarro – PolicyLink
  • Deborah Raji – Mozilla
  • Thomas Renner – Fraunhofer IAO
  • Andrew Selbst – Data & Society
  • Ramya Sethuraman – Facebook
  • Reshama Shaikh – Data Umbrella
  • Moninder Singh – IBM
  • Spandana Singh – Open Technology Institute
  • Amber Sinha – Centre for Internet and Society
  • Michael Spranger – Sony
  • Andrew Strait – Ada Lovelace Institute
  • Michael Veale – UCL
  • Briana Vecchione – Cornell University
  • Jennifer Wortman Vaughan – Microsoft
  • Hannah Wallach – Microsoft
  • Adrian Weller – Leverhulme Centre for the Future of Intelligence
  • Abigail Hing Wen – Author & Filmmaker
  • Alexander Wong – Vision and Image Processing Lab at University of Waterloo
  • Andrew Zaldivar – Google
  • Gabi Zijderveld – Affectiva

ABOUT ML Reference Document

Section 0: How to Use this Document

Recommended Reading Plan

Quick Guides

How We Define

Contact for Support

Section 1: Project Overview

1.1 Statement of Importance for ABOUT ML Project

1.1.0 Importance of Transparency: Why a Company Motivated by the Bottom Line Should Adopt ABOUT ML Recommendations

1.1.1 About This Document and Version Numbering

1.1.2 ABOUT ML Goals and Plan

1.1.3 ABOUT ML Project Process and Timeline Overview

1.1.4 Who Is This Project For? Audiences for the ABOUT ML Resources Stakeholders That Should Be Consulted While Putting Together ABOUT ML Resources Audiences for ABOUT ML Documentation Artifacts Whose Voices Are Currently Reflected in ABOUT ML? Origin Story

Section 2: Literature Review (Current Recommendations on Documentation for Transparency in the ML Lifecycle)

2.1 Demand for Transparency and AI Ethics in ML Systems 

2.2 Documentation to Operationalize AI Ethics Goals

2.2.1 Documentation as a Process in the ML Lifecycle

2.2.2 Key Process Considerations for Documentation

2.3 Research Themes on Documentation for Transparency 

2.3.1 System Design and Set Up

2.3.2 System Development

2.3.3 System Deployment

Section 3: Preliminary Synthesized Documentation Suggestions

3.4.1 Suggested Documentation Sections for Datasets Data Specification Motivation Data Curation Collection Processing Composition Types and Sources of Judgement Calls Data Integration Use Distribution Maintenance

3.4.2 Suggested Documentation Sections for Models Model Specifications Model Training Evaluation Model Integration Maintenance

Section 4: Current Challenges of Implementing Documentation

Section 5: Conclusions

Version 0

Version 1

Appendix A: Compiled List of Documentation Questions 

Fact Sheets (Arnold et al. 2018)

Data Sheets (Gebru et al. 2018)

Model Cards (Mitchell et al. 2018)

A “Nutrition Label” for Privacy (Kelley et al. 2009)

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards (Holland et al. 2019)

Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science (Bender and Friedman 2018)

Appendix B: Diverse Voices Process and Artifacts

Procurement Recruitment Email

Procurement Confirmation Email 

Appendix C: Glossary

Sources Cited

  1. Holstein, K., Vaughan, J.W., Daumé, H., Dudík, M., & Wallach, H.M. (2018). Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? CHI.
  2. Young, M., Magassa, L. and Friedman, B. (2019) Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology 21(2), 89-103.
  3. World Wide Web Consortium Process Document (W3C) process outlined here: https://www.w3.org/2019/Process-20190301/
  4. Internet Engineering Task Force (IETF) process outlined here: https://www.ietf.org/standards/process/
  5. The Web Hypertext Application Technology Working Group (WHATWG) process outlined here: https://whatwg.org/faq#process
  6. Oever, N., Moriarty, K. The Tao of IETF: A novice's guide to the Internet Engineering Task Force. https://www.ietf.org/about/participate/tao/.
  7. Young, M., Magassa, L. and Friedman, B. (2019) Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology 21(2), 89-103.
  8. Friedman, B, Kahn, Peter H., and Borning, A., (2008) Value sensitive design and information systems. In Kenneth Einar Himma and Herman T. Tavani (Eds.) The Handbook of Information and Computer Ethics., (pp. 70-100) John Wiley & Sons, Inc. http://jgustilo.pbworks.com/f/the-handbook-of-information-and-computer-ethics.pdf#page=104; Davis, J., and P. Nathan, L. (2015). Value sensitive design: applications, adaptations, and critiques. Handbook of Ethics, Values, and Technological Design: Sources, Theory, Values and Application Domains. (pp. 11-40) DOI: 10.1007/978-94-007-6970-0_3. https://www.researchgate.net/publication/283744306_Value_Sensitive_Design_Applications_Adaptations_and_Critiques; Borning, A. and Muller, M. (2012). Next steps for value sensitive design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). (pp 1125-1134) DOI: https://doi.org/10.1145/2207676.2208560 https://dl.acm.org/citation.cfm?id=2208560
  9. Pichai, S., (2018). AI at Google: our principles. The Keyword. https://www.blog.google/technology/ai/ai-principles/; IBM’s Principles for Trust and Transparency. IBM Policy. https://www.ibm.com/blogs/policy/trust-principles/; Microsoft AI principles. Microsoft. https://www.microsoft.com/en-us/ai/our-approach-to-ai; Ethically Aligned Design – Version II. IEEE. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead_v2.pdf
  10. Zeng, Y., Lu, E., and Huangfu, C. (2018) Linking artificial intelligence principles. CoRR https://arxiv.org/abs/1812.04814.
  11. essica Fjeld, Hannah Hilligoss, Nele Achten, Maia Levy Daniel, Sally Kagay, and Joshua Feldman, (2018). Principled artificial intelligence - a map of ethical and rights based approaches, Berkman Center for Internet and Society, https://ai-hr.cyber.harvard.edu/primp-viz.html
  12. Jobin, A., Ienca, M., & Vayena, E. (2019). Artificial Intelligence: the global landscape of ethics guidelines. arXiv preprint arXiv:1906.11668. https://arxiv.org/pdf/1906.11668.pdf
  13. Jobin, A., Ienca, M., & Vayena, E. (2019). Artificial Intelligence: the global landscape of ethics guidelines. arXiv preprint arXiv:1906.11668. https://arxiv.org/pdf/1906.11668.pdf
  14. Ananny, M., and Kate Crawford (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society 20 (3): 973-989.
  15. Whittlestone, J., Nyrup, R., Alexandrova, A., & Cave, S. (2019, January). The Role and Limits of Principles in AI Ethics: Towards a Focus on Tensions. In Proceedings of the AAAI/ACM Conference on AI Ethics and Society, Honolulu, HI, USA (pp. 27-28). http://www.aies-conference.com/wp-content/papers/main/AIES-19_paper_188.pdf; Mittelstadt, B. (2019). AI Ethics–Too Principled to Fail? https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3391293
  16. Greene, D., Hoffmann, A. L., & Stark, L. (2019, January). Better, nicer, clearer, fairer: A critical assessment of the movement for ethical artificial intelligence and machine learning. In Proceedings of the 52nd Hawaii International Conference on System Sciences. https://scholarspace.manoa.hawaii.edu/handle/10125/59651
  17. Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In AAAI/ACM Conf. on AI Ethics and Society (Vol. 1). https://www.media.mit.edu/publications/actionable-auditing-investigating-the-impact-of-publicly-naming-biased-performance-results-of-commercial-ai-products/
  18. Algorithmic Impact Assessment (2019) Government of Canada https://www.canada.ca/en/government/system/digital-government/modern-emerging-technologies/responsible-use-ai/algorithmic-impact-assessment.html
  19. Benjamin, M., Gagnon, P., Rostamzadeh, N., Pal, C., Bengio, Y., & Shee, A. (2019). Towards Standardization of Data Licenses: The Montreal Data License. arXiv preprint arXiv:1903.12262. https://arxiv.org/abs/1903.12262; Responsible AI Licenses v0.1. RAIL: Responsible AI Licenses. https://www.licenses.ai/ai-licenses
  20. See Citation 5
  21. Safe Face Pledge. https://www.safefacepledge.org/; Montreal Declaration on Responsible AI. Universite de Montreal. https://www.montrealdeclaration-responsibleai.com/; The Toronto Declaration: Protecting the right to equality and non-discrimination in machine learning systems. (2018). Amnesty International and Access Now. https://www.accessnow.org/cms/assets/uploads/2018/08/The-Toronto-Declaration_ENG_08-2018.pdf ; Dagsthul Declaration on the application of machine learning and artificial intelligence for social good. https://www.dagstuhl.de/fileadmin/redaktion/Programm/Seminar/19082/Declaration/Declaration.pdf
  22. Dobbe, R., Dean, S., Gilbert, T., & Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. https://arxiv.org/pdf/1807.00553.pdf
  23. Wagstaff, K. (2012). Machine learning that matters. https://arxiv.org/pdf/1206.4656.pdf ; Friedman, B., Kahn, P. H., Borning, A., & Huldtgren, A. (2013). Value sensitive design and information systems. In Early engagement and new technologies: Opening up the laboratory (pp. 55-95). Springer, Dordrecht. https://vsdesign.org/publications/pdf/non-scan-vsd-and-information-systems.pdf
  24. Dobbe, R., Dean, S., Gilbert, T., & Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. https://arxiv.org/pdf/1807.00553.pdf
  25. Safe Face Pledge. https://www.safefacepledge.org/
  26. Montreal Declaration on Responsible AI. Universite de Montreal. https://www.montrealdeclaration-responsibleai.com/
  27. Diverse Voices How To Guide. Tech Policy Lab, University of Washington. https://techpolicylab.uw.edu/project/diverse-voices/
  28. Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587-604.
  29. Ethically Aligned Design – Version II. IEEE. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead_v2.pdf
  30. Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumeé III, H., & Crawford, K. (2018). Datasheets for datasets. https://arxiv.org/abs/1803.09010 https://arxiv.org/abs/1803.09010; Hazard Communication Standard: Safety Data Sheets. Occupational Safety and Health Administration, US Department of Labor. https://www.osha.gov/Publications/OSHA3514.html
  31. Holland, S., Hosny, A., Newman, S., Joseph, J., & Chmielinski, K. (2018). The dataset nutrition label: A framework to drive higher data quality standards. https://arxiv.org/abs/1805.03677; Kelley, P. G., Bresee, J., Cranor, L. F., & Reeder, R. W. (2009). A nutrition label for privacy. In Proceedings of the 5th Symposium on Usable Privacy and Security (p. 4). ACM. http://cups.cs.cmu.edu/soups/2009/proceedings/a4-kelley.pdf
  32. Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., ... & Gebru, T. (2019, January). Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 220-229). ACM. https://arxiv.org/abs/1810.03993
  33. Hind, M., Mehta, S., Mojsilovic, A., Nair, R., Ramamurthy, K. N., Olteanu, A., & Varshney, K. R. (2018). Increasing Trust in AI Services through Supplier's Declarations of Conformity. https://arxiv.org/abs/1808.07261
  34. Veale M., Van Kleek M., & Binns R. (2018) ‘Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making’ in Proceedings of the ACM Conference on Human Factors in Computing Systems, CHI 2018. https://arxiv.org/abs/1802.01029.
  35. Benjamin, M., Gagnon, P., Rostamzadeh, N., Pal, C., Bengio, Y., & Shee, A. (2019). Towards Standardization of Data Licenses: The Montreal Data License. https://arxiv.org/abs/1903.12262
  36. Cooper, D. M. (2013, April). A Licensing Approach to Regulation of Open Robotics. In Paper for presentation for We Robot: Getting down to business conference, Stanford Law School.
  37. Responsible AI Practices. Google AI. https://ai.google/education/responsible-ai-practices
  38. Everyday Ethics for Artificial Intelligence. (2019). IBM. https://www.ibm.com/watson/assets/duo/pdf/everydayethics.pdf
  39. Federal Trade Commission. (2012). Best Practices for Common Uses of Facial Recognition Technologies (Staff Report). Federal Trade Commission, 30. https://www.ftc.gov/sites/default/files/documents/reports/facing-facts-best-practices-common-uses-facial-recognition-technologies/121022facialtechrpt.pdf
  40. Microsoft (2018). Responsible bots: 10 guidelines for developers of conversational AI. https://www.microsoft.com/en-us/research/uploads/prod/2018/11/Bot_Guidelines_Nov_2018.pdf
  41. Tramer, F., Atlidakis, V., Geambasu, R., Hsu, D., Hubaux, J. P., Humbert, M., ... & Lin, H. (2017, April). FairTest: Discovering unwarranted associations in data-driven applications. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P) (pp. 401-416). IEEE. https://github.com/columbia/fairtest, https://www.mhumbert.com/publications/eurosp17.pdf
  42. Kishore Durg (2018). Testing AI: Teach and Test to raise responsible AI. Accenture Technology Blog. https://www.accenture.com/us-en/insights/technology/testing-AI
  43. Kush R. Varshney (2018). Introducing AI Fairness 360. IBM Research Blog. https://www.ibm.com/blogs/research/2018/09/ai-fairness-360/
  44. Dave Gershgorn (2018). Facebook says it has a tool to detect bias in its artificial intelligence. Quartz. https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence/
  45. James Wexler. (2018) The What-If Tool: Code-Free Probing of Machine Learning Models. Google AI Blog. https://ai.googleblog.com/2018/09/the-what-if-tool-code-free-probing-of.html
  46. Miro Dudík, John Langford, Hanna Wallach, and Alekh Agarwal (2018). Machine Learning for fair decisions. Microsoft Research Blog. https://www.microsoft.com/en-us/research/blog/machine-learning-for-fair-decisions/
  47. Veale, M., Binns, R., & Edwards, L. (2018). Algorithms that Remember: Model Inversion Attacks and Data Protection Law. Phil. Trans. R. Soc. A, 376, 20180083. https://doi.org/10/gfc63m
  48. Floridi, L. (2010, February). Information: A Very Short Introduction.
  49. Data Information Specialists Committee UK, 2007. http://www.disc-uk.org/qanda.html.
  50. Harwell, Drew. “Federal Study Confirms Racial Bias of Many Facial-Recognition Systems, Casts Doubt on Their Expanding Use.” The Washington Post, WP Company, 21 Dec. 2019, www.washingtonpost.com/technology/2019/12/19/federal-study-confirms-racial-bias-many-facial-recognition-systems-casts-doubt-their-expanding-use/
  51. Hildebrandt, M. (2019) ‘Privacy as Protection of the Incomputable Self: From Agnostic to Agonistic Machine Learning’, Theoretical Inquiries in Law, 20(1) 83–121.
  52. D'Amour, A., Heller, K., Moldovan, D., Adlam, B., Alipanahi, B., Beutel, A., ... & Sculley, D. (2020). Underspecification presents challenges for credibility in modern machine learning. arXiv preprint arXiv:2011.03395.
  53. Selinger, E. (2019). ‘Why You Can’t Really Consent to Facebook’s Facial Recognition’, One Zero. https://onezero.medium.com/why-you-cant-really-consent-to-facebook-s-facial-recognition-6bb94ea1dc8f
  54. Lum, K., & Isaac, W. (2016). To predict and serve?. Significance, 13(5), 14-19. https://rss.onlinelibrary.wiley.com/doi/full/10.1111/j.1740-9713.2016.00960.x
  55. LabelInsight (2016). “Drive Long-Term Trust & Loyalty Through Transparency”. https://www.labelinsight.com/Transparency-ROI-Study
  56. Crawford and Paglen, https://www.excavating.ai/
  57. Geva, Mor & Goldberg, Yoav & Berant, Jonathan. (2019). Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets. https://arxiv.org/pdf/1908.07898.pdf
  58. Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587-604.
  59. Desmond U. Patton et al (2017).
  60. See Cynthia Dwork et al.,
  61. Katta Spiel, Oliver L. Haimson, and Danielle Lottridge. (2019). How to do better with gender on surveys: a guide for HCI researchers. Interactions. 26, 4 (June 2019), 62-65. DOI: https://doi.org/10.1145/3338283
  62. A. Doan, A. Y. Halevy, and Z. G. Ives. Principles of Data Integration. Morgan Kaufmann, 2012
  63. Momin M. Malik. (2019). Can algorithms themselves be biased? Medium. https://medium.com/berkman-klein-center/can-algorithms-themselves-be-biased-cffecbf2302c
  64. Fire, Michael, and Carlos Guestrin (2019). “Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action.” GigaScience 8 (giz053). https://doi.org/10.1093/gigascience/giz053.
  65. Vogelsang, A., & Borg, M. (2019, September). Requirements engineering for machine learning: Perspectives from data scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW) (pp. 245-251). IEEE
  66. Eckersley, P. (2018). Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv preprint arXiv:1901.00064.
  67. Partnership on AI. Report on Algorithmic Risk Assessment Tools in the U.S. Criminal Justice System, Requirement 5.
  68. Eckersley, P. (2018). Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv preprint arXiv:1901.00064.https://arxiv.org/abs/1901.00064
  69. If it is not, there is likely a bug in the code. Checking a predictive model's performance on the training set cannot distinguish irreducible error (which comes from intrinsic variance of the system) from error introduced by bias and variance in the estimator; this is universal, and has nothing to do with different settings or
  70. Selbst, Andrew D. and Boyd, Danah and Friedler, Sorelle and Venkatasubramanian, Suresh and Vertesi, Janet (2018). “Fairness and Abstraction in Sociotechnical Systems”, ACM Conference on Fairness, Accountability, and Transparency (FAT*). https://ssrn.com/abstract=3265913
  71. Tools that can be used to explore and audit the predictive model fairness include FairML, Lime, IBM AI Fairness 360, SHAP, Google What-If Tool, and many others
  72. Wagstaff, K. (2012). Machine learning that matters. arXiv preprint arXiv:1206.4656. https://arxiv.org/abs/1206.4656
Table of Contents