ABOUT ML Reference Document

Last Updated

Section 4: Current Challenges of Implementing Documentation

Section 4: Current Challenges of Implementing Documentation

This section is where PAI invites comments, anecdotes, case studies, broader stories from implementing documentation efforts, and results from any solutions (effective or ineffective) attempting to address these challenges. Please also share feedback on whether your organization has encountered these challenges or new ones or if these challenges do not exist in your work.

When attempting to implement the recommended documentation guidelines, a number of common challenges arise. The following is an overview of currently identified challenges. The eventual goal of this chapter is to help practitioners foresee challenges in their own settings and provide solution options for addressing them.

Intra- and Inter-Organizational Cooperation

It is very difficult to set up the novel organizational processes, get buy in, and secure monetary and HR resources required to effectively adopt documentation for transparency as an organizational norm without cooperation and alignment across multiple levels of internal and possibly external teams. This can be especially true when documentation is not required by external factors (e.g., by procurement requirements, audits, etc.). Getting alignment around prioritizing the principle of transparency is the critical first step to implementing any documentation practices. Given the early stage of implementation for documentation practices, organizations may also need to look to outside expertise to aid in the process of designing the right processes, templates, and practices. ABOUT ML hopes to offer a starting resource in this.

Documentation Takes Time

Related to the above challenge, another obvious challenge to implementing documentation is that proper documentation takes time, and time is a valuable commodity in the quick-moving technology industry. Often internal and external incentives will not align with the incentives behind proper documentation.

Benchmark Customization

For many documentation criteria, it is very difficult to identify appropriate and demographically representative benchmarks to completely evaluate a model system. Often, separate, custom benchmarks specific to the ML system’s context of use need to be developed for the documentation requirements and the evaluation of deployment. However, this can be costly and time intensive, so there are real organizational tradeoffs to navigate between benchmark quality, speed to deployment, and quality of evaluation and documentation.

Metric Selection

There exist numerous academic and industry metrics to measure the performance of an ML system. There also exist many different fairness evaluation metrics and definitions — too many to go into in detail in this document.Tools that can be used to explore and audit the predictive model fairness include FairML, Lime, IBM AI Fairness 360, SHAP, Google What-If Tool, and many others Metric selection thus requires ethical choices based on the specific situation, which will often require interdisciplinary work with ethicists, legal experts, representatives of affected communities, and others. Thus, metric selection, both in terms of performance metrics and fairness-oriented metrics — and documentation of why a given metric was selected — is additional work that teams need to budget time for. In “Machine Learning That Matters,”Wagstaff, K. (2012). Machine learning that matters. arXiv preprint arXiv:1206.4656. https://arxiv.org/abs/1206.4656 Kiri Wagstaff suggests the high-level guidance of defining metrics according to the intended outcome rather than evaluating model performance on arbitrary test sets with typical ML performance metrics, which is a reasonable starting point for most projects. For instance, if an ML system is meant to optimize for revenue, then measure revenue outcomes from the ML system directly, or make use of a proxy like advertisement impressions, rather than rely on the Area Under The Curve (AUC) of an isolated model.

Privacy

Soliciting and mining the demographic metadata used in evaluating whether a model system is performing fairly across intersectional subgroups can also expose identifying information about users and image subjects. To mitigate this risk, it is important to store the data in a way that respects privacy and does not compromise individual privacy in exchange for ML system-level transparency.

Compromising Intellectual Property: One commonly feared risk of documentation is losing trade secrets and intellectual property due to disclosing too much information. However, allowing either protecting intellectual property or preserving “trade secrets” to serve as a blanket excuse for omitting information in documentation opens the door for companies to hide crucial information that should be revealed in the public interest. One goal of ABOUT ML guidelines is to indicate areas all companies should be willing to share. In each documentation recommendation section in Section 3: Preliminary Synthesized Documentation Suggestions, we discuss more specific pros and cons of disclosing that specific information about an ML system to better inform this tradeoff.

System Security Vulnerability

Another fear for documentation is revealing attack surfaces in an ML system by providing too much insight into how it was built. This is a fine line to walk because, on the one hand, it is an organization’s responsibility to build robust security measures into their ML systems and documenting these may spread more knowledge for other organizations attempting to secure their own systems. On the other hand, people fear that knowledge can be misused for hacking rather than for shoring up collective defenses. It is worth a detailed discussion as an industry to better discern between low- and high-risk types of information disclosure. Additional security risks include documentation that reveals potential blindspots of the model system such that nefarious actors could game or hack the system. The first step to finding solutions for these risks is by naming them and the next step involves investing in research and further understanding towards best practices.

Lack of Formal Decision-Making or Development Practices

In some situations, models are tuned by adjusting various parameters until the result “looks right.” It can be difficult to document such ad hoc practices as they lack any formal structure. Conversely, this sort of ad hoc decision-making or model development is especially important to document, since it can obscure consequential decisions about who, what, and how the model was developed.

Section 5: Conclusions

Section 5: Conclusions

Version 0

Version 0

The ABOUT ML project objective is to work towards putting forward a new industry emphasis on transparent ML systems and this is the first step in providing a guide for practitioners to start taking this goal seriously. The goal of the v0 document was to synthesize insights and recommendations from the existing body of literature to begin a public multistakeholder conversation about how to improve ML transparency.

Key ideas include:

  • Documentation is valuable both as a process and an artifact built to accomplish specific goals (and thus extra documentation for its own sake will not always be useful).
  • Internal documentation (for other teams inside the same organization, more detailed) and external documentation (for broader consumption, fewer sensitive details) are both valuable and should be undertaken together as they provide complementary incentives and benefits.
  • Avoiding misuse and harm from ML systems is a focus of current research and practice. Adhering to a documentation process that demands intentional reflection about how a system might be used and misused, in which contexts, and impacting whom is one first step towards potentially reducing harms. Incorporating feedback from diverse perspectives early, often, and throughout every stage in the ML lifecycle is another risk mitigation strategy. The Diverse Voices process from the Tech Policy Lab at the University of Washington is one formalized methodology for incorporating this type of feedback.

Version 1

Version 1

The transition from v0 document to the current ABOUT ML Reference Document is based on input we received from the Diverse Voices panels and does not represent an overhaul of the original work but an update with enhanced focus on readability and utility. Notable adjustments include:

ABOUT ML Reference Document

Section 0: How to Use this Document

Recommended Reading Plan

Quick Guides

How We Define

Contact for Support

Section 1: Project Overview

1.1 Statement of Importance for ABOUT ML Project

1.1.0 Importance of Transparency: Why a Company Motivated by the Bottom Line Should Adopt ABOUT ML Recommendations

1.1.1 About This Document and Version Numbering

1.1.2 ABOUT ML Goals and Plan

1.1.3 ABOUT ML Project Process and Timeline Overview

1.1.4 Who Is This Project For?

1.1.4.1 Audiences for the ABOUT ML Resources

1.1.4.2 Stakeholders That Should Be Consulted While Putting Together ABOUT ML Resources

1.1.4.3 Audiences for ABOUT ML Documentation Artifacts

1.1.4.4 Whose Voices Are Currently Reflected in ABOUT ML?

1.1.4.5 Origin Story

Section 2: Literature Review (Current Recommendations on Documentation for Transparency in the ML Lifecycle)

2.1 Demand for Transparency and AI Ethics in ML Systems 

2.2 Documentation to Operationalize AI Ethics Goals

2.2.1 Documentation as a Process in the ML Lifecycle

2.2.2 Key Process Considerations for Documentation

2.3 Research Themes on Documentation for Transparency 

2.3.1 System Design and Set Up

2.3.2 System Development

2.3.3 System Deployment

Section 3: Preliminary Synthesized Documentation Suggestions

3.4.1 Suggested Documentation Sections for Datasets

3.4.1.1 Data Specification

3.4.1.1.1 Motivation

3.4.1.2 Data Curation 

3.4.1.2.1 Collection

3.4.1.2.2 Processing

3.4.1.2.3 Composition

3.4.1.2.4 Types and Sources of Judgement Calls

3.4.1.3 Data Integration

3.4.1.3.1 Use

3.4.1.3.2 Distribution

3.4.1.4 Maintenance

3.4.2 Suggested Documentation Sections for Models

3.4.2.1 Model Specifications

3.4.2.2 Model Training

3.4.2.3 Evaluation

3.4.2.4 Model Integration

3.4.2.5 Maintenance

Section 4: Current Challenges of Implementing Documentation

Section 5: Conclusions

Version 0

Version 1

Appendix A: Compiled List of Documentation Questions 

Fact Sheets (Arnold et al. 2018)

Data Sheets (Gebru et al. 2018)

Model Cards (Mitchell et al. 2018)

A “Nutrition Label” for Privacy (Kelley et al. 2009)

The Dataset Nutrition Label: A Framework To Drive Higher Data Quality Standards (Holland et al. 2019)

Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science (Bender and Friedman 2018)

Appendix B: Diverse Voices Process and Artifacts

Procurement Recruitment Email

Procurement Confirmation Email 

Appendix C: Glossary

Sources Cited

  1. Holstein, K., Vaughan, J.W., Daumé, H., Dudík, M., & Wallach, H.M. (2018). Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? CHI.
  2. Young, M., Magassa, L. and Friedman, B. (2019) Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology 21(2), 89-103.
  3. World Wide Web Consortium Process Document (W3C) process outlined here: https://www.w3.org/2019/Process-20190301/
  4. Internet Engineering Task Force (IETF) process outlined here: https://www.ietf.org/standards/process/
  5. The Web Hypertext Application Technology Working Group (WHATWG) process outlined here: https://whatwg.org/faq#process
  6. Oever, N., Moriarty, K. The Tao of IETF: A novice's guide to the Internet Engineering Task Force. https://www.ietf.org/about/participate/tao/.
  7. Young, M., Magassa, L. and Friedman, B. (2019) Toward inclusive tech policy design: a method for underrepresented voices to strengthen tech policy documents. Ethics and Information Technology 21(2), 89-103.
  8. Friedman, B, Kahn, Peter H., and Borning, A., (2008) Value sensitive design and information systems. In Kenneth Einar Himma and Herman T. Tavani (Eds.) The Handbook of Information and Computer Ethics., (pp. 70-100) John Wiley & Sons, Inc. http://jgustilo.pbworks.com/f/the-handbook-of-information-and-computer-ethics.pdf#page=104; Davis, J., and P. Nathan, L. (2015). Value sensitive design: applications, adaptations, and critiques. Handbook of Ethics, Values, and Technological Design: Sources, Theory, Values and Application Domains. (pp. 11-40) DOI: 10.1007/978-94-007-6970-0_3. https://www.researchgate.net/publication/283744306_Value_Sensitive_Design_Applications_Adaptations_and_Critiques; Borning, A. and Muller, M. (2012). Next steps for value sensitive design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). (pp 1125-1134) DOI: https://doi.org/10.1145/2207676.2208560 https://dl.acm.org/citation.cfm?id=2208560
  9. Pichai, S., (2018). AI at Google: our principles. The Keyword. https://www.blog.google/technology/ai/ai-principles/; IBM’s Principles for Trust and Transparency. IBM Policy. https://www.ibm.com/blogs/policy/trust-principles/; Microsoft AI principles. Microsoft. https://www.microsoft.com/en-us/ai/our-approach-to-ai; Ethically Aligned Design – Version II. IEEE. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead_v2.pdf
  10. Zeng, Y., Lu, E., and Huangfu, C. (2018) Linking artificial intelligence principles. CoRR https://arxiv.org/abs/1812.04814.
  11. essica Fjeld, Hannah Hilligoss, Nele Achten, Maia Levy Daniel, Sally Kagay, and Joshua Feldman, (2018). Principled artificial intelligence - a map of ethical and rights based approaches, Berkman Center for Internet and Society, https://ai-hr.cyber.harvard.edu/primp-viz.html
  12. Jobin, A., Ienca, M., & Vayena, E. (2019). Artificial Intelligence: the global landscape of ethics guidelines. arXiv preprint arXiv:1906.11668. https://arxiv.org/pdf/1906.11668.pdf
  13. Jobin, A., Ienca, M., & Vayena, E. (2019). Artificial Intelligence: the global landscape of ethics guidelines. arXiv preprint arXiv:1906.11668. https://arxiv.org/pdf/1906.11668.pdf
  14. Ananny, M., and Kate Crawford (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media and Society 20 (3): 973-989.
  15. Whittlestone, J., Nyrup, R., Alexandrova, A., & Cave, S. (2019, January). The Role and Limits of Principles in AI Ethics: Towards a Focus on Tensions. In Proceedings of the AAAI/ACM Conference on AI Ethics and Society, Honolulu, HI, USA (pp. 27-28). http://www.aies-conference.com/wp-content/papers/main/AIES-19_paper_188.pdf; Mittelstadt, B. (2019). AI Ethics–Too Principled to Fail? https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3391293
  16. Greene, D., Hoffmann, A. L., & Stark, L. (2019, January). Better, nicer, clearer, fairer: A critical assessment of the movement for ethical artificial intelligence and machine learning. In Proceedings of the 52nd Hawaii International Conference on System Sciences. https://scholarspace.manoa.hawaii.edu/handle/10125/59651
  17. Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In AAAI/ACM Conf. on AI Ethics and Society (Vol. 1). https://www.media.mit.edu/publications/actionable-auditing-investigating-the-impact-of-publicly-naming-biased-performance-results-of-commercial-ai-products/
  18. Algorithmic Impact Assessment (2019) Government of Canada https://www.canada.ca/en/government/system/digital-government/modern-emerging-technologies/responsible-use-ai/algorithmic-impact-assessment.html
  19. Benjamin, M., Gagnon, P., Rostamzadeh, N., Pal, C., Bengio, Y., & Shee, A. (2019). Towards Standardization of Data Licenses: The Montreal Data License. arXiv preprint arXiv:1903.12262. https://arxiv.org/abs/1903.12262; Responsible AI Licenses v0.1. RAIL: Responsible AI Licenses. https://www.licenses.ai/ai-licenses
  20. See Citation 5
  21. Safe Face Pledge. https://www.safefacepledge.org/; Montreal Declaration on Responsible AI. Universite de Montreal. https://www.montrealdeclaration-responsibleai.com/; The Toronto Declaration: Protecting the right to equality and non-discrimination in machine learning systems. (2018). Amnesty International and Access Now. https://www.accessnow.org/cms/assets/uploads/2018/08/The-Toronto-Declaration_ENG_08-2018.pdf ; Dagsthul Declaration on the application of machine learning and artificial intelligence for social good. https://www.dagstuhl.de/fileadmin/redaktion/Programm/Seminar/19082/Declaration/Declaration.pdf
  22. Dobbe, R., Dean, S., Gilbert, T., & Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. https://arxiv.org/pdf/1807.00553.pdf
  23. Wagstaff, K. (2012). Machine learning that matters. https://arxiv.org/pdf/1206.4656.pdf ; Friedman, B., Kahn, P. H., Borning, A., & Huldtgren, A. (2013). Value sensitive design and information systems. In Early engagement and new technologies: Opening up the laboratory (pp. 55-95). Springer, Dordrecht. https://vsdesign.org/publications/pdf/non-scan-vsd-and-information-systems.pdf
  24. Dobbe, R., Dean, S., Gilbert, T., & Kohli, N. (2018). A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics. https://arxiv.org/pdf/1807.00553.pdf
  25. Safe Face Pledge. https://www.safefacepledge.org/
  26. Montreal Declaration on Responsible AI. Universite de Montreal. https://www.montrealdeclaration-responsibleai.com/
  27. Diverse Voices How To Guide. Tech Policy Lab, University of Washington. https://techpolicylab.uw.edu/project/diverse-voices/
  28. Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587-604.
  29. Ethically Aligned Design – Version II. IEEE. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead_v2.pdf
  30. Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumeé III, H., & Crawford, K. (2018). Datasheets for datasets. https://arxiv.org/abs/1803.09010 https://arxiv.org/abs/1803.09010; Hazard Communication Standard: Safety Data Sheets. Occupational Safety and Health Administration, US Department of Labor. https://www.osha.gov/Publications/OSHA3514.html
  31. Holland, S., Hosny, A., Newman, S., Joseph, J., & Chmielinski, K. (2018). The dataset nutrition label: A framework to drive higher data quality standards. https://arxiv.org/abs/1805.03677; Kelley, P. G., Bresee, J., Cranor, L. F., & Reeder, R. W. (2009). A nutrition label for privacy. In Proceedings of the 5th Symposium on Usable Privacy and Security (p. 4). ACM. http://cups.cs.cmu.edu/soups/2009/proceedings/a4-kelley.pdf
  32. Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., ... & Gebru, T. (2019, January). Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 220-229). ACM. https://arxiv.org/abs/1810.03993
  33. Hind, M., Mehta, S., Mojsilovic, A., Nair, R., Ramamurthy, K. N., Olteanu, A., & Varshney, K. R. (2018). Increasing Trust in AI Services through Supplier's Declarations of Conformity. https://arxiv.org/abs/1808.07261
  34. Veale M., Van Kleek M., & Binns R. (2018) ‘Fairness and Accountability Design Needs for Algorithmic Support in High-Stakes Public Sector Decision-Making’ in Proceedings of the ACM Conference on Human Factors in Computing Systems, CHI 2018. https://arxiv.org/abs/1802.01029.
  35. Benjamin, M., Gagnon, P., Rostamzadeh, N., Pal, C., Bengio, Y., & Shee, A. (2019). Towards Standardization of Data Licenses: The Montreal Data License. https://arxiv.org/abs/1903.12262
  36. Cooper, D. M. (2013, April). A Licensing Approach to Regulation of Open Robotics. In Paper for presentation for We Robot: Getting down to business conference, Stanford Law School.
  37. Responsible AI Practices. Google AI. https://ai.google/education/responsible-ai-practices
  38. Everyday Ethics for Artificial Intelligence. (2019). IBM. https://www.ibm.com/watson/assets/duo/pdf/everydayethics.pdf
  39. Federal Trade Commission. (2012). Best Practices for Common Uses of Facial Recognition Technologies (Staff Report). Federal Trade Commission, 30. https://www.ftc.gov/sites/default/files/documents/reports/facing-facts-best-practices-common-uses-facial-recognition-technologies/121022facialtechrpt.pdf
  40. Microsoft (2018). Responsible bots: 10 guidelines for developers of conversational AI. https://www.microsoft.com/en-us/research/uploads/prod/2018/11/Bot_Guidelines_Nov_2018.pdf
  41. Tramer, F., Atlidakis, V., Geambasu, R., Hsu, D., Hubaux, J. P., Humbert, M., ... & Lin, H. (2017, April). FairTest: Discovering unwarranted associations in data-driven applications. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P) (pp. 401-416). IEEE. https://github.com/columbia/fairtest, https://www.mhumbert.com/publications/eurosp17.pdf
  42. Kishore Durg (2018). Testing AI: Teach and Test to raise responsible AI. Accenture Technology Blog. https://www.accenture.com/us-en/insights/technology/testing-AI
  43. Kush R. Varshney (2018). Introducing AI Fairness 360. IBM Research Blog. https://www.ibm.com/blogs/research/2018/09/ai-fairness-360/
  44. Dave Gershgorn (2018). Facebook says it has a tool to detect bias in its artificial intelligence. Quartz. https://qz.com/1268520/facebook-says-it-has-a-tool-to-detect-bias-in-its-artificial-intelligence/
  45. James Wexler. (2018) The What-If Tool: Code-Free Probing of Machine Learning Models. Google AI Blog. https://ai.googleblog.com/2018/09/the-what-if-tool-code-free-probing-of.html
  46. Miro Dudík, John Langford, Hanna Wallach, and Alekh Agarwal (2018). Machine Learning for fair decisions. Microsoft Research Blog. https://www.microsoft.com/en-us/research/blog/machine-learning-for-fair-decisions/
  47. Veale, M., Binns, R., & Edwards, L. (2018). Algorithms that Remember: Model Inversion Attacks and Data Protection Law. Phil. Trans. R. Soc. A, 376, 20180083. https://doi.org/10/gfc63m
  48. Floridi, L. (2010, February). Information: A Very Short Introduction.
  49. Data Information Specialists Committee UK, 2007. http://www.disc-uk.org/qanda.html.
  50. Harwell, Drew. “Federal Study Confirms Racial Bias of Many Facial-Recognition Systems, Casts Doubt on Their Expanding Use.” The Washington Post, WP Company, 21 Dec. 2019, www.washingtonpost.com/technology/2019/12/19/federal-study-confirms-racial-bias-many-facial-recognition-systems-casts-doubt-their-expanding-use/
  51. Hildebrandt, M. (2019) ‘Privacy as Protection of the Incomputable Self: From Agnostic to Agonistic Machine Learning’, Theoretical Inquiries in Law, 20(1) 83–121.
  52. D'Amour, A., Heller, K., Moldovan, D., Adlam, B., Alipanahi, B., Beutel, A., ... & Sculley, D. (2020). Underspecification presents challenges for credibility in modern machine learning. arXiv preprint arXiv:2011.03395.
  53. Selinger, E. (2019). ‘Why You Can’t Really Consent to Facebook’s Facial Recognition’, One Zero. https://onezero.medium.com/why-you-cant-really-consent-to-facebook-s-facial-recognition-6bb94ea1dc8f
  54. Lum, K., & Isaac, W. (2016). To predict and serve?. Significance, 13(5), 14-19. https://rss.onlinelibrary.wiley.com/doi/full/10.1111/j.1740-9713.2016.00960.x
  55. LabelInsight (2016). “Drive Long-Term Trust & Loyalty Through Transparency”. https://www.labelinsight.com/Transparency-ROI-Study
  56. Crawford and Paglen, https://www.excavating.ai/
  57. Geva, Mor & Goldberg, Yoav & Berant, Jonathan. (2019). Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets. https://arxiv.org/pdf/1908.07898.pdf
  58. Bender, E. M., & Friedman, B. (2018). Data statements for natural language processing: Toward mitigating system bias and enabling better science. Transactions of the Association for Computational Linguistics, 6, 587-604.
  59. Desmond U. Patton et al (2017).
  60. See Cynthia Dwork et al.,
  61. Katta Spiel, Oliver L. Haimson, and Danielle Lottridge. (2019). How to do better with gender on surveys: a guide for HCI researchers. Interactions. 26, 4 (June 2019), 62-65. DOI: https://doi.org/10.1145/3338283
  62. A. Doan, A. Y. Halevy, and Z. G. Ives. Principles of Data Integration. Morgan Kaufmann, 2012
  63. Momin M. Malik. (2019). Can algorithms themselves be biased? Medium. https://medium.com/berkman-klein-center/can-algorithms-themselves-be-biased-cffecbf2302c
  64. Fire, Michael, and Carlos Guestrin (2019). “Over-Optimization of Academic Publishing Metrics: Observing Goodhart’s Law in Action.” GigaScience 8 (giz053). https://doi.org/10.1093/gigascience/giz053.
  65. Vogelsang, A., & Borg, M. (2019, September). Requirements engineering for machine learning: Perspectives from data scientists. In 2019 IEEE 27th International Requirements Engineering Conference Workshops (REW) (pp. 245-251). IEEE
  66. Eckersley, P. (2018). Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv preprint arXiv:1901.00064.
  67. Partnership on AI. Report on Algorithmic Risk Assessment Tools in the U.S. Criminal Justice System, Requirement 5.
  68. Eckersley, P. (2018). Impossibility and Uncertainty Theorems in AI Value Alignment (or why your AGI should not have a utility function). arXiv preprint arXiv:1901.00064.https://arxiv.org/abs/1901.00064
  69. If it is not, there is likely a bug in the code. Checking a predictive model's performance on the training set cannot distinguish irreducible error (which comes from intrinsic variance of the system) from error introduced by bias and variance in the estimator; this is universal, and has nothing to do with different settings or
  70. Selbst, Andrew D. and Boyd, Danah and Friedler, Sorelle and Venkatasubramanian, Suresh and Vertesi, Janet (2018). “Fairness and Abstraction in Sociotechnical Systems”, ACM Conference on Fairness, Accountability, and Transparency (FAT*). https://ssrn.com/abstract=3265913
  71. Tools that can be used to explore and audit the predictive model fairness include FairML, Lime, IBM AI Fairness 360, SHAP, Google What-If Tool, and many others
  72. Wagstaff, K. (2012). Machine learning that matters. arXiv preprint arXiv:1206.4656. https://arxiv.org/abs/1206.4656
Table of Contents
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16