Requirement 10: Jurisdictions must take responsibility for the post-deployment evaluation, monitoring, and auditing of these tools Requirement 10: Jurisdictions must take responsibility for the post-deployment evaluation, monitoring, and auditing of these tools
Jurisdictions must periodically publish an independent review, algorithmic impact assessment, or audit of all risk assessment tools they use to verify that the requirements listed in this report have been met. For further guidance on how such audits and evaluations might be structured, see, AI Now Institute, Algorithmic Impact Assessments: A Practical Framework for Public Agency Accountability, https://ainowinstitute.org/aiareport2018.pdf; Christian Sandvig et al., Auditing algorithms: Research methods for detecting discrimination on internet platform (2014). Subsequent audits will need to examine the outcomes and operation of the system on a regular basis. Such review processes must also be localized because the conditions of crime, law enforcement response, and culture among judges and clerks are all local phenomena. See John Logan Koepke and David G. Robinson, Danger Ahead: Risk Assessment and the Future of Bail Reform, 93 Wash. L. Rev. 1725 (2018). These processes should ideally operate with staff support and buy-in within judicial institutions, while also drawing on external expertise.
To ensure transparency and accountability, an independent outside body (such as a review board) must be responsible for overseeing the audit. This body should be comprised of legal, technical, and statistical experts, currently and formerly incarcerated individuals, public defenders, public prosecutors, judges, and civil rights organizations, among others. These audits and their methodology must be open to public review and comment. To mitigate privacy risks, published versions of these audits should be redacted and sufficiently blinded to prevent de-anonymization. For a discussion Latanya Sweeney & Ji Su Yoo, De-anonymizing South Korean Resident Registration Numbers Shared in Prescription Data, Technology Science, (Sept. 29, 2015), https://techscience.org/a/2015092901. Techniques exist that can guarantee that re-identification is impossible. See the literature on methods for provable privacy, notably differential privacy. A good introduction is in Kobbi Nissim, Thomas Steinke, Alexandra Wood, Mark Bun, Marco Gaboardi, David R. O’Brien, and Salil Vadhan, Differential Privacy: A Primer for a Non-technical Audience, http://privacytools.seas.harvard.edu/files/privacytools/files/pedagogical-document-dp_0.pdf.
A current challenge to implementing these audits is a lack of data needed to assess the consequences of those tools already deployed. When some partners of PAI tried to assess the consequences of California’s pretrial risk assessment legislation, they found inadequate data on the pretrial detention population in California and could not identify data or studies to understand how the definition of low, medium, high risk and their thresholds could impact how many people are held or released pre-trial. Similarly, evaluating or correcting tools and training data for error and bias requires better data on discrimination at various points in the criminal justice system. In order to understand the impact of current risk assessment tools, whether in pretrial, sentencing, or probation, court systems should collect data on pretrial decisions and outcomes. In addition, data on individual judges’ decisions before and after an intervention should be collected and analyzed.
To meet these responsibilities, whenever legislatures or judicial bodies decide to mandate or purchase risk assessment tools, those authorities should simultaneously ensure the collection of post-deployment data, provide the resources to do so adequately, and support open analysis and review of the tools in deployment. That requires both (i) allocation or appropriation of sufficient funding for those needs and (ii) institutional commitment to recruiting (or contracting with) statistical/technical and criminological expertise to ensure that data collection and review are conducted appropriately.