From racial discrimination in predictive policing algorithms to sexist AI hiring tools, algorithmic bias — the ways in which algorithms might produce disparate outcomes for different demographic groups — has become a growing social concern. This, in turn, has led to an increased interest in the collection of demographic data, information required by most current algorithmic fairness techniques designed to measure and mitigate bias. However, collecting data about sensitive attributes carries its own risks, even if done in support of fairness. If the goal of demographic data collection is to reduce harms for marginalized individuals and groups, then such risks must be considered fully.
To help provide greater clarity around the use of demographic data to mitigate algorithmic bias, the Partnership on AI’s (PAI) Demographic Data Workstream seeks to identify data collection practices and governance frameworks that ensure this usage is in the public interest. To advance this work, PAI hosted two events this year exploring the challenges involved with collecting and using demographic data in service of algorithmic fairness techniques. The first, a workshop held at the ACM FAccT machine learning conference in March, asked participants to think critically about demographic data use and the infrastructures it requires. The second, a Community Lab session at the RightsCon summit in June, had participants discuss specific trade-offs involved in using or not using demographic data.
Risks of Use and Non-Use
As previously stated, most currently proposed algorithmic fairness techniques require access to demographic data in order to make performance comparisons or standardizations across groups. In practice, demographic data is often not available, leading many organizations to consider new data collection efforts. Collecting more data in support of “fairness,” however, is not always the answer, and can, in fact, exacerbate or introduce harm for marginalized individuals and groups.
A number of risks are connected with the use of demographic data. These risks of use include encroachments on privacy, increased monitoring and surveillance of vulnerable groups, explicit discrimination due to knowledge of sensitive categories, the reinforcement of harmful stereotypes, misrepresentation due to narrow categories, and using data in a way that supports dominant narratives to the detriment of data subjects.
Conversely, there are a number of risks associated with the non-use of demographic data, which are often used as further justification for new data collection efforts to support algorithmic fairness. Risks of non-use include the difficulty of detecting discrimination without this data, entrenching “colorblind” decision making by ignoring certain aspects of an individual’s identity, reinforcing or exacerbating disparities experienced by marginalized groups, and the inability to hold system owners accountable for discriminatory harms.
“The choice between using or not using demographic data in algorithmic decision-making will always carry a risk.”
Additionally, organizations that wish to use demographic data must grapple with additional concerns as identified in PAI’s research paper “What We Can’t Measure We Can’t Understand.” These include both organizational constraints (specifically, privacy and anti-discrimination policies that hinder the collection of sensitive attribute data such as race and gender) and practitioner concerns (such as the unreliability of self-reported demographic data and inaccuracy of inferences).
The choice between using or not using demographic data in algorithmic decision-making will always carry a risk. It is therefore important to confront the risk trade-offs associated with the decision to use, or not use, demographic data.
Demographic Data Use in Practice
At ACM FAccT 2021, PAI researchers organized a workshop titled “Contesting and Rethinking Demographic Data Infrastructures for Algorithmic Fairness.” The goal of this event was to have participants confront whether or not demographic data should be collected and the implications of continuing to design fairness interventions that presuppose demographic data availability.
Participants heard from a series of speakers, each of whom provided a different perspective through which to examine both demographic data collection practices and the requirements and goals of current demographic-based algorithmic fairness techniques. These speakers were McKane Andrus (Partnership on AI), Nick Couldry (LSE) and Ulises A. Mejias (SUNY Oswego), Nithya Sambasivan (Google Research) Emily Denton (Google Research), and danah boyd (Microsoft Research, Data & Society). Additionally, participants engaged in two breakout activities.
The first breakout activity sought to illuminate the core challenges participants face in collecting and using demographic data in their work. Participants’ experiences with demographic data seemed to largely revolve around three core challenges. The first of these is the matter of data subject agency. Demographic data is often more sensitive than other types of data, leading to more stringent standards around data privacy, security, and governance, whether they are imposed by external bodies or by the participant’s institution itself. As such, participants recalled often being unsure what constituted appropriate justification for demographic data collection and what types of commitments should be made to data subjects in order to collect this kind of data. The second challenge that many participants recalled was the question of demographic data quality. Practically speaking, the demographic data available to participants was often sparse and unreliable, if not missing entirely. The third challenge revolved around questions relating to categorization processes and norms, specifically, uncertainty around where to draw the lines between groups or what classification schema to follow (i.e. if practitioners should follow legally protected categories as a framework).
The second breakout activity introduced participants to a fictional data governance structure called a “data coalition” and prompted participants to think of the specific types of supporting infrastructure required to sustain systems that rely on sensitive attribute data. After being provided with six categories of supporting infrastructure (social, technical, physical, economic, legal, and values) as well as the option to add their own category, participants were asked to elaborate on the practices, norms, values, or forms of social and technical organization that are required for this fictional data coalition to exist in the world. The goal of this exercise was to have participants think critically about the infrastructures required to enable the collection of demographic data, and thereby the practice of algorithmic fairness.
Participants pointed to a wide range of supporting infrastructures that went well beyond our expectations. Looking briefly at two of the recurring themes from participants’ responses, the first was clearly that mechanisms that enable participatory collection and use of demographic data will require clear and consistent support from governmental institutions. Currently, there are few legal and political requirements to provide data subjects with meaningful agency over their data, meaning that were this to change we would need a concomitant shift in the amount and kind of enforcement. Building on this, another salient theme was participants’ interrogation of the values underlying “democratic” or “participatory” data infrastructures. Some participants likened the expansion of individual agency over their sensitive data to an embrace of some of the core tenets of critical practices like Indigenous Data Sovereignty. Others, however, felt like arrangements that allow individuals to effectively “vote” with their data would simply reinforce the individualistic values already at the core of our data-driven institutions and that a deeper shift to communalistic thinking around the potentials of data would be necessary to achieve something like “participatory data governance.”
On June 9th, the Partnership on AI hosted a Community Lab session at RightsCon aimed at exploring the various trade-offs involved in collecting and using demographic data for the purpose of mitigating bias in algorithmic decision-making systems. After a brief presentation from PAI’s McKane Andrus and Sarah Villeneuve followed by a Q & A, participants were put into groups to discuss a pair of use and non-use risks associated with demographic data, exploring the trade-offs between addressing one over the other.
One trade-off participants explored was the tension between group invisibility and privacy. When building algorithmic systems, the non-use of demographic data may result in marginalized groups being unseen. On the other hand, the collection and use of demographic data is likely to go beyond what members of marginalized groups are comfortable with given histories of discrimination and oppression based on their demographic attributes. One example where this trade-off could play out that was referenced during group discussions was the child welfare system in Canada, in which indigenous children represent the majority of those in foster care and whose families face increased surveillance. While the child welfare system engages in identity-based data collection — likely in efforts to improve outcomes for specific populations — it has had the effect of encroaching on privacy and has not lent indigenous communities agency over how they are seen and acted on by the state.
Another potential trade-off participants discussed was the benefits of collecting demographic data to generate evidence of oppression and the risk of that data reinforcing oppressive categories. This tension is an important one to explore in relation to algorithmic decision-making because having quantitative evidence about structural forms of oppression is often treated as a prerequisite to making change. When the demographic categories in use are misinformed or outdated, however, categorizing individuals can actually lead to mischaracterizing individuals’ experiences and subjecting them to further discrimination. One example of this that participants discussed was in the healthcare context, where racial data is seen as necessary for accurate diagnoses, so racial categorization is quite common. The issue that arises, however, is that differences in health outcomes are frequently attributed to racial differences without much consideration for how systemic racism impacts health or how systemic racism affects the practice of medicine itself. In this way, though racial data in healthcare does provide evidence for various forms of discrimination, the racial categories in use can end up reinforcing discriminatory practices.
“Many participants emphasized that much of the risk in demographic data collection stems from misalignment between the data collectors and data subjects.”
While participants recognized that risks will differ in their degree depending on the specific context of use, there was broad agreement that the use of demographic data presents more manageable risks compared to the risks that can emerge from non-use across all trade-off pairs. That being said, many participants emphasized that much of the risk in demographic data collection stems from misalignment between the data collectors and data subjects. As such, multiple participants noted that in order to use demographic data and mitigate risks related to privacy, surveillance, and the reinforcement of oppressive categories, data should likely be collected and managed by an intermediary third party — possibly an independent, non-profit organization. Some participants also proposed that organizations should be more transparent about the use and storage of demographic data during the initial collection process. As one participant noted, however, when demographic data is not used, there is still the potential for algorithmic decision-making systems to identify protected categories (such as race and gender) through proxy variables, so it is critical that we find ways to mitigate immediate potential harms while also developing better procedures for the long term.
As more organizations look to employ algorithmic fairness techniques, we must carefully consider the trade-offs associated with the use of demographic data and ensure that fairness strategies are properly aligned with the data subjects’ interests and values. Simply collecting more data cannot support fairness if it exacerbates or introduces harms.
In late 2021, PAI will be publishing an introductory white paper cataloguing the challenges and opportunities associated with the collection and use of demographic data for algorithmic fairness. We have established an Algorithmic Decision Making and Demographic Data Community mailing list, open to the public, which provides members the opportunity to provide direct feedback on drafts of our white paper and subsequent research outputs. If you would like to join our mailing list, you can do so here.