Shedding Light on the Trade-offs of Using Demographic Data for Algorithmic Fairness

Sarah Villeneuve, McKane Andrus, Hudson Hongo

December 8, 2021

As an increasing number of organizations deploy algorithmic systems to help them make decisions, alarming new examples of algorithmic discrimination continue to emerge. From racial discrimination in predictive policing algorithms to sexist AI hiring tools, harmful biases have been found in a variety of algorithmic decision-making systems, including in critical contexts such as healthcare, criminal justice, and education.

While a number of strategies for mitigating these biases have been proposed, most algorithmic fairness techniques share one common requirement: access to demographic data related to sensitive attributes such as race, gender, sexuality, or nationality. Too frequently overlooked is that collecting and using such data carries its own risks, even if done in support of fairness.

The Partnership on AI’s (PAI) new white paper, “Fairer Algorithmic Decision Making and Its Consequences,” seeks to shed light on the risks and benefits associated with the using demographic data in algorithmic systems. By offering a deeper understanding of these trade-offs, this paper challenges the assumption that collecting more data in support of fairness is always the answer, detailing how this can actually exacerbate or introduce harm for marginalized individuals and groups.

In this paper, PAI highlights the following key risks associated with not collecting and using demographic data:

obscuring the discriminatory impact of algorithmic systems,
preventing algorithmic systems from addressing historical inequities, and
rendering certain groups invisible to important institutions.

We also highlight the following risks associated with collecting and using demographic data:

violating personal privacy,
misrepresenting individuals,
using sensitive data beyond what was consented to,
increasing surveillance of disenfranchised groups,
reinforcing oppressive or overly prescriptive categories, and
control by private entities over what constitutes bias.

Efforts to detect and mitigate harms from algorithmic systems must account for the wider contexts and power structures that they, and the data that they draw on, are embedded in. By cataloguing the affordances and limitations associated with the use of demographic data, we hope this paper will advance discussions of technical solutions to algorithmic bias, pushing them to incorporate questions about governance and impact. To read the white paper in full, click here.

The insights collected in this paper were drawn from workshops and conversations with experts in industry, academia, government, and advocacy organizations as well as academic literature across relevant domains. We are grateful to those who engaged with us and this work over the last year.

To learn more about our Demographic Data Workstream, please visit our website, where you can fill out this form to become more involved with the demographic data community at PAI. You can also reach out to us about this white paper directly by emailing sarah.v@partnershiponai.org and mckane@partnershiponai.org.

Back to All Posts