Monthly Archives: May 2016

White House report on big data and discrimination

The White House has recently published another report on the social and ethical impacts of big data, entitled ‘Big Data: A Report on Algorithmic Systems, Opportunity and Civil Rights’. This could be considered the third in a trilogy, following 2012’s ‘Consumer Data Privacy in a Networked World’ and 2014’s ‘Big Data: Seizing Opportunities, Preserving Values’.

Each report has been a welcome contribution in an evolving debate about data and society, with impacts around the world as well as in the US context. They also reflect, I think, the progress that’s been made in the general direction of understanding in this complex and fast-moving policy area.

The 2012 report was largely an affirmation of commitment to the decades-old fair information practice principles, which noted the challenges posed by new technology. The 2014 report addressed the possibility that big data might lead to forms of unintended discrimination, but didn’t demonstrate any advanced understanding of the potential mechanisms behind such effects. In a paper written shortly after, Solon Barocas and Andrew Selbst commented that ‘because the origin of the discriminatory effects remains unexplored, the 2014 report’s approach does not address the full scope of the problem’.

The latest report does begin to dig more deeply into the heart of big data’s discrimination problem. It describes a number of policy areas – including credit, employment, higher education and criminal justice – in which there is a ‘problem’ to which a ‘big data opportunity’ might be a solution, along with a civil rights ‘challenge’ which must be overcome.

This framing is not without its problems. One might reasonably suspect that the problems in these policy areas are themselves at least partly the result of government mismanagement or market failure, and that advocating a big data ’solution’ would merely be a sticking plaster.

In any case, the report does well to note some of the perils and promise of big data in these areas. It acknowledges some of the complex processes by which big data may have disparate impacts – thus filling the gap in understanding identified by Barocas and Selbst in their 2014 paper. It also alludes to ways in which big data could also help us detect discrimination and thus help prevent it (something I have written about recently). It advocates what it calls ’equal opportunity by design’ approaches to algorithmic hiring. Towards the end of the report, it refers to ‘promising avenues for research and development that could address fairness and discrimination in algorithmic systems, such as those that would enable the design of machine learning systems that constrain disparate impact or construction of algorithms that incorporate fairness properties into their design and execution’. This may be a reference to nascent interdisciplinary research on computational fairness, transparency and accountability (see e.g. the FAT-ML workshop).

While I’d like to see more recognition of the latter, both among the wider academic community and in policy discussions, I hope that its inclusion in the White House report signals a positive direction in the big data debate over the coming years.