Category Archives: research

Predictions, predictions

The Crystal Ball by John William Waterhouse

It is prediction time. Around about this time of year, pundits love to draw on their specialist expertise to predict significant events for the year ahead. The honest ones also revisit their previous yearly predictions to see how they did.

Philip Tetlock is an expert on expert judgement. He conducted a series of experiments between 1984-2003 to uncover whether experts were better than average at predicting outcomes in their given specialism. Predictions had to be specific and quantifiable, in areas ranging from economics to politics and internatonal relations. He found that many so-called experts were pretty bad at telling the future, and some were even worse than the man or woman on the street. Most worryingly, there was an inverse relationship between an experts fame and the accuracy of their predictions.

But he also discovered certain factors that make some experts better predictors than others. He found that training in probabilistic reasoning, avoidance of common cognitive biases, and evaluating previous guesses enabled experts to make more accurate forecasts. Predictions based on the combined judgements of multiple individuals also proved helpful.

This work has continued in the Good Judgement Project, a research project involving several thousand volunteer forecasters, which now has a commercial spin-off, Good Judgement Inc. The project has experimented with various methods of training, forming forecasters into complementary teams, and aggregation algorithms. By rigourously and systematically testing these and other factors, the system aims to uncover the determinants of accurate forecasts. It has already proven highly successful, winning a CIA-funded competition several years in a row (‘Intelligence Advanced Research Projects Activity – Aggregative Contingent Estimation’ (IARPA-ACE)).

The commercial spin-off began as an invite-only scheme, but now it has a new part called ‘Good Judgement Open’ which allows anyone to sign up and have a go. I’ve just signed up and made my first prediction in response to the following question:

“Before the end of 2016, will a North American country, the EU, or an EU member state impose sanctions on another country in response to a cyber attack or cyber espionage?”

You can view the question and the current forecast from users of the site. Users compete to be the most prescient in their chosen areas of expertise.

It’s an interesting concept. I expect it will also prove to be a pretty shrewd way of harvest intelligence and recruit superforecasters for Good Judgement Inc. In this sense the business model is like Facebook and Google, i.e. monetising user data, although not in order to sell targeted advertising.

I can think of a number of ways the site could be improved further. I’d like to be given a tool which helps me break down my prediction into various necessary and jointly sufficient elements and allow me to place a probability estimate on each. For instance, let’s say the question about international sanctions in response to a cyberattack depends on several factors; the likelihood of severe attacks, the ability of digital forensics to determine the origin of the attack, and the likelihood of sanctions as a response. I have more certainty about some of these factors than others, so a system which split them into parts, and periodically revise them, would be helpful (in Bayesian terms, I’d like to be explicit about my priors and posteriors).

One could also experiment with keeping predictions secret until after the outcome of some event. This would mean one forecaster’s prediction wouldn’t contaminate the predictions of others (perhaps if they were well known as a reliable forecaster). This would allow for forecastors to say ‘I knew it!’ without saying ‘I told you so’ (we could call it the ‘Reverse Cassandra’). Of course you’d need some way to prove that you didn’t just write the prediction after the event and back-date it. Or create a prediction for every possible outcome and then selectively reveal the correct one, a classic con illustrated by TV magician Derren Brown. If you wanted to get really fancy, you could do that kind of thing with cryptography and the blockchain.

After looking into this a bit more, I came across this blog by someone called gwern, who appears to be incredibly knowledgeable about prediction markets (and much else besides).

Galileo and Big Data: The Philosophy of Data Science

The heliocentric view of the solar system was initially rejected… for good reasons.

I recently received a targeted email from a recruitment agency, advertising relevant job openings. Many of them were for ‘Data Science’ roles. At recruitment agencies. There’s a delicious circularity here; somewhere, a data scientist is employed by a recruitment agency to use data science to target recruitment opportunities to potential data scientists for data science roles at recruitment agencies… to recruit more data scientists.

For a job title that barely existed a few years ago, Data Science is clearly a growing role. There are many ways to describe a data scientist, but ‘experts who use analytical techniques from statistics, computing and other disciplines to create value from new (‘big’) data’ provides a good summary (from Nesta).

For me, one of the most interesting and novel aspects of these new data-intensive scientific methods is the extent to which they seem to change the scientific method. In fact, the underpinnings of this new approach can be helpfully explained in terms an old debate in the philosophy of science.

Science is typically seen as a process of theory-driven hypothesis testing. For instance, Galileo’s heliocentric theory made predictions about the movements of the stars; he tested these predictions with his telescope; and his telescopic observations appeared to confirm his predictions and, at least in his eyes, proved his theory (if Galileo were a pedantic modern scientist, he might instead have described it more cautiously as ‘rejecting the null hypothesis’). As we know, Galileo’s theory didn’t gain widespread acceptance for many years (indeed, it still isn’t universally accepted today).

We could explain the establishment’s dismissal of Galileo’s findings as a result of religious dogma. But as Paul Feyerabend argued in his seminal book Against Method (1975), Galileo’s case was actually quite weak (despite being true). His theory appeared to directly contradict many ordinary observations. For instance, rather than birds getting thrown off into the sky, as one might expect if the earth were moving, they can fly a stable course. It was only later that such phenomena could be accommodated by the heliocentric view.

So Galileo’s detractors were, at the time, quite right to reject his theory. Many of their objections came from doubts about the veracity of the data gathered from Galileo’s unfamiliar telescopic equipment. Even if a theory makes a prediction that is borne out by an observation, we might still rationally reject it if it contradicts other observations, if there are competing theories with more explanatory power, and if – as with Galileo’s telescope – there are doubts about the equipment used to derive the observation.

Fast forward three centuries, to 1906, when French physicist Pierre Duhem published La Théorie Physique. Son Objet, sa Structure (The Aim and Structure of Physical Theory). Duhem argued that physicists cannot simply test one hypothesis in isolation. Because whatever the result, one could always reject the ‘auxiliary’ hypotheses that support the observation. These include hypotheses about the background conditions, about the efficacy of your measurement equipment, and so on. No experiment can conclusively confirm or deny the hypothesis because there will always be background assumptions that are open to question.

This idea became known as the Duhem-Quine thesis, after the American philosopher and logician W. V. O. Quine argued that this problem applied not only to physics, but to all of science and even to the truths of mathematics and logic.

The Duhem-Quine thesis is echoed in the work of leading proponents of data-driven science, who argue that a paradigm shift is taking place. Rather than coming up with a hypothesis that predicts a linear relationship between one set of variables and another (‘output’) variable, data-driven science can simply explore every possible relationship (including highly complex, non-linear functions) between any set of variables. This shift has been described as going from data to algorithmic models, from ‘model-based’ to ‘model-free’ science, and from parametric to ‘non-parametric’ modelling (see, for instance, Stuart Russel & Peter Norvig’s 2009 Artificial Intelligence, chapter 18).

While this paradigm shift may be less radical than the Duhem-Quine thesis (it certainly doesn’t cast doubt on the foundations of mathematical truths, as Quine’s holism appears to do), it does embrace one of its key messages; you cannot test a single hypothesis in isolation with a single observation, since there are always auxiliary hypotheses involved – other potential causal factors which might be relevant to the output variable. Modern data science techniques attempt to include as many potential causal factors as possible, and to automatically test an extensive range of complex, non-linear relationships between them and the output variables.

Contemporary philosophers of science are beginning to draw these kinds of connections between their discipline and this paradigm shift in data science. For instance, Wolfgang Pietsch’s work on theory-ladenness in data-intensive science compares the methodology behind common data science techniques (namely, classificatory trees and non-parametric regression) to theories of eliminative induction originating in the work of John Stuart Mill.

There are potential dangers in pursuing theory-free and model-free analysis. We may end up without comprehensible explanations for the answers our machines give us. This may be scientifically problematic, because we often want science to explain a phenomenon rather than just predict it. It may also be problematic in an ethical sense, because machine learning is increasingly used in ways that affect people’s lives, from predicting criminal behaviour to targeting job offers. If data scientists can’t explain the underlying reasons why their models make certain predictions, why they target certain people rather than others, we cannot evaluate the fairness of decisions based on those models.

The Who, What and Why of Personal Data (new paper in JOAL)

Some of my PhD research has been published in the current issue of the Journal of Open Access to Law, from Cornell University.

Abstract:
“Data protection laws require organisations to be transparent about how they use personal data. This article explores the potential of machine-readable privacy notices to address this transparency challenge. We analyse a large source of open data comprised of semi-structured privacy notifications from hundreds of thousands of organisations in the UK, to investigate the reasons for data collection, the types of personal data collected and from whom, and the types of recipients who have access to the data. We analyse three specific sectors in detail; health, finance, and data brokerage. Finally, we draw recommendations for possible future applications of open data to privacy policies and transparency notices.”

It’s available under open access at the JOAL website (or download the PDF)

A Study of International Personal Data Transfers

Whilst researching open registers of data controllers, I was left with some interesting data on international data transfers which didn’t make it into my main research paper. This formed the basis of a short paper for the 2014 Web Science conference which took place last month.

The paper presents a brief analysis of the destinations of 16,000 personal data transfers from the UK. Each ‘transfer’ represents an arrangement between a data controller in the UK to send data to a country located overseas. Many of these destinations are simply listed by the rather general categories of ‘European Economic Area’ or ‘Worldwide’, so the analysis focuses on just those transfers where specific countries were mentioned.

I found that even when we adjust for the size of their existing UK export market, countries whose data protection regimes are approved as ‘adequate’ by the European Commission had higher rates of data transfers. This indicates that easing legal restrictions on cross-border transfers does indeed positively correlate with a higher number of transfers (although the direction of causation can’t be established). I was asked by the organisers to produce a graphic to illustrate the findings, so I’m sharing that below.

datatransfers

Open Research in Practice: responding to peer review with GitHub

I wrote a tweet last week that got a bit of unexpected (but welcome) attention;

I got a number of retweets and replies in response, including:

The answers are: Yes, yes, yes and yes. I thought I’d respond in more detail and pen some short reflections on github, collaboration and open research.

The backstory; I had submitted a short paper (co-authored with my colleague David Matthews from the Maths department) to a conference workshop (WWW2014: Social Machines). While the paper was accepted (yay!), the reviewers had suggested a number of substantial revisions. Since the whole thing had to be written in Latex, with various associated style, bibliography and other files, and version control was important, we decided to create a github repo for the revision process. I’d seen Github used for paper authoring before by another colleague and wanted to try it out.

Besides version control, we also decided to make use of other features of Github, including ‘issues’. I took the long list of reviewer comments and filed them as a new issue. We then separated these out into themes which were given their own sub-issues. From here, we could clearly see what needed changing, and discuss how we were going to do it. Finally, once we were satisfied with an issue, that could be closed.

At first I considered making the repository private – I was a little bit nervous to put all the reviewer comments up on a public repo, especially as some were fairly critical of our first draft (although, in hindsight, entirely fair). In the end, we opted for the open approach – that way, if anyone is interested they can see the process we went through. While I doubt anyone will be interested in such a level of detail for this paper, opening up the paper revision process as a matter of principle is probably a good idea.

With the movement to open up the ‘grey literature’ of science – preliminary data, unfinished papers, failed experiments – it seems logical to extend this to the post-peer-review revision process. For very popular and / or controversial papers, it would be interesting to see how authors have dealt with reviewer comments. It could help provide context for subsequent debates and responses, as well as demystify what can be a strange and scary process for early-career researchers like myself.

I’m sure there are plenty of people more steeped in the ways of open science who’ve given this a lot more thought. New services like FigShare, and open access publishers like PLoS and PeerJ, are experimenting with opening up all the whole process of academic publishing. There are also dedicated paper authoring tools that extend on git-like functionality – next time, I’d like to try one of the collaborative web-based Latex editors like ShareLatex or WriteLatex. Either way, I’d recommend adopting git or something git-like, for co-authoring papers and the post-peer-review revision process. The future of open peer review looks bright – and integrating this with an open, collaborative revision process is a no-brainer.

Next on my reading list for this topic is this book on the future of academic publishing by Kathleen FitzPatrick – Planned Obsolescence
— UPDATE: Chad Kohalyk just alerted me to a relevant new feature rolled out by Github yesterday – a better way to track diffs in rendered prose. Thanks!