Category Archives: security

Predictions, predictions

The Crystal Ball by John William Waterhouse

It is prediction time. Around about this time of year, pundits love to draw on their specialist expertise to predict significant events for the year ahead. The honest ones also revisit their previous yearly predictions to see how they did.

Philip Tetlock is an expert on expert judgement. He conducted a series of experiments between 1984-2003 to uncover whether experts were better than average at predicting outcomes in their given specialism. Predictions had to be specific and quantifiable, in areas ranging from economics to politics and internatonal relations. He found that many so-called experts were pretty bad at telling the future, and some were even worse than the man or woman on the street. Most worryingly, there was an inverse relationship between an experts fame and the accuracy of their predictions.

But he also discovered certain factors that make some experts better predictors than others. He found that training in probabilistic reasoning, avoidance of common cognitive biases, and evaluating previous guesses enabled experts to make more accurate forecasts. Predictions based on the combined judgements of multiple individuals also proved helpful.

This work has continued in the Good Judgement Project, a research project involving several thousand volunteer forecasters, which now has a commercial spin-off, Good Judgement Inc. The project has experimented with various methods of training, forming forecasters into complementary teams, and aggregation algorithms. By rigourously and systematically testing these and other factors, the system aims to uncover the determinants of accurate forecasts. It has already proven highly successful, winning a CIA-funded competition several years in a row (‘Intelligence Advanced Research Projects Activity – Aggregative Contingent Estimation’ (IARPA-ACE)).

The commercial spin-off began as an invite-only scheme, but now it has a new part called ‘Good Judgement Open’ which allows anyone to sign up and have a go. I’ve just signed up and made my first prediction in response to the following question:

“Before the end of 2016, will a North American country, the EU, or an EU member state impose sanctions on another country in response to a cyber attack or cyber espionage?”

You can view the question and the current forecast from users of the site. Users compete to be the most prescient in their chosen areas of expertise.

It’s an interesting concept. I expect it will also prove to be a pretty shrewd way of harvest intelligence and recruit superforecasters for Good Judgement Inc. In this sense the business model is like Facebook and Google, i.e. monetising user data, although not in order to sell targeted advertising.

I can think of a number of ways the site could be improved further. I’d like to be given a tool which helps me break down my prediction into various necessary and jointly sufficient elements and allow me to place a probability estimate on each. For instance, let’s say the question about international sanctions in response to a cyberattack depends on several factors; the likelihood of severe attacks, the ability of digital forensics to determine the origin of the attack, and the likelihood of sanctions as a response. I have more certainty about some of these factors than others, so a system which split them into parts, and periodically revise them, would be helpful (in Bayesian terms, I’d like to be explicit about my priors and posteriors).

One could also experiment with keeping predictions secret until after the outcome of some event. This would mean one forecaster’s prediction wouldn’t contaminate the predictions of others (perhaps if they were well known as a reliable forecaster). This would allow for forecastors to say ‘I knew it!’ without saying ‘I told you so’ (we could call it the ‘Reverse Cassandra’). Of course you’d need some way to prove that you didn’t just write the prediction after the event and back-date it. Or create a prediction for every possible outcome and then selectively reveal the correct one, a classic con illustrated by TV magician Derren Brown. If you wanted to get really fancy, you could do that kind of thing with cryptography and the blockchain.

After looking into this a bit more, I came across this blog by someone called gwern, who appears to be incredibly knowledgeable about prediction markets (and much else besides).

How to improve how we prove; from paper-and-ink to digital verified attributes

'Stamp of Approval' by Sudhamshu Hebbar, CC-BY 2.0
‘Stamp of Approval’ by Sudhamshu Hebbar, CC-BY 2.0

Personal information management services (PIMS) are an emerging class of digital tools designed to help people manage and use data about themselves. At the core of this is information about your identity and credentials, without which you cannot prove who you are or that you have certain attributes. This is a boring but necessary part of accessing services, claiming benefits and compensation, and a whole range of other general ‘life admin’ tasks.

Currently the infrastructure for managing these processes is stuck somewhere in the Victorian era, dominated by rubber stamps, handwritten signatures and paper forms, dealt with through face-to-face interactions with administrators and shipped around through snail mail. A new wave of technology aims to radically simplify this infrastructure through digital identities, certificates and credentials. Examples include GOV.UK Verify, the UK government identity scheme, and services like MiiCard and Mydex which allow individuals to store and re-use digital proofs of identity and status. The potential savings from these new services are estimated at £3 billion in the UK alone (disclosure: I was part of the research team behind this report).

Yesterday I learned a powerful first-hand lesson about the current state of identity management, and the dire need for PIMS to replace it. It all started when I realised that a train ticket, which I’d bought in advance, would be invalid because my discount rail card expired before the date of travel. After discovering I could not simply pay off the excess to upgrade to a regular ticket, I realised my only option would be to renew the railcard.

That may sound simple, but it was not. To be eligible for the discount, I’d need to prove to the railcard operator that I’m currently a post-graduate student. They require a specific class of (very busy) University official to fill in, sign and stamp their paper form and verify a passport photo. There is a semi-online application system, but this still requires a University administrator to complete the paperwork and send a scanned copy, and then there’s an additional waiting time while a new railcard is sent by post from an office in Scotland.

So I’d need to make a face-to-face visit to one of the qualified University administrators with all the documents, and hope that they are available and willing to deal with them. Like many post-graduate students, I live in a different city so this involves an 190 minute, £38 train round-trip.  When I arrive, the first administrator I ask to sign the documentation tells me that I will have to leave the documentation with their office for an unspecified number of days (days!) while they ‘check their system’ to verify that I am who I say I am.

I tried to communicate the absurdity of the situation: I had travelled 60 miles to get a University-branded pattern of ink stamped on a piece of paper, in order to verify my identity to the railcard company, but the University administrators couldn’t stamp said paper because they needed several days to check a database to verify that I exist and I am me – while I stand before them with my passport, driver’s license, proof of address and my student identity card.

Finally I was lucky enough to speak to another administrator whom I know personally, who was able to deal with the paperwork in a matter of seconds. In the end, the only identity system which worked was a face to face interaction predicated on interpersonal trust; a tried-and-tested protocol which pre-dates the scanned passport, the Kafka-esque rubber stamp, and the pen-pushing Victorian clerk.

Here’s how an effective digital identity system would have solved this problem. Upon enrolment, the university would issue me with a digital certificate, verifying my status as a postgraduate, which would be securely stored and regularly refreshed in my personal data store (PDS). When the time comes to renew my discount railcard, I would simply log in to my PDS and accept a connection from the railcard operator’s site. I pay the fee and they extend the validity of my existing railcard.

From the user experience perspective, that’s all there is to it – a few clicks and it’s done. In the background, there’s a bit more complexity. My PDS would receive a request from the railcard operator’s system for the relevant digital certificate (essentially a cryptographically signed token generated by the University’s system). After verifying the authenticity of the request, my PDS sends a copy of the certificate. The operator’s back-end system then checks the validity of the certificate against the public key of the issuer (in this case, the university). If it all checks out, the operator has assurance from the University that I am eligible for the discount. It should take a matter of seconds.

From a security perspective, it’s harder to fake a signature made out of cryptography than one made out of ink (ironically, it would probably have been less effort for me to forge the ink signature than to obtain it legitimately). Digital proofs can also be better for privacy, as they reveal the minimal amount of information about me that the railcard operator needs to determine my eligibility, and the data is only shared when I permit it.

Identity infrastructure is important for reasons beyond convenience and security – it’s also about equality and access. I’m lucky that I can afford to pay the costs when these boring parts of ‘life admin’ go wrong – paying for a full price ticket wouldn’t have put my bank balance in the red. But if you’re at the bottom of the economic ladder, you have much more to lose when you can’t access the discounted services, benefits and compensation you are entitled to. Reforming our outdated systems could therefore have a disproportionately positive impact for the least well-off.

Public Digital Infrastructure: Who Pays?

Glen Canyon Bridge & Dam, Page, Arizona, by flickr user Thaddeus Roan under CC-BY 2.0
Glen Canyon Bridge & Dam, Page, Arizona, by flickr user Thaddeus Roan under CC-BY 2.0

Every day, we risk our personal security and privacy by relying on lines of code written by a bunch under-funded non-profits and unpaid volunteers. These essential pieces of infrastructure go unnoticed and under-funded; that is, until they fail.

Take OpenSSL, one of the most common tools for encrypting internet traffic. It means that things like confidential messages and credit card details aren’t transferred as plain text. It probably saves you from identity fraud, theft, stalking, blackmail, and general inconvenience dozens of times a day. At the time when a critical security flaw (known as ‘Heartbleed’) was discovered in OpenSSL’s code last April, there was just one person paid to work full-time on the project – the rest of it being run largely by volunteers.

What about the Network Time Protocol? It keeps most of the world’s computer’s clocks synchronised so that everything is, you know, on time. NTP has been developed and maintained over the last 20 years by one university professor and a team of volunteers.

Then there is OpenSSH, which is used to securely log in to remote computers across a network – used every day by systems administrators to keep IT systems, servers, and websites working whilst keeping out intruders. That’s maintained by another under-funded team who recently started a fundraising drive because they could barely afford to keep the lights on in their office.

Projects like these are essential pieces of public digital infrastructure; they are the fire brigade of the internet, the ambulance service for our digital lives, the giant dam holding back a flood of digital sewage. But our daily dependence on them is largely invisible and unquantified, so it’s easy to ignore their importance. There is no equivalent to pictures of people being rescued from burning buildings. The image of a programmer auditing some code is not quite as visceral.

So these projects survive on small handouts, occasionally large ones from large technology companies. Whilst it’s great that commercial players want to help secure the open source code they use in their products, this alone is not an ideal solution. Imagine if the ambulance service were funded by ad-hoc injections of cash from various private hospitals, who had no obligation to maintain their contributions. Or if firefighters only got new trucks and equipment when some automobile manufacturer thinks it would be good PR.

There’s a good reason to make this kind of critical public infrastructure open-source. Proprietary code can only be audited behind closed doors, so that means everyone who relies on it has to trust the provider to discover its flaws, fix them, and be honest when they fail. Open source code, on the other hand, can be audited by anyone. The idea is that ‘many eyes make all bugs shallow’ – if everyone can go looking for them, bugs are much more likely to be found.

But just because anyone can, that doesn’t mean that someone will. It’s a little like the story of four people named Everybody, Somebody, Anybody, and Nobody:

There was an important job to be done and Everybody was sure that Somebody would do it. Anybody could have done it, but Nobody did it. Somebody got angry about that because it was Everybody’s job. Everybody thought that Anybody could do it, but Nobody realized that Everybody wouldn’t do it. It ended up that Everybody blamed Somebody when Nobody did what Anybody could have done.

Everybody would benefit if Somebody audited and improved OpenSSL/NTP/OpenSSH/etc, but Nobody has sufficient incentive to do so. Neither proprietary software nor the open source world is delivering the quality of critical public digital infrastructure we need.

One solution to this kind of market failure is to treat critical infrastructure as a public good, deserving of public funding. Public goods are traditionally defined as ‘non-rival’, meaning that one person’s use of the good does not reduce its availability to others, and ‘non-excludable’, meaning that it is not possible to exclude certain people from using it. The examples given above certainly meet this criteria. Code is infinitely reproducible at nearly zero marginal cost, and its use, absent any patents or copyrights, is impossible to constrain.

The costs of creating and sustaining a global, secure, open and free-as-in-freedom digital infrastructure are tiny in comparison to the benefits. But direct, ongoing public funding for those who maintain this infrastructure is rare. Meanwhile, we find that billions have been spent on intelligence agencies whose goal is to make security tools less secure. Rather than undermining such infrastructure, governments should be pooling their resources to improve it.


Related: The Linux foundation have an initiative to address this situation, with the admirable backing of some industry heavyweights http://www.linuxfoundation.org/programs/core-infrastructure-initiative/
While any attempt to list all the critical projects of the internet is likely to be incomplete and lead to disagreement, Jonathan Wilkes and volunteers have nevertheless begun one https://wiki.pch.net/doku.php?id=pch:public:critical-internet-software