Bayes in the World I: Wikileaks – The Random Universe

I’ve come across a couple bits of popular/political culture that give me the opportunity to discuss one of my favorite topics: the uses and abuses of probability theory.

The first is piece by Nate Silver of the New York Times’ FiveThirtyEight blog, dedicated to trying to crunch the political numbers of polls and other data in as transparent a manner as possible. Usually, Silver relies on a relentlessly frequentist take on probability: he runs lots of simulations letting the inputs vary according to the poll results (correctly taking into account the “margin of error” and more than occasionally using other information to re-weight the results of different polls. Nonetheless, these techniques give a good summary of the results at any given time — and have been far and away the best discussion of the numerical minutiae of electioneering for both the 2008 and 2010 US elections.

But yesterday, Silver wrote a column: A Bayesian Take on Julian Assange which tackles the question of Assange’s guilt in the sexual-assault offense with which he has been charged. Bayes’ theorem, you will probably recall if you’ve been reading this blog, states that the probability of some statement (“Assange is innocent of sexual assault, despite the charges against him”) is the product of the probability that he would be charged if he were innocent (the “likelihood”) times the probability of his innnocence in the absence of knowledge about the charge (the “prior”):

P(innocent|charged, context) ∝ P(innocent | context) × P(charged|innocent, context)

where P(A|B) means the probability of A given B, and the “∝” means that I’ve left off an overall number that you can mulitply by. The most important thing I’ve left in here is the “context”: all of these probabilities depend upon the entire context in which you consider the problem.

To figure out these probabilities, there are no simulations we can perform — we can’t run a big social-science model of Swedish law-enforcement, possibly in contact with, say, American diplomats, and make small changes and see what happens. We just need to assign probabilities to these statements.

But even to do that requires considerable thought, and important decisions about the context in which we want to make these assignments. For Silver, the important context is that there is evidence that other governments, particularly the US, may have an ulterior motive for wanting to not just prosecute, but persecute Assange. Hence, the probability of his being unjustly accused [P(charged|innocent, context)] is larger than it would be for, say, an arbitrary Australian citizen traveling in Britain. Usually, Bayesian probability is accused of needing a subjective prior, but in this case the context affects and adds a subjective aspect to the likelihood.

Some of the commenters on the site make a different point: given that Assange is, at least in some sense, a known criminal (he has leaked secret documents, which is likely against the law), he is more likely to commit other criminal acts. This time, the likelihood is not affected, but the prior: the commenter believes that Assange is less likely to be innocent irrespective of the information about the charge.

Next: game shows.