# Is Presuming Innocence A Bayesian Prior?

Note An earlier version of this post was accidentally sent out in unedited form. My enemies caused me to hit the wrong button. Subscribers: apologies for the near duplicate email.

I don’t mean to pick on Deborah Mayo, but her site has lots of good probability teasers that don’t confuse the question with a lot of math. Nothing wrong with math, except that most of it is just plodding along. Solving math problems for fixed situations is not as much fun as solving philosophical ones, to me.

The conundrum today is provided by Larry Laudan, who wrote “Why Presuming Innocence is Not a Bayesian Prior” at Mayo’s site.

Let’s switch things up. I’ll say what I think is the right answer, then we’ll let Laudan have a go.

Judging a man guilty or innocent, or at least not guilty, is a decision, an act. It is not probability. Like all decisions it uses probability. The probability you form depends on the evidence you assume or believe. Probability is the deduction, not always quantified, from the set of assumed evidence of the proposition of interest. In this case, “He’s guilty.”

When jurors are empanelled they enter with minds full of chaos. Some might have already formed high probabilities of guilt of the defendant (“Just look at him!”), some will have formed low (“Just look at him!”), because all will have different evidence assumed. Yet most, we imagine, will accept the proposition “There’s more evidence about guilt that I haven’t yet heard.” Adding that to what’s in their minds, perhaps after subtracting some beliefs, and some jurors might form a low probability.

Now no juror at this point is ever asked to form the decision from his probability to guilty or not guilty. Each could, though. People do. You do when you read of trials in the paper, for instance. There is nothing magical that turns the evidence at the final official decision into “real probability”. Decisions could be made at any time. It is only that the law states only one decision will count, and that’s the one directed by the judge.

Of course, what’s going on in a juror’s mind—and I speak from experience—is nearly constantly shifting. One moment you believe or accept this set of evidence, the next moment maybe something entirely different. You’re nearly always ready to judge based on the probability you’ve formed right now. “He was at the school? He’s Guilty!” Then you hear something new and you think Not guilty. The judge may tell you to ignore a piece of evidence, and maybe you can or maybe you can’t. Some jurors see a certain mannerism and interpret it in a certain way, some didn’t. And so on.

At trial’s end, every juror retires to their room with what they started with: minds full of augmented chaos—a directed chaos, though. The direction is honed by the discussion the jurors have. They will try to agree on two things: a set of evidence, which necessarily leads to a deduction of a (non-quantified, thank the Lord) probability (which won’t be precisely identical for each juror, because the set of evidence considered will never be precisely identical), and a decision based on the probability. Decisions are above probability. They account for thinking about being right and wrong, and what consequences flow from that. Each juror might come to a high probability of Guilty, but they might decide Not guilty because they think the law is stupid, or (think OJ) “racist”.

That’s the scheme. But we still haven’t accounted for the initial directive of “presuming innocence”. What happens with that?

You hear “You must presume the defendant innocent.” That can be taken as a judgement, i.e. a decision, or a command to clear the mind of evidence probative to the question of guilt. Or both. If it’s a decision, it’s nothing but a formality. Jurors don’t get a vote at the beginning of a trial anyway, so hearing they would have to vote Not guilty right now, if they were allowed to vote, isn’t much beyond legal theater.

But if it’s a command to clear the mind, or a command to at least implant the evidence “I don’t know all the evidence, but know more is on its way”, and to the extent each juror obeys this command, it is treated as a piece of evidence, and therefore forms part of each juror’s total evidence, which itself implies a (non-quantified) probability for each juror.

So the command is not a “prior” per se. A “prior” is a probability, and probability is the deduction from a set of evidence. That the command is used in forming a probability (of course very informally), does make it prior evidence, though. Prior to the trial itself.

That’s the answer. We’re done. With the reminder that Bayes itself is not what is important in probability. Bayes is just a helpful formula, which isn’t strictly needed. Our answer is the same as what we began with. Probability is deduced from the evidence assumed (at any point), and decisions are acts made with reference to the probability and other matters.

What does Laudan says?

He says the command is “an instruction about [the jurors’] probative attitudes”. I agree with that, in the sense just stated. But Laudan amplifies:

asking a juror to begin a trial believing that defendant did not commit a crime requires a doxastic act that is probably outside the jurors’ control. It would involve asking jurors to strongly believe an empirical assertion for which they have no evidence whatsoever.

That jurors have “no evidence whatsoever” is false, and not even close. I walked into my last trial with the thought, “The guy probably did it because he was arrested and is on trial.” That is positive evidence for Guilt. I had lots of other thought-evidence, as did each other juror. I’m sure some came in thinking Not guilty for any number of other reasons (evidence). The name of the crime itself, taken in context, is more evidence. Each juror could commit, as I said, his “doxastic act” (his decision), at any time. Only his decision doesn’t count until the end.

asking jurors to believe that defendant did not commit the crime seems a rather strange and gratuitous request to make since at no point in the trial will jurors be asked to make a judgment whether defendant is materially innocent. The key decision they must make at the end of the trial does not require a determination of factual innocence. On the contrary, jurors must make a probative judgment: has it been proved beyond a reasonable doubt that defendant committed the crime? If they believe that the proof standard has been satisfied, they issue a verdict of guilty. If not, they acquit him. It is crucial to grasp that an acquittal entails nothing about whether defendant committed the crime, [sic]

We have already seen how each juror forms his probability and then decision based on the evidence. That evidence can very well start with the evidence provided by the judge’s command. I don’t buy his “at no point” either. Many jurors take the vote of Not guilty to mean exactly “He didn’t do it!”—by which they mean they believe the defendant is innocent. Anybody who has served on a jury can verify this. Some jurors might say, of course, they’re not sure, not convinced. To insist that “an acquittal entails nothing about whether defendant committed the crime” is just false—except in a narrow, legal sense.

Laudan says “Legal jurisprudence itself makes clear that the presumption of innocence must be glossed in probatory terms.” That’s true, and I agree the judge’s statement is often taken as theater, part of the ritual of the trial. But it can, and in the manner I showed, be taken as evidence, too.

Now it seems Laudan is not a Bayesian (and neither am I):

Bayesians will of course be understandably appalled at the suggestion here that, as the jury comes to see and consider more and more evidence, they must continue assuming that defendant did not commit the crime until they make a quantum leap and suddenly decide that his guilt has been proven to a very high standard. This instruction makes sense if and only if we suppose that the court is not referring to belief in the likelihood of material innocence (which will presumably gradually decline with the accumulation of more and more inculpatory evidence) but rather to a belief that guilt has been proved.

As I see it, the presumption of innocence is nothing more than an instruction to jurors to avoid factoring into their calculations the fact that he is on trial because some people in the legal system believe him to be guilty. Such an instruction may be reasonable or not (after all, roughly 80% of those who go to trial are convicted and, given what we know about false conviction rates, that clearly means that the majority of defendants are guilty). But I’m quite prepared to have jurors urged to ignore what they know about conviction rates at trial and simply go into a trial acknowledging that, to date, they have seen no proof of defendant’s culpability.

I can’t say what Bayesians would be appalled by, though the ones I have known have strong stomachs. That Bayesians see an accumulation of evidence leading to a point seems to me to be exactly what Bayesians do think, though. How to think of the initial instruction (command), we have already seen.

I agree that the command is used “to avoid factoring into their calculations the fact that he is on trial because some people in the legal system believe him to be guilty.” That’s evidence (which he just said jurors didn’t have). Increasing the probability of guilty because the defendant is on trial is what many jurors do. Even Laudan does that! That’s why he quotes that “80%”. The command (sometimes) removes this evidence. (Laudan may be using evidence as true statements of reality; I do not and instead call it the premises the jury believes true; some lawyers have been known to lie.)

Laudan doesn’t say, but I’m guessing he’s a frequentist. Jury trials are perfect at showing frequentism fails as a definition of probability. In that theory, probabilities are defined by infinite sequences of positive (guilty) measurements embedded in infinite sequences of positive and negative (guilty and not guilty) measurements. (Large doesn’t count: has to be infinite.)

Tell me just what exact unique no-dispute no-possibility-of-other infinite sequence this real-life trial is embedded in. Black guy on trial for selling a certain quantity of cocaine within so many yards of a school. Guy, born and raised Christian in the States, dresses in Muslim garb. Good luck!

Sequence has to be exact unique no-dispute no-possibility-of-other otherwise you could come to different probabilities.

Incidentally, I was a juror on a trial with these circumstances. The black women on the jury were incensed to high degree and never forgave the defendant for wearing the garb.

1. Gary

There is another bit of evidence that jurors bring to the trial just touched on briefly (i.e., “Not guilty because they think the law is stupid”) but which contributes greatly to the decision in the minds of some jurors. That evidence is their personal struggle with the responsibility of judging someone guilty of a crime that leads to severe punishment. They ask themselves “who am I to judge?” The answer is — the juror weighs the evidence and decides if it’s sufficient to prove the case. It’s the judge who decides the punishment. In the trial I experienced as a juror, we worked through the emotional conflicts and came to a proper decision under the guidance of a foreman to managed the process well.

Great post. It makes the distinction between probability and decision quite clear.

2. cdquarles

I, unfortunately, have had enough experience with the judicial system in the USA, that it is easy for me to agree with the presumption of innocence. I’ve been a plaintiff, a defendant, a witness and a potential juror. That one was the most ‘eye opening’ of all. If you get justice, thank God for it.

As a plaintiff, your complaint is presumed true. As a defendant, you are presumed innocent. Remember, the plaintiff makes a claim or a charge. The claim or charge is open to rebuttal. The presumption of innocence is also open to rebuttal.

Since evidence is open to interpretation, tests subject to error, and humans are fallible, it is generally accepted, in the USA, that it is better for the ‘guilty’ to go unpunished than it is to subject the innocent to punishment.

3. Plantagenet

I worked as a pupil at a criminal set in the UK, and again briefly as an advisor in Canada so my knowledge of the American system is only general. However the presumption of innocence notwithstanding, 999 times out of a thousand you have to be pretty darn guilty to get caught up in the criminal justice system to begin with. Having said that once you have what the British call “form”, and I believe Americans refer to as a “rap sheet”, you will be pulled back in if you do anything remotely suspicious or are just in the wrong place at the wrong time. Again often there is good reason, nevertheless I have met those who gave up trying to go straight because the system had made up its mind. As a thief in Newfoundland once said to me your only choice is to be an amateur or a professional.

4. People complain about all the violence in America. They talk about our “gun culture.” But what they don’t talk about are the facts. The average murder victim has been arrested at least 10 times. The average murderer has been arrested over 20 times. Both are known gang/organized crime members.

We don’t have a violence problem. We have a lack of criminal justice problem. If these career criminals and thugs were in prison or under the ground, they wouldn’t be bothering anybody else.

5. “In that [frequentist] theory, probabilities are defined by infinite sequences..”

Nope, that is just one type of frequentism.

If I toss a coin a trillion times (a trillion is still less than infinity), I’m pretty convinced of the probability of it landing on heads is p, whatever p happens to tend to.

Justin

6. Andyd

There is no presumption of innocence. Who could possibly assume such a thing, it’s absurd. The onus is to prove guilt beyond a reasonable doubt.

7. Getting back to this: ” Jury trials are perfect at showing frequentism fails as a definition of probability.”

Just how do you solve this difficult problem with your favored definition of probability?

I was re-reading Boole’s “Laws of Thought” (LOT) recently. In LOT towards the end, at pages 376-398 in a “Probability of Judgments” section, he talks about probability for trials. To summarize:

P(accused person is guilty) = k
P(juryman forms the correct opinion) = x
P(accused person will be condemned) = X
P(accused party is guilty and juryman judges him to be guilty) = kx
P(accused person is innocent and juryman judges him guilty) = (1-k)(1-x)
Then for 1 juryman, X = kx + (1-k)(1-x)
If there are n jurymen whose separate probability of judgment is x1, x2, …, xn, then
X = k*x1*x2*…*xn + (1-k)(1-x1)(1-x2)…(1-xn)
Suppose x1, x2, … , xn are equal, then X = kx^n + (1-k)(1-x)^n
In this case, P(accused person is guilty) = kx^n /(kx^n + (1-k)(1-x)^n)

He calculates:
P(i voices out of jury of n declares guilty)
P(condemnation by a majority “a”) (this is just the above probability with i = (n+a)/2)
P(condemnation by a majority of at least m out of n jurors)
All the above equations but without k (ie. where it is not a jury setting, but an assembly of n people)

He also speculates on values of x and k. Notes Laplaces speculations are not great.
He uses estimates x and k from real trial data that Poisson did. ie. he is using relative frequencies. Estimates 5286/11016 = .4782 for some timespan, and 743/2046 = .3631 for another. He uses these to derive values for k and x or something.
He assumes accused is more likely to be guilty than not.
He assumes juryman is more likely to make correct decision than not.

He builds this all up from basic principles, to finally derive:
-the mean probability of a correct judgement
-the general probability k of guilt of an accused person

Based on models and relative frequency data.

Justin

8. Timothy S

Dr. Briggs, [1] “A decision” is not ” A probability” + [2] Presumption of innocence has meaning outside of jury trials.

A few years ago I was charged with with kidnapping and doing bodily harm to a child. *gasps*

Now on the precautionary principle, most of us will take a comparatively low standard of evidence to draw our beloved children away from any potential harm. However, many people take that decison as a probability. That would account for the reluctance to change people’s mind in the face of evidence. Today I have the benefit of a court ruling, but a significant portion of the population are stubbornly conviced by such accusations. A * goes beside my name everywhere I go.
https://calgarysun.com/news/crime/man-attempted-to-snatch-nine-year-old-girl-in-front-of-mother-trial-told
https://nationalpost.com/news/crime/court-views-video-of-child-abduction-suspect-attacking-police-officer
https://www.alainhepner.ca/in-the-news/man-cleared-of-assault-after-judge-rules-he-was-suffering-epileptic-seizure

One day in the middle of town I had a seizure and grabbed onto the nearest person. This has happened all my life, but this time it was a poor 9 year old! The mom tried to wrest the the child from my grasp. The kid was bruised and both were traumatized. I fled – another pattern of epilepsy evident for decades – usually happening without any inciting incident, but which would seem to prove culpability. In seconds I didn’t even know anything happened and continued merrily on the way I was going and I walked right back past the poor family, who were terrified and the gathering crowd was angry.

It was incumbent on me to provide affirmative evidence of my innocence with neurologists and character witnesses. On this point, I don’t have any particular objection, the “presumption of innocence” serves important functions most people never think about. Some problems it mitigates do not occur in the courtroom and are thus invisible to the public, even the judges. Prosecutors get to decide who to charge, what they’re charged with, whether they are guilty, how to weigh punishment/deterrence and how much time the accused spends in jail.

By overcharging, they can ensure the full sentences for the underlying allegation is served even before conviction. In my case, after several [violent] months in jail, Ramaswami – a prosecutor who substituts vindictiveness for a moral compas – set cynical trap for me: If I plead guilty to one small charge, I’d be free the next day and the case is over. I got strangled unconscious and kicked around a few times and the reputational damage would be permanent. When I refused the sleazy b… dog, having just offered unconditional release, convinced a judge I was too dangerous to release! So my telling the truth resulted in another full year of imprisonment.

In any case Mr. Briggs has cleared up a fact which perplexed me: The continued perception of ‘guilt’ is not about evidence, but a cognitive consequence of confusing decisions with probabilities.

*The police lied, but in fairness to them, these lies were were only in relation to their own criminal conduct.
**And apologies for the long post. Please feel fre