Let’s do a little science experiment together. Go into the closet and pull out an opaque sack or bag. Anything will do, even a large sock. If you can fit your hand in, it’s fine.
Now reach in and pull out a random observation. Pinch between your fingers a piece of probability and carefully remove it. Hold on tight! Probability is slippery. We will call this a “draw from a probability distribution”.
What does yours look like? Nothing, you say? That’s odd. Mine also looks like nothing. Let’s try again, because drawing random observations is what statisticians do all the time. If we didn’t manage to find something, the fault must lie with us, and not with the idea.
The idea is that these “random draws” will tell us what the uncertainty in some proposition is (described below). “Random draws” are used in Gibbs sampling, Metropolis-Hastings, Markov Chain Monte Carlo, bootstrap, and the like, which we can call MCMC for short.
Maybe the sack is the problem. Doesn’t seem to be anything in there. Maybe that’s because there is no such thing as probability in the sense that it is not a physical thing, not a real property of objects? Nah.
Let’s leave that aside for now and trade the sack for a computer. After all, statisticians use computers. Reach instead inside your computer for a random draw. Still nothing? Did you have the ONOFF switch in the OFF position?
You probably got nothing because when you reach into the computer for a random draw you have to do it in a specific way. Here’s how.
Step one: select a number between 0 and 1. Any will do. Close your eyes when picking, because in order to make the magic work, you have to pretend not to see it. I do not jest. Now since this pick will be on a finite, discrete machine, the number you get will be in some finite discrete set on 0 to 1. There is no harm in thinking this set is 0.1, 0.2, …, 0.9 (or indeed only, say, 0.3333 and 0.6666!). Call your number s. (Sometimes you have to pick more than one s at a time, but that is irrelevant to the philosophy.)
Step two: Transform s with a function, f(s). The function will turn s into a “random draw” from the “distribution” specified by f(). This function is like a map, like adding 1 to any number is. In other words, f(s) could be f(s) = s + 1. It will be more complicated than that, of course, but it’s the same idea. (And s might be more than one number, as mentioned.)
Step three: That f(s) is usually fed into an algorithm which transforms it again with another function, sort of like g(f(s)). That f(s) becomes input to a new function. The output of g(f(s)) is the answer we wanted, almost, which was the uncertainty of the proposition of interest.
Step four: Repeat Steps one-three many times. The result will be a pile of g(f(s)), each having a different value for every s found in Step one. From this pile, we can ask, “How many g(f(s)) are larger than x?” and use that as a guess for the probability of seeing values larger than x. And so on.
Steps two-four are reasonable and even necessary because we cannot often solve for the uncertainty of the proposition of interest analytically. The math is too hard. So we have to derive approximations. If you know any calculus, it is like finding approximations to integrals that don’t have easy solutions. You plot the curve, and bust it up into lots of sections, compute the area of each section, then add them up.
Same idea here. Except for the bit about magic. Let’s figure out what that is.
Now instead of picking “randomly”, we could just cycle through the allowable, available s, which we imagined could equal 0.1, 0.2, …, 0.9. That would give us 9 g(f(s))s. And that pile could be used as in Step four. No problem!
Of course, having only 9 in the pile would have the same effect of only slicing an integral coarsely. The approximation won’t be great. But it will be an approximation. The solution is obvious: increase the number of possible s. Maybe 0.05, 0.10, 0.15, …, 0.95 is better. Try it yourself! (There are all sorts of niceties about selecting good s as part of the steps which do not interest us philosophically. We’re not after efficiency here, but understanding.)
This still doesn’t explain the magic. To us, random means not predictable with certainty. Our sampling from s is not random, because we know with certainty what s is (as well as what f() and g() are). We are just approximating some hard math in a straightforward fashion. There is no mystery.
To some, though, random it is a property of a thing. That’s why they insist on picking s with their eyes closed. The random of the s is real, in the same way probability is real. The idea is that the random of the s, as long as we keep our eyes closed, attaches itself to f(s), which inherits the random, and f(s) in turn paints g(f(s)) with its random. Thus, the questions we ask of the pile of g(f(s))s is also random. And random means real probability. The magic has been preserved!
As long as you keep your eyes closed, that is. Open them at any point and the random vanishes! Poof!
“Are you saying, Briggs, that those who believe in the random get the wrong answers?”
Nope. I said they believe in magic. Like I wrote in the link above, it’s like they believe gremlins are what make their cars go. It’s not that cars don’t go, it’s that gremlins are the wrong explanation.
“So what difference does it make, then, since they’re getting the right answers? You just like to complain.”
Because it’s nice to be right rather than wrong. Probability and randomness are not real features of the world. They are purely epistemic. Once people grasp that, we can leave behind frequentism and a whole host of other errors.
“You are a bad person. Why should I listen to you?”
Because I’m right.