It’s a lazy Saturday, so some musings today on entropy and information and probability. It’s about time we started tying these things together.
Things like the following are heard: “Given a list of premises, I judge the probability of rain tomorrow at 0%”. Swap in for “rain tomorrow” your favorite prediction: “the team wins”, “the customer buys”, “the man defaults”, and so on.
Now zero-percent, a probability of 0, is mighty strong. The strongest. It means the proposition is impossible: not unlikely, impossible. Impossible things cannot happen. Even God cannot do the impossible.
Yet impossible events occur all the time. It rains when it’s not supposed to, the team loses, the customer leaves, the man honors. And so on. Something has gone wrong.
The failure, as always, is not to keep assiduous track of our language. Equivocation creeps in unaware. Impossible doesn’t mean impossible but conditionally impossible. Why?
All (as in all) probability is conditional. That means some list of premises existed which allowed the judgement of a probability of 0. That is, the forecaster had some model—a model is a list of premises—which spit out “probability 0” for his proposition.
Since it did later rain, we have falsified this model, i.e. this list of premises. But we haven’t shown which of the main premises are false: we have only proved that at least one premise of the model isn’t so. The “model”, i.e. collection of premises as a whole is false, it is wrong, but there may be pieces of it which are true. So much we already know.
Introduce the idea of “probabilistic surprise.” A surprising event is a rare one. If our model—you simply must get into the habit of thinking of any model as a list of premises; however “list of premises” is not as compact as “model”—says the probability of some proposition is low, then, conditional on that model, is the proposition is observed to be true, we are surprised. Rarer events are more surprising.
Think how you’d feel winning the lottery. The model easily lets us deduce the probability for the proposition “I win.” This is small; your shock at winning is correspondingly large.
Propositions which are certain conditional on the model are not surprisingly; indeed, since they must happen, since they have a probability of one, they are inevitable. Who could be surprised but what must occur? (Insert your own joke here.)
Lotteries and other dichotomous situations are great examples since we can easily track possibilities. Tracking possibilities isn’t always easy, nor even always possible with every model (list of premises!). Tracking means deducing every (as in every) proposition which has a probability relative to the model. In formal situations, we’re fine; informally, not so fine. So let’s stick with dichotomy, which is free beer.
Turns out, with this definition of “surprise”, or rather believing the notion that we can quantify surprise, a move which is open to debate, we can deduce a formula for the amount of surprise we can expect given we have tracked the model and identified all the propositions and deduced their conditional probabilities (pi). This formula is:
Yes, entropy. Now, isn’t that curious?
Update See below for comments about calculating entropies. I originally had calculations here (something I don’t normally do) and because of my sloppiness (and laziness) I distracted us from the main points.
What about impossible propositions? Once again, our habitual slackness with language leads to mistakes. If a conditionally impossible event happens, according to our derivation, our surprise should be infinite.
In one way, this is right. If something truly impossible happened—these events are defined just as we define necessary truths, i.e. from indubitable first principles and irresistible deduction—then our surprise would surely be infinite. And deadly. Who could handle the shock? And since entropy is often given physical interpretation, impossible events imply the Trump of Doom.
But it’s obviously wrong. Impossible events which really occur always mean a broken model. They always imply that a false premise has snuck in and been believed.
The lesson (the only one we can do today) models which assign zero probabilities to contingent events are inherently flawed. (Contingent events are those which are not logically necessary or—you guessed it—logically impossible. It all ties together!)
Well, that’s it today. We haven’t done information nor scores nor a world of other things.