We started by assuming each X was measured without error, that each observation was perfectly certain. This is not always so for real X. It could be that the measurement apparatus itself does not produce error-free values, that instead the values come with some uncertainty. The values from satellites, for example, are like this: the values you see are actually output from a mathematical model which contains uncertainty. And it is often the case that the values of X from one time point to another are from different sources, even different locations. This, too, implies uncertainty, as we shall see. All proxy reconstructions have error.
Now, if each Xi is itself an average of, say, Yi,j , j = 1…m, where the Yi,j might be different fixed locations, then as long as each Yi,j is measured without error, we do not need to treat X any differently than we have been treating it. Those Yi,j can remain “hidden.” But if Yi,j is measured with error, that error must be accounted for in Xi. Averaging a bunch of Yi,j that are measured with error does not remove the error in the average: the average is inviolably subject to uncertainty. If you find yourself disbelieving this, then let m = 1. Aha! There the error is, plain as day. Therefore, if there is uncertainty in the Yi,j there is uncertainty in the Xi.
Another form of error is when then the locations and instrumentation change in time. This means that Xi is the average of Yi,j , j = 1…m, but that Xk (k n.e. i) is the average of, say, Wk,l , l = 1…n, where it might be that some of the locations/instrumentation in W match some of the locations/instrumentation in Y, but they all do not. This means that Xi and Xk are measuring different things, not just in time but in substance. It is possible to “map” Xk onto Xi+r, where k = i + r, but this requires a model. That is, if we want to talk of X everywhere being the same substance, we need to make sure they are all talking about the same thing. Doing so requires some kind of model, usually probabilistic. This model introduces uncertainty in the “mapped” Xi, even if, the Y and W are measured without error. If, as is likely for physical variables, Y and W are measured with error, then this error and the error due to “mapping” must be accounted for when speaking of X.
We began by asking
(2′) Pr(X1 = 0.43 | Error-free Observations ),
We noticed that X1 = 0.43, so that (2′) = 1 or 100%. But if there is error we must instead ask
(24) Pr(X1 = 0.43 | Error-filled Observations Z ),
where all we can say at this point is that (24) < 1 (it will also be > 0). The notation changes slightly here. We observe the value Z1 to be 0.43 and we ask what is the probability the actual, unobserved value X1 is the same. What we really need is this:
(25) Pr(X1 = 0.43 | Error-filled Observations Z & Mprob),
where we have some model, here assumed to be a probability model Mprob. Once we have it, we can answer questions like this
(26) Pr(X156 > X1 | Error-filled Observations Z & Mprob),
(27) Pr(X Decreased | Error-filled Observations Z & Mprob).
Let’s focus on (26) because it’s easier. It is no longer enough to look at the graph, which are now understood to be plots of Z and not X, and say whether X156 > X1. We don’t observe X, we see Z, so we can’t say with certainty whether this is so or not.
Since Mprob is a probability model, it will have a set of unobservable parameters about which we haven’t the slightest interest but which we must account for if we are to calculate (26). We certainly do not want to make a guess of those parameters and say that the guess of these parameters are really X (as many who work with proxies do). We don’t want to calculate something like (19) or (20) and make a decision whether some of these parameters should be set to 0 (or some other value). We absolutely, positively, 100% do not want to view the uncertainty in the parameters themselves as a substitute for (26) (as the BEST people did). This is the mistake most people make, especially when they are “reconstructing” temperatures from proxies. We are presented a long line of parameter estimates which do not say anything about (26) (or (27) or any other question about the real data).
Instead, we want to calculate (26) after removing all uncertainty we have in the parameters. In Bayesian terms, we say these parameters must be “integrated out.” The result, (26), is called a “prediction” because, of course, we are predicting what we do not see. ((26) is found from the “posterior predictive distribution.”)
All we can say for certain is that, if there is any measurement error, 0 < (26) < 1, where the limits are never reached. Different models of measurement error will give different values of (26). And all the same goes for (27). A glance at the plot is not enough to confirm that X really decreased or increased. And no matter what, (27) will be away from 0 or 1. There is no “straight line” or “trend” in (27), either. This is just a question if X decreased more often than it increased or stayed the same. We can add in a trend to (27), but that does not allow us to bypass modeling the measurement error: we must add these models together. And again, even if the trend+error model says one thing, that does not mean the plain error model is wrong.
Everything else I said about comparing probability to physical models still holds, except that now the physical models must be augmented by some notion of the measurement error. Usually this means the physical model is a physical+probability model. But however it is done, no interpretations must be changed. We just need to be sure we’re accounting for the measurement error.
We are finished. So if somebody now makes a claim such as “Temperatures have not increased over the last decade” we now know exactly how to verify this claim. Even better, we know how not to verify it.
Quod erat demonstrandum