From Bill Raynor:
You like to discuss P(Y|M) a lot, but haven’t spent much time talking about the practical construction of that.
A topic I’d like to see: a constructive development of P(Y|X) for some real problem involving real objects…Every time I’ve chatted with a Bayesian about priors, I get a lot of handwaving and mathematical idealization (hello, mathematical Platonism) but very little in the way of real examples. The resulting math is very pretty and elegant, yadda, yadda. The part where I ask “what does that physically imply…” gets rather vague, quickly….
I have in mind something rather Kolomogorovian:
1. define a finite reference set of objects and measurements on those objects (e.g. body weights usually have a body attached…) You can use a finite sampling frame if you wish.
2. define a set of observable propositions on those objects and measurements (mutually exclusive and exhaustive, no infinities or absolute continuity, etc.) including the means of measurement. (e.g. the Brewers beat the Yankees, the mean difference between two (blocked) partitions of objects.) If it involves means, show how object/measurements really are additive — e.g. weights of grain from a field plot.
3. Assigning an additive measure and hence probability to that.
There are innumerable practical uses of Pr(Y|X). The worn stock examples were made for this. Let X = ‘This interocitor must take one of n states’, then if Y = ‘This interocitor is in state s = i’, we have Pr(Y|X) = 1/n.
This is as practical as it gets. Casinos use it; everybody does. And without priors! Most probability is done in this uncomplicated way. Pure probability, no formal models, no parameters.
Now let’s do your Kolomogorovian scheme to see it’s exactly the same thing. Take two blocked objects, A and B, which have finite discrete measurements taken on them, as all measurements are (all measurements are discrete and finite).
Next, since you’re interested in the means, or presumably some function of the means, we can compute A’ and B’, which are the means (also discrete and finite). We want Y = f(A’,B’).
Finally, we gather X, evidence probative about Y, call it some additive measure if you like, and compute Pr(Y|X).
See, I told you it was the same.
You can make it more complicated, but that doesn’t change the end result. Make it complicated by making X into math.
It could be, and it would be ideal, that you can deduce from X the probability of Y, as we did with the interocitor. The strategy is to consider the nature of the measurements. I give some examples in Uncertainty. Mostly folks are too anxious for the deduction, which won’t be simple, and start cramming ad hoc models into X.
Any number of models are used, most of them continuous approximations. There could be one model for A, another for B, which then implies some sort of model for A’ and B’, which in turn implies a model for Y.
All these models are usually parameterized. These parameters, being part of the models, are just another part of X. The models of the parameters, the priors, are also part of X.
Any past observations, if any, are also part of X. In the end, still Pr(Y|X).
“Why no specifics, Briggs. We want concrete examples.”
Did you try the many examples in the class? Lots of common cases. There’s no general solution, but many, many.
“Not yet. I saw them, but didn’t try. Too busy. What really worries me are the priors. They’re so much nonsense.”
Sure, like the ad hoc models themselves in which the priors are related. But they might, if you do it right, be reasonable nonsense, good approximations.
Here’s the idea. Start with a model for A and B, or start with A’ or B’, or even start with Y. It makes no difference. In the end we still get Pr(Y|X). If there are parameterized models, then if you want to try different parameters or different models with their different parameters, then you have X_1, X_2, X_3, …, each of these the full X for the choice of model and parameter uncertainty; including, of course, any past observations, and other evidence that went into the suggesting the models used.
Which is “best” depends on the uses to which you put the model, the decisions you make with it. X_17 might be great for you, lousy for me; maybe I prefer X_2. Who knows?
None of these are the true model, the cause of Y. If we knew the cause, we wouldn’t be worried about all this other nonsense. And the model isn’t true because we haven’t deduced it like we did with the interocitor.
“This is not a satisfying answer.”
Is it not? It is the true answer. It’s complete. There was no point going on about some specific example (which are in the class anyway). The idea is what counts.
To support this site and its wholly independent host using credit card or PayPal (in any amount) click here