First, and most strongly, probability need not have anything to do with data. For example, we can compute a value for the probability that “Matt wears a hat” given the assumed evidence “Most (but not all) men of superior wisdom wear hats and Matt is a man of superior wisdom.” The probability that “Matt wears a hat” is a range: less than one and more than 1/2 (because of our tacit premise that most means more than half but not all).
Data can and often is necessary, but again it is not always. When we do take data, these data form additional premises, or evidence, to our chain of argument.
Now, the purpose of taking data is usually to learn about a general situation. For example, a poll is used to sample a small group of people, and this sample will be used to infer the responses of the larger group of people who were not polled.
Since we don’t know with certainty what this larger group will say, we are uncertain. And we can quantify our uncertainty using a probability model, which we assume is true, and by using the evidence of our observations on the smaller group.
That is, we want to say something like this: “Given that this probability model is true, and given the observations I have taken, the probability that somebody from the larger group will answer question one ‘Yes’ is p”, where p is the result of a calculation which depends on the model.
This make sense to you? We could change the later part of the sentence to “the probability that each of ten people from the larger group” or “the fraction of the larger group” and so on. In each case, we ask something about data that is in principle observable, but that we have not yet observed.
Another scenario (from this year’s class): We want to know how years of education predicts starting salary for individuals in a given field. Presumably, the more education the greater the salary, at least on average.
So, we go out and survey a group of individuals and ask them their years of education and how much was their starting salary. Then we can answer questions like this: “Given that the probability model which relates education to salary is true, and given the observations I have taken (and assuming nobody lied), and given that a new person I have not yet surveyed has x years of education, what is the probability that his salary will be greater than y?”
Does this make sense to you? Once more, I move from taking data to make probabilistically quantified predictions about data I have not yet observed. This is the primary goal of statistics. These are the only questions of interest to most of us. These questions are what we want to know.
But we never teach students how to answer these questions! No matter if the instructor is a frequentist or a Bayesian, none will teach the student what they really want to know.
Instead, professors will bedevil them with “hypothesis testing”, “statistical significance”, and talks of “priors” and “posteriors”. They will set students to calculating dozens of equations, none of which are useful in answering the questions the student has.
Not only that, but since the student wants to know about observable data, he will assume that what he has learned about “hypothesis testing” tells him about observable data. He will either go away hating statistics, or will be completely confused about what he has learned and will thus make mistake after mistake in interpreting his results.
In short, if he is not downright stupefied, he will at least be far more certain than he has a right to be in all of his statistical judgments. This is so because all those older methods (“significance”, “posteriors”, etc.) will produce results which will seem certain, but which are not with respect to actual observations.
Now, this sad state of affairs is so for many reasons, one of which is inertia. Academic statisticians know of how to talk about observable data, but they do not because everybody else does not. If they tried, they’d have to change the entire system, which was set up to talk about the old way.
Another is that almost all statisticians view themselves as mathematicians: indeed, many of them are. But statistics and probability, when applied to quantifying uncertain in real-world applications, has nothing to do with math. Applied probability is no more a branch of math than is physics, or chemistry, or automotive engineering, and so on. Math is useful in all these areas, but the point of each of them is to learn something about the real world. They are not there to learn to do math.
Stick around to learn how to talk about quantifying uncertainty in real observable data.