The answer is yes. Or, if you prefer, no. It depends. Here’s a long query about the subject from friend-of-the-blog Kevin Gray.
Over the years many have disagreed about many things with the man some feel was the Founding Father of Modern Statistics.
Part of this has surely been a reaction to R.A. Fisher’s frequently caustic ways of interacting with colleagues, but the discipline itself has developed extraordinarily since he was active.
For instance, in an agricultural study say we want to test the main effects and interactions of soybean variety, amount of watering, amount of fertilizer, and soil compaction on yield. The design was structured (rotated) so that soil composition beneath and surrounding the plants would not bias the results.
In this example, there would essentially be no heterogeneity in any of the predictors. That is, the water and fertilizer would be identical across all cells as would soil compaction (which would be performed by a trained professional with a mechanical device). The seeds for the different varieties should also show essentially no variation within cell, either.
There is no true sampling error to account for. Do we really need ANOVA?
Rather than assuming that our experiment was “sampled” from a hypothetical population of identical experiments, wouldn’t it make more sense to set a criterion for effect size in advance, and if the experiment meets or surpasses this threshold, to repeat it?
If the results of the trials differ more than slightly, this means the methodology was not replicated exactly, a possibility ANOVA does not address.
We may wish to estimate the main effects and interactions of the predictors (variety, watering, fertilizer, etc.) via multiple regression, perhaps with a Bayesian hierarchical approach. I would focus on the point estimates and ignore the confidence/credible intervals.
Similarly, in observational studies when we perform causal modeling on population data, as is common in many econometric applications, point estimates are what matters. A t-test to see if a linear trend (for example) differed from zero would be meaningless.
So, my non-scholarly stance is that Fisher was wrong in this case.
The problem has always been, and still is in spite of recent attempts, a confusion about cause. The classical view is schizophrenic.
Gray is right: if we assume the only causes of yield are variety, watering, fertilizer, and soil compaction, then the only causes are variety, watering, fertilizer and compaction. Those are it. There are no others. As in none. Including imaginary ones. None. By assumption. That assumption is the key move.
There is therefore no such thing as “sampling variability.” And no need of ANOVA (basically regression), or for any other kind of statistical-probability model. All we have to do is set levels of all those causes, wait for the beans to do their thing, and measure them.
We still have to assume the measurement process itself has no error. This is often a very fine nice assumption. Go count the number of licensed working cars in your parking slot. I bet most of you, save those well into their third mimosa, this being breakfast time, come to a firm reliable error-free number.
Putting a ruler up to a speedy subatomic particle is a different question, and so is asking a person how fretful they are on a scale of sqrt(-32) to 18.7. The act of measurement is part of the system. We won’t worry about that problem here.
If we’re right about these four causes being the sole causes, then after we measure yield at a specific setting of those causes, we’re done. Finis. Too, there should be no deviation from the measured yields if we redo the growing with the same cause-settings.
Ah, but what if there is a deviation? Two or more runs of the same settings of the four assumed causes and two or more different yields? Assuming no measurement problems, then necessarily there must be at least one more cause than our four assumed causes.
And, even worse, it could mean that one, or even all, of the four assumed causes aren’t causes at all. Pause and reflect on this.
All this must be so because if there are only four causes, then necessarily we should get the same measurement each and every time. If the measurements differ, then our assumption is falsified.
Now it could be that when we repeat runs the measurements differ, but always by some trivial amount (we can discuss another day what to do if this amount is constant, predictable, or unpredictable). There are still other causes operating we haven’t assumed, or errors in the ones we did assume, but since the departures are not important, we don’t care. And again ANOVA isn’t needed.
ANOVA—which is to say, some probability model—is only needed if at any combination of the repeat causes, the measurements differ importantly. It’s easiest to think of this with just one assumed cause. Pick any. How about watering.
For instance, with water set to “level 1” (whatever this might be), we see “yield 1” one time and “yield 2” another time (where the two yields are importantly different). Clearly there is another cause besides watering, or the watering isn’t a cause at all. The latter is a logical deduction based on this information alone, and no outside information (such as what plants without water do).
There is now uncertainty in the eventual measurement. We express uncertainty using probability, quantified or not. The uncertainty is because we do not understand the cause of the measurements. We think the cause is water, based on information outside this system, but we know, because of differing measurements, there must be at least one more cause at work. It could be many causes. We don’t and can’t know.
If we did know, we’d be back in the first situation, and probability isn’t needed.
Therefore, the only time we need probability is when we don’t understand cause.
This is why probability can’t be used to discover cause, in spite of the many burgeoning claims that say it can. Whenever a claim like this is made, it is always because somebody has snuck in an outside premise having nothing to do with the measurement system at hand.
We learn cause by induction. We know water is a cause in plant growth because of induction. In our experiment, we start the problem by assuming water is a cause. We don’t back it out: we begin with it. We deduce (a weaker form of knowledge, in the scale of things) that another cause or causes must exist because we get different measurements at the same setting of water.
We can’t induce what these causes are with just the information provided. And we don’t need to, as long as we’re comfortable using probability to deal with the uncertainty in the unknown cause or causes.
Which probability model to use, should one demand quantification of uncertainty, is an entirely separate matter. I’ll leave off here saying the model is usually picked by custom, and for scarcely any other reason.
But at no point, ever, does probability come alive, become some real thing, become a cause or some strange physical thing. Probability stays in the mind, as part of your understanding only.
That’s why it’s strange that the unknown or mistaken causes (when we have differing measures) are said to be probability, and called “error”. It’s no so wrong to use the word error, because it signals a mistake in our thinking, but to say, as they do say, that the error is “normal” (or some other probability) is to say that probability becomes a cause. Which is absurd.
So, yes, Fisher was right and Fisher was wrong. It depends on the precise question asked.
Subscribe or donate to support this site and its wholly independent host using credit card or PayPal click here