Statistics

# The Philosophy of Probability and Statistics (Book. Sort Of.)

I’m going to add another one to the stack.

I have decided to let you, dear reader, help me finish my book, which I have tentatively entitled The Philosophy of Probability and Statistics. This is about the seventeenth version of the title, so it might change again.

I’ve been working on this book piecemeal for some time, but not consistently enough (I’ve been spending more time on another one, a more popular version about over-certainty). So I decided to release it as she is, in separated segments. In a sort of fashion. Kinda sorta. More or less.

Today, an outline. Comments more than welcome.

There’s chapters and fragments of chapters and bare notes floating all over my hard drive and on odd pieces of paper. Putting them up will force me to gather them into something resembling coherence.

The writing will be in Latex, in raw code. Maybe I’ll PDF a few, links at bottom of posts. Luckily, PPS is not a math book. Mathematics is useful to probability, and there exists a mathematical subdivision called measure theory which makes great purpose of it, but I am interested in probability as measures of evidence, probability as she is or should be used for real-life matters. In this sense, probability is not mathematics; therefore I don’t need as much of it as is ordinarily used. Meaning reading Latex code won’t be that difficult for the uninitiated.

Incidentally, there is no difference between probability and statistics, except that the latter is a name for data. So I’ll mostly use probability to mean what people usually mean of either subject.

The new category tag PPS has been added to note posts which are part of the book. Click it to see all post (just this so far).

Rough gross mysterious outline:

1. The way it’s done now has lead to an (unnoticed) epidemic of over-certainty. Logic and probability belong to epistemology, which is the study of what we can know. Truth exists, relativism is silly but understandable, skepticism is stupid and not understandable, Gettier problems aren’t. I am not a Bayesian, but I love much of it.

2. Logic, which isn’t formal. Logic is the study of the relations between propositions. Let’s return to syllogistic logic to educate initiates. Symbolic and mathematical logic, fine things, can be saved for adepts. Math and symbolic logic are formal because they constrain the range of propositions. With freedom comes responsibility!

3. Probability, which is logic, it is its natural extension, or rather, its completion. Every results which holds for logic therefore holds for probability; thus probability isn’t formal until its propositions are constrained. Probability is not (of course it is not) relative frequency, a fallacy which mixes up epistemological propositions with ontological ones, and neither is it subjective. Beliefs, decisions, acts are not logic therefore are not probability. Probability is rarely quantifiable.

4. Causality and Induction, which is fine. Logic is the not the proper language of causality, therefore neither is probability. Causality has four dimensions (formal, material, efficient, and final). Logic-probability can measure relations between causal propositions, but again beliefs etc. are not logic. Induction is fine and rational. Induction is rarely quantifiable. Grue is no problem.

5. Observational propositions, which are statistics. An observational proposition is “I saw m people in the drug group out of n become well, and r people out of s in the placebo group become well.” This is statistics as she is normally thought of. Measurements, except in exceptionally rare circumstances, and possibly not even then, are finite and discrete. Again, not all probability is quantifiable.

6. Probability models, most of which aren’t deduced, but some are. Deduced models aren’t models, but optimal and true statements of probability. Deduced probabilities aren’t well known, aren’t well developed, and will save your soul. Non-deduced, i.e. assumed, habitual, or customary, models are killing science softly and slowly and with a smile. And they lead to endless and incorrect debates about truths of models, which we know are false.

7. Over-certainty, which is parameters, p-values, hypothesis tests, estimation, credible and confidence intervals, and premature jumps to infinity. Domine exaudi orationem meam, let the Cult of Parameter end!

8. Predictive statistics, probability leakage. If you’re going to use a non-deduced model, then at least do it so it can be verified, which means use the model in a predictive sense (Bayesians say “predictive posterior distributions”).

9. Models to decisions to verification. Since probabilities aren’t decisions or acts or beliefs to be useful they must be transformed to decisions acts beliefs. Verifying probabilities is not the same as verifying decisions, since by definition probabilities are true statements and therefore not in need of verification. But decisions can be good or bad, as long as you understand what good and bad are.

10. Examples like time series, regression, and so on will be spread throughout. But maybe a special chapter with the regular suspects. There are thousands of procedures and I can’t hope to do more than a handful.

This not a recipe book, but a starting point for somebody to write one. One step at a time!

Categories: Statistics

### 17 replies »

1. Gary says:

To start, the title “The Philosophy of Probability and Statistics” is certainly clear enough, but Levitt and Dubner know how to capture attention http://freakonomics.com/2014/04/02/think-like-a-freak-our-new-book-out-on-may-13/ Just sayin’.

So who’s the audience? And why does it need this book? Your target really determines your ammunition. Don’t end up hunting rhinos with slingshots or bunnies with bazookas. Hmm, do I sense a new title gestating?

2. Briggs says:

Gary,

Audience is not the public, but those scholars (there are a few left) interested in getting probability right.

Book should sell tens of copies. If I’m lucky.

3. Wits' End says:

Briggs:

It seems you have two books you are trying write over the same period. I’m guessing but there is probably some overlap in the content. Over confidence about certainty in part flows from misunderstanding the underlying philosophy.

I’d suggest finishing the first book, dealing with over-certainty, look at the result and say, “All right we have a fair amount of clarity on that topic but what are the underlying philosophical mistakes that lead to the confusion. The underlying mistakes are not all philosophical so the sources of error need to be separated into different camps: philosophical, experimental design, misuse of statistics and so on.

The results of that reflection (here is a list of the main philosophical mistakes and related issues) could be the forward to the second book – PPS.

Then you could do the outline of the actual structure for PPS. With the actual structure in hand your ‘dear readers’ could be more helpful.

This is probably, there’s that word again, not the ‘dear reader’ help you were looking for but in terms of getting the work done it might be a helpful suggestion.

4. Gary says:

Briggs,

Well how about undergraduates in STEM majors who need a better understanding of over-certainty before they are foisted upon the working world. It’s probably a lost cause to suggest journalism students could benefit …

5. Briggs says:

Wit’s, Gary,

It’s a good suggestion except that I need PPS to point to and that it’s my professional duty to get this book done.

The duty is self-imposed. I am not employed to do it.

6. In lecturing on a similar topic, I find it helpful to tell the audience that the key ideas in the lecture that I am about to deliver are those of measure, information, optimization and missing information. Later on, I point out that in the probabilistic logic, an inference has the unique measure that is called its “entropy” and that the entropy of an inference is the missing information in it for a deductive conclusion. In view of the existence and uniqueness of the measure of an inference, I say, the problem of induction can be solved by an optimization in which the entropy of each inference that is made by the model is optimized by minimization of its conditional entropy or maximization of its entropy under constraints expressing the available information. Thus, the principles of reasoning are entropy minimax say I.

7. Ray says:

I am looking forward to reading your draft. Years ago I commented on a book by Hans Camenzind when he solicited comments on the draft. I pointed out some mistakes and suggested numerous gramatical changes. He corrected the mistakes but didn’t change a word of his text. Pride of authorship I suppose. It is a very good book if you are interested in analog circuit design. http://www.designinganalogchips.com/
.

8. Francsois says:

I am looking forward to the book, will buy it. I will read anything that clears up this question- which real world problems can statistics/probability be used for (and how?). I have been reading your blog for the last year or so, and you have pointed out many instances where statistics are used wrongly. I am left wondering if probability and statistics can be applied to ANY real world problems, apart from the solving problems with dice, roulette wheels, playing cards etc. Can we use statistics to find out if one medical treatment is better than another? How?

9. Chronus says:

I have quoted you to my son about statistics and logic. I think the 2 books are a good idea and a positive gain for western civilization. To build on Gary’s statement, a short,third book for journalists,should be considered also. We have the mid-term elections coming up. You could the next the Nate Silver.

Adding two more cents of free advice, your target reader is likely a mathematician and may be unfamiliar (or rusty) when it comes the background knowledge of philosophy. I would broadly summarize the field itself before going into logic. Perhaps you meant that in sections 1 and 2 but I read it as much narrower.

Lastly, add an appendix that lists the common fallacies. You can’t remind people of them too much.

10. Jim S says:

Over-certainty is only a problem when you demand that everyone agree with you. So many philosophical problems disappear when it’s finally grasped that you have ZERO power over what others believe or think.

Epistemologically, each of us is an island unto our self. See Umwelt on wiki.

11. I think such a book should be written. I can give you more examples in the MRI literature for the “over-certainty” issue. I disagree with your item 3. Probability was engendered by Pascal and Fermat to support belief in the best gambling strategy, and I believe probability is only useful as a guide to action… I’m with “Richard Jeffrey” in his “Subjective Probability”… Can you write about that in more detail, why using probability as a measure of willingness to bet is not justified?

12. Briggs says:

Bob,

Certainly you can use probability in gambling. But a probability is not a bet, it is not an action. You use probability to decide actions.

13. Matt, OK, I get your point. But if the motivation for the use is to win a bet, doesn’t that say something about its epistemological roots?

14. Briggs says:

Bob,

Only in the sense of the motivation to understand the epistemology. That (given some evidence) the probability of a proposition is quantifiable, say it equals X, says nothing at all about how one should act on that information. Except that, accepting the evidence, it is ration to believe X and irrational to doubt X.

15. Probability theory was invented by the mathematician and gambler Girolamo Cardano who, however, perished before being published. Thus, the honor of first to publish goes to others.

16. Briggs says:

Terry,

Mathematical probability first saw print because of Cardano. Probability as uncertainty has been with us since the beginning. See Jim Franklin’s Science of Conjecture.

17. William:

Thanks for the citation. I though I’d lighten up the conversation by telling the joke about perishing before publishing.