# Which Statistics Grad School If You’re Sick Of Hocus-Pocus P-values?

I received this query from a reader and thought it important enough for all of us to answer.

I was hoping you could provide some advice for an aspiring statistician. I am an undergraduate in math preparing to apply to graduate school in statistics, and I love studying the subject and analyzing data. However, I am now convinced by the arguments against classical statistics and I fear committing to a career path where I will be forced to participate in hocus-pocus. Is there an intellectually honest way forward? For instance, are there particular statistics departments doing things right? And if I can’t get accepted into / afford any of these schools, should I just do something else?

I’d greatly appreciate any guidance you can give.

J.

Dear J,

I do not know any graduate department that will allow you to bypass classical methods. It’s true there are some departments that are greatly sympathetic to Bayesian statistics, but Bayes must always come after frequentist; even in “Bayesian” departments it’s an add-on. Anyway, many of the same mistakes in philosophy are made in Bayes as they are in frequentism.

Idea is to eschew standard statistics departments and come at the subject sideways.

You can consider going the straight mathematical route. Choose a math department which will allow you to do probability. Not much data analysis that way, and the (of course) over-emphasis on probability as mathematics, which means you can begin to forget that most probabilities aren’t quantifiable. On the other hand, you can focus on what I think will be a huge area for observable predictive statistics: combinatorics.

Since all measurements are finite and discrete, and no measurement can in reality go to infinity, all probability is finite and discrete, or should be, which makes figuring problems exercises in combinatorics. Learn to count. You can do the math without telling statisticians that you’ll eventually use it for statistics. The mathematicians won’t care.

Applied math is a good option. But even in applied math the focus will be on parameters and parameter analysis, which we agree leads to over-confidence or fallacy. There are many mathematical tools statisticians use that you’ll only gain a passing familiarity with unless you’re in a stats department, but most of these are in “estimation” theory, which always means estimating unobservable parameters about which we should have very little interest. But if you’re a mathematician, you’ll be able to figure these out when you need them.

Another good option is physics. I’d recommend this as the ideal option, but I’m guessing you haven’t done much undergraduate work in the area, which puts you at a major disadvantage. It’s not a coincidence that many advantages in predictive probability and even the philosophy of probability have come from physicists.

Physicists used to emphasize observables in all its branches, but even there that trait beginning to slide. Yet you can avoid the weirdness by staying away from string theory, multiverses and the like.

And then there’s philosophy itself. That would be the toughest place to break in, but only because not a lot of departments do much probability. Many do philosophy of science, naturally, and those would be the places to look. Philosophers are more willing on average, and not surprisingly, to welcome argument on received wisdom. This isn’t so in many other areas of science.

If you want to go to any school or department, first discover the kind of work you think you’d like and then find somebody who is doing it. Write them and see if he needs students.

The last option is to skip school all together and teach yourself what you need. Takes a lot of dedication, but it can be done. Problem is your potential employers, either because they’re too lazy to assess skills of potential employers or because they are forbidden to do so because of government threats (diversity), often want certification.

What does everybody else have to say?

1. James

One thought might be to look into machine learning grad programs (CMU, perhaps?). I’ve only ever self-taught machine learning, but all the books and lessons online are pretty good. They don’t get to Briggs-level criticisms of p-values (they appear to just kind of go “these exist”), though.

What they do well, in my view, is focus a whole lot on the ability to predict and generate useful data. There’s always out of (training) sample testing that goes on, so I think that a student in such a program would get a good appreciation for the problems of overfitting and the need for a good model skill analysis.

Plus, some of the problems are really fun, especially machine vision. If you get into recurrent neural nets and the like you can also really amuse yourself by generating ‘random’ pieces of text from various words banks online. Some people have even done it with journal papers (LaTeX source and all!).

As an analytics executive for a large international bank, I feel like most of my peers are looking for credentials. There are a few of us here (maybe 5 of the 50 team owners) that value practical skills (programming, critical reasoning, mathematical aptitude, etc…) over credentials. I have a few PhDs on staff, but no two from the same field; and it’s on purpose! I loathe group-think, so I want very diverse approaches to problem solving.

So I generally advise my junior employees looking at grad school to look at Operations Research, Computer Science, Applied Stats, or something analytically broad that emphasizes real world implementation, because the depth you’ll acquire from on-the-job usage. (It’s unlikely that your PhD thesis will ever be something you’d work on outside of academia.). In my experience, you’ll learn more good habits at work by having one “good apple” who stresses sound logic, like Briggs, than you will sitting in classes discussing moments. If you can read “math text” and know how to code it up, you’ll eat well forever… just get enough credentials to get your foot in the door.

P.S. Don’t underestimate professional networking.

3. James

As an addendum to my comment, the other thing I really like about ML is that they don’t treat it as some kind of crazy statistical thing with all kinds of causal meaning. My impression has always been (and I like it because this mirrors my thinking) is that statistical and ML modeling are just optimization routines, and they come along with all the caveats of optimization.

I could go one for hours about statistics as optimization and why it makes p-values meaningless, but I won’t waste your time!

4. J,

My biased comments: Your inquiry raises the question of ‘What makes a good statistician?’ Since all science/critical thinking is conducted with assumptions about metaphysics, epistemology, aesthetics and ethics, philosophy plays a key role. What statistics department appreciates the role of philosophy in the subject matter? Also, knowledgeable and skilled statisticians have a solid background in probability theory. Which institutions practice that principle and provide excellent graduate education in probability theory for statistics programs? Lastly, you might find a professor who shares your interests and contact him/her for advice. Since you will write a thesis/dissertation under a supervisor, there is no substitute for a good mentor. – Best wishes.

5. Gary

I’ll offer some general advice. Get to know some of the graduate students in your department and others that use mathematics extensively (Engineering, Economics, even some health-allied disciplines like Pharmacy and Biochemistry). They went through the same evaluation process not too long ago and can describe how they came to a decision. Some will suggest particular schools and some will give you insight into what to look for. Take it all in but realize each comes with some bias – good or bad.

6. Yawrate

Take Brigg’s advice to find a grad school friendly to Bayesian stats. Just apply your critical thinking to everything you do there. Learn both sides of the coin.

7. JH

My two cents.

If you are sick of hearing all the hocus pocus about p-values, go to a graduate school to learn about classic statistics yourself. Don’t let the strange fear to stop you. (I have talked to many undergraduate students about graduate degrees in statitprograms, J’s fear is not what I have ever dealt with. It’s a bit strange. )

Indeed, learn both sides of the coin. If you know classical statistics, you will be able to learn Bayesian methods yourself. Vice versa? I am not sure.

Established graduate programs usually offer a large variety of courses in addition to basic, required common core courses in probability and distribution theory. There are courses in Bayesian analysis, which is a useful tool and computationally intensive in the study of advanced modeling. You will acquire statistical computing skills, which are in high demand now. Modeling skills are also important as one simply can’t just arbitrarily apply the binomial distribution whenever s/he feels like it.

Don’t go to a graduate school to mainly study probability unless you enjoy your undergraduate pure math courses. Advanced probability theory involves analysis heavily. Be aware that you’d be neither a statistician nor a mathematician, hence your job prospects in both industry and academic is likely to be bad.

If you have excellent GPA (and GRE scores) in math or stat, apply to top ranked graduate schools in the area of your interest. A solid math undergraduate degree is highly valued. There is a shortage of US citizens in graduate STEM degrees, go for it!!!

8. 1) Yes, study physics – find a university with a good experimental physics program and make that your second major field while studying theoretical math, including a major focus on probability and statistics. Remember: you can’t understand what’s wrong with the traditional ideas (frequentist and Bayes) without understanding those ideas.

2) Dr. Briggs, whom I normally agree with, is, i think, dead wrong about philosophy. It’s mostly bunk and you’ll find academic work in the field utterly boring. Come to it when you’re old and have done your own thinking and some of it might make sense, but as a student? Thousands of pages of drek for every worthwhile thought.

3) you might want to consider a school that has a hot Finance group – lots of applied stats, simulation (combinatorics), and prob with (late) real world correction built in. One caveat: if you first study mathematical physics (and there is really no other kind outside the experimental world) you won’t find finance a challenge. Finance jobs, on the other hand, (especially if you do modeling well) are easy to get and highly paid.

9. bill

Briggs hit on a very good point: find a program with good coverage in combinatorics and discrete data. That usually shows up in departments with emphasis in Nonparametrics and in Biostatistics departments. While theoryland is wonderful, the transition to the real world with its non-random samples and discrete data tends to get short-changed.

Nothing dispells the silly interpretations on p-values like learning a. what an exact distributions is and how to compute it and b. reading Fisher’s actual comments. I found Hoeffding’s students to be quite good at that. Philosophy is useful for things like logic and contrafactuals.

10. JohnK

Dear Student J,

Your request is alarmingly similar to a student asking Alfred Wegener where to study a geology that included continental drift; or asking Ignaz Semmelweiss where to study obstetrics using the Semmelweiss methods to reduce puerperal fever.

These guys were also correct. (And if you don’t know who they are, you should look them up). But it didn’t matter how correct they were.

Student J, here are the facts:
Matt is correct;
Matt already is a PhD statistician;
and Matt himself can’t get a full-time job as a statistician.

So, Matt already has everything that you are longing for — all the training, all the knowledge, all the credentials — and he himself can’t get the job you dream of having one day.

What you should do, then, seems painfully clear, distressingly obvious.

PREDICTIVE STATISTICIANS — FOLLOW THESE FIVE SIMPLE STEPS TO GUARANTEED PROFIT !!!

1. Teach YOURSELF everything Matt knows about predictive statistics and the general theory of statistics that Matt has and is developing.
2. Hold your nose and get the CHEAPEST and FASTEST relevant credential in statistics that you can. Doesn’t matter exactly what it is, or where it’s from (though ‘statistics’ in the title might be good). Because all you need is the credential itself — the letters behind your name. After all, Matt has a PhD is in Mathematical Statistics from relatively prestigious Cornell, and look where that gets him.
3. Then the hard part is getting MATT a job.
4. After which, he’ll hire you as his research assistant.
5. PROFIT !!!

11. Ken

Pursue a path into Operations Research (e.g. http://www.mit.edu/~orc/ ).

That involves plenty of math (including but not limited to statistics) & analysis that can be broadly applied; there’s plenty of demand for the discipline, and within that demand are plenty of areas where honest & objective analysis are crucial to both the analyst’s and the customer’s mutual success. If, somewhere along the way interest’s change, the academic preparation will remain beneficial to other technical pursuits (and a substantial proportion of accumulated credits will transfer — important if one’s funds are limited).

Also, NEVER forget the value of a well-rounded personality that comes from breadth of study and experience. This is particularly true in some professions more than others; analyzing things well is enhanced by understanding how things work, with first-hand experience being particularly valuable. Regarding that, take Frankfurter’s letter, repeated below in its entirety, and post a copy where its message will serve a constant reminder. While this letter centers on law, the message is broadly applicable — simply cross out the word “law” if you must, and put in whatever you think you’re interested in because the message is just as important.

Lawyers, by the way, have to analyze a lot (the particulars of what happened, the applicable law, their client’s credibility, the opposition’s analysis and how to rebut it and their analysis & how to present it most advantageously). If you think that advice about practicing law isn’t really applicable, consider the dopey decisions coming out of our legal system — by attorneys having great expertise in “law” but lacking the breadth of experience to truly apply that niche expertise effectively…or even counter-productively despite best intentions.

In May 1954, 12-year-old M. Paul Claussen Jr. of Alexandria, Va., sent a letter to Felix Frankfurter saying that he was interested in “going into the law as a career” and requested the jurist’s advice as to “some ways to start preparing myself while still in junior high school.” He received this reply:

My Dear Paul:

No one can be a truly competent lawyer unless he is a cultivated man. If I were you, I would forget all about any technical preparation for the law. The best way to prepare for the law is to come to the study of the law as a well-read person. Thus alone can one acquire the capacity to use the English language on paper and in speech and with the habits of clear thinking which only a truly liberal education can give. No less important for a lawyer is the cultivation of the imaginative faculties by reading poetry, seeing great paintings, in the original or in easily available reproductions, and listening to great music. Stock your mind with the deposit of much good reading, and widen and deepen your feelings by experiencing vicariously as much as possible the wonderful mysteries of the universe, and forget about your future career.
With good wishes,

Sincerely yours,

Felix Frankfurter

12. JH

Briggs hit on a very good point: find a program with good coverage in combinatorics and discrete data. That usually shows up in departments with emphasis in Nonparametrics and in Biostatistics departments.

bill, show me one or two such departments! Seriously, the responsibility of advising young people is great, and cannot be not undertaken lightly.

JohnK, Briggs did have a job after leaving school.

13. JH

bill, show me one or two such departments! Seriously, the responsibility of advising young people is great, and cannot be undertaken lightly. (Delete the extra “not”)

14. bill

@JH
UNC Biostatistics. I graduated in 77. Among my teachers were Dana Quade and P.K. Sen. Quade made us figure out the exact distributions for a number of non-parametric statistics and spent a lot of time showing exact derivations. My dissertation, under Kupper, was on pair-matching. No p-values or asymptotics involved (but I did use continuous distributions). Lots of quadrature though.

I would reccommend that J look around for areas that interest her and find out where the authors are, and what their teachers and students have done. (e.g. the mathematics geneology pages are great for that.) I choose Chapel Hill as a biology major based on reccommendations from my biology and stat teachers.

15. bill

@JH,
Neglected to mention. UNC-CH had three stat departments (Biostat, MathStat, and Psychometrics). Down the road, UNC-Raleigh had a department of Biometry , and Duke had a stats department (now a Bayesian center). This offered the opportunity for lots of varying views and a broader education. UNC-Bios also had Grizzle and Koch so one got lots of grounding in discrete data.

For J, this would suggest selecting a department that has a lot of opportunities for cross-fertilization.

16. Come on Paul. Philosophy discussions at 2 AM in the dorm hall made college bearable! As bill points out, philosophy is good for logic.

JohnK: Shouldn’t you have noted that if you have no problem with fluid morals and can play follow the leader, employment should not be a problem? Following the leader may not work for J but it might in the right setting.

17. Joachim

Our (Cognitive Sciences) department doesn’t really reach classical anymore, and an applied statistician with an interest in the mind could find a job there, so I imagine the list that only includes math, physics, and philosophy is a little too short.

18. Francsois

This question is very important. A closely related question which I have asked in this forum previously is, but never with an answer that satisfied: what do researchers have to do to use statistics correctly for medical research? Many readers like myself use classical statistics not to be malicious but because we believed we were doing the right thing. Briggs point out how the staples such as p-values are not to be used. So what should we use for areas such a trials, epidemiological studies etc.?

19. bill

@Francsois
1. Get a real (bio)statistician on the team. You are a medical researcher and statistics is just as hard to master as is your field. The intro/survey stat courses for life scientists are about as informative as the intro courses you teach to non-majors in your area.
2. Pester him/her incessantly with practical questions about the interpretation and applicability of results and the medical relavance of the assumptions.
3. Make him/her a member of the team. I spent a lot more time/effort on researchers who involved me with their work than I did with those who just wanted “something significant”. As an added bonus, you will become a better user of statistics and and gain a better understanding of where it helps or doesn’t help.

To your example, p-values are just fine if you understand them and are interested in what they measure, but are really misleading if you misuse them (e.g substituting p-values for repeatability, explanatory power, or truth values.) Not everything that is mapped into a zero to one range is a probability, even if it can be tortureously twisted into that framework.

20. Briggs

All,

The definition of a p-value is: given the data, given the test statistics, assuming the model and null hypothesis are true, the probability of seeing a larger test statistic (in absolute value) than the observed test statistic given an infinite repetition of the “experiment.”

Any deviation from that is misusing the p-value. Since everybody deviates from that (I’ve yet to see a paper which hasn’t), everybody misuses it.

The p-value adds no information—none—to any decision. P-values are always an act of will (this was Newyman’s critique, too).

Click “Classic posts” and search for p-value.

21. bill

Briggs,
You and I seem to have gone to different schools wrt p-values. In my world, p-values are the rank of the observed statistic among (permissible) permutations of the raw data under a specific set of assumptions (the hypothesized model). See e.g. Fisher (1929, 1935, 1936), and Freedman and Lane (1983), and Pittman’s and Kempthorne’s stuff or Good’s recent book. No infinite repetitions required, probability not required. Fisher liked talking about sampling from the permutations so that he could force a probability interpretation, but its not needed and, in my view, is rather pointless. (Why sample from a list when I already have the list?) All the theory stuff is just convienent approximations to that rank. (Fisher commented on that. I can dig up a quote if you like)

This approach is appealing for agricultural, biological, biomedical, and epidemiological studies where sampling is difficult. Fisher commented on that, too.

I agree that the infinite repitition argument is just confusing to non-statisticians, like most asymptotic arguments.
Bill

22. bill

Briggs,
Found the quote. In Fisher (1936) (Coefficient of Racial Likeness) He describes the permutation method and states

Actually, the statistician does not carry out this very simple and very tedious process, but his conclusions have no justification beyond the fact that they agree with those which could have been arrived at by this elementary method.

A more recent source is
Ernst, M. E. (2004). Permutation Methods: A Basis for Exact Inference. Statistical Science, 19(4) 676-685

23. JH

Mr. Briggs,

given the data, given the test statistics, assuming the model and null hypothesis are true, the probability of seeing a larger test statistic (in absolute value) than the observed test statistic given an infinite re of the “experiment.”

Correction –

…the probability of seeing results as or more extreme as those actually observed …

Read this paper as to why as is essential in the definition.

24. Francsois

Thanks Bill, i appreciate your advice. Problem is, even if I do use a biostatistician, they use p-values etc. too. So the problem is still there. How does one do medical research without p-values, what methods should we use then? I see(or think I see) what Briggs is saying, but he does not offer practical advice for the researcher. it would be great if Briggs could take atypical medical research paper and redo the stats as he thinks it should be, as an example.

25. JH

Dear Bill,

I see “combintorics” as a branch of mathematics that studies countable discrete structures. Which is different from the combinatorial math involved in finding the exact distribution of a nonparametric test statistics such as Mann-Whitney statistics. Those nonparametric test statistics can be applied to continuous random variables. So I thought you didn’t know what you were talking about.

Yes, UNC-CH has great graduate programs in statistics. I highly recommend it.

BTW, I think the paper Fisher (1936) (Coefficient of Racial Likeness) is so.so.so properly referenced here. It is a wonderful paper for people to read about fisher’s view of significance test… considering that Briggs just asserted that

The p-value adds no information—none—to any decision.

LOL!

26. bill

Hi JH,
I’ve never done the full math track on Combinatorics, but you do stumble across it now and then in statistics, beyond the introductory ennumerative stuff. For example: design theory, graph theory (e.g. Bayes nets, aka causal networks) and order theory (posets and other semi-orders as well as lattices in analysing preference, choice, and dominance data – probabilistic index models). One of my career regrets was not paying more attention to this in undergrad and grad school. I’m retired so it doesn’t make any diff now, other than personal curiosity.

I agree with Briggs and others that observed data is always discrete. Continuity is simplifying assumption. My definition of p-values starts from that. I’m a “bottom-up” sort of statistician. I like to start with the data at hand and build outwards. Briggs’ p-value definition is more of a top down approach, starting with infinities and simple(r) sampling.

Thanks for the Fisher endorsement.

@Francsois
P-values are difficult to escape. (Although Ranking And Selection procedures do go in that direction.) Part of the escape is learning what they are and aren’t. (There are even ‘Bayesian p-valyes’) A useful book is “What if there were no Significance Tests?”
I think Matt would love a consulting gig to rework some existing data!