No p-Values at *Political Analysis*?

The main methods journal of political science, Political Analysis has banned p-values because:

"In addition, Political Analysis will no longer be reporting p-values in regression tables or elsewhere. There are many principled reasons for this change—most notably that in isolation a p-value simply does not give adequate evidence in support of a given model or the associated hypotheses. There is an extremely large, and at times self-reflective, literature in support of that statement dating back to 1962. I [Jeff Gill] could fill all of the pages of this issue with citations. Readers of Political Analysis have surely read the recent American Statistical Association report on the use and misuse of p-values, and are aware of the resulting public discussion. The key problem from a journal’s perspective is that p-values are often used as an acceptance threshold leading to publication bias. This in turn promotes the poisonous practice of model mining by researchers. Furthermore, there is evidence that a large number of social scientists misunderstand p-values in general and consider them a key form of scientific reasoning. I hope other respected journals in the field follow our lead."

This is misguided in that it identifies a symptom of bad research as the problem itself. Many conclusions based on models published in political science are unreliable (I think; this is a presumption). That is mostly because the models themselves are bad; that is, unvalidated, fragile, and opaque. Yes, evaluating support for "theory" based on whether a coefficient in a model is "significant" or not is often a quite poor way to answer this type of question, as it depends on the model and the assumptions of the variance estimator. These are the problems though and the situation is not solved or improved by this choice. Note that Political Analysis didn't ban confidence intervals. These are just inverted hypothesis tests!

"Model mining" (which reminded me of Ted Cruz calling basketball hoops "basketball rings") is not inherently bad and should not be discouraged. It should be made explicit, and its statistical properties studied. This is an entire field/class of methods (machine/statistical learning) that he just casually dismisses. Yes, doing model selection via an informal search for small $p$-values for certain "effect" estimates has poor statistical properties. This has been well known for decades and is irrelevant.

Since the original editorial PA tweeted

"Important clarification to PA's recently updated p-value policy: in the case of design based experiments and related procedures (permutation tests, asymptotic approximations, etc.) p-values are appropriate and may be supplied."

What? Aren't we nearly always making an asymptotic approximation to the sampling distribution of a statistic? This makes very little sense.

p-values are quite sensible when used appropriately. If the quality of our methods is poor (which it often is I'd argue) then our reviewers need technical help and our students need better education. Banning p-values is not going to do anything useful.