The value of p

The value of p

In biomedical research statistics are a funny thing. The post below touches on some of them. One thing that I’ve found that is bad but not necessarily fraud, is that many researchers learn a statistical method from a mentor or journal and use only that method, e.g., Student t-test. A t-test is perfectly fine when comparing two groups. If you have more than two groups, you have to switch to an analysis of variance, ANOVA or something else. In some fields and journals, the problem is bad enough that even reviewers don’t catch the statistical flaw. However, all is not doom and gloom. Many if not all universities have a group of statisticians on hand to help you design your study, i.e., statistical method, before you do the research. Many large grants want evidence of this, e.g. in a power calculation.

Here are a few older posts on statistics.

Beer + statistics (science) + history, FTW.

http://goo.gl/EC0Y37

Analysis of Meta-analysis

http://goo.gl/SGAmAu

Bad science → bad headlines

http://goo.gl/ojW213

Here are some examples of publications that don’t quite understand what significance means. http://goo.gl/AgWTRl

#ScienceEveryday  

Originally shared by Joerg Fliege

A peculiar prevalence of p-values

Now whats a p-value?  In laymans terms, it is a number saying that a particular hypothesis (“All sheeps are black”, “All math teachers are jerks”, “Aspirin helps against cancer”, etc) is not completely bats… crazy.  It roughly goes like this. You make up a hypothesis (see above) that you really do not like, and you gather some data (e.g. some sheep, or some math teachers). The p-value corresponding to this hypothesis and this data set then tells you how probable it is to randomly stumble upon that particular data set under the assumption that the hypothesis is true. Small p-values tell you that the hypothesis is probably wrong, which is what you wanted to show anyway.

There is big money in small p-values. Its what you need to ‘show’ that a particular treatment for a particular ailment works in order to bring your pills to the market. [Insert cheap joke about big pharma and certain dysfunctions here.]

Now whats a small p-value?

Well, Ronald Aylmer Fisher wrote the following in The Journal of the Ministry of Agriculture, in the year of the Lord 1926:

If one in twenty does not seem high enough odds, we may, if we prefer it, draw the line at one in fifty (the 2 per cent. point), or one in a hundred (the 1 per cent. point). Personally, the writer prefers to set a low standard of significance at the 5 per cent point, and ignore entirely all results which fail to reach this level. A scientific fact should be regarded as experimentally established only if a properly designed experiment rarely fails to give this level of significance.

In other words, Fisher pulled the number 1/20 = 0.05 out of his ass thin air and was honest about it. Since then, people have followed Fisher’s words as reported in the reputable Journal of the Ministry of Agriculture. Not with pulling numbers out of various orifices, but with sticking to p=0.05 as the special value that shows ‘significance’, like a religious sermon.

Now lets have a look at p-values reported in various recent papers. Say, 3627 of them. This is where the histogram comes from [1]. (The original paper [2] is behind a pay wall. Thank you, Taylor & Francis.)

Look, too many p-values less than 0.05!  And too few above 0.05!  How peculiar, and utterly surprising!

Some possible explanations [1]:

Publication bias. Report a p-value just above 0.05? Referees will shoot you down.

Give up. Found a p-value just above 0.05? Don’t bother writing up.  Because see above.

Tweaking. Fiddle around with your analysis until you make it below 0.05.

Dynamic sample size. Fiddle with the sample size until you make it below 0.05.

Slice and dice. Only report p-values for ‘appropriate’ subsets of data.

Outliers. Only report outliers.

One item is missing from this list: Fraud.

Caveat: the 3627 reported p-values all come from psychology journals. Be careful before you start laughing and point fingers at that particular discipline. Are you sure this stuff doesn’t happen in your neck of the woods?

[1] http://www.graphpad.com/www/data-analysis-resource-center/blog/a-peculiar-prevalence-of-p-values-just-below-051/

[2] http://www.ncbi.nlm.nih.gov/pubmed/22853650

0 Comments

  1. Scott Sneddon
    September 30, 2013

    Is that really surprising?  Most papers will spend more time discussing data that is “significant” than not (which would give the discontinuous distribution shown in the plot).  

    What is interesting is that 0.05 has become so widely accepted as indicating significance.

    Reply
  2. Chad Haney
    September 30, 2013

    Jeff G many of us here try do just that, take breaking science news and distill it.

    Reply
  3. Chad Haney
    September 30, 2013

    Network problems at work. I probably have to wait until I get home to respond here.

    Reply
  4. Rajini Rao
    October 1, 2013

    Your point about most universities having statisticians on hand is excellent, Chad Haney . They are really a wonderful resource to those statistically challenged among us! 

    Reply
  5. Chad Haney
    October 1, 2013

    At UofC the Cancer Center subsidizes the biostatistics core. It was a wonderful resource indeed. I used them a few times to talk about developing a new method for a grant.

    Reply

Leave a Reply to Chad HaneyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.