P values?

Jey :crab:@dftba.club · 2 years ago

P values?

stravanasu · edit-2 6 months ago

Please keep in mind that p-values (and null-hypothesis testing) suffer from officially recognized intrinsic flaws:

https://doi.org/10.1080/00031305.2016.1154108 https://doi.org/10.1080/00031305.2019.1583913

That is, they have intrinsic problems even when used “correctly” (for examples see e.g. https://doi.org/10.3758/BF03194105). On top of this, they are often misinterpreted and used incorrectly.

A p-value is the probability of observing your particular sample data or other imagined data, assuming your null-hypothesis is true, and assuming additional hypotheses (such as underlying gaussian distributions or the like), which may or may not hold:

p-value = Pr(observed data or imagined data | null hypothesis and additional assumptions)

Because of the presence of imagined data, a p-value is not the likelihood of the null hypothesis (+assumptions), which is Pr(observed data | null hypothesis and additional assumptions)

My personal recommendation is that you do your analysis by using more modern methods that do not have intrinsic flaws. Many good books are out there (just one example https://doi.org/10.1201/9780429029608). Of course any method may be misused, but at least using a self-consistent method we have worry only about one problem (misuse) rather than two (misuse & intrinsic flaws).

An example of the quirky, unscientific characteristics of p-values. Imagine you design an experiment this way: “I’ll test 10 subjects, and in the meantime I apply for a grant. At the time the 10th subject is tested, I’ll know my application’s outcome. If the outcome is positive, I’ll test 10 more subjects; if it isn’t, I’ll stop”. Not an unrealistic situation.

With this stopping rule, your p-value will depend on the probability that you get the grant. This is not a joke.

This is a quote from H. Jeffreys, “Theory of Probability” § VII.7.2 (emphasis in the original) https://doi.org/10.1093/oso/9780198503682.001.0001:

“What the use of P implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred. This seems a remarkable procedure. On the face of it the fact that such results have not occurred might more reasonably be taken as evidence for the law, not against it.”