When testing multiple hypotheses, what does it mean when there are not enough extremes? [closed]

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
4
down vote

favorite

Suppose you are testing a large number of hypotheses, say a million. Unlike the usual situation where you have a lot of very small p-values, in this case all of your p-values are greater than 5%.

What does that imply, and what's the best way to handle something like this?

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

asked Aug 12 at 4:38

badmax

32317

closed as unclear what you're asking by Martijn Weterings, mdewey, Michael Chernick, kjetil b halvorsen, whuberâ™¦ Aug 12 at 17:10

Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, itÃ¢Â€Â™s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

8

If tests are independent it's possible that there's an issue with the assumptions. If the tests are sufficiently dependent, then it may not indicate any problem. Your question is a bit light on details -- and the details may provide clues that would be needed to give a useful answer.
â€“Â Glen_bâ™¦
Aug 12 at 5:37

add a commentÂ |Â

up vote
4
down vote

favorite

Suppose you are testing a large number of hypotheses, say a million. Unlike the usual situation where you have a lot of very small p-values, in this case all of your p-values are greater than 5%.

What does that imply, and what's the best way to handle something like this?

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

asked Aug 12 at 4:38

badmax

32317

closed as unclear what you're asking by Martijn Weterings, mdewey, Michael Chernick, kjetil b halvorsen, whuberâ™¦ Aug 12 at 17:10

8

If tests are independent it's possible that there's an issue with the assumptions. If the tests are sufficiently dependent, then it may not indicate any problem. Your question is a bit light on details -- and the details may provide clues that would be needed to give a useful answer.
â€“Â Glen_bâ™¦
Aug 12 at 5:37

add a commentÂ |Â

up vote
4
down vote

favorite

Suppose you are testing a large number of hypotheses, say a million. Unlike the usual situation where you have a lot of very small p-values, in this case all of your p-values are greater than 5%.

What does that imply, and what's the best way to handle something like this?

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

asked Aug 12 at 4:38

badmax

32317

Suppose you are testing a large number of hypotheses, say a million. Unlike the usual situation where you have a lot of very small p-values, in this case all of your p-values are greater than 5%.

What does that imply, and what's the best way to handle something like this?

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

asked Aug 12 at 4:38

badmax

32317

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

edited Aug 12 at 6:28

kjetil b halvorsen

25.1k976179

asked Aug 12 at 4:38

badmax

32317

asked Aug 12 at 4:38

badmax

32317

asked Aug 12 at 4:38

badmax

32317

closed as unclear what you're asking by Martijn Weterings, mdewey, Michael Chernick, kjetil b halvorsen, whuberâ™¦ Aug 12 at 17:10

8

If tests are independent it's possible that there's an issue with the assumptions. If the tests are sufficiently dependent, then it may not indicate any problem. Your question is a bit light on details -- and the details may provide clues that would be needed to give a useful answer.
â€“Â Glen_bâ™¦
Aug 12 at 5:37

add a commentÂ |Â

8

If tests are independent it's possible that there's an issue with the assumptions. If the tests are sufficiently dependent, then it may not indicate any problem. Your question is a bit light on details -- and the details may provide clues that would be needed to give a useful answer.
â€“Â Glen_bâ™¦
Aug 12 at 5:37

If tests are independent it's possible that there's an issue with the assumptions. If the tests are sufficiently dependent, then it may not indicate any problem. Your question is a bit light on details -- and the details may provide clues that would be needed to give a useful answer.
â€“Â Glen_bâ™¦
Aug 12 at 5:37

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
4
down vote

As the sample size ($n$) grows, hypothesis tests become significant. However, as the number of independent hypothesis tests ($k$) grows, individual tests still behave the same.

The problem of multiple testing is that the overall chance of a false positive becomes deceivingly large. Among multiple tests, the chance of at least one false positive is larger than the significance level ($alpha$). Hence why you should apply multiple testing correction.

Suppose the effects/differences you are testing for are simply not present in the population, or they are so infinitesimally small that you cannot even detect them with your current hypothesis tests. This is essentially what you assume when applying a Bonferroni correction: There are no true effects, so every test has only the ability to produce a false positive. There are now $k$ potential false positives and a chance of $1 - (1 - alpha)^k$ of at least one false positive.

So what does it mean when you don't observe extremely small $p$-values? Under the null-hypothesis, the $p$-value is uniformly distributed so even if there are no true effects you would expect the number of values closer to $0$ to increase with the number of tests, since you would essentially be drawing $k$ numbers from $mathsfUnif(0,1)$.

If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough, or your tests are not actually independent. However, if you conclude approximately $alphacdot100%$ nominally significant $p$-values, then nothing strange is going on. (In your example, you would expect about $50,000$ $p$-values below $0.05$.)

Lastly, as for the conclusion: It might be more interesting to report a set of confidence intervals / credibility ranges, so you can say something about the effect sizes. Alternatively, if your sample size is indeed large and you want to demonstrate that there are no effects, then you should be running tests of equivalence instead.

To elaborate on what Glen_b aluded to in the comments:

If your tests are not actually independent, then neither are your $p$-values. In other words, $p$-values only follow a uniform distribution if you (1) repeatedly draw samples from the same population and test the same hypothesis, or (2) perform independent tests for different effects. A simple, albeit somewhat contrived example would be if you were to perform the same test multiple times. In this case, every $p$-value is identical and may well be above the significance threshold.

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

add a commentÂ |Â

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
4
down vote

As the sample size ($n$) grows, hypothesis tests become significant. However, as the number of independent hypothesis tests ($k$) grows, individual tests still behave the same.

To elaborate on what Glen_b aluded to in the comments:

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

add a commentÂ |Â

up vote
4
down vote

As the sample size ($n$) grows, hypothesis tests become significant. However, as the number of independent hypothesis tests ($k$) grows, individual tests still behave the same.

To elaborate on what Glen_b aluded to in the comments:

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

add a commentÂ |Â

up vote
4
down vote

As the sample size ($n$) grows, hypothesis tests become significant. However, as the number of independent hypothesis tests ($k$) grows, individual tests still behave the same.

To elaborate on what Glen_b aluded to in the comments:

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

As the sample size ($n$) grows, hypothesis tests become significant. However, as the number of independent hypothesis tests ($k$) grows, individual tests still behave the same.

To elaborate on what Glen_b aluded to in the comments:

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

edited Aug 12 at 15:23

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

answered Aug 12 at 5:15

Frans Rodenburg

2,645322

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

add a commentÂ |Â

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

"If you are running a very large number of tests and don't conclude any nominally significant differences (uncorrected), then perhaps your test is not powerful enough" But wouldn't the p-values still be uniformly distributed in this situation, if there is a null effect? I'm just having a hard time wrapping my mind around there being a clustering of p-values above a threshold, since null results would imply a uniform distribution. I haven't done any simulations of this, so I may very well be wrong.
â€“Â Mark White
Aug 12 at 14:53

I'm not sure what you mean, but you would expect $(1-alpha)cdot100%$ of the tests to have a $p$-value below $alpha$. You don't need any 'clustering' for that, that's just $frac1alpha$ of the uniform distribution from $0$ to $1$.
â€“Â Frans Rodenburg
Aug 12 at 15:00

add a commentÂ |Â

Search This Blog

Iyfjky