How do frequentists address this paradox of hypothesis testing?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
2
down vote

favorite

Suppose we sample a person from the population. They are a member of US Congress. We define the null hypothesis $H_0$ as "the person is American". We calculate the $p$-value: $P[member of Congress | American] ll 0.05$. Since if the null hypothesis holds, the person is very unlikely to be a member of Congress, we reject the null hypothesis and decide that the person is very likely not an American. This conclusion is obviously very wrong, as all members of Congress are American.

Which assumptions of the hypothesis testing did I violate here? In other words, if I encounter a similar (but more obscure) application where this methodology is also not appropriate, how do I identify it?

asked 9 hours ago

rinspy

1,863330

2

"The person is an American" is not a null hypothesis: it is a prediction about the value of a random variable. It cannot possibly have a p-value. You aren't doing hypothesis testing at all--and since you haven't explained how you obtained your purported "p-value," it isn't evident what you're doing or what your "methodology" might possibly be. Could you edit your post to explain it?
â€“Â whuberâ™¦
59 mins ago

add a commentÂ |Â

up vote
2
down vote

favorite

asked 9 hours ago

rinspy

1,863330

2

"The person is an American" is not a null hypothesis: it is a prediction about the value of a random variable. It cannot possibly have a p-value. You aren't doing hypothesis testing at all--and since you haven't explained how you obtained your purported "p-value," it isn't evident what you're doing or what your "methodology" might possibly be. Could you edit your post to explain it?
â€“Â whuberâ™¦
59 mins ago

add a commentÂ |Â

up vote
2
down vote

favorite

asked 9 hours ago

rinspy

1,863330

hypothesis-testing bayesian frequentist

asked 9 hours ago

rinspy

1,863330

asked 9 hours ago

rinspy

1,863330

asked 9 hours ago

rinspy

1,863330

asked 9 hours ago

rinspy

1,863330

asked 9 hours ago

rinspy

1,863330

2

"The person is an American" is not a null hypothesis: it is a prediction about the value of a random variable. It cannot possibly have a p-value. You aren't doing hypothesis testing at all--and since you haven't explained how you obtained your purported "p-value," it isn't evident what you're doing or what your "methodology" might possibly be. Could you edit your post to explain it?
â€“Â whuberâ™¦
59 mins ago

add a commentÂ |Â

2

"The person is an American" is not a null hypothesis: it is a prediction about the value of a random variable. It cannot possibly have a p-value. You aren't doing hypothesis testing at all--and since you haven't explained how you obtained your purported "p-value," it isn't evident what you're doing or what your "methodology" might possibly be. Could you edit your post to explain it?
â€“Â whuberâ™¦
59 mins ago

"The person is an American" is not a null hypothesis: it is a prediction about the value of a random variable. It cannot possibly have a p-value. You aren't doing hypothesis testing at all--and since you haven't explained how you obtained your purported "p-value," it isn't evident what you're doing or what your "methodology" might possibly be. Could you edit your post to explain it?
â€“Â whuberâ™¦
59 mins ago

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
5
down vote

Frequentist statistics is meant to make inference on populations using samples, not on individuals. You first define a population (which you have not done), take a sample, and make inference on the population using the sample, taking into account the uncertainty.

You have used your sampled individual as if it were your population, and try to make inference on him. But frequentist statistics do not apply here, you cannot repeat the sampling processs with a population size of 1.

answered 6 hours ago

Knarpie

1,196418

1

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

1

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

1

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

1

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

Â |Â
show 1 more comment

up vote
5
down vote

To make this test apply to the population we could change the hypotheses slightly to

H0: The sample is drawn from a population in the US

H1: The sample is drawn from a population not in the US

As far as I can tell there's nothing wrong with this hypothesis test. For a hypothesis test with significance level 0.05 (for example) is a test you need that if the null hypothesis is true then the probability that the test will reject it is less than 0.05.

In this example if you think about repeatedly sampling, if the null hypothesis then we will choose people from the US. And out of those people, only a very small fraction (less than 0.05) are expected to be members of congress, so you would only reject the null hypothesis for less than 5% of them.

So if the test is correct, why does it seem so paradoxical? While it technically satisfies the criterion for a hypothesis test, for any fixed significance level we typically want to choose the rejection criterion which maximizes the power of the test - that is it maximizes the probability of rejecting the null hypothesis if it is false. In your case the test is terrible at this, it will never reject the null hypothesis even if it is false.

The paradox depends on the rejection criterion being impossible under the alternative hypothesis or more unlikely than under the alternate hypothesis that under the null. Any such test will have zero or very low power.

edited 3 hours ago

answered 5 hours ago

Rohan

813

2

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

1

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

add a commentÂ |Â

up vote
2
down vote

I'd say there's at least two additional problems with your "paradox" (I don't even think it is a valid testing problem):

First, "Member of Congress=1" is an invalid test statistic for your "H0" as it does not measure deviation from the H0. So, a person who is not American would have automatically "Member of Congress=0" which also applies to most Americans. Let me expand on that. What values can the test statistic take? Well 1 if the person is a member of congress AND American and 0 if the person is either American AND not member of congress, OR Non-American. That means that the test statistic can take on TWO distinct values (0, 1) if the null were true! And both values do carry information in favour of the H0 (0 for Americans non-congress members, 1 for American congress members). But one of these values (0) also carries information in favour against the H0. So what does one learn about H0 in case the test statistic is 0 or not 0? Thus the test you describe appears invalid.

Second, the p-value is defined as the probability to observe a test statistic that speaks as much or more strongly against the H0 as the value you have observed in the sample. In other words, it is a quantile of the distribution of the test statistic under the assumption the null were true. I have difficulties to match your p-value, which seems to simply be a conditional probability, to that. But not every conditional probability conditioning on "American" is automatically the correct p-value because for that one would have to work out the correct null distribution of the test statistic that you propose.

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f366708%2fhow-do-frequentists-address-this-paradox-of-hypothesis-testing%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
5
down vote

answered 6 hours ago

Knarpie

1,196418

1

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

1

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

1

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

1

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

Â |Â
show 1 more comment

up vote
5
down vote

answered 6 hours ago

Knarpie

1,196418

1

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

1

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

1

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

1

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

Â |Â
show 1 more comment

up vote
5
down vote

answered 6 hours ago

Knarpie

1,196418

answered 6 hours ago

Knarpie

1,196418

answered 6 hours ago

Knarpie

1,196418

answered 6 hours ago

Knarpie

1,196418

answered 6 hours ago

Knarpie

1,196418

1

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

1

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

1

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

1

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

Â |Â
show 1 more comment

1

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

1

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

1

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

1

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

But is my example not fundamentally the same in this sense as Fisher's lady drinking tea? He had a particular lady, and he wanted to find out if she has "the ability" or not based on how many cups of tea she guessed. So the way I see it, his argument was "If I were to sample from the population of 'no ability' outcomes, a sample with all cups guessed correctly would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'no ability' population".
â€“Â rinspy
4 hours ago

In my case, the argument is "If I were to sample from the population of Americans, a sample that is a member of Congress would be very unlikely. Therefore, it is unlikely that the sample I am looking at came from the 'Americans' distribution".
â€“Â rinspy
4 hours ago

The population in the case of the lady drinking tea is the cups of tea she could possibly drink (which is infinite), not the lady herself. If she does not have the ability, the probability that she guesses right is 0.5. The null hypothesis tested here revolves around this parameter, in your earlier example it revolved around one individual (one teacup by analogy).
â€“Â Knarpie
4 hours ago

The only difference I can see is that in Fisher's example, we implicitly assume that the rare sample is much more likely to have been sampled from some population that is not the null hypothesis population (it doesn't matter which specific population). In my case this assumption does not hold - there is no population that is not the null hypothesis population in which the sample "Member of Congress" is likely.
â€“Â rinspy
4 hours ago

@Knarpie I agree with this answer and it is something I overlooked in my answer (and was therefore sloppy about certain things in mine). However, maybe we could consider H0: The population we are sampling from is from the US and H1: The population we are sampling from is not from the US. With the same rejection criterion "Reject if the selected person is a member of congress" I think we still get the same paradox.
â€“Â Rohan
3 hours ago

Â |Â
show 1 more comment

up vote
5
down vote

To make this test apply to the population we could change the hypotheses slightly to

H0: The sample is drawn from a population in the US

H1: The sample is drawn from a population not in the US

edited 3 hours ago

answered 5 hours ago

Rohan

813

2

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

1

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

add a commentÂ |Â

up vote
5
down vote

To make this test apply to the population we could change the hypotheses slightly to

H0: The sample is drawn from a population in the US

H1: The sample is drawn from a population not in the US

edited 3 hours ago

answered 5 hours ago

Rohan

813

2

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

1

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

add a commentÂ |Â

up vote
5
down vote

To make this test apply to the population we could change the hypotheses slightly to

H0: The sample is drawn from a population in the US

H1: The sample is drawn from a population not in the US

edited 3 hours ago

answered 5 hours ago

Rohan

813

To make this test apply to the population we could change the hypotheses slightly to

H0: The sample is drawn from a population in the US

H1: The sample is drawn from a population not in the US

edited 3 hours ago

answered 5 hours ago

Rohan

813

edited 3 hours ago

answered 5 hours ago

Rohan

813

answered 5 hours ago

Rohan

813

answered 5 hours ago

Rohan

813

2

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

1

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

add a commentÂ |Â

2

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

1

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

Intuitively, the assumption that "the rejection criterion must be (much) more likely under the alternate hypothesis than under the null" seems key. In fact, the more I think about it, the more it seems like an implicit assumption in this kind of statistical testing.
â€“Â rinspy
4 hours ago

Agreed. I think it's so implicit because it's usually not a problem, the null hypothesis is typically quite specific and the alternative quite broad (e.g. a specific value of a parameter vs all others or independence vs dependence), but it's still probably not mentioned enough.
â€“Â Rohan
3 hours ago

I've altered my answer slightly to take into account the other answer which I think raises a valid point. But the idea hasn't changed.
â€“Â Rohan
3 hours ago

add a commentÂ |Â

up vote
2
down vote

I'd say there's at least two additional problems with your "paradox" (I don't even think it is a valid testing problem):

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

add a commentÂ |Â

up vote
2
down vote

I'd say there's at least two additional problems with your "paradox" (I don't even think it is a valid testing problem):

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

add a commentÂ |Â

up vote
2
down vote

I'd say there's at least two additional problems with your "paradox" (I don't even think it is a valid testing problem):

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

I'd say there's at least two additional problems with your "paradox" (I don't even think it is a valid testing problem):

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

edited 49 mins ago

answered 2 hours ago

Momo

7,18423654

answered 2 hours ago

Momo

7,18423654

answered 2 hours ago

Momo

7,18423654

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky