Parametric tests for non-normal data?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
I'm trying to polish my stats skills and it seems to me that you have either parametric test for normal data, or non-parametric tests for non-normal data.
Looking at the t-test for instance, I don't really see the reason why a similar derivation could not be done for other known distributions. I guess some of them might be hard to do analytically, but since we have computers anyway, that should not be really a problem.
So I'm guessing that there are such tests, but they are not really taught and/or are impractical for one or more reasons . Can someone enlighten me?
parametric
New contributor
add a comment |Â
up vote
2
down vote
favorite
I'm trying to polish my stats skills and it seems to me that you have either parametric test for normal data, or non-parametric tests for non-normal data.
Looking at the t-test for instance, I don't really see the reason why a similar derivation could not be done for other known distributions. I guess some of them might be hard to do analytically, but since we have computers anyway, that should not be really a problem.
So I'm guessing that there are such tests, but they are not really taught and/or are impractical for one or more reasons . Can someone enlighten me?
parametric
New contributor
1
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I'm trying to polish my stats skills and it seems to me that you have either parametric test for normal data, or non-parametric tests for non-normal data.
Looking at the t-test for instance, I don't really see the reason why a similar derivation could not be done for other known distributions. I guess some of them might be hard to do analytically, but since we have computers anyway, that should not be really a problem.
So I'm guessing that there are such tests, but they are not really taught and/or are impractical for one or more reasons . Can someone enlighten me?
parametric
New contributor
I'm trying to polish my stats skills and it seems to me that you have either parametric test for normal data, or non-parametric tests for non-normal data.
Looking at the t-test for instance, I don't really see the reason why a similar derivation could not be done for other known distributions. I guess some of them might be hard to do analytically, but since we have computers anyway, that should not be really a problem.
So I'm guessing that there are such tests, but they are not really taught and/or are impractical for one or more reasons . Can someone enlighten me?
parametric
parametric
New contributor
New contributor
New contributor
asked 1 hour ago
fbence
1112
1112
New contributor
New contributor
1
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago
add a comment |Â
1
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago
1
1
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.
Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.
It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"
The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.
Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n rightarrow infty$.
In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.
Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.
It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"
The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.
Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n rightarrow infty$.
In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
add a comment |Â
up vote
2
down vote
If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.
Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.
It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"
The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.
Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n rightarrow infty$.
In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
add a comment |Â
up vote
2
down vote
up vote
2
down vote
If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.
Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.
It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"
The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.
Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n rightarrow infty$.
In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.
If you know that the distribution is in fact normal, then tests derived from normal data will be optimal. The Z-test (with known variance) achieves just such a property for the 1 parameter normal distribution.
Parametric tests can be derived for any old distribution with maximum likelihood. If data are Poisson, Exponential, etc., a likelihood ratio test with 1 degree of freedom can be done as a two sample test. The link between T-tests and regression with adjustment for a binary group variable can be extended to generalized linear models for two-sample tests with known but not normal data.
It's way more interesting to think about when we don't know the distribution of the data. I mean, if you don't know the mean, what sense is to say, "I know this is a 3 parameter bimodal normal mixture model!"
The T-test has some interesting properties that it is also efficient even for a general class of finite-variance distributions. This is because of the central limit theorem. The sampling distribution of the mean converges to normal even in very small samples. Another way of describing the T-test is an asymptotic test because you are approximating the long-run sampling distribution of the mean.
Some test statistics, especially mins and maxes, do not converge to normal distributions, so tests based on their limiting distributions are actually compared to exponential (Huzurbazar), Gumbell, or extreme value distributions as $n rightarrow infty$.
In general, we would never apply a parametric test to the data of the wrong parametric form, it's just obviously the less optimal solution.
answered 1 hour ago
AdamO
30.7k255128
30.7k255128
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
add a comment |Â
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
So essentially, you are saying, that a) they do exist they just don't specifically have a name and b) if the data is not normal, usually the distribution is not something I know anyway, and c) if it kind of is normal, than the t-test will be okay anyway assuming I have enough data?
â fbence
1 hour ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
@fbence a) some are named, e.g. Pearson chi-square test, but certainly not for all cases b) we rarely know any distribution in the wild, but the context/science gives us some background, like I know the distribution of neutrophils is well modeled by Negative binomial because it is a concentration. c) Usually yes. The normal approximation converges fast, and even faster when the sample(s) are unimodel, symmetric, concave, and normokurtic.
â AdamO
54 mins ago
add a comment |Â
fbence is a new contributor. Be nice, and check out our Code of Conduct.
fbence is a new contributor. Be nice, and check out our Code of Conduct.
fbence is a new contributor. Be nice, and check out our Code of Conduct.
fbence is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f368478%2fparametric-tests-for-non-normal-data%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
I don't think this is a black or white type of problem. Many parametric tests are robust to minor or even moderate violations of the normality assumption of the data or model residuals and can still be used. It's a matter of quantum!
â Isabella Ghement
1 hour ago