How do the t-distribution and standard normal distribution differ, and why is t-distribution used more?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
For statistical inference (e.g., hypothesis testing or computing confidence intervals), why do we use the t-distribution instead of the standard normal distribution? My class started with the standard normal distribution and shifted to the t-distribution, and I am not fully sure why. Is it because t-distributions can a) deal with small sample sizes (because it gives more emphasis to the tails) or b) be more robust to a non-normally distributed sample?
hypothesis-testing mathematical-statistics confidence-interval inference t-distribution
add a comment |Â
up vote
2
down vote
favorite
For statistical inference (e.g., hypothesis testing or computing confidence intervals), why do we use the t-distribution instead of the standard normal distribution? My class started with the standard normal distribution and shifted to the t-distribution, and I am not fully sure why. Is it because t-distributions can a) deal with small sample sizes (because it gives more emphasis to the tails) or b) be more robust to a non-normally distributed sample?
hypothesis-testing mathematical-statistics confidence-interval inference t-distribution
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
For statistical inference (e.g., hypothesis testing or computing confidence intervals), why do we use the t-distribution instead of the standard normal distribution? My class started with the standard normal distribution and shifted to the t-distribution, and I am not fully sure why. Is it because t-distributions can a) deal with small sample sizes (because it gives more emphasis to the tails) or b) be more robust to a non-normally distributed sample?
hypothesis-testing mathematical-statistics confidence-interval inference t-distribution
For statistical inference (e.g., hypothesis testing or computing confidence intervals), why do we use the t-distribution instead of the standard normal distribution? My class started with the standard normal distribution and shifted to the t-distribution, and I am not fully sure why. Is it because t-distributions can a) deal with small sample sizes (because it gives more emphasis to the tails) or b) be more robust to a non-normally distributed sample?
hypothesis-testing mathematical-statistics confidence-interval inference t-distribution
edited Aug 22 at 18:53
Alexis
15.3k34493
15.3k34493
asked Aug 22 at 18:35
Jane Sully
1496
1496
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18
add a comment |Â
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
3
down vote
accepted
The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.
With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.
Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributionsâÂÂat least in their table p-values, for example).
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
 |Â
show 2 more comments
up vote
2
down vote
The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.
We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.
Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.
Example: sample mean
Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.
If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$
The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.
With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.
Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributionsâÂÂat least in their table p-values, for example).
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
 |Â
show 2 more comments
up vote
3
down vote
accepted
The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.
With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.
Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributionsâÂÂat least in their table p-values, for example).
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
 |Â
show 2 more comments
up vote
3
down vote
accepted
up vote
3
down vote
accepted
The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.
With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.
Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributionsâÂÂat least in their table p-values, for example).
The normal distribution (which is almost certainly returning in later chapters of your course) is much easier to motivate than the t distribution for students new to the material. The reason why you are learning about the t distribution is more or less for your first reason: the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data, assuming that the true variance is unknown.
With increasing sample size, both t and standard normal distributions are both approximately as robust with respect to deviations from normality (as sample size increases the t distribution converges to the standard normal distribution). Nonparametric tests (which I start teaching about half way through my intro stats course) are generally much more robust to non-normality than either t or normal distributions.
Finally, you are likely going to learn tests and confidence intervals for many different distributions by the end of your course (F, $chi^2$, rank distributionsâÂÂat least in their table p-values, for example).
edited Aug 23 at 4:20
answered Aug 22 at 19:00
Alexis
15.3k34493
15.3k34493
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
 |Â
show 2 more comments
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
1
1
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
Thank you so much for this awesome response. I now get that t-distributions can better account for small sample sizes. However, if the sample size is large (> 30), it doesn't matter whether we use a t or standard normal distribution, right?
â Jane Sully
Aug 22 at 19:26
2
2
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
they become very similar when the degrees of freedom rise.
â Bernhard
Aug 22 at 19:37
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
@JaneSully Sure, but, for inference about means of normal data, it is never wrong to use the t distribution.
â Alexis
Aug 22 at 21:13
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
(Also, when/if you like an answer enough to say that it has answered your question, you can "accept" it by clicking on the check mark to the top left of the question. :).
â Alexis
Aug 22 at 21:24
1
1
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
I disagree with this statement: "the t distribution takes a single parameterâÂÂsample size minus oneâÂÂand more correctly accounts for uncertainty due to (small) sample size than the normal distribution when making inferences about a sample mean of normally-distributed data." E.g. see this lecture: onlinecourses.science.psu.edu/stat414/node/173 There's no need for t-distribution on Gaussian data when standard deviation is known. The key here is whether you do or do not know the variance, not the n-1 adjustment
â Aksakal
Aug 23 at 3:49
 |Â
show 2 more comments
up vote
2
down vote
The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.
We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.
Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.
Example: sample mean
Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.
If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$
The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.
add a comment |Â
up vote
2
down vote
The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.
We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.
Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.
Example: sample mean
Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.
If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$
The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.
We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.
Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.
Example: sample mean
Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.
If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$
The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.
The reason t-distribution is used in inference instead of normal is due to the fact that the theoretical distribution of some estimators is normal (Gaussian) only when the standard deviation is known, and when it is unknown the theoretical distribution is Student t.
We rarely know the standard deviation. Usually, we estimate from the sample, so for many estimators it is theoretically more solid to use Student t distribution and not normal.
Some estimators are consistent, i.e. in layman terms, they get better when the sample size increases. Student t becomes normal when sample size is large.
Example: sample mean
Consider a mean $mu$ of the sample $x_1,x_2,dots,x_n$. We can estimate it using a usual average estimator: $bar x=frac 1 nsum_i=1^nx_i$, which you may call a sample mean.
If we want to make inference statements about the mean, such as whether a true mean $mu<0$, we can use the sample mean $bar x$ but we need to know what is its distribution. It turns out that if we knew the standard deviation $sigma$ of $x_i$, then the sample mean would be distributed around the true mean according to Gaussian: $bar xsimmathcal N(mu,sigma^2/n)$, for large enough $n$
The problem's that we rarely know $sigma$, but we can estimate its value from the sample $hatsigma$ using one of the estimators. In this case the distribution of the sample mean is no longer Gaussian, but closer to Student t distribution.
edited Aug 23 at 3:44
answered Aug 22 at 21:25
Aksakal
36.1k345103
36.1k345103
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f363460%2fhow-do-the-t-distribution-and-standard-normal-distribution-differ-and-why-is-t%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Possibly related: stats.stackexchange.com/questions/285649/â¦
â Henry
Aug 22 at 23:09
Searcher like stats.stackexchange.com/search?q=t-distribution+normal and stats.stackexchange.com/search?q=t-test+normal will include a number of relevant posts (and a lot of other hits so you may need to add further keywords to reduce the clutter).
â Glen_bâ¦
Aug 23 at 2:18