How useful are linear hypotheses?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
In a linear model $Y=Xbeta + varepsilon$, one can easily test linear hypotheses of the form $H_0: Cbeta = gamma, $ where $C$ is a matrix and $gamma$ is a vector with dimension equal to the number of rows in $C$. Namely, one can derive a test statistic which has an F distribution under the null and go from there.
Theoretically, these tests are very interesting to me and seem quite flexible, as $C$ and $gamma$ can be anything.
However, I'm interested to know how useful these hypotheses are in practical applications and what are some interesting examples of these applications? (besides testing if a single coefficient is 0 or all coefficients of the model being zero, which is included in every lm
call in R for example)
hypothesis-testing multiple-regression linear-model
add a comment |Â
up vote
3
down vote
favorite
In a linear model $Y=Xbeta + varepsilon$, one can easily test linear hypotheses of the form $H_0: Cbeta = gamma, $ where $C$ is a matrix and $gamma$ is a vector with dimension equal to the number of rows in $C$. Namely, one can derive a test statistic which has an F distribution under the null and go from there.
Theoretically, these tests are very interesting to me and seem quite flexible, as $C$ and $gamma$ can be anything.
However, I'm interested to know how useful these hypotheses are in practical applications and what are some interesting examples of these applications? (besides testing if a single coefficient is 0 or all coefficients of the model being zero, which is included in every lm
call in R for example)
hypothesis-testing multiple-regression linear-model
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
In a linear model $Y=Xbeta + varepsilon$, one can easily test linear hypotheses of the form $H_0: Cbeta = gamma, $ where $C$ is a matrix and $gamma$ is a vector with dimension equal to the number of rows in $C$. Namely, one can derive a test statistic which has an F distribution under the null and go from there.
Theoretically, these tests are very interesting to me and seem quite flexible, as $C$ and $gamma$ can be anything.
However, I'm interested to know how useful these hypotheses are in practical applications and what are some interesting examples of these applications? (besides testing if a single coefficient is 0 or all coefficients of the model being zero, which is included in every lm
call in R for example)
hypothesis-testing multiple-regression linear-model
In a linear model $Y=Xbeta + varepsilon$, one can easily test linear hypotheses of the form $H_0: Cbeta = gamma, $ where $C$ is a matrix and $gamma$ is a vector with dimension equal to the number of rows in $C$. Namely, one can derive a test statistic which has an F distribution under the null and go from there.
Theoretically, these tests are very interesting to me and seem quite flexible, as $C$ and $gamma$ can be anything.
However, I'm interested to know how useful these hypotheses are in practical applications and what are some interesting examples of these applications? (besides testing if a single coefficient is 0 or all coefficients of the model being zero, which is included in every lm
call in R for example)
hypothesis-testing multiple-regression linear-model
hypothesis-testing multiple-regression linear-model
edited 8 hours ago
asked 8 hours ago
Blaza
21316
21316
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago
add a comment |Â
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
When you fit a linear model, the statistical softwares give you the point estimate, confidence interval, test statistics, and p-values of the $beta$_s. If you are just interesting in these $beta$s, you can stop here (for example, the simple linear regression just have on intercept and one slope, so $beta$s themselves are enough). But for little complicated model, you will not satisfied by $beta$s, and you want to estimate, test the linear combinations of $beta$s. At this time The importance of C matrix is obvious. For complicated model, such as model with interactions, C matrix must needs be constructed.
Example 1: For ANOVA, one categorical covariate has 3 level. Suppose level 1 is reference. Two $beta$ will give you the difference between level 2 vs 1 and level 3 vs 1. If you want the difference between level 2 and 3, you need C matrix (0 1 -1). (the first 0 is for intercept). If you want to estimate the means for level 3, C=(1 0 1) is needed.
Example 2: If you want to the multiple hypothesis simultaneously, see Testing the general linear hypothesis: $H_0: beta_1 = beta_2 = beta_3 = beta_4 = beta$. Here T=C.
Example 3: If the interactions exist, we need to have the linear relation for each combination (cell) of interaction. Here is 16x16 C matrix to get 8 intercepts and slopes. How to understand the coefficients of a three-way interaction in a regression?
You can find more example on the internet, textbooks.
In summary, for linear model, constructing C matrix is equal to half of theory of linear model.
add a comment |Â
up vote
2
down vote
These linear hypotheses on the coefficient vector have three main uses:
Testing the existence of relationships: We can test the existence of relationships between some subset of the explanatory variables and the response variable. To do this, let $mathbfe_mathcalS$ denote the indicator vector for the subset $mathcalS$ and test the linear hypotheses:
$$H_0: mathbfe_mathcalS boldsymbolbeta = 0 quad quad quad H_A: mathbfe_mathcalS boldsymbolbeta neq 0.$$
Testing a specified magnitude for the relationship: We can test the magnitude of a relationship between an explanatory variables and the response variable using some specified value of interest. This is often useful when a particular specified magnitude has some practical significance (e.g., it is often useful to test if the true coefficient is equal to one). To test $beta_k = b$ we use the linear hypotheses:
$$H_0: mathbfe_k boldsymbolbeta = b quad quad quad H_A: mathbfe_k boldsymbolbeta neq b.$$
Testing the expected responses of new explanatory variables: We can test the values expected values of responses corresponding to a new set of explanatory variables. Taking new explanatory data $boldsymbolX_textnew$ we get corresponding expected values $mathbbE(boldsymbolY_textnew) = boldsymbolX_textnew boldsymbolbeta$. This means that we can test the hypothesis $mathbbE(boldsymbolY_textnew) = boldsymboly $ via the hypotheses:
$$H_0: boldsymbolX_textnew boldsymbolbeta = boldsymboly quad quad quad H_A: boldsymbolX_textnew boldsymbolbeta neq boldsymboly.$$
As you can see, the first use is to test for whether some of the coefficients are zero, which is a test of whether those explanatory variables are related to the response in the model. However, you can also undertake more general tests of a specific magnitude for the relationship. You can also use the linear test to test the expected response for new data.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
When you fit a linear model, the statistical softwares give you the point estimate, confidence interval, test statistics, and p-values of the $beta$_s. If you are just interesting in these $beta$s, you can stop here (for example, the simple linear regression just have on intercept and one slope, so $beta$s themselves are enough). But for little complicated model, you will not satisfied by $beta$s, and you want to estimate, test the linear combinations of $beta$s. At this time The importance of C matrix is obvious. For complicated model, such as model with interactions, C matrix must needs be constructed.
Example 1: For ANOVA, one categorical covariate has 3 level. Suppose level 1 is reference. Two $beta$ will give you the difference between level 2 vs 1 and level 3 vs 1. If you want the difference between level 2 and 3, you need C matrix (0 1 -1). (the first 0 is for intercept). If you want to estimate the means for level 3, C=(1 0 1) is needed.
Example 2: If you want to the multiple hypothesis simultaneously, see Testing the general linear hypothesis: $H_0: beta_1 = beta_2 = beta_3 = beta_4 = beta$. Here T=C.
Example 3: If the interactions exist, we need to have the linear relation for each combination (cell) of interaction. Here is 16x16 C matrix to get 8 intercepts and slopes. How to understand the coefficients of a three-way interaction in a regression?
You can find more example on the internet, textbooks.
In summary, for linear model, constructing C matrix is equal to half of theory of linear model.
add a comment |Â
up vote
2
down vote
When you fit a linear model, the statistical softwares give you the point estimate, confidence interval, test statistics, and p-values of the $beta$_s. If you are just interesting in these $beta$s, you can stop here (for example, the simple linear regression just have on intercept and one slope, so $beta$s themselves are enough). But for little complicated model, you will not satisfied by $beta$s, and you want to estimate, test the linear combinations of $beta$s. At this time The importance of C matrix is obvious. For complicated model, such as model with interactions, C matrix must needs be constructed.
Example 1: For ANOVA, one categorical covariate has 3 level. Suppose level 1 is reference. Two $beta$ will give you the difference between level 2 vs 1 and level 3 vs 1. If you want the difference between level 2 and 3, you need C matrix (0 1 -1). (the first 0 is for intercept). If you want to estimate the means for level 3, C=(1 0 1) is needed.
Example 2: If you want to the multiple hypothesis simultaneously, see Testing the general linear hypothesis: $H_0: beta_1 = beta_2 = beta_3 = beta_4 = beta$. Here T=C.
Example 3: If the interactions exist, we need to have the linear relation for each combination (cell) of interaction. Here is 16x16 C matrix to get 8 intercepts and slopes. How to understand the coefficients of a three-way interaction in a regression?
You can find more example on the internet, textbooks.
In summary, for linear model, constructing C matrix is equal to half of theory of linear model.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
When you fit a linear model, the statistical softwares give you the point estimate, confidence interval, test statistics, and p-values of the $beta$_s. If you are just interesting in these $beta$s, you can stop here (for example, the simple linear regression just have on intercept and one slope, so $beta$s themselves are enough). But for little complicated model, you will not satisfied by $beta$s, and you want to estimate, test the linear combinations of $beta$s. At this time The importance of C matrix is obvious. For complicated model, such as model with interactions, C matrix must needs be constructed.
Example 1: For ANOVA, one categorical covariate has 3 level. Suppose level 1 is reference. Two $beta$ will give you the difference between level 2 vs 1 and level 3 vs 1. If you want the difference between level 2 and 3, you need C matrix (0 1 -1). (the first 0 is for intercept). If you want to estimate the means for level 3, C=(1 0 1) is needed.
Example 2: If you want to the multiple hypothesis simultaneously, see Testing the general linear hypothesis: $H_0: beta_1 = beta_2 = beta_3 = beta_4 = beta$. Here T=C.
Example 3: If the interactions exist, we need to have the linear relation for each combination (cell) of interaction. Here is 16x16 C matrix to get 8 intercepts and slopes. How to understand the coefficients of a three-way interaction in a regression?
You can find more example on the internet, textbooks.
In summary, for linear model, constructing C matrix is equal to half of theory of linear model.
When you fit a linear model, the statistical softwares give you the point estimate, confidence interval, test statistics, and p-values of the $beta$_s. If you are just interesting in these $beta$s, you can stop here (for example, the simple linear regression just have on intercept and one slope, so $beta$s themselves are enough). But for little complicated model, you will not satisfied by $beta$s, and you want to estimate, test the linear combinations of $beta$s. At this time The importance of C matrix is obvious. For complicated model, such as model with interactions, C matrix must needs be constructed.
Example 1: For ANOVA, one categorical covariate has 3 level. Suppose level 1 is reference. Two $beta$ will give you the difference between level 2 vs 1 and level 3 vs 1. If you want the difference between level 2 and 3, you need C matrix (0 1 -1). (the first 0 is for intercept). If you want to estimate the means for level 3, C=(1 0 1) is needed.
Example 2: If you want to the multiple hypothesis simultaneously, see Testing the general linear hypothesis: $H_0: beta_1 = beta_2 = beta_3 = beta_4 = beta$. Here T=C.
Example 3: If the interactions exist, we need to have the linear relation for each combination (cell) of interaction. Here is 16x16 C matrix to get 8 intercepts and slopes. How to understand the coefficients of a three-way interaction in a regression?
You can find more example on the internet, textbooks.
In summary, for linear model, constructing C matrix is equal to half of theory of linear model.
answered 7 hours ago
a_statistician
2,514139
2,514139
add a comment |Â
add a comment |Â
up vote
2
down vote
These linear hypotheses on the coefficient vector have three main uses:
Testing the existence of relationships: We can test the existence of relationships between some subset of the explanatory variables and the response variable. To do this, let $mathbfe_mathcalS$ denote the indicator vector for the subset $mathcalS$ and test the linear hypotheses:
$$H_0: mathbfe_mathcalS boldsymbolbeta = 0 quad quad quad H_A: mathbfe_mathcalS boldsymbolbeta neq 0.$$
Testing a specified magnitude for the relationship: We can test the magnitude of a relationship between an explanatory variables and the response variable using some specified value of interest. This is often useful when a particular specified magnitude has some practical significance (e.g., it is often useful to test if the true coefficient is equal to one). To test $beta_k = b$ we use the linear hypotheses:
$$H_0: mathbfe_k boldsymbolbeta = b quad quad quad H_A: mathbfe_k boldsymbolbeta neq b.$$
Testing the expected responses of new explanatory variables: We can test the values expected values of responses corresponding to a new set of explanatory variables. Taking new explanatory data $boldsymbolX_textnew$ we get corresponding expected values $mathbbE(boldsymbolY_textnew) = boldsymbolX_textnew boldsymbolbeta$. This means that we can test the hypothesis $mathbbE(boldsymbolY_textnew) = boldsymboly $ via the hypotheses:
$$H_0: boldsymbolX_textnew boldsymbolbeta = boldsymboly quad quad quad H_A: boldsymbolX_textnew boldsymbolbeta neq boldsymboly.$$
As you can see, the first use is to test for whether some of the coefficients are zero, which is a test of whether those explanatory variables are related to the response in the model. However, you can also undertake more general tests of a specific magnitude for the relationship. You can also use the linear test to test the expected response for new data.
add a comment |Â
up vote
2
down vote
These linear hypotheses on the coefficient vector have three main uses:
Testing the existence of relationships: We can test the existence of relationships between some subset of the explanatory variables and the response variable. To do this, let $mathbfe_mathcalS$ denote the indicator vector for the subset $mathcalS$ and test the linear hypotheses:
$$H_0: mathbfe_mathcalS boldsymbolbeta = 0 quad quad quad H_A: mathbfe_mathcalS boldsymbolbeta neq 0.$$
Testing a specified magnitude for the relationship: We can test the magnitude of a relationship between an explanatory variables and the response variable using some specified value of interest. This is often useful when a particular specified magnitude has some practical significance (e.g., it is often useful to test if the true coefficient is equal to one). To test $beta_k = b$ we use the linear hypotheses:
$$H_0: mathbfe_k boldsymbolbeta = b quad quad quad H_A: mathbfe_k boldsymbolbeta neq b.$$
Testing the expected responses of new explanatory variables: We can test the values expected values of responses corresponding to a new set of explanatory variables. Taking new explanatory data $boldsymbolX_textnew$ we get corresponding expected values $mathbbE(boldsymbolY_textnew) = boldsymbolX_textnew boldsymbolbeta$. This means that we can test the hypothesis $mathbbE(boldsymbolY_textnew) = boldsymboly $ via the hypotheses:
$$H_0: boldsymbolX_textnew boldsymbolbeta = boldsymboly quad quad quad H_A: boldsymbolX_textnew boldsymbolbeta neq boldsymboly.$$
As you can see, the first use is to test for whether some of the coefficients are zero, which is a test of whether those explanatory variables are related to the response in the model. However, you can also undertake more general tests of a specific magnitude for the relationship. You can also use the linear test to test the expected response for new data.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
These linear hypotheses on the coefficient vector have three main uses:
Testing the existence of relationships: We can test the existence of relationships between some subset of the explanatory variables and the response variable. To do this, let $mathbfe_mathcalS$ denote the indicator vector for the subset $mathcalS$ and test the linear hypotheses:
$$H_0: mathbfe_mathcalS boldsymbolbeta = 0 quad quad quad H_A: mathbfe_mathcalS boldsymbolbeta neq 0.$$
Testing a specified magnitude for the relationship: We can test the magnitude of a relationship between an explanatory variables and the response variable using some specified value of interest. This is often useful when a particular specified magnitude has some practical significance (e.g., it is often useful to test if the true coefficient is equal to one). To test $beta_k = b$ we use the linear hypotheses:
$$H_0: mathbfe_k boldsymbolbeta = b quad quad quad H_A: mathbfe_k boldsymbolbeta neq b.$$
Testing the expected responses of new explanatory variables: We can test the values expected values of responses corresponding to a new set of explanatory variables. Taking new explanatory data $boldsymbolX_textnew$ we get corresponding expected values $mathbbE(boldsymbolY_textnew) = boldsymbolX_textnew boldsymbolbeta$. This means that we can test the hypothesis $mathbbE(boldsymbolY_textnew) = boldsymboly $ via the hypotheses:
$$H_0: boldsymbolX_textnew boldsymbolbeta = boldsymboly quad quad quad H_A: boldsymbolX_textnew boldsymbolbeta neq boldsymboly.$$
As you can see, the first use is to test for whether some of the coefficients are zero, which is a test of whether those explanatory variables are related to the response in the model. However, you can also undertake more general tests of a specific magnitude for the relationship. You can also use the linear test to test the expected response for new data.
These linear hypotheses on the coefficient vector have three main uses:
Testing the existence of relationships: We can test the existence of relationships between some subset of the explanatory variables and the response variable. To do this, let $mathbfe_mathcalS$ denote the indicator vector for the subset $mathcalS$ and test the linear hypotheses:
$$H_0: mathbfe_mathcalS boldsymbolbeta = 0 quad quad quad H_A: mathbfe_mathcalS boldsymbolbeta neq 0.$$
Testing a specified magnitude for the relationship: We can test the magnitude of a relationship between an explanatory variables and the response variable using some specified value of interest. This is often useful when a particular specified magnitude has some practical significance (e.g., it is often useful to test if the true coefficient is equal to one). To test $beta_k = b$ we use the linear hypotheses:
$$H_0: mathbfe_k boldsymbolbeta = b quad quad quad H_A: mathbfe_k boldsymbolbeta neq b.$$
Testing the expected responses of new explanatory variables: We can test the values expected values of responses corresponding to a new set of explanatory variables. Taking new explanatory data $boldsymbolX_textnew$ we get corresponding expected values $mathbbE(boldsymbolY_textnew) = boldsymbolX_textnew boldsymbolbeta$. This means that we can test the hypothesis $mathbbE(boldsymbolY_textnew) = boldsymboly $ via the hypotheses:
$$H_0: boldsymbolX_textnew boldsymbolbeta = boldsymboly quad quad quad H_A: boldsymbolX_textnew boldsymbolbeta neq boldsymboly.$$
As you can see, the first use is to test for whether some of the coefficients are zero, which is a test of whether those explanatory variables are related to the response in the model. However, you can also undertake more general tests of a specific magnitude for the relationship. You can also use the linear test to test the expected response for new data.
answered 7 hours ago


Ben
17.1k22286
17.1k22286
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f374347%2fhow-useful-are-linear-hypotheses%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
$gamma$ is the vector with dimension equal to the number of rows in matrix C. (not size of $beta$)
– a_statistician
8 hours ago
@a_statistician Yes, sorry, I'll edit that.
– Blaza
8 hours ago