What is the need of assumptions in linear regression?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
14
down vote

favorite

In linear regression, we make the following assumptions

The mean of the response,
$E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, is a Linear function of the predictors.

The errors, $ÃŽÂµ_i$, are Independent.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, are Normally distributed.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, have Equal variances (denoted $ÃÂƒ2$).

One of the ways we can solve linear regression is through normal equations, which we can write as

$$theta = (X^TX)^-1X^TY$$

From a mathematical standpoint, the above equation only needs $X^TX$ to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?

edited Aug 15 at 12:40

conjectures

3,0191430

asked Aug 15 at 7:05

Clock Slave

264113

2

Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
â€“Â keiv.fly
Aug 15 at 8:30

You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
â€“Â keiv.fly
Aug 15 at 8:43

2

See stats.stackexchange.com/q/16381/35989
â€“Â Timâ™¦
Aug 15 at 12:14

1

@Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
â€“Â Dason
Aug 15 at 14:57

1

@Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
â€“Â Alexis
Aug 15 at 17:02

Â |Â
show 1 more comment

up vote
14
down vote

favorite

In linear regression, we make the following assumptions

The mean of the response,
$E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, is a Linear function of the predictors.

The errors, $ÃŽÂµ_i$, are Independent.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, are Normally distributed.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, have Equal variances (denoted $ÃÂƒ2$).

One of the ways we can solve linear regression is through normal equations, which we can write as

$$theta = (X^TX)^-1X^TY$$

edited Aug 15 at 12:40

conjectures

3,0191430

asked Aug 15 at 7:05

Clock Slave

264113

2

Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
â€“Â keiv.fly
Aug 15 at 8:30

You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
â€“Â keiv.fly
Aug 15 at 8:43

2

See stats.stackexchange.com/q/16381/35989
â€“Â Timâ™¦
Aug 15 at 12:14

1

@Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
â€“Â Dason
Aug 15 at 14:57

1

@Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
â€“Â Alexis
Aug 15 at 17:02

Â |Â
show 1 more comment

up vote
14
down vote

favorite

In linear regression, we make the following assumptions

The mean of the response,
$E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, is a Linear function of the predictors.

The errors, $ÃŽÂµ_i$, are Independent.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, are Normally distributed.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, have Equal variances (denoted $ÃÂƒ2$).

One of the ways we can solve linear regression is through normal equations, which we can write as

$$theta = (X^TX)^-1X^TY$$

edited Aug 15 at 12:40

conjectures

3,0191430

asked Aug 15 at 7:05

Clock Slave

264113

In linear regression, we make the following assumptions

The mean of the response,
$E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, is a Linear function of the predictors.

The errors, $ÃŽÂµ_i$, are Independent.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, are Normally distributed.

The errors, $ÃŽÂµ_i$, at each set of values of the predictors, $(x_1i, x_2i,Ã¢Â€Â¦)$, have Equal variances (denoted $ÃÂƒ2$).

One of the ways we can solve linear regression is through normal equations, which we can write as

$$theta = (X^TX)^-1X^TY$$

edited Aug 15 at 12:40

conjectures

3,0191430

asked Aug 15 at 7:05

Clock Slave

264113

edited Aug 15 at 12:40

conjectures

3,0191430

edited Aug 15 at 12:40

conjectures

3,0191430

edited Aug 15 at 12:40

conjectures

3,0191430

asked Aug 15 at 7:05

Clock Slave

264113

asked Aug 15 at 7:05

Clock Slave

264113

asked Aug 15 at 7:05

Clock Slave

264113

2

Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
â€“Â keiv.fly
Aug 15 at 8:30

You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
â€“Â keiv.fly
Aug 15 at 8:43

2

See stats.stackexchange.com/q/16381/35989
â€“Â Timâ™¦
Aug 15 at 12:14

1

@Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
â€“Â Dason
Aug 15 at 14:57

1

@Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
â€“Â Alexis
Aug 15 at 17:02

Â |Â
show 1 more comment

2

Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
â€“Â keiv.fly
Aug 15 at 8:30

You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
â€“Â keiv.fly
Aug 15 at 8:43

2

See stats.stackexchange.com/q/16381/35989
â€“Â Timâ™¦
Aug 15 at 12:14

1

@Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
â€“Â Dason
Aug 15 at 14:57

1

@Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
â€“Â Alexis
Aug 15 at 17:02

Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
â€“Â keiv.fly
Aug 15 at 8:30

You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
â€“Â keiv.fly
Aug 15 at 8:43

See stats.stackexchange.com/q/16381/35989
â€“Â Timâ™¦
Aug 15 at 12:14

@Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
â€“Â Dason
Aug 15 at 14:57

@Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
â€“Â Alexis
Aug 15 at 17:02

Â |Â
show 1 more comment

5 Answers
5

active

oldest

votes

up vote
19
down vote

accepted

You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?

answered Aug 15 at 7:20

rinspy

1,850330

add a commentÂ |Â

up vote
17
down vote

Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)

answered Aug 15 at 11:41

Henry

17.5k13164

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

add a commentÂ |Â

up vote
3
down vote

You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.

answered Aug 15 at 8:29

SmallChess

5,32341837

add a commentÂ |Â

up vote
1
down vote

Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?

I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.

answered Aug 15 at 11:41

Fabian Werner

1,081512

add a commentÂ |Â

up vote
0
down vote

The two key assumptions are

Independence of observations

Mean is not related to the variance

See The discussion in Julian Faraway's book.

If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.

answered Aug 15 at 23:55

astaines

30625

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362284%2fwhat-is-the-need-of-assumptions-in-linear-regression%23new-answer', 'question_page');

);

Post as a guest

Name

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

up vote
19
down vote

accepted

answered Aug 15 at 7:20

rinspy

1,850330

add a commentÂ |Â

up vote
19
down vote

accepted

answered Aug 15 at 7:20

rinspy

1,850330

add a commentÂ |Â

up vote
19
down vote

accepted

answered Aug 15 at 7:20

rinspy

1,850330

answered Aug 15 at 7:20

rinspy

1,850330

answered Aug 15 at 7:20

rinspy

1,850330

answered Aug 15 at 7:20

rinspy

1,850330

answered Aug 15 at 7:20

rinspy

1,850330

add a commentÂ |Â

up vote
17
down vote

answered Aug 15 at 11:41

Henry

17.5k13164

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

add a commentÂ |Â

up vote
17
down vote

answered Aug 15 at 11:41

Henry

17.5k13164

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

add a commentÂ |Â

up vote
17
down vote

answered Aug 15 at 11:41

Henry

17.5k13164

answered Aug 15 at 11:41

Henry

17.5k13164

answered Aug 15 at 11:41

Henry

17.5k13164

answered Aug 15 at 11:41

Henry

17.5k13164

answered Aug 15 at 11:41

Henry

17.5k13164

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

add a commentÂ |Â

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
â€“Â Alexis
Aug 15 at 14:19

add a commentÂ |Â

up vote
3
down vote

answered Aug 15 at 8:29

SmallChess

5,32341837

add a commentÂ |Â

up vote
3
down vote

answered Aug 15 at 8:29

SmallChess

5,32341837

add a commentÂ |Â

up vote
3
down vote

answered Aug 15 at 8:29

SmallChess

5,32341837

answered Aug 15 at 8:29

SmallChess

5,32341837

answered Aug 15 at 8:29

SmallChess

5,32341837

answered Aug 15 at 8:29

SmallChess

5,32341837

answered Aug 15 at 8:29

SmallChess

5,32341837

add a commentÂ |Â

up vote
1
down vote

answered Aug 15 at 11:41

Fabian Werner

1,081512

add a commentÂ |Â

up vote
1
down vote

answered Aug 15 at 11:41

Fabian Werner

1,081512

add a commentÂ |Â

up vote
1
down vote

answered Aug 15 at 11:41

Fabian Werner

1,081512

answered Aug 15 at 11:41

Fabian Werner

1,081512

answered Aug 15 at 11:41

Fabian Werner

1,081512

answered Aug 15 at 11:41

Fabian Werner

1,081512

answered Aug 15 at 11:41

Fabian Werner

1,081512

add a commentÂ |Â

up vote
0
down vote

The two key assumptions are

Independence of observations

Mean is not related to the variance

See The discussion in Julian Faraway's book.

If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.

answered Aug 15 at 23:55

astaines

30625

add a commentÂ |Â

up vote
0
down vote

The two key assumptions are

Independence of observations

Mean is not related to the variance

See The discussion in Julian Faraway's book.

If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.

answered Aug 15 at 23:55

astaines

30625

add a commentÂ |Â

up vote
0
down vote

The two key assumptions are

Independence of observations

Mean is not related to the variance

See The discussion in Julian Faraway's book.

If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.

answered Aug 15 at 23:55

astaines

30625

The two key assumptions are

Independence of observations

Mean is not related to the variance

See The discussion in Julian Faraway's book.

If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.

answered Aug 15 at 23:55

astaines

30625

answered Aug 15 at 23:55

astaines

30625

answered Aug 15 at 23:55

astaines

30625

answered Aug 15 at 23:55

astaines

30625

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky