What is the need of assumptions in linear regression?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
14
down vote

favorite
10












In linear regression, we make the following assumptions


  • The mean of the response,
    $E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,…)$, is a Linear function of the predictors.

  • The errors, $ε_i$, are Independent.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, are Normally distributed.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, have Equal variances (denoted $σ2$).


  • One of the ways we can solve linear regression is through normal equations, which we can write as



    $$theta = (X^TX)^-1X^TY$$



    From a mathematical standpoint, the above equation only needs $X^TX$ to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?







    share|cite|improve this question


















    • 2




      Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
      – keiv.fly
      Aug 15 at 8:30










    • You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
      – keiv.fly
      Aug 15 at 8:43






    • 2




      See stats.stackexchange.com/q/16381/35989
      – Tim♦
      Aug 15 at 12:14






    • 1




      @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
      – Dason
      Aug 15 at 14:57






    • 1




      @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
      – Alexis
      Aug 15 at 17:02
















    up vote
    14
    down vote

    favorite
    10












    In linear regression, we make the following assumptions


  • The mean of the response,
    $E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,…)$, is a Linear function of the predictors.

  • The errors, $ε_i$, are Independent.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, are Normally distributed.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, have Equal variances (denoted $σ2$).


  • One of the ways we can solve linear regression is through normal equations, which we can write as



    $$theta = (X^TX)^-1X^TY$$



    From a mathematical standpoint, the above equation only needs $X^TX$ to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?







    share|cite|improve this question


















    • 2




      Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
      – keiv.fly
      Aug 15 at 8:30










    • You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
      – keiv.fly
      Aug 15 at 8:43






    • 2




      See stats.stackexchange.com/q/16381/35989
      – Tim♦
      Aug 15 at 12:14






    • 1




      @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
      – Dason
      Aug 15 at 14:57






    • 1




      @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
      – Alexis
      Aug 15 at 17:02












    up vote
    14
    down vote

    favorite
    10









    up vote
    14
    down vote

    favorite
    10






    10





    In linear regression, we make the following assumptions


  • The mean of the response,
    $E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,…)$, is a Linear function of the predictors.

  • The errors, $ε_i$, are Independent.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, are Normally distributed.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, have Equal variances (denoted $σ2$).


  • One of the ways we can solve linear regression is through normal equations, which we can write as



    $$theta = (X^TX)^-1X^TY$$



    From a mathematical standpoint, the above equation only needs $X^TX$ to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?







    share|cite|improve this question














    In linear regression, we make the following assumptions


  • The mean of the response,
    $E(Y_i)$, at each set of values of the predictors, $(x_1i, x_2i,…)$, is a Linear function of the predictors.

  • The errors, $ε_i$, are Independent.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, are Normally distributed.

  • The errors, $ε_i$, at each set of values of the predictors, $(x_1i, x_2i,…)$, have Equal variances (denoted $σ2$).


  • One of the ways we can solve linear regression is through normal equations, which we can write as



    $$theta = (X^TX)^-1X^TY$$



    From a mathematical standpoint, the above equation only needs $X^TX$ to be invertible. So, why do we need these assumptions? I asked a few colleagues and they mentioned that it is to get good results and normal equations are an algorithm to achieve that. But in that case, how do these assumptions help? How does upholding them help in getting a better model?









    share|cite|improve this question













    share|cite|improve this question




    share|cite|improve this question








    edited Aug 15 at 12:40









    conjectures

    3,0191430




    3,0191430










    asked Aug 15 at 7:05









    Clock Slave

    264113




    264113







    • 2




      Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
      – keiv.fly
      Aug 15 at 8:30










    • You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
      – keiv.fly
      Aug 15 at 8:43






    • 2




      See stats.stackexchange.com/q/16381/35989
      – Tim♦
      Aug 15 at 12:14






    • 1




      @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
      – Dason
      Aug 15 at 14:57






    • 1




      @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
      – Alexis
      Aug 15 at 17:02












    • 2




      Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
      – keiv.fly
      Aug 15 at 8:30










    • You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
      – keiv.fly
      Aug 15 at 8:43






    • 2




      See stats.stackexchange.com/q/16381/35989
      – Tim♦
      Aug 15 at 12:14






    • 1




      @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
      – Dason
      Aug 15 at 14:57






    • 1




      @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
      – Alexis
      Aug 15 at 17:02







    2




    2




    Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
    – keiv.fly
    Aug 15 at 8:30




    Normal distribution is needed to calculate coefficient confidence intervals using usual formulas. Other formulas of CI calculation (I think it was White) allow non-normal distribution.
    – keiv.fly
    Aug 15 at 8:30












    You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
    – keiv.fly
    Aug 15 at 8:43




    You do not always need those assumptions for the model to work. In neural networks you have linear regressions inside and they minimize rmse just like the formula you provided, but most likely none of the assumptions hold. No normal distribution, no equal variance, no linear function, even the errors can be dependent.
    – keiv.fly
    Aug 15 at 8:43




    2




    2




    See stats.stackexchange.com/q/16381/35989
    – Tim♦
    Aug 15 at 12:14




    See stats.stackexchange.com/q/16381/35989
    – Tim♦
    Aug 15 at 12:14




    1




    1




    @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
    – Dason
    Aug 15 at 14:57




    @Alexis The independent variables being i.i.d. definitely isn't an assumption (and the dependent variable being iid also isn't an assumption - imagine if we assumed that the response was iid then it would be pointless to do anything beyond estimating the mean). And the "no omitted variables" isn't really an additional assumption although it's good to avoid omitting variables - the first assumption listed is really what takes care of that.
    – Dason
    Aug 15 at 14:57




    1




    1




    @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
    – Alexis
    Aug 15 at 17:02




    @Dason I think my link provides a pretty strong example of "no omitted variables" being requisite for valid interpretation. I also think i.i.d. (conditional on the predictors, yes) is necessary, with random walks providing an excellent example of where non-i.i.d. estimation can fail (ever resorting to estimating only the mean).
    – Alexis
    Aug 15 at 17:02










    5 Answers
    5






    active

    oldest

    votes

















    up vote
    19
    down vote



    accepted










    You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?






    share|cite|improve this answer



























      up vote
      17
      down vote













      Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)



      https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png






      share|cite|improve this answer




















      • I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
        – Alexis
        Aug 15 at 14:19

















      up vote
      3
      down vote













      You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.






      share|cite|improve this answer



























        up vote
        1
        down vote













        Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?



        I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.






        share|cite|improve this answer



























          up vote
          0
          down vote













          The two key assumptions are



          1. Independence of observations

          2. Mean is not related to the variance

          See The discussion in Julian Faraway's book.



          If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.






          share|cite|improve this answer




















            Your Answer




            StackExchange.ifUsing("editor", function ()
            return StackExchange.using("mathjaxEditing", function ()
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            );
            );
            , "mathjax-editing");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "65"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362284%2fwhat-is-the-need-of-assumptions-in-linear-regression%23new-answer', 'question_page');

            );

            Post as a guest






























            5 Answers
            5






            active

            oldest

            votes








            5 Answers
            5






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            19
            down vote



            accepted










            You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?






            share|cite|improve this answer
























              up vote
              19
              down vote



              accepted










              You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?






              share|cite|improve this answer






















                up vote
                19
                down vote



                accepted







                up vote
                19
                down vote



                accepted






                You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?






                share|cite|improve this answer












                You are correct - you do not need to satisfy these assumptions to fit a least squares line to the points. You need these assumptions to interpret the results. For example, assuming there was no relationship between an input $X_1$ and $Y$, what is the probability of getting a coefficient $beta_1$ at least as great as what we saw from the regression?







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered Aug 15 at 7:20









                rinspy

                1,850330




                1,850330






















                    up vote
                    17
                    down vote













                    Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)



                    https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png






                    share|cite|improve this answer




















                    • I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                      – Alexis
                      Aug 15 at 14:19














                    up vote
                    17
                    down vote













                    Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)



                    https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png






                    share|cite|improve this answer




















                    • I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                      – Alexis
                      Aug 15 at 14:19












                    up vote
                    17
                    down vote










                    up vote
                    17
                    down vote









                    Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)



                    https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png






                    share|cite|improve this answer












                    Try the image of Anscombe's quartet from Wikipedia to get an idea of some of the potential issues with interpreting linear regression when some of those assumptions are clearly false: most of the basic descriptive statistics are the same in all four (and the individual $x_i$ values are identical in all but the bottom right)



                    https://upload.wikimedia.org/wikipedia/commons/thumb/e/ec/Anscombe%27s_quartet_3.svg/1280px-Anscombe%27s_quartet_3.svg.png







                    share|cite|improve this answer












                    share|cite|improve this answer



                    share|cite|improve this answer










                    answered Aug 15 at 11:41









                    Henry

                    17.5k13164




                    17.5k13164











                    • I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                      – Alexis
                      Aug 15 at 14:19
















                    • I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                      – Alexis
                      Aug 15 at 14:19















                    I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                    – Alexis
                    Aug 15 at 14:19




                    I made a illustration following Anscombe showing what violating the no omitted variables assumption can look like. Still working on an Anscombe-like illustration of a violation of the i.i.d. assumption.
                    – Alexis
                    Aug 15 at 14:19










                    up vote
                    3
                    down vote













                    You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.






                    share|cite|improve this answer
























                      up vote
                      3
                      down vote













                      You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.






                      share|cite|improve this answer






















                        up vote
                        3
                        down vote










                        up vote
                        3
                        down vote









                        You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.






                        share|cite|improve this answer












                        You don't need those assumptions to fit a linear model. However, your parameter estimates could be biased or not having the minimum variance. Violating the assumptions will make yourself more difficult in interpreting the regression results, for example, constructing a confidence interval.







                        share|cite|improve this answer












                        share|cite|improve this answer



                        share|cite|improve this answer










                        answered Aug 15 at 8:29









                        SmallChess

                        5,32341837




                        5,32341837




















                            up vote
                            1
                            down vote













                            Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?



                            I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.






                            share|cite|improve this answer
























                              up vote
                              1
                              down vote













                              Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?



                              I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.






                              share|cite|improve this answer






















                                up vote
                                1
                                down vote










                                up vote
                                1
                                down vote









                                Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?



                                I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.






                                share|cite|improve this answer












                                Ok, the answers so far go like this: If we violate the assumptions then bad things can happen. I believe that the interesting direction is: When all assumptions that we need (actually a little different from the ones above) are met, why and how can we be sure that linear regression is the best model?



                                I think the answer to that question goes like this: If we make the assumptions as in the answer of this question then we can compute the conditional density $p(y_i|x_i)$. From this we can compute $E[Y_i|X_i=x_i]$ (the factorization of the conditional expectation at $x_i$) and see that it is indeed the linear regression function. Then we use this in order to see that this is the best function with respect to the true risk.







                                share|cite|improve this answer












                                share|cite|improve this answer



                                share|cite|improve this answer










                                answered Aug 15 at 11:41









                                Fabian Werner

                                1,081512




                                1,081512




















                                    up vote
                                    0
                                    down vote













                                    The two key assumptions are



                                    1. Independence of observations

                                    2. Mean is not related to the variance

                                    See The discussion in Julian Faraway's book.



                                    If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.






                                    share|cite|improve this answer
























                                      up vote
                                      0
                                      down vote













                                      The two key assumptions are



                                      1. Independence of observations

                                      2. Mean is not related to the variance

                                      See The discussion in Julian Faraway's book.



                                      If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.






                                      share|cite|improve this answer






















                                        up vote
                                        0
                                        down vote










                                        up vote
                                        0
                                        down vote









                                        The two key assumptions are



                                        1. Independence of observations

                                        2. Mean is not related to the variance

                                        See The discussion in Julian Faraway's book.



                                        If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.






                                        share|cite|improve this answer












                                        The two key assumptions are



                                        1. Independence of observations

                                        2. Mean is not related to the variance

                                        See The discussion in Julian Faraway's book.



                                        If these are both true, OLS is surprisingly resistant to breaches in the other assumptions you have listed.







                                        share|cite|improve this answer












                                        share|cite|improve this answer



                                        share|cite|improve this answer










                                        answered Aug 15 at 23:55









                                        astaines

                                        30625




                                        30625



























                                             

                                            draft saved


                                            draft discarded















































                                             


                                            draft saved


                                            draft discarded














                                            StackExchange.ready(
                                            function ()
                                            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f362284%2fwhat-is-the-need-of-assumptions-in-linear-regression%23new-answer', 'question_page');

                                            );

                                            Post as a guest













































































                                            Comments

                                            Popular posts from this blog

                                            What does second last employer means? [closed]

                                            List of Gilmore Girls characters

                                            One-line joke