Ways of Testing Linearity Assumption in Multiple Regression apart from Residual Plots

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I was going through the assumptions of linear regression and of course one of them was linearity between the dependent and the independent variables - to be precise I should say that the assumption is the conditional mean of Yi given Xi is linear in the parameters.



I looked in many textbooks and resources online and all of them suggested to check that assumption through a scatter plot of the residuals versus the fitted values. Although I can see that this is a valid and helpful way, I can't help but notice that it can be a bit arbitrary and subjective in some cases.



My question is if there is a statistical test to examine that assumption as well. For example when testing heteroscedasticity we can see the residual plot but we also have Levene's test.



I can see in that in How can I use the value of $R^2$ to test the linearity assumption in multiple regression analysis? ,which is very helpful, it stated the R squared is not that statistic but doesn't mention anything as a viable alternative.



Thanks in advance










share|cite|improve this question



























    up vote
    1
    down vote

    favorite












    I was going through the assumptions of linear regression and of course one of them was linearity between the dependent and the independent variables - to be precise I should say that the assumption is the conditional mean of Yi given Xi is linear in the parameters.



    I looked in many textbooks and resources online and all of them suggested to check that assumption through a scatter plot of the residuals versus the fitted values. Although I can see that this is a valid and helpful way, I can't help but notice that it can be a bit arbitrary and subjective in some cases.



    My question is if there is a statistical test to examine that assumption as well. For example when testing heteroscedasticity we can see the residual plot but we also have Levene's test.



    I can see in that in How can I use the value of $R^2$ to test the linearity assumption in multiple regression analysis? ,which is very helpful, it stated the R squared is not that statistic but doesn't mention anything as a viable alternative.



    Thanks in advance










    share|cite|improve this question























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I was going through the assumptions of linear regression and of course one of them was linearity between the dependent and the independent variables - to be precise I should say that the assumption is the conditional mean of Yi given Xi is linear in the parameters.



      I looked in many textbooks and resources online and all of them suggested to check that assumption through a scatter plot of the residuals versus the fitted values. Although I can see that this is a valid and helpful way, I can't help but notice that it can be a bit arbitrary and subjective in some cases.



      My question is if there is a statistical test to examine that assumption as well. For example when testing heteroscedasticity we can see the residual plot but we also have Levene's test.



      I can see in that in How can I use the value of $R^2$ to test the linearity assumption in multiple regression analysis? ,which is very helpful, it stated the R squared is not that statistic but doesn't mention anything as a viable alternative.



      Thanks in advance










      share|cite|improve this question













      I was going through the assumptions of linear regression and of course one of them was linearity between the dependent and the independent variables - to be precise I should say that the assumption is the conditional mean of Yi given Xi is linear in the parameters.



      I looked in many textbooks and resources online and all of them suggested to check that assumption through a scatter plot of the residuals versus the fitted values. Although I can see that this is a valid and helpful way, I can't help but notice that it can be a bit arbitrary and subjective in some cases.



      My question is if there is a statistical test to examine that assumption as well. For example when testing heteroscedasticity we can see the residual plot but we also have Levene's test.



      I can see in that in How can I use the value of $R^2$ to test the linearity assumption in multiple regression analysis? ,which is very helpful, it stated the R squared is not that statistic but doesn't mention anything as a viable alternative.



      Thanks in advance







      multiple-regression assumptions linearity






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked 1 hour ago









      ALEX.VAMVAS

      214




      214




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote













          What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:



          library("splines")

          # linear effect of age on y
          fm_linear <- lm(y ~ age + sex, data = your_data)

          # nonlinear effect of age on y using natural cubic splines
          fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

          # F-test between the two models
          anova(fm_linear, fm_non_linear)





          share|cite|improve this answer




















          • Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
            – ALEX.VAMVAS
            26 mins ago










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "65"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f370506%2fways-of-testing-linearity-assumption-in-multiple-regression-apart-from-residual%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote













          What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:



          library("splines")

          # linear effect of age on y
          fm_linear <- lm(y ~ age + sex, data = your_data)

          # nonlinear effect of age on y using natural cubic splines
          fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

          # F-test between the two models
          anova(fm_linear, fm_non_linear)





          share|cite|improve this answer




















          • Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
            – ALEX.VAMVAS
            26 mins ago














          up vote
          3
          down vote













          What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:



          library("splines")

          # linear effect of age on y
          fm_linear <- lm(y ~ age + sex, data = your_data)

          # nonlinear effect of age on y using natural cubic splines
          fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

          # F-test between the two models
          anova(fm_linear, fm_non_linear)





          share|cite|improve this answer




















          • Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
            – ALEX.VAMVAS
            26 mins ago












          up vote
          3
          down vote










          up vote
          3
          down vote









          What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:



          library("splines")

          # linear effect of age on y
          fm_linear <- lm(y ~ age + sex, data = your_data)

          # nonlinear effect of age on y using natural cubic splines
          fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

          # F-test between the two models
          anova(fm_linear, fm_non_linear)





          share|cite|improve this answer












          What you can do is fit a model that relaxes the linearity assumption, using, e.g., splines, and compare it with the model that assumes linearity. For example, in R, for a linear regression model you can do something like that:



          library("splines")

          # linear effect of age on y
          fm_linear <- lm(y ~ age + sex, data = your_data)

          # nonlinear effect of age on y using natural cubic splines
          fm_non_linear <- lm(y ~ ns(age, 3) + sex, data = your_data)

          # F-test between the two models
          anova(fm_linear, fm_non_linear)






          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered 58 mins ago









          Dimitris Rizopoulos

          1,53319




          1,53319











          • Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
            – ALEX.VAMVAS
            26 mins ago
















          • Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
            – ALEX.VAMVAS
            26 mins ago















          Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
          – ALEX.VAMVAS
          26 mins ago




          Hello Dimitri. Thanks for the quick response. So if I understand it correctly unlike the other assumptions of heteroscedasticity and multicollinearity, which affect the accuracy (for lack of a better word) of the OLS estimators, linearity is an assumptions that refers to the relationship between the dependent and the independent variables. We can still use OLS if it is violated but we should not have a straight line model but rather one with splines and the way to test that would be through ANOVA. Is that a correct conclusion? Also instead of ANOVA could we use the R squared?
          – ALEX.VAMVAS
          26 mins ago

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f370506%2fways-of-testing-linearity-assumption-in-multiple-regression-apart-from-residual%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          What does second last employer means? [closed]

          Installing NextGIS Connect into QGIS 3?

          One-line joke