Dropped 2 Categories in Dummy Variables (Logistic Regression)

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



Should I still interpret it as usual, using the original dropped category as a baseline?










share|cite|improve this question





























    up vote
    1
    down vote

    favorite












    I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



    Should I still interpret it as usual, using the original dropped category as a baseline?










    share|cite|improve this question

























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



      Should I still interpret it as usual, using the original dropped category as a baseline?










      share|cite|improve this question















      I understand that when modeling, dummy variables should be k-1 and the dropped category should be the baseline. However, I do not know how to interpret if after feature selection 2 more categories of that dummy variable were removed (say I have a dummy variable with 5 categories - 1 would be the baseline, another 2 were removed after feature selection).



      Should I still interpret it as usual, using the original dropped category as a baseline?







      logistic feature-selection categorical-encoding






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 2 hours ago









      kjetil b halvorsen

      26.3k977189




      26.3k977189










      asked 3 hours ago









      SuperSaiyan

      163




      163




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






          share|cite|improve this answer



























            up vote
            0
            down vote













            Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



            It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






            share|cite|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "65"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f371738%2fdropped-2-categories-in-dummy-variables-logistic-regression%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              2
              down vote













              You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






              share|cite|improve this answer
























                up vote
                2
                down vote













                You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






                share|cite|improve this answer






















                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.






                  share|cite|improve this answer












                  You should think of the k-1 dummy variables as a "block" - either they all stay in the model or they are all eliminated from the model during the feature selection process. The reason for this is that the k-1 dummy variables together help encode the effect of the original categorical variable that spawned them.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered 3 hours ago









                  Isabella Ghement

                  4,433316




                  4,433316






















                      up vote
                      0
                      down vote













                      Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                      It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                      share|cite|improve this answer
























                        up vote
                        0
                        down vote













                        Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                        It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                        share|cite|improve this answer






















                          up vote
                          0
                          down vote










                          up vote
                          0
                          down vote









                          Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                          It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.






                          share|cite|improve this answer












                          Suppose 5 categories are A, B, C, D and E. Suppose that A is for baseline, and C and E were removed in the process of variable selection.



                          It means A, C, and E have no statistically significant difference and they are combined into one group and treat A, C and E together as baseline.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered 3 hours ago









                          a_statistician

                          1,549139




                          1,549139



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f371738%2fdropped-2-categories-in-dummy-variables-logistic-regression%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Comments

                              Popular posts from this blog

                              Long meetings (6-7 hours a day): Being “babysat” by supervisor

                              Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                              Confectionery