What is the reason behind taking log for few continuous variables?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log or log of continuous variable like loan_amount or applicant_income etc.



I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?



Please give me some explanation behind it. Thank you.










share|improve this question









New contributor




Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.























    up vote
    1
    down vote

    favorite












    I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log or log of continuous variable like loan_amount or applicant_income etc.



    I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?



    Please give me some explanation behind it. Thank you.










    share|improve this question









    New contributor




    Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log or log of continuous variable like loan_amount or applicant_income etc.



      I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?



      Please give me some explanation behind it. Thank you.










      share|improve this question









      New contributor




      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log or log of continuous variable like loan_amount or applicant_income etc.



      I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?



      Please give me some explanation behind it. Thank you.







      machine-learning python classification scikit-learn






      share|improve this question









      New contributor




      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 44 mins ago





















      New contributor




      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 49 mins ago









      Sai Kumar

      1064




      1064




      New contributor




      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.






          share|improve this answer




















          • Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
            – Sai Kumar
            31 mins ago











          • Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
            – Kasra Manshaei
            30 mins ago










          • I didn't get you here. Can you explain?
            – Sai Kumar
            29 mins ago






          • 1




            Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
            – Kasra Manshaei
            26 mins ago






          • 1




            Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
            – Kasra Manshaei
            20 mins ago

















          up vote
          2
          down vote













          This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.



          This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:



          $$log(x^n)= n log(x)$$



          which implies



          $$log(10^4) = 4 * log(10)$$



          and



          $$log(10^3) = 3 * log(10)$$






          share|improve this answer










          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.













          • 1




            Nice answer specially talking about exponential distributions.
            – Kasra Manshaei
            16 mins ago










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "557"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.









           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40089%2fwhat-is-the-reason-behind-taking-log-for-few-continuous-variables%23new-answer', 'question_page');

          );

          Post as a guest






























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          2
          down vote













          Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.






          share|improve this answer




















          • Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
            – Sai Kumar
            31 mins ago











          • Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
            – Kasra Manshaei
            30 mins ago










          • I didn't get you here. Can you explain?
            – Sai Kumar
            29 mins ago






          • 1




            Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
            – Kasra Manshaei
            26 mins ago






          • 1




            Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
            – Kasra Manshaei
            20 mins ago














          up vote
          2
          down vote













          Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.






          share|improve this answer




















          • Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
            – Sai Kumar
            31 mins ago











          • Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
            – Kasra Manshaei
            30 mins ago










          • I didn't get you here. Can you explain?
            – Sai Kumar
            29 mins ago






          • 1




            Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
            – Kasra Manshaei
            26 mins ago






          • 1




            Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
            – Kasra Manshaei
            20 mins ago












          up vote
          2
          down vote










          up vote
          2
          down vote









          Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.






          share|improve this answer












          Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 37 mins ago









          Kasra Manshaei

          2,9661035




          2,9661035











          • Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
            – Sai Kumar
            31 mins ago











          • Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
            – Kasra Manshaei
            30 mins ago










          • I didn't get you here. Can you explain?
            – Sai Kumar
            29 mins ago






          • 1




            Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
            – Kasra Manshaei
            26 mins ago






          • 1




            Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
            – Kasra Manshaei
            20 mins ago
















          • Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
            – Sai Kumar
            31 mins ago











          • Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
            – Kasra Manshaei
            30 mins ago










          • I didn't get you here. Can you explain?
            – Sai Kumar
            29 mins ago






          • 1




            Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
            – Kasra Manshaei
            26 mins ago






          • 1




            Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
            – Kasra Manshaei
            20 mins ago















          Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
          – Sai Kumar
          31 mins ago





          Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
          – Sai Kumar
          31 mins ago













          Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
          – Kasra Manshaei
          30 mins ago




          Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
          – Kasra Manshaei
          30 mins ago












          I didn't get you here. Can you explain?
          – Sai Kumar
          29 mins ago




          I didn't get you here. Can you explain?
          – Sai Kumar
          29 mins ago




          1




          1




          Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
          – Kasra Manshaei
          26 mins ago




          Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
          – Kasra Manshaei
          26 mins ago




          1




          1




          Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
          – Kasra Manshaei
          20 mins ago




          Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
          – Kasra Manshaei
          20 mins ago










          up vote
          2
          down vote













          This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.



          This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:



          $$log(x^n)= n log(x)$$



          which implies



          $$log(10^4) = 4 * log(10)$$



          and



          $$log(10^3) = 3 * log(10)$$






          share|improve this answer










          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.













          • 1




            Nice answer specially talking about exponential distributions.
            – Kasra Manshaei
            16 mins ago














          up vote
          2
          down vote













          This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.



          This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:



          $$log(x^n)= n log(x)$$



          which implies



          $$log(10^4) = 4 * log(10)$$



          and



          $$log(10^3) = 3 * log(10)$$






          share|improve this answer










          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.













          • 1




            Nice answer specially talking about exponential distributions.
            – Kasra Manshaei
            16 mins ago












          up vote
          2
          down vote










          up vote
          2
          down vote









          This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.



          This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:



          $$log(x^n)= n log(x)$$



          which implies



          $$log(10^4) = 4 * log(10)$$



          and



          $$log(10^3) = 3 * log(10)$$






          share|improve this answer










          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.



          This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:



          $$log(x^n)= n log(x)$$



          which implies



          $$log(10^4) = 4 * log(10)$$



          and



          $$log(10^3) = 3 * log(10)$$







          share|improve this answer










          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|improve this answer



          share|improve this answer








          edited 28 mins ago





















          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered 34 mins ago









          Duccio Piovani

          213




          213




          New contributor




          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.







          • 1




            Nice answer specially talking about exponential distributions.
            – Kasra Manshaei
            16 mins ago












          • 1




            Nice answer specially talking about exponential distributions.
            – Kasra Manshaei
            16 mins ago







          1




          1




          Nice answer specially talking about exponential distributions.
          – Kasra Manshaei
          16 mins ago




          Nice answer specially talking about exponential distributions.
          – Kasra Manshaei
          16 mins ago










          Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.









           

          draft saved


          draft discarded


















          Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.












          Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.











          Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.













           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40089%2fwhat-is-the-reason-behind-taking-log-for-few-continuous-variables%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          What does second last employer means? [closed]

          List of Gilmore Girls characters

          Confectionery