What is the reason behind taking log for few continuous variables?
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?
Please give me some explanation behind it. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
1
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?
Please give me some explanation behind it. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?
Please give me some explanation behind it. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory or is there any logic behind it?
Please give me some explanation behind it. Thank you.
machine-learning python classification scikit-learn
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 44 mins ago
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 49 mins ago
Sai Kumar
1064
1064
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
 |Â
show 4 more comments
up vote
2
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
 |Â
show 4 more comments
up vote
2
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
 |Â
show 4 more comments
up vote
2
down vote
up vote
2
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
answered 37 mins ago
Kasra Manshaei
2,9661035
2,9661035
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
 |Â
show 4 more comments
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
31 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
30 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
I didn't get you here. Can you explain?
– Sai Kumar
29 mins ago
1
1
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
26 mins ago
1
1
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 mins ago
 |Â
show 4 more comments
up vote
2
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
add a comment |Â
up vote
2
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
add a comment |Â
up vote
2
down vote
up vote
2
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 28 mins ago
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 34 mins ago


Duccio Piovani
213
213
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
add a comment |Â
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
1
1
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
16 mins ago
add a comment |Â
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40089%2fwhat-is-the-reason-behind-taking-log-for-few-continuous-variables%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password