What is the reason behind taking log transformation of few continuous variables?
Clash Royale CLAN TAG#URR8PPP
up vote
5
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory? or Is there any logic behind it?
Please provide some explanation if possible. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
5
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory? or Is there any logic behind it?
Please provide some explanation if possible. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory? or Is there any logic behind it?
Please provide some explanation if possible. Thank you.
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I have been doing a classification problem and I have read many people's code and tutorials. One things I've noticed is that many people take np.log
or log
of continuous variable like loan_amount
or applicant_income
etc.
I just want to understand the reason behind it. Does it help improve our model prediction accuracy. Is it mandatory? or Is there any logic behind it?
Please provide some explanation if possible. Thank you.
machine-learning python classification scikit-learn
machine-learning python classification scikit-learn
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 3 mins ago
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 20 hours ago
Sai Kumar
1355
1355
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Sai Kumar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
add a comment |Â
6 Answers
6
active
oldest
votes
up vote
13
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
which transforms a huge difference $$ 10^4 - 10^3 $$ in a smaller one $$ 4 - 3 $$
Making the values comparable.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
add a comment |Â
up vote
6
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
 |Â
show 5 more comments
up vote
3
down vote
In addition to the other answers, another side-effect of taking $logx$ is that if $0 < x < infty$, again for example with loans or incomes, basically anything that cannot become negative, the domain becomes $-infty < logx <infty$.
This can be helpful, especially in return variables, if the model you are using is based on assuptions about the distribution of $x$. For example the assumption of normality in linear models.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
1
down vote
Yet another reason why logarithmic transformations are useful comes into play for ratio data, due to the fact that log(A/B) = -log(B/A)
. If you plot a distribution of ratios on the raw scale, your points fall in the range (0, Inf)
. Any ratios less than 1 will be squished into a small area of the plot, and furthermore, the plot will look completely different if you flip the ratio to (B/A)
instead of (A/B)
. If you do this on a logarithmic scale, the range is now (-Inf, +Inf)
, meaning ratios less than 1 and greater than 1 are more equally spread out. If you decide to flip the ratio, you simply flip the plot around 0, otherwise it looks exactly the same. On a log scale, it doesn't really matter if you show a ratio as 1/10 or 10/1
, which is useful when there's not an obvious choice about which it should be.
add a comment |Â
up vote
0
down vote
I'd say the main reason is not distributional but rather because of the non linear relationship. Logs often capture saturating relationships...
add a comment |Â
up vote
0
down vote
which implies
log(104)=4∗log(10)
and
log(103)=3∗log(10)
which transforms a huge difference104−103
in a smaller one4−3
Making the values comparable.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
add a comment |Â
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
13
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
which transforms a huge difference $$ 10^4 - 10^3 $$ in a smaller one $$ 4 - 3 $$
Making the values comparable.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
add a comment |Â
up vote
13
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
which transforms a huge difference $$ 10^4 - 10^3 $$ in a smaller one $$ 4 - 3 $$
Making the values comparable.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
add a comment |Â
up vote
13
down vote
up vote
13
down vote
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
which transforms a huge difference $$ 10^4 - 10^3 $$ in a smaller one $$ 4 - 3 $$
Making the values comparable.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
This is done when the variables span several orders of magnitude. Income is a typical example: its distribution is "power law", meaning that the vast majority of incomes are small and very few are big.
This type of "fat tailed" distribution is studied in logarithmic scale because of the mathematical properties of the logarithm:
$$log(x^n)= n log(x)$$
which implies
$$log(10^4) = 4 * log(10)$$
and
$$log(10^3) = 3 * log(10)$$
which transforms a huge difference $$ 10^4 - 10^3 $$ in a smaller one $$ 4 - 3 $$
Making the values comparable.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 19 hours ago
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 20 hours ago


Duccio Piovani
1314
1314
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Duccio Piovani is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
add a comment |Â
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
2
2
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
Nice answer specially talking about exponential distributions.
– Kasra Manshaei
20 hours ago
1
1
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
@KasraManshaei I was speaking about power laws in particular (income being a typical example): extreme values in exponential distribution are by definition very rare. Therefore data which spans many orders of magnitude is usually power law.
– Duccio Piovani
19 hours ago
1
1
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
but of course in such cases log ---> ln, which absolutely doesnt change the point of the answer.
– Duccio Piovani
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
Yes I got it. As you said not much changes.
– Kasra Manshaei
19 hours ago
add a comment |Â
up vote
6
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
 |Â
show 5 more comments
up vote
6
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
 |Â
show 5 more comments
up vote
6
down vote
up vote
6
down vote
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
Mostly because of skewed distribution. Logarithm naturally reduces the dynamic range of a variable so the differences are preserved while the scale is not that dramatically skewed. Imagine some people got 100,000,000 loan and some got 10000 and some 0. Any feature scaling will probably put 0 and 10000 so close to each other as the biggest number anyway pushes the boundary. Logarithm solves the issue.
answered 20 hours ago
Kasra Manshaei
3,0411035
3,0411035
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
 |Â
show 5 more comments
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Manshael, So I can use MinMaxScaler or StandardScaler right? or Is it necessary to take log?
– Sai Kumar
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
Necessary. If you use scalers they compress small values dramatically. That's what I meant to say.
– Kasra Manshaei
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
I didn't get you here. Can you explain?
– Sai Kumar
20 hours ago
2
2
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
Yes. If you take values 1000,000,000 and 10000 and 0 into account. In many cases, the first one is too big to let others be seen properly by your model. But if you take logarithm you will have 9, 4 and 0 respectively. As you see the dynamic range is reduced while the differences are almost preserved. It comes from any exponential nature in your feature. In those cases you need logarithm as the other answer depicted. Hope it helped :)
– Kasra Manshaei
20 hours ago
2
2
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
Well, scaling! Imagine two variables with normal distribution (so there is no need for logarithm) but one of them in the scale of 10ish and the other in the scale of milions. Again feeding them to the model makes the small one invisible. In this case you use scalers to make their scales reasonable.
– Kasra Manshaei
20 hours ago
 |Â
show 5 more comments
up vote
3
down vote
In addition to the other answers, another side-effect of taking $logx$ is that if $0 < x < infty$, again for example with loans or incomes, basically anything that cannot become negative, the domain becomes $-infty < logx <infty$.
This can be helpful, especially in return variables, if the model you are using is based on assuptions about the distribution of $x$. For example the assumption of normality in linear models.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
3
down vote
In addition to the other answers, another side-effect of taking $logx$ is that if $0 < x < infty$, again for example with loans or incomes, basically anything that cannot become negative, the domain becomes $-infty < logx <infty$.
This can be helpful, especially in return variables, if the model you are using is based on assuptions about the distribution of $x$. For example the assumption of normality in linear models.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
3
down vote
up vote
3
down vote
In addition to the other answers, another side-effect of taking $logx$ is that if $0 < x < infty$, again for example with loans or incomes, basically anything that cannot become negative, the domain becomes $-infty < logx <infty$.
This can be helpful, especially in return variables, if the model you are using is based on assuptions about the distribution of $x$. For example the assumption of normality in linear models.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
In addition to the other answers, another side-effect of taking $logx$ is that if $0 < x < infty$, again for example with loans or incomes, basically anything that cannot become negative, the domain becomes $-infty < logx <infty$.
This can be helpful, especially in return variables, if the model you are using is based on assuptions about the distribution of $x$. For example the assumption of normality in linear models.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 19 hours ago


JAD
13114
13114
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
JAD is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
add a comment |Â
up vote
1
down vote
Yet another reason why logarithmic transformations are useful comes into play for ratio data, due to the fact that log(A/B) = -log(B/A)
. If you plot a distribution of ratios on the raw scale, your points fall in the range (0, Inf)
. Any ratios less than 1 will be squished into a small area of the plot, and furthermore, the plot will look completely different if you flip the ratio to (B/A)
instead of (A/B)
. If you do this on a logarithmic scale, the range is now (-Inf, +Inf)
, meaning ratios less than 1 and greater than 1 are more equally spread out. If you decide to flip the ratio, you simply flip the plot around 0, otherwise it looks exactly the same. On a log scale, it doesn't really matter if you show a ratio as 1/10 or 10/1
, which is useful when there's not an obvious choice about which it should be.
add a comment |Â
up vote
1
down vote
Yet another reason why logarithmic transformations are useful comes into play for ratio data, due to the fact that log(A/B) = -log(B/A)
. If you plot a distribution of ratios on the raw scale, your points fall in the range (0, Inf)
. Any ratios less than 1 will be squished into a small area of the plot, and furthermore, the plot will look completely different if you flip the ratio to (B/A)
instead of (A/B)
. If you do this on a logarithmic scale, the range is now (-Inf, +Inf)
, meaning ratios less than 1 and greater than 1 are more equally spread out. If you decide to flip the ratio, you simply flip the plot around 0, otherwise it looks exactly the same. On a log scale, it doesn't really matter if you show a ratio as 1/10 or 10/1
, which is useful when there's not an obvious choice about which it should be.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Yet another reason why logarithmic transformations are useful comes into play for ratio data, due to the fact that log(A/B) = -log(B/A)
. If you plot a distribution of ratios on the raw scale, your points fall in the range (0, Inf)
. Any ratios less than 1 will be squished into a small area of the plot, and furthermore, the plot will look completely different if you flip the ratio to (B/A)
instead of (A/B)
. If you do this on a logarithmic scale, the range is now (-Inf, +Inf)
, meaning ratios less than 1 and greater than 1 are more equally spread out. If you decide to flip the ratio, you simply flip the plot around 0, otherwise it looks exactly the same. On a log scale, it doesn't really matter if you show a ratio as 1/10 or 10/1
, which is useful when there's not an obvious choice about which it should be.
Yet another reason why logarithmic transformations are useful comes into play for ratio data, due to the fact that log(A/B) = -log(B/A)
. If you plot a distribution of ratios on the raw scale, your points fall in the range (0, Inf)
. Any ratios less than 1 will be squished into a small area of the plot, and furthermore, the plot will look completely different if you flip the ratio to (B/A)
instead of (A/B)
. If you do this on a logarithmic scale, the range is now (-Inf, +Inf)
, meaning ratios less than 1 and greater than 1 are more equally spread out. If you decide to flip the ratio, you simply flip the plot around 0, otherwise it looks exactly the same. On a log scale, it doesn't really matter if you show a ratio as 1/10 or 10/1
, which is useful when there's not an obvious choice about which it should be.
edited 5 hours ago
Sai Kumar
1355
1355
answered 14 hours ago
Nuclear Wang
24614
24614
add a comment |Â
add a comment |Â
up vote
0
down vote
I'd say the main reason is not distributional but rather because of the non linear relationship. Logs often capture saturating relationships...
add a comment |Â
up vote
0
down vote
I'd say the main reason is not distributional but rather because of the non linear relationship. Logs often capture saturating relationships...
add a comment |Â
up vote
0
down vote
up vote
0
down vote
I'd say the main reason is not distributional but rather because of the non linear relationship. Logs often capture saturating relationships...
I'd say the main reason is not distributional but rather because of the non linear relationship. Logs often capture saturating relationships...
answered 18 hours ago
seanv507
63439
63439
add a comment |Â
add a comment |Â
up vote
0
down vote
which implies
log(104)=4∗log(10)
and
log(103)=3∗log(10)
which transforms a huge difference104−103
in a smaller one4−3
Making the values comparable.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
add a comment |Â
up vote
0
down vote
which implies
log(104)=4∗log(10)
and
log(103)=3∗log(10)
which transforms a huge difference104−103
in a smaller one4−3
Making the values comparable.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
add a comment |Â
up vote
0
down vote
up vote
0
down vote
which implies
log(104)=4∗log(10)
and
log(103)=3∗log(10)
which transforms a huge difference104−103
in a smaller one4−3
Making the values comparable.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
which implies
log(104)=4∗log(10)
and
log(103)=3∗log(10)
which transforms a huge difference104−103
in a smaller one4−3
Making the values comparable.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 3 hours ago
Sai Kumar
1355
1355
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
answered 16 hours ago
Tuscano Anson
1
1
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Tuscano Anson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
add a comment |Â
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
1
1
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
Did you just copy a part of another answer without even checking if the formatting made sense?
– pipe
19 mins ago
add a comment |Â
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sai Kumar is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f40089%2fwhat-is-the-reason-behind-taking-log-transformation-of-few-continuous-variables%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password