Can you infer causality from correlation in this example of dictator game?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
16
down vote
favorite
I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.
My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?
correlation causality
New contributor
 |Â
show 1 more comment
up vote
16
down vote
favorite
I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.
My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?
correlation causality
New contributor
8
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
5
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
What was the sample size?
â EngrStudent
2 days ago
4
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago
 |Â
show 1 more comment
up vote
16
down vote
favorite
up vote
16
down vote
favorite
I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.
My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?
correlation causality
New contributor
I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.
My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?
correlation causality
correlation causality
New contributor
New contributor
edited 17 mins ago
Carlos Cinelli
4,14731843
4,14731843
New contributor
asked 2 days ago
JonnyBravo
813
813
New contributor
New contributor
8
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
5
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
What was the sample size?
â EngrStudent
2 days ago
4
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago
 |Â
show 1 more comment
8
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
5
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
What was the sample size?
â EngrStudent
2 days ago
4
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago
8
8
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
5
5
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
What was the sample size?
â EngrStudent
2 days ago
What was the sample size?
â EngrStudent
2 days ago
4
4
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago
 |Â
show 1 more comment
9 Answers
9
active
oldest
votes
up vote
10
down vote
In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.
Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?
All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
add a comment |Â
up vote
7
down vote
I can postulate several causalities from your data.
The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).
The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.
Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.
(Omitting 143 theories; I need to keep this reasonably short)
- The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.
Conclusion: you are correct, but your classmate might claim to be 147 times correcter.
Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.
New contributor
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
Another theory: survivorship bias.
â R..
yesterday
add a comment |Â
up vote
5
down vote
Correlation is a mathematical concept; causality is a philosophical idea.
On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.
This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.
Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).
In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.
New contributor
add a comment |Â
up vote
4
down vote
Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.
However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.
Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
 |Â
show 3 more comments
up vote
3
down vote
The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.
https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes
Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
 |Â
show 2 more comments
up vote
2
down vote
No. There is a one-way logical relationship between causality and correlation.
Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.
Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
add a comment |Â
up vote
1
down vote
Causal claim for age would be inappropriate in this case
The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).
But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.
Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.
This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.
How could you change the design to make a causal claim?
Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.
You have Group A play the dictatorship game as usual.
For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.
For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".
Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.
add a comment |Â
up vote
1
down vote
Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.
Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.
In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.
add a comment |Â
up vote
0
down vote
Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.
For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.
There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)
But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.
You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.
For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.
add a comment |Â
9 Answers
9
active
oldest
votes
9 Answers
9
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.
Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?
All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
add a comment |Â
up vote
10
down vote
In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.
Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?
All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
add a comment |Â
up vote
10
down vote
up vote
10
down vote
In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.
Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?
All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.
In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.
Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?
All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.
answered 2 days ago
MikeP
1,70647
1,70647
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
add a comment |Â
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
â NofP
7 hours ago
add a comment |Â
up vote
7
down vote
I can postulate several causalities from your data.
The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).
The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.
Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.
(Omitting 143 theories; I need to keep this reasonably short)
- The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.
Conclusion: you are correct, but your classmate might claim to be 147 times correcter.
Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.
New contributor
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
Another theory: survivorship bias.
â R..
yesterday
add a comment |Â
up vote
7
down vote
I can postulate several causalities from your data.
The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).
The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.
Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.
(Omitting 143 theories; I need to keep this reasonably short)
- The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.
Conclusion: you are correct, but your classmate might claim to be 147 times correcter.
Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.
New contributor
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
Another theory: survivorship bias.
â R..
yesterday
add a comment |Â
up vote
7
down vote
up vote
7
down vote
I can postulate several causalities from your data.
The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).
The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.
Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.
(Omitting 143 theories; I need to keep this reasonably short)
- The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.
Conclusion: you are correct, but your classmate might claim to be 147 times correcter.
Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.
New contributor
I can postulate several causalities from your data.
The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).
The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.
Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.
(Omitting 143 theories; I need to keep this reasonably short)
- The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.
Conclusion: you are correct, but your classmate might claim to be 147 times correcter.
Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.
New contributor
edited yesterday
Nick Cox
37.5k478126
37.5k478126
New contributor
answered 2 days ago
Klaws
1792
1792
New contributor
New contributor
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
Another theory: survivorship bias.
â R..
yesterday
add a comment |Â
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
Another theory: survivorship bias.
â R..
yesterday
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
â EngrStudent
2 days ago
4
4
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
â Nuclear Wang
2 days ago
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
@NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
â aaaaaa
yesterday
1
1
Another theory: survivorship bias.
â R..
yesterday
Another theory: survivorship bias.
â R..
yesterday
add a comment |Â
up vote
5
down vote
Correlation is a mathematical concept; causality is a philosophical idea.
On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.
This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.
Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).
In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.
New contributor
add a comment |Â
up vote
5
down vote
Correlation is a mathematical concept; causality is a philosophical idea.
On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.
This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.
Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).
In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.
New contributor
add a comment |Â
up vote
5
down vote
up vote
5
down vote
Correlation is a mathematical concept; causality is a philosophical idea.
On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.
This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.
Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).
In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.
New contributor
Correlation is a mathematical concept; causality is a philosophical idea.
On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.
This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.
Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).
In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.
New contributor
New contributor
answered 2 days ago
user8948
1394
1394
New contributor
New contributor
add a comment |Â
add a comment |Â
up vote
4
down vote
Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.
However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.
Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
 |Â
show 3 more comments
up vote
4
down vote
Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.
However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.
Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
 |Â
show 3 more comments
up vote
4
down vote
up vote
4
down vote
Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.
However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.
Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.
Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.
However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.
Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.
edited 2 days ago
answered 2 days ago
Lucas
4,0561529
4,0561529
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
 |Â
show 3 more comments
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
3
3
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
â Llopis
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
â Lucas
2 days ago
1
1
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
â locobro
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
â Lucas
2 days ago
4
4
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
â Tim Randall
yesterday
 |Â
show 3 more comments
up vote
3
down vote
The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.
https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes
Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
 |Â
show 2 more comments
up vote
3
down vote
The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.
https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes
Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
 |Â
show 2 more comments
up vote
3
down vote
up vote
3
down vote
The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.
https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes
Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.
The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.
https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X
https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes
Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.
edited 7 hours ago
Communityâ¦
1
1
answered 2 days ago
gareth
1213
1213
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
 |Â
show 2 more comments
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
4
4
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
â gungâ¦
2 days ago
3
3
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
â Cliff AB
yesterday
1
1
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
â Cliff AB
yesterday
1
1
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
â user8948
yesterday
1
1
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
â Nat
9 hours ago
 |Â
show 2 more comments
up vote
2
down vote
No. There is a one-way logical relationship between causality and correlation.
Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.
Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
add a comment |Â
up vote
2
down vote
No. There is a one-way logical relationship between causality and correlation.
Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.
Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
add a comment |Â
up vote
2
down vote
up vote
2
down vote
No. There is a one-way logical relationship between causality and correlation.
Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.
Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.
No. There is a one-way logical relationship between causality and correlation.
Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.
Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.
edited yesterday
Nick Cox
37.5k478126
37.5k478126
answered 2 days ago
cherub
1,308210
1,308210
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
add a comment |Â
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
â user8948
yesterday
add a comment |Â
up vote
1
down vote
Causal claim for age would be inappropriate in this case
The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).
But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.
Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.
This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.
How could you change the design to make a causal claim?
Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.
You have Group A play the dictatorship game as usual.
For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.
For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".
Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.
add a comment |Â
up vote
1
down vote
Causal claim for age would be inappropriate in this case
The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).
But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.
Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.
This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.
How could you change the design to make a causal claim?
Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.
You have Group A play the dictatorship game as usual.
For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.
For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".
Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Causal claim for age would be inappropriate in this case
The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).
But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.
Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.
This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.
How could you change the design to make a causal claim?
Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.
You have Group A play the dictatorship game as usual.
For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.
For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".
Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.
Causal claim for age would be inappropriate in this case
The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).
But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.
Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.
This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.
How could you change the design to make a causal claim?
Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.
You have Group A play the dictatorship game as usual.
For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.
For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".
Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.
edited 2 days ago
answered 2 days ago
BrianH
1366
1366
add a comment |Â
add a comment |Â
up vote
1
down vote
Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.
Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.
In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.
add a comment |Â
up vote
1
down vote
Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.
Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.
In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.
Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.
In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.
Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.
Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.
In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.
edited yesterday
Nick Cox
37.5k478126
37.5k478126
answered yesterday
Stilez
25914
25914
add a comment |Â
add a comment |Â
up vote
0
down vote
Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.
For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.
There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)
But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.
You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.
For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.
add a comment |Â
up vote
0
down vote
Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.
For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.
There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)
But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.
You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.
For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.
For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.
There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)
But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.
You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.
For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.
Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.
For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.
There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)
But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.
You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.
For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.
answered 5 hours ago
Tom
1012
1012
add a comment |Â
add a comment |Â
JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.
JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.
JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.
JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f372708%2fcan-you-infer-causality-from-correlation-in-this-example-of-dictator-game%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
8
Normally you can't infer causality from correlation, unless you have a designed experiment.
â user2974951
2 days ago
5
Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
â Aleksandr Dubinsky
2 days ago
Is this dictator game taking place in a lab, where assignment to be the dictator is random?
â Dimitriy V. Masterov
2 days ago
What was the sample size?
â EngrStudent
2 days ago
4
@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
â gungâ¦
2 days ago