Can you infer causality from correlation in this example of dictator game?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
16
down vote

favorite
2












I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.



My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?










share|cite|improve this question









New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 8




    Normally you can't infer causality from correlation, unless you have a designed experiment.
    – user2974951
    2 days ago






  • 5




    Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
    – Aleksandr Dubinsky
    2 days ago










  • Is this dictator game taking place in a lab, where assignment to be the dictator is random?
    – Dimitriy V. Masterov
    2 days ago










  • What was the sample size?
    – EngrStudent
    2 days ago






  • 4




    @DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
    – gung♦
    2 days ago
















up vote
16
down vote

favorite
2












I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.



My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?










share|cite|improve this question









New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 8




    Normally you can't infer causality from correlation, unless you have a designed experiment.
    – user2974951
    2 days ago






  • 5




    Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
    – Aleksandr Dubinsky
    2 days ago










  • Is this dictator game taking place in a lab, where assignment to be the dictator is random?
    – Dimitriy V. Masterov
    2 days ago










  • What was the sample size?
    – EngrStudent
    2 days ago






  • 4




    @DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
    – gung♦
    2 days ago












up vote
16
down vote

favorite
2









up vote
16
down vote

favorite
2






2





I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.



My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?










share|cite|improve this question









New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I've just had en exam where we were presented with two variables. In a dictator game where a dictator is given 100 USD, and can choose how much to send or keep for himself, there was a positive correlation between age and how much money the participants decided to keep.



My thinking is that you can't infer causality from this because you can't infer causation from correlation. My classmate thinks that you can because if you, for example, split the participants up into three separate groups, you can see how they differ in how much they keep and how much they share, and therefore conclude that age causes them to keep more. Who is correct and why?







correlation causality






share|cite|improve this question









New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|cite|improve this question









New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|cite|improve this question




share|cite|improve this question








edited 17 mins ago









Carlos Cinelli

4,14731843




4,14731843






New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 2 days ago









JonnyBravo

813




813




New contributor




JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






JonnyBravo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 8




    Normally you can't infer causality from correlation, unless you have a designed experiment.
    – user2974951
    2 days ago






  • 5




    Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
    – Aleksandr Dubinsky
    2 days ago










  • Is this dictator game taking place in a lab, where assignment to be the dictator is random?
    – Dimitriy V. Masterov
    2 days ago










  • What was the sample size?
    – EngrStudent
    2 days ago






  • 4




    @DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
    – gung♦
    2 days ago












  • 8




    Normally you can't infer causality from correlation, unless you have a designed experiment.
    – user2974951
    2 days ago






  • 5




    Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
    – Aleksandr Dubinsky
    2 days ago










  • Is this dictator game taking place in a lab, where assignment to be the dictator is random?
    – Dimitriy V. Masterov
    2 days ago










  • What was the sample size?
    – EngrStudent
    2 days ago






  • 4




    @DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
    – gung♦
    2 days ago







8




8




Normally you can't infer causality from correlation, unless you have a designed experiment.
– user2974951
2 days ago




Normally you can't infer causality from correlation, unless you have a designed experiment.
– user2974951
2 days ago




5




5




Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
– Aleksandr Dubinsky
2 days ago




Everything that we know in about our world as individuals, we know through correlation. So yes, we can infer causality from correlation as far as it can be said that causality exists at all. Of course, doing it right is tricky.
– Aleksandr Dubinsky
2 days ago












Is this dictator game taking place in a lab, where assignment to be the dictator is random?
– Dimitriy V. Masterov
2 days ago




Is this dictator game taking place in a lab, where assignment to be the dictator is random?
– Dimitriy V. Masterov
2 days ago












What was the sample size?
– EngrStudent
2 days ago




What was the sample size?
– EngrStudent
2 days ago




4




4




@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
– gung♦
2 days ago




@DimitriyV.Masterov, most likely all participants were 'assigned' to be dictators & the second player was a plant. However, I'm sure no one was randomly assigned to their age.
– gung♦
2 days ago










9 Answers
9






active

oldest

votes

















up vote
10
down vote













In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.



Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?



All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.






share|cite|improve this answer




















  • In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
    – NofP
    7 hours ago

















up vote
7
down vote













I can postulate several causalities from your data.



  1. The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).


  2. The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.


  3. Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.


(Omitting 143 theories; I need to keep this reasonably short)



  1. The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.

Conclusion: you are correct, but your classmate might claim to be 147 times correcter.



Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.






share|cite|improve this answer










New contributor




Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

















  • The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
    – EngrStudent
    2 days ago







  • 4




    It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
    – Nuclear Wang
    2 days ago










  • @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
    – aaaaaa
    yesterday







  • 1




    Another theory: survivorship bias.
    – R..
    yesterday

















up vote
5
down vote













Correlation is a mathematical concept; causality is a philosophical idea.



On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.



This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.



Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).



In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.






share|cite|improve this answer








New contributor




user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    up vote
    4
    down vote













    Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.



    However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.



    Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.






    share|cite|improve this answer


















    • 3




      In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
      – Llopis
      2 days ago










    • Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
      – Lucas
      2 days ago







    • 1




      Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
      – locobro
      2 days ago










    • That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
      – Lucas
      2 days ago







    • 4




      You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
      – Tim Randall
      yesterday

















    up vote
    3
    down vote













    The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.



    https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X



    https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes



    Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.






    share|cite|improve this answer


















    • 4




      Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
      – gung♦
      2 days ago






    • 3




      I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
      – Cliff AB
      yesterday






    • 1




      btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
      – Cliff AB
      yesterday






    • 1




      Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
      – user8948
      yesterday







    • 1




      I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
      – Nat
      9 hours ago


















    up vote
    2
    down vote













    No. There is a one-way logical relationship between causality and correlation.



    Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
    For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.



    Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.






    share|cite|improve this answer






















    • A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
      – user8948
      yesterday

















    up vote
    1
    down vote













    Causal claim for age would be inappropriate in this case



    The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).



    But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.



    Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.



    This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.



    How could you change the design to make a causal claim?



    Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.



    You have Group A play the dictatorship game as usual.



    For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.



    For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".



    Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.






    share|cite|improve this answer





























      up vote
      1
      down vote













      Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.



      Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.



      In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.






      share|cite|improve this answer





























        up vote
        0
        down vote













        Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.



        For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.



        There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)



        But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.



        You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.



        For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.






        share|cite|improve this answer




















          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "65"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );






          JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.









           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f372708%2fcan-you-infer-causality-from-correlation-in-this-example-of-dictator-game%23new-answer', 'question_page');

          );

          Post as a guest






























          9 Answers
          9






          active

          oldest

          votes








          9 Answers
          9






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          10
          down vote













          In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.



          Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?



          All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.






          share|cite|improve this answer




















          • In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
            – NofP
            7 hours ago














          up vote
          10
          down vote













          In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.



          Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?



          All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.






          share|cite|improve this answer




















          • In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
            – NofP
            7 hours ago












          up vote
          10
          down vote










          up vote
          10
          down vote









          In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.



          Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?



          All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.






          share|cite|improve this answer












          In general you should not assume that correlation implies causality - even in cases where it seems that is the only possible reason.



          Consider that there are other things that correlate with age - generational aspects of culture for example. Perhaps these three groups will remain the same even as they all age, but the next generation will buck the trend?



          All that being said, you are probably right that younger people are more likely to keep a larger amount, but just be aware there are other possibilities.







          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered 2 days ago









          MikeP

          1,70647




          1,70647











          • In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
            – NofP
            7 hours ago
















          • In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
            – NofP
            7 hours ago















          In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
          – NofP
          7 hours ago




          In addition to the other answers, the current experiment cannot discern between the model where the money kept is a function of the age, and where the money kept is a function of the year of birth. Note that the second model may be non-linear across history, and that 20-years old taken from different historical periods may decide to keep very different amounts of cash.
          – NofP
          7 hours ago












          up vote
          7
          down vote













          I can postulate several causalities from your data.



          1. The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).


          2. The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.


          3. Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.


          (Omitting 143 theories; I need to keep this reasonably short)



          1. The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.

          Conclusion: you are correct, but your classmate might claim to be 147 times correcter.



          Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.






          share|cite|improve this answer










          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.

















          • The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
            – EngrStudent
            2 days ago







          • 4




            It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
            – Nuclear Wang
            2 days ago










          • @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
            – aaaaaa
            yesterday







          • 1




            Another theory: survivorship bias.
            – R..
            yesterday














          up vote
          7
          down vote













          I can postulate several causalities from your data.



          1. The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).


          2. The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.


          3. Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.


          (Omitting 143 theories; I need to keep this reasonably short)



          1. The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.

          Conclusion: you are correct, but your classmate might claim to be 147 times correcter.



          Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.






          share|cite|improve this answer










          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.

















          • The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
            – EngrStudent
            2 days ago







          • 4




            It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
            – Nuclear Wang
            2 days ago










          • @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
            – aaaaaa
            yesterday







          • 1




            Another theory: survivorship bias.
            – R..
            yesterday












          up vote
          7
          down vote










          up vote
          7
          down vote









          I can postulate several causalities from your data.



          1. The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).


          2. The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.


          3. Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.


          (Omitting 143 theories; I need to keep this reasonably short)



          1. The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.

          Conclusion: you are correct, but your classmate might claim to be 147 times correcter.



          Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.






          share|cite|improve this answer










          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          I can postulate several causalities from your data.



          1. The age is measured and then the amount of money kept. Older participants prefer to keep more money (maybe they are smarter or less idealistic, but that's not the point).


          2. The amount of money kept is measured and then the age. People who keep more money spend more time time counting it and are therefore older when the age is measured.


          3. Sick people keep more money because they need money for (possibly life-saving) medication or treatment. The actual correlation is between sickness and money kept, but this variable is "hidden" and we therefore jump to the wrong conclusion, because age and likelihood of sickness correlates in the demographic group of persons chosen for experiment.


          (Omitting 143 theories; I need to keep this reasonably short)



          1. The experimenter spoke in an old, obscure dialect which the young people did not understand and therefore mistakenly chose the wrong option.

          Conclusion: you are correct, but your classmate might claim to be 147 times correcter.



          Another famous correlation is between low IQ and hours of TV watched daily. Does watching TV make one dumb, or do dumb people watch more TV? It could even be both.







          share|cite|improve this answer










          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|cite|improve this answer



          share|cite|improve this answer








          edited yesterday









          Nick Cox

          37.5k478126




          37.5k478126






          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered 2 days ago









          Klaws

          1792




          1792




          New contributor




          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Klaws is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.











          • The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
            – EngrStudent
            2 days ago







          • 4




            It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
            – Nuclear Wang
            2 days ago










          • @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
            – aaaaaa
            yesterday







          • 1




            Another theory: survivorship bias.
            – R..
            yesterday
















          • The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
            – EngrStudent
            2 days ago







          • 4




            It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
            – Nuclear Wang
            2 days ago










          • @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
            – aaaaaa
            yesterday







          • 1




            Another theory: survivorship bias.
            – R..
            yesterday















          The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
          – EngrStudent
          2 days ago





          The young could be under-valuing their worth, suggesting that they are poor at leadership. If they don't understand value, why can they decide strategically or even just rationally about it.
          – EngrStudent
          2 days ago





          4




          4




          It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
          – Nuclear Wang
          2 days ago




          It's not clear what you're getting at with the "classmate might claim to be 147 times correcter". The classmate is wrong - this data does not imply the conclusion that age causes a lack of sharing.
          – Nuclear Wang
          2 days ago












          @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
          – aaaaaa
          yesterday





          @NuclearWang i think the point is that when you have 150 equally probably hypotheses, none are probable. Its not strict, as much as am illustration attempt
          – aaaaaa
          yesterday





          1




          1




          Another theory: survivorship bias.
          – R..
          yesterday




          Another theory: survivorship bias.
          – R..
          yesterday










          up vote
          5
          down vote













          Correlation is a mathematical concept; causality is a philosophical idea.



          On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.



          This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.



          Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).



          In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.






          share|cite|improve this answer








          New contributor




          user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





















            up vote
            5
            down vote













            Correlation is a mathematical concept; causality is a philosophical idea.



            On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.



            This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.



            Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).



            In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.






            share|cite|improve this answer








            New contributor




            user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.



















              up vote
              5
              down vote










              up vote
              5
              down vote









              Correlation is a mathematical concept; causality is a philosophical idea.



              On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.



              This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.



              Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).



              In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.






              share|cite|improve this answer








              New contributor




              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              Correlation is a mathematical concept; causality is a philosophical idea.



              On the other hand, spurious correlation is a mostly technical (you won't find it in measure-theoretical probability textbooks) concept that can be defined in a way that's mostly actionable.



              This idea is closely related to the idea of falsificationism in science -- where the goal is never to prove things, only to disprove them.



              Statistics is to mathematics as medicine is to biology. You're asked to make your best judgement with the support of a wealth of technical knowledge, but this knowledge is never enough to cover the whole world. So if you're going to make judgements as a statistician and present them to others, you need to follow certain standards of quality are met; i.e. that you're giving sound advice, giving them their money's worth. This also means taking the asymmetry of risks into consideration -- in medical testing, the cost of giving a false negative result (which may prevent people from getting early treatment) may be higher than the cost of giving a false positive (which causes distress).



              In practice these standards will vary from field to field -- sometimes it's triple-blind RCTs, sometimes it's instrumental variables and other techniques to control for reverse causation and hidden common causes, sometimes it's Granger causality -- that something in the past consistently correlates with something else in the presence, but not in the reverse direction. It might even be rigorous regularization and cross-validation.







              share|cite|improve this answer








              New contributor




              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              share|cite|improve this answer



              share|cite|improve this answer






              New contributor




              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              answered 2 days ago









              user8948

              1394




              1394




              New contributor




              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.





              New contributor





              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              user8948 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.




















                  up vote
                  4
                  down vote













                  Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.



                  However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.



                  Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.






                  share|cite|improve this answer


















                  • 3




                    In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                    – Llopis
                    2 days ago










                  • Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                    – Lucas
                    2 days ago







                  • 1




                    Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                    – locobro
                    2 days ago










                  • That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                    – Lucas
                    2 days ago







                  • 4




                    You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                    – Tim Randall
                    yesterday














                  up vote
                  4
                  down vote













                  Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.



                  However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.



                  Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.






                  share|cite|improve this answer


















                  • 3




                    In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                    – Llopis
                    2 days ago










                  • Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                    – Lucas
                    2 days ago







                  • 1




                    Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                    – locobro
                    2 days ago










                  • That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                    – Lucas
                    2 days ago







                  • 4




                    You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                    – Tim Randall
                    yesterday












                  up vote
                  4
                  down vote










                  up vote
                  4
                  down vote









                  Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.



                  However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.



                  Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.






                  share|cite|improve this answer














                  Inferring causation from correlation in general is problematic because there may be a number of other reasons for the correlation. For example, spurious correlations due to confounders, selection bias (e.g., only choosing participants with an income below a certain threshold), or the causal effect may simply go the other direction (e.g., a thermometer is correlated with temperature but certainly does not cause it). In each of these cases, your classmate's procedure might find a causal effect where there is none.



                  However, if the participants were randomly selected, we could rule out confounders and selection bias. In that case, either age must cause money kept or money kept must cause age. The latter would imply that forcing someone to keep a certain amount of money would somehow change their age. So we can safely assume that age causes money kept.



                  Note that the causal effect could be "direct" or "indirect". People of different age will have received a different education, have a different amount of wealth, etc., and for these reasons might choose to keep a different amount of the $100. Causal effects via these mediators are still causal effects but are indirect.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited 2 days ago

























                  answered 2 days ago









                  Lucas

                  4,0561529




                  4,0561529







                  • 3




                    In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                    – Llopis
                    2 days ago










                  • Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                    – Lucas
                    2 days ago







                  • 1




                    Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                    – locobro
                    2 days ago










                  • That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                    – Lucas
                    2 days ago







                  • 4




                    You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                    – Tim Randall
                    yesterday












                  • 3




                    In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                    – Llopis
                    2 days ago










                  • Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                    – Lucas
                    2 days ago







                  • 1




                    Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                    – locobro
                    2 days ago










                  • That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                    – Lucas
                    2 days ago







                  • 4




                    You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                    – Tim Randall
                    yesterday







                  3




                  3




                  In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                  – Llopis
                  2 days ago




                  In the second paragraph you mention that it must be a causation. Note that it still could be noise from the random selection (other elder participants spend money [why do they keep them for?] and other young participants kept money [I want to retire/buy a house]).
                  – Llopis
                  2 days ago












                  Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                  – Lucas
                  2 days ago





                  Here I started from the assumption that correlation is established and estimation error due to limited data can be ignored.
                  – Lucas
                  2 days ago





                  1




                  1




                  Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                  – locobro
                  2 days ago




                  Is random selection enough? In simple experimental designs, we want random assignment of the "treatment" --- here, age --- for valid judgments regarding causal effects. (Of course, we can't assign people different ages, so this simple experimental design may not be possible to apply.)
                  – locobro
                  2 days ago












                  That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                  – Lucas
                  2 days ago





                  That's a good question. If we were randomly sampling from a biased pool, random selection wouldn't get rid of this bias. I think the assumption here is that for the same reason that age can't be assigned, there can be no confounding of age (no arrows going into age in the causal diagram). Therefore, observation is as good as assignment (i.e., $p(y mid textdo(age)) = p(y mid age)$ in the language of do-calculus) when there is no selection bias.
                  – Lucas
                  2 days ago





                  4




                  4




                  You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                  – Tim Randall
                  yesterday




                  You've excluded a possibility. A correlation between A and B can be explained as follows: A might cause B, or B might cause A, or another previously unknown factor C might cause both A and B.
                  – Tim Randall
                  yesterday










                  up vote
                  3
                  down vote













                  The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.



                  https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X



                  https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes



                  Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.






                  share|cite|improve this answer


















                  • 4




                    Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                    – gung♦
                    2 days ago






                  • 3




                    I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                    – Cliff AB
                    yesterday






                  • 1




                    btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                    – Cliff AB
                    yesterday






                  • 1




                    Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                    – user8948
                    yesterday







                  • 1




                    I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                    – Nat
                    9 hours ago















                  up vote
                  3
                  down vote













                  The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.



                  https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X



                  https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes



                  Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.






                  share|cite|improve this answer


















                  • 4




                    Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                    – gung♦
                    2 days ago






                  • 3




                    I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                    – Cliff AB
                    yesterday






                  • 1




                    btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                    – Cliff AB
                    yesterday






                  • 1




                    Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                    – user8948
                    yesterday







                  • 1




                    I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                    – Nat
                    9 hours ago













                  up vote
                  3
                  down vote










                  up vote
                  3
                  down vote









                  The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.



                  https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X



                  https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes



                  Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.






                  share|cite|improve this answer














                  The relationship between correlation and causation has stumped philosophers and statisticians alike for centuries. Finally, over the last twenty years or so computer scientists claim to have sorted it all out. This does not seem to be widely known. Fortunately Judea Pearl, a prime mover in this field, has recently published a book explaining this work for a popular audience: The Book of Why.



                  https://www.amazon.com/Book-Why-Science-Cause-Effect/dp/046509760X



                  https://bigthink.com/errors-we-live-by/judea-pearls-the-book-of-why-brings-news-of-a-new-science-of-causes



                  Spoiler alert: You can infer causation from correlation in some circumstances if you know what you are doing. You need to make some causal assumptions to start with (a causal model, ideally based on science). And you need the tools to do counterfactual reasoning (The do-algebra). Sorry I can't distill this down to a few lines (I'm still reading the book myself), but I think the answer to your question is in there.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited 7 hours ago









                  Community♦

                  1




                  1










                  answered 2 days ago









                  gareth

                  1213




                  1213







                  • 4




                    Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                    – gung♦
                    2 days ago






                  • 3




                    I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                    – Cliff AB
                    yesterday






                  • 1




                    btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                    – Cliff AB
                    yesterday






                  • 1




                    Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                    – user8948
                    yesterday







                  • 1




                    I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                    – Nat
                    9 hours ago













                  • 4




                    Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                    – gung♦
                    2 days ago






                  • 3




                    I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                    – Cliff AB
                    yesterday






                  • 1




                    btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                    – Cliff AB
                    yesterday






                  • 1




                    Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                    – user8948
                    yesterday







                  • 1




                    I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                    – Nat
                    9 hours ago








                  4




                  4




                  Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                  – gung♦
                  2 days ago




                  Pearl & his work are quite prominent. It would be an uncommon statistician who's never heard of this. Note that whether he has truly "sorted it all out" is very much open for debate. There is no question that his methods work on paper (when you can guarantee the assumptions are met), but how well is works in real situations is much hazier.
                  – gung♦
                  2 days ago




                  3




                  3




                  I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                  – Cliff AB
                  yesterday




                  I want to give a (+1) and (-1) at the same time, so no vote from me. The (+1) is for mentioning Judea Pearl and his work; his work is definitely helped establish the field of causal statistics. The (-1) for saying it has stumped philosophers and statisticians for centuries but now Pearl solved it. I believe that Pearl approach is the best way to think about things, but at the same time, if you use this approach (which you should), your answer is "if my untestable assumptions are correct, I've shown a causal relation. Let's cross our fingers about those assumptions".
                  – Cliff AB
                  yesterday




                  1




                  1




                  btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                  – Cliff AB
                  yesterday




                  btw, my last sentence isn't knocking Pearl's approach. Rather, it's recognizing that causal inference is still very hard and you need to be honest about the limitations of your analysis.
                  – Cliff AB
                  yesterday




                  1




                  1




                  Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                  – user8948
                  yesterday





                  Pearl promotes a kind of neo-Bayesianism (following in the footsteps of the great E.T. Jaynes) which is worth knowing. But your own answer says: << You need to make some causal assumptions to start with (a causal model, ideally based on science).>> -- there you go. Jaynes was a prominent critic of mainstream statistics, which shies away from giving explicit priors and instead contrives "objective" systems where causality is lost. Pearl goes further and gives us tools to propagate causality assumptions from priors to posteriors -- which is not causality ex nihilo.
                  – user8948
                  yesterday





                  1




                  1




                  I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                  – Nat
                  9 hours ago





                  I'm also avoiding a +1 for the overly poetic part at the beginning. I mean, a lot of things have "stumped [intellectuals of some sort] for ages", but such observations tend to be the result of biased sampling and play into this false narrative of human knowledge as though it's some sort of block chain that everyone reads and writes to. But, it seems baseless to assert that no one across the ages has understood a concept merely because it was misunderstood by others. Sorry to rant, just, the initial dramatic language seems to detract from the rest.
                  – Nat
                  9 hours ago











                  up vote
                  2
                  down vote













                  No. There is a one-way logical relationship between causality and correlation.



                  Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
                  For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.



                  Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.






                  share|cite|improve this answer






















                  • A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                    – user8948
                    yesterday














                  up vote
                  2
                  down vote













                  No. There is a one-way logical relationship between causality and correlation.



                  Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
                  For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.



                  Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.






                  share|cite|improve this answer






















                  • A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                    – user8948
                    yesterday












                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  No. There is a one-way logical relationship between causality and correlation.



                  Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
                  For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.



                  Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.






                  share|cite|improve this answer














                  No. There is a one-way logical relationship between causality and correlation.



                  Consider correlation a property you calculate on some data, e.g. the most common (linear) correlation as defined by Pearson. For this particular definition of correlation you can create random data points that will have a correlation of zero or of one without having any kind of causality between them, just by having certain (a)symmetries.
                  For any definition of correlation you can create a prescription that will show both behaviours: high values of correlation with no mathematical relation in between and low values of correlation, even if there is a fixed expression.



                  Yes, the relation from "unrelated, but highly correlated" is weaker than "no correlation despite being related". But the only indicator (!) you have if correlation is present is that you have to look harder for an explanation for it.







                  share|cite|improve this answer














                  share|cite|improve this answer



                  share|cite|improve this answer








                  edited yesterday









                  Nick Cox

                  37.5k478126




                  37.5k478126










                  answered 2 days ago









                  cherub

                  1,308210




                  1,308210











                  • A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                    – user8948
                    yesterday
















                  • A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                    – user8948
                    yesterday















                  A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                  – user8948
                  yesterday




                  A higher bar than "no correlation" is statistical independence, which implies e.g. P(A|B) = P(A). Indeed, Pearson correlation zero does not imply statistical independence, but e.g. zero distance correlation does.
                  – user8948
                  yesterday










                  up vote
                  1
                  down vote













                  Causal claim for age would be inappropriate in this case



                  The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).



                  But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.



                  Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.



                  This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.



                  How could you change the design to make a causal claim?



                  Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.



                  You have Group A play the dictatorship game as usual.



                  For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.



                  For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".



                  Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.






                  share|cite|improve this answer


























                    up vote
                    1
                    down vote













                    Causal claim for age would be inappropriate in this case



                    The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).



                    But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.



                    Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.



                    This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.



                    How could you change the design to make a causal claim?



                    Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.



                    You have Group A play the dictatorship game as usual.



                    For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.



                    For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".



                    Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.






                    share|cite|improve this answer
























                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      Causal claim for age would be inappropriate in this case



                      The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).



                      But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.



                      Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.



                      This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.



                      How could you change the design to make a causal claim?



                      Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.



                      You have Group A play the dictatorship game as usual.



                      For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.



                      For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".



                      Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.






                      share|cite|improve this answer














                      Causal claim for age would be inappropriate in this case



                      The problem with claiming causality in your exam question design can be boiled down to one simple fact: aging was not a treatment, age was not manipulated at all. The main reason to do controlled studies is precisely because, due to the manipulation and control over the variables of interest, you can say that the change in one variable causes the change in the outcome (under extremely specific experimental conditions and with a boat-load of other assumptions like random assignment and that the experimenter didn't screw up something in the execution details, which I casually gloss over here).



                      But that's not what the exam design describes - it simply has two groups of participants, with one specific fact that differs them known (their age); but you have no way of knowing any of the other ways the group differs. Due to the lack of control, you cannot know whether it was the difference in age that caused the change in outcome, or if it is because the reason 40-year olds join a study is because they need the money while 20-year olds were students who were participating for class credit and so had different motivations - or any one of a thousand other possible natural differences in your groups.



                      Now, the technical terminology for these sorts of things varies by field. Common terms for things like participant age and gender are "participant attribute", "extraneous variable", "attribute independent variable", etc. Ultimately you end up with something that is not a "true experiment" or a "true controlled experiment", because the thing you want to make a claim about - like age - wasn't really in your control to change, so the most you can hope for without far more advanced methods (like causal inference, additional conditions, longitudinal data, etc.) is to claim there is a correlation.



                      This also happens to be one of the reasons why experiments in social science, and understanding hard-to-control attributes of people, is so tricky in practice - people differ in lots of ways, and when you can't change the things you want to learn about, you tend to need more complex experimental and inferential techniques or a different strategy entirely.



                      How could you change the design to make a causal claim?



                      Imagine a hypothetical scenario like this: Group A and B are both made up of participants who are 20 years old.



                      You have Group A play the dictatorship game as usual.



                      For Group B, you take out a Magical Aging Ray of Science (or perhaps by having a Ghost treat them with horrifying visage), which you have carefully tuned to aging all the participants in Group B so that they are now 40 years old, but otherwise leaving them unchanged, and then have them play the dictator game just as Group A did.



                      For extra rigor you could get a Group C of naturally-aged 40-year olds to confirm the synthetic aging is comparable to natural aging, but lets keep things simple and say we know that artificial aging is just like the real thing based on "prior work".



                      Now, if Group B keeps more money than Group A, you can claim that the experiment indicates that aging causes people to keep more of the money. Of course there are still approximately a thousand reasons why your claim could turn out to be wrong, but your experiment at least has a valid causal interpretation.







                      share|cite|improve this answer














                      share|cite|improve this answer



                      share|cite|improve this answer








                      edited 2 days ago

























                      answered 2 days ago









                      BrianH

                      1366




                      1366




















                          up vote
                          1
                          down vote













                          Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.



                          Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.



                          In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.






                          share|cite|improve this answer


























                            up vote
                            1
                            down vote













                            Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.



                            Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.



                            In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.






                            share|cite|improve this answer
























                              up vote
                              1
                              down vote










                              up vote
                              1
                              down vote









                              Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.



                              Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.



                              In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.






                              share|cite|improve this answer














                              Generally you can't jump from correlation to causation. For example, there's a well-known social science phenomenon about social status/class, and propensity to spend/save. For many many years it was believed that this showed causation. Last year more intensive research showed it wasn't.



                              Classic "correlation isn't causation" - in this case, the confounding factor was that growing up in poverty teaches people to use money differently, and spend if there is a surplus, because it may not be there tomorrow even if saved for various reasons.



                              In your example, suppose the older people all lived through a war, which the younger people didn't. The link might be that people who grew up in social chaos, with real risk of harm and loss of life, learn to prioritise saving resources for themselves and against need, more than those who grow up in happier circumstances where the state, employers, or health insurers will take care of it, and survival isn't an issue that shaped their outlook. Then you would get the same apparent link - older people (including those closer to their generation) keep more, but it would only apparently be linked to age. In reality the causative element is the social situation one spent formative years in, and what habits that taught - not age per se.







                              share|cite|improve this answer














                              share|cite|improve this answer



                              share|cite|improve this answer








                              edited yesterday









                              Nick Cox

                              37.5k478126




                              37.5k478126










                              answered yesterday









                              Stilez

                              25914




                              25914




















                                  up vote
                                  0
                                  down vote













                                  Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.



                                  For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.



                                  There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)



                                  But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.



                                  You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.



                                  For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.






                                  share|cite|improve this answer
























                                    up vote
                                    0
                                    down vote













                                    Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.



                                    For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.



                                    There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)



                                    But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.



                                    You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.



                                    For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.






                                    share|cite|improve this answer






















                                      up vote
                                      0
                                      down vote










                                      up vote
                                      0
                                      down vote









                                      Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.



                                      For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.



                                      There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)



                                      But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.



                                      You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.



                                      For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.






                                      share|cite|improve this answer












                                      Causality and correlation are different categories of things. That is why correlation alone is not sufficient to infer causality.



                                      For example, causality is directional, while correlation is not. When infering causality, you need to establish what is cause and what is effect.



                                      There are other things that might interfere with your inference. Hidden or third variables and all the questions of statistics (sample selection, sample size, etc.)



                                      But assuming that your statistics are properly done, correlation can provide clues about causality. Typically, if you find a correlation, it means that there is some kind of causality somewhere and you should start looking for it.



                                      You can absolutely start with a hypothesis derived from your correlation. But a hypothesis is not a causality, it is merely a possibility of a causality. You then need to test it. If your hypothesis resists sufficient valsification attempts, you may be on to something.



                                      For example, in your age-causes-greed hypothesis, one alternative hypothesis would be that it is not age, but length of being a dictator. So you would look for old, but recently-empowered dictators as a control group, and young-but-dictator-since-childhood as a second one and check the results there.







                                      share|cite|improve this answer












                                      share|cite|improve this answer



                                      share|cite|improve this answer










                                      answered 5 hours ago









                                      Tom

                                      1012




                                      1012




















                                          JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.









                                           

                                          draft saved


                                          draft discarded


















                                          JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.












                                          JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.











                                          JonnyBravo is a new contributor. Be nice, and check out our Code of Conduct.













                                           


                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function ()
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f372708%2fcan-you-infer-causality-from-correlation-in-this-example-of-dictator-game%23new-answer', 'question_page');

                                          );

                                          Post as a guest













































































                                          Comments

                                          Popular posts from this blog

                                          Long meetings (6-7 hours a day): Being “babysat” by supervisor

                                          Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                                          Confectionery