What does < stand for in data.table joins with on=

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
6
down vote

favorite












Joining the data tables:



X <- data.table(A = 1:4, B = c(1,1,1,1)) 
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1

Y <- data.table(A = 4)
# A
# 1: 4


via



X[Y, on = .(A == A)]
# A B
# 1: 4 1


returns the expected result. However, I would expect the line:



X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1


to return



 A B
1: 1 1
2: 2 1
3: 3 1


because the keyword on:




Indicate which columns in x should be joined with which columns in i along with the type of binary operator to join with




according to ?data.table. The way the joining is done is not explicitly mentioned, and certainly it is not as I have guessed. How exactly < joins columns in x with columns in i?










share|improve this question























  • A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
    – Jaap
    2 hours ago














up vote
6
down vote

favorite












Joining the data tables:



X <- data.table(A = 1:4, B = c(1,1,1,1)) 
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1

Y <- data.table(A = 4)
# A
# 1: 4


via



X[Y, on = .(A == A)]
# A B
# 1: 4 1


returns the expected result. However, I would expect the line:



X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1


to return



 A B
1: 1 1
2: 2 1
3: 3 1


because the keyword on:




Indicate which columns in x should be joined with which columns in i along with the type of binary operator to join with




according to ?data.table. The way the joining is done is not explicitly mentioned, and certainly it is not as I have guessed. How exactly < joins columns in x with columns in i?










share|improve this question























  • A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
    – Jaap
    2 hours ago












up vote
6
down vote

favorite









up vote
6
down vote

favorite











Joining the data tables:



X <- data.table(A = 1:4, B = c(1,1,1,1)) 
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1

Y <- data.table(A = 4)
# A
# 1: 4


via



X[Y, on = .(A == A)]
# A B
# 1: 4 1


returns the expected result. However, I would expect the line:



X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1


to return



 A B
1: 1 1
2: 2 1
3: 3 1


because the keyword on:




Indicate which columns in x should be joined with which columns in i along with the type of binary operator to join with




according to ?data.table. The way the joining is done is not explicitly mentioned, and certainly it is not as I have guessed. How exactly < joins columns in x with columns in i?










share|improve this question















Joining the data tables:



X <- data.table(A = 1:4, B = c(1,1,1,1)) 
# A B
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1

Y <- data.table(A = 4)
# A
# 1: 4


via



X[Y, on = .(A == A)]
# A B
# 1: 4 1


returns the expected result. However, I would expect the line:



X[Y, on = .(A < A)]
# A B
# 1: 4 1
# 2: 4 1
# 3: 4 1


to return



 A B
1: 1 1
2: 2 1
3: 3 1


because the keyword on:




Indicate which columns in x should be joined with which columns in i along with the type of binary operator to join with




according to ?data.table. The way the joining is done is not explicitly mentioned, and certainly it is not as I have guessed. How exactly < joins columns in x with columns in i?







r data.table left-join






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 2 hours ago









Jaap

52.8k20115123




52.8k20115123










asked 3 hours ago









FOMH

384




384











  • A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
    – Jaap
    2 hours ago
















  • A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
    – Jaap
    2 hours ago















A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
– Jaap
2 hours ago




A side note: you can simplify X[Y, on = .(A == A)] to X[Y, on = .(A)] or X[Y, on = "A"]
– Jaap
2 hours ago












2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted










When doing a non-equi join like X[Y, on = .(A < A)] data.table returns the A-column from Y (the i-data.table).



To get the desired result, you could do:



X[Y, on = .(A < A), .(A = x.A, B)]


which gives:




 A B
1: 1 1
2: 2 1
3: 3 1



In the next release, data.table will return both A columns. See here for the discussion.






share|improve this answer



























    up vote
    4
    down vote













    You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <, a single column is returned for the join colum (A in your example). This columns takes the values from the data.table on the right side of the join, in this case the values in A from Y.



    Here's an illustrated example:



    Illustration of current non-equi join behaviour



    We're planning to change this behaviour in a future version of data.table so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.






    share|improve this answer




















      Your Answer





      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52793037%2fwhat-does-stand-for-in-data-table-joins-with-on%23new-answer', 'question_page');

      );

      Post as a guest






























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      3
      down vote



      accepted










      When doing a non-equi join like X[Y, on = .(A < A)] data.table returns the A-column from Y (the i-data.table).



      To get the desired result, you could do:



      X[Y, on = .(A < A), .(A = x.A, B)]


      which gives:




       A B
      1: 1 1
      2: 2 1
      3: 3 1



      In the next release, data.table will return both A columns. See here for the discussion.






      share|improve this answer
























        up vote
        3
        down vote



        accepted










        When doing a non-equi join like X[Y, on = .(A < A)] data.table returns the A-column from Y (the i-data.table).



        To get the desired result, you could do:



        X[Y, on = .(A < A), .(A = x.A, B)]


        which gives:




         A B
        1: 1 1
        2: 2 1
        3: 3 1



        In the next release, data.table will return both A columns. See here for the discussion.






        share|improve this answer






















          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          When doing a non-equi join like X[Y, on = .(A < A)] data.table returns the A-column from Y (the i-data.table).



          To get the desired result, you could do:



          X[Y, on = .(A < A), .(A = x.A, B)]


          which gives:




           A B
          1: 1 1
          2: 2 1
          3: 3 1



          In the next release, data.table will return both A columns. See here for the discussion.






          share|improve this answer












          When doing a non-equi join like X[Y, on = .(A < A)] data.table returns the A-column from Y (the i-data.table).



          To get the desired result, you could do:



          X[Y, on = .(A < A), .(A = x.A, B)]


          which gives:




           A B
          1: 1 1
          2: 2 1
          3: 3 1



          In the next release, data.table will return both A columns. See here for the discussion.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 3 hours ago









          Jaap

          52.8k20115123




          52.8k20115123






















              up vote
              4
              down vote













              You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <, a single column is returned for the join colum (A in your example). This columns takes the values from the data.table on the right side of the join, in this case the values in A from Y.



              Here's an illustrated example:



              Illustration of current non-equi join behaviour



              We're planning to change this behaviour in a future version of data.table so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.






              share|improve this answer
























                up vote
                4
                down vote













                You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <, a single column is returned for the join colum (A in your example). This columns takes the values from the data.table on the right side of the join, in this case the values in A from Y.



                Here's an illustrated example:



                Illustration of current non-equi join behaviour



                We're planning to change this behaviour in a future version of data.table so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.






                share|improve this answer






















                  up vote
                  4
                  down vote










                  up vote
                  4
                  down vote









                  You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <, a single column is returned for the join colum (A in your example). This columns takes the values from the data.table on the right side of the join, in this case the values in A from Y.



                  Here's an illustrated example:



                  Illustration of current non-equi join behaviour



                  We're planning to change this behaviour in a future version of data.table so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.






                  share|improve this answer












                  You're partially correct. The missing piece of the puzzle is that (currently) when you perform any join, including a non-equi join with <, a single column is returned for the join colum (A in your example). This columns takes the values from the data.table on the right side of the join, in this case the values in A from Y.



                  Here's an illustrated example:



                  Illustration of current non-equi join behaviour



                  We're planning to change this behaviour in a future version of data.table so that both columns will be returned in the case of non-equi joins. See pull requests https://github.com/Rdatatable/data.table/pull/2706 and https://github.com/Rdatatable/data.table/pull/3093.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 3 hours ago









                  Scott Ritchie

                  6,79321347




                  6,79321347



























                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52793037%2fwhat-does-stand-for-in-data-table-joins-with-on%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      Long meetings (6-7 hours a day): Being “babysat” by supervisor

                      Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                      Confectionery