Left join without duplicate rows

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
6
down vote

favorite
1












I have two tables called record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,



SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC


Here, I am getting the latest one by ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?







share|improve this question






















  • you're not selecting anything from rech2?
    – Evan Carroll
    Aug 14 at 14:32










  • @EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
    – RaR
    Aug 14 at 16:26
















up vote
6
down vote

favorite
1












I have two tables called record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,



SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC


Here, I am getting the latest one by ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?







share|improve this question






















  • you're not selecting anything from rech2?
    – Evan Carroll
    Aug 14 at 14:32










  • @EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
    – RaR
    Aug 14 at 16:26












up vote
6
down vote

favorite
1









up vote
6
down vote

favorite
1






1





I have two tables called record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,



SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC


Here, I am getting the latest one by ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?







share|improve this question














I have two tables called record and record_history. For each record, there could be more than one history. They can be joined by id and record_id. I want to get all the record's entries with recent record_history data. I have created the query like,



SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC


Here, I am getting the latest one by ts. This works as long as there are no duplicate ts entries. If the recent timestamp is repeated in record_history, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?









share|improve this question













share|improve this question




share|improve this question








edited Aug 14 at 14:13









Andriy M

15.3k53470




15.3k53470










asked Aug 14 at 13:12









RaR

1334




1334











  • you're not selecting anything from rech2?
    – Evan Carroll
    Aug 14 at 14:32










  • @EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
    – RaR
    Aug 14 at 16:26
















  • you're not selecting anything from rech2?
    – Evan Carroll
    Aug 14 at 14:32










  • @EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
    – RaR
    Aug 14 at 16:26















you're not selecting anything from rech2?
– Evan Carroll
Aug 14 at 14:32




you're not selecting anything from rech2?
– Evan Carroll
Aug 14 at 14:32












@EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
– RaR
Aug 14 at 16:26




@EvanCarroll rech2 is used here select the first record_history and my need for rech2 is to be NULL
– RaR
Aug 14 at 16:26










1 Answer
1






active

oldest

votes

















up vote
6
down vote



accepted










Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.



The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):



SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;



Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:



LEFT OUTER JOIN record_history rech2 
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)


By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).






share|improve this answer






















  • Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
    – RaR
    Aug 14 at 16:23










  • @ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
    – RaR
    Aug 14 at 16:29






  • 1




    Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
    – Evan Carroll
    Aug 14 at 16:34






  • 3




    @Evan because there might be rows in record without any related row in record_history.
    – ypercubeᵀᴹ
    Aug 14 at 16:35






  • 1




    @EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
    – RaR
    Aug 14 at 17:11










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f214885%2fleft-join-without-duplicate-rows%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
6
down vote



accepted










Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.



The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):



SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;



Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:



LEFT OUTER JOIN record_history rech2 
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)


By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).






share|improve this answer






















  • Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
    – RaR
    Aug 14 at 16:23










  • @ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
    – RaR
    Aug 14 at 16:29






  • 1




    Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
    – Evan Carroll
    Aug 14 at 16:34






  • 3




    @Evan because there might be rows in record without any related row in record_history.
    – ypercubeᵀᴹ
    Aug 14 at 16:35






  • 1




    @EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
    – RaR
    Aug 14 at 17:11














up vote
6
down vote



accepted










Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.



The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):



SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;



Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:



LEFT OUTER JOIN record_history rech2 
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)


By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).






share|improve this answer






















  • Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
    – RaR
    Aug 14 at 16:23










  • @ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
    – RaR
    Aug 14 at 16:29






  • 1




    Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
    – Evan Carroll
    Aug 14 at 16:34






  • 3




    @Evan because there might be rows in record without any related row in record_history.
    – ypercubeᵀᴹ
    Aug 14 at 16:35






  • 1




    @EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
    – RaR
    Aug 14 at 17:11












up vote
6
down vote



accepted







up vote
6
down vote



accepted






Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.



The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):



SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;



Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:



LEFT OUTER JOIN record_history rech2 
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)


By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).






share|improve this answer














Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL join.



The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id. With the LATERAL join method, the use of LIMIT is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):



SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;



Regarding how to do this with the original method (2 joins and IS NULL check), you could change the ON condition - assuming there is an id column in history table so that (id) or at least (ts, id) is unique:



LEFT OUTER JOIN record_history rech2 
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)


By the way, you could replace that second LEFT join and IS NULL check with a NOT EXISTS subquery with same results and possibly similar efficiency (or even with a NOT IN subquery although that needs extra care for nullable columns, not recommended).







share|improve this answer














share|improve this answer



share|improve this answer








edited Aug 14 at 16:40

























answered Aug 14 at 13:36









ypercubeᵀᴹ

71.4k11118194




71.4k11118194











  • Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
    – RaR
    Aug 14 at 16:23










  • @ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
    – RaR
    Aug 14 at 16:29






  • 1




    Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
    – Evan Carroll
    Aug 14 at 16:34






  • 3




    @Evan because there might be rows in record without any related row in record_history.
    – ypercubeᵀᴹ
    Aug 14 at 16:35






  • 1




    @EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
    – RaR
    Aug 14 at 17:11
















  • Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
    – RaR
    Aug 14 at 16:23










  • @ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
    – RaR
    Aug 14 at 16:29






  • 1




    Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
    – Evan Carroll
    Aug 14 at 16:34






  • 3




    @Evan because there might be rows in record without any related row in record_history.
    – ypercubeᵀᴹ
    Aug 14 at 16:35






  • 1




    @EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
    – RaR
    Aug 14 at 17:11















Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
– RaR
Aug 14 at 16:23




Great! That worked. Didn't know about LATERAL. It is a good learning for me. Thanks!
– RaR
Aug 14 at 16:23












@ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
– RaR
Aug 14 at 16:29




@ypercube The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id What condition we can add here to avoid duplicate?
– RaR
Aug 14 at 16:29




1




1




Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
– Evan Carroll
Aug 14 at 16:34




Why do a left join lateral .. on (true) rather than CROSS JOIN LATERAL ()?
– Evan Carroll
Aug 14 at 16:34




3




3




@Evan because there might be rows in record without any related row in record_history.
– ypercubeᵀᴹ
Aug 14 at 16:35




@Evan because there might be rows in record without any related row in record_history.
– ypercubeᵀᴹ
Aug 14 at 16:35




1




1




@EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
– RaR
Aug 14 at 17:11




@EvanCarroll, sorry, I didn't mention clearly. I wanted all record's entries even if there are no history entries for some.
– RaR
Aug 14 at 17:11

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f214885%2fleft-join-without-duplicate-rows%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

Long meetings (6-7 hours a day): Being “babysat” by supervisor

Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

Confectionery