Left join without duplicate rows
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
6
down vote
favorite
I have two tables called record
and record_history
. For each record, there could be more than one history. They can be joined by id
and record_id
. I want to get all the record
's entries with recent record_history
data. I have created the query like,
SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC
Here, I am getting the latest one by ts
. This works as long as there are no duplicate ts
entries. If the recent timestamp is repeated in record_history
, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?
sql postgresql greatest-n-per-group
add a comment |Â
up vote
6
down vote
favorite
I have two tables called record
and record_history
. For each record, there could be more than one history. They can be joined by id
and record_id
. I want to get all the record
's entries with recent record_history
data. I have created the query like,
SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC
Here, I am getting the latest one by ts
. This works as long as there are no duplicate ts
entries. If the recent timestamp is repeated in record_history
, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?
sql postgresql greatest-n-per-group
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
@EvanCarrollrech2
is used here select the first record_history and my need for rech2 is to beNULL
â RaR
Aug 14 at 16:26
add a comment |Â
up vote
6
down vote
favorite
up vote
6
down vote
favorite
I have two tables called record
and record_history
. For each record, there could be more than one history. They can be joined by id
and record_id
. I want to get all the record
's entries with recent record_history
data. I have created the query like,
SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC
Here, I am getting the latest one by ts
. This works as long as there are no duplicate ts
entries. If the recent timestamp is repeated in record_history
, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?
sql postgresql greatest-n-per-group
I have two tables called record
and record_history
. For each record, there could be more than one history. They can be joined by id
and record_id
. I want to get all the record
's entries with recent record_history
data. I have created the query like,
SELECT rec.id, rec.name, rech1.data AS last_history_data
FROM record rec
LEFT OUTER JOIN record_history rech1 ON (rec.id = rech1.record_id)
LEFT OUTER JOIN record_history rech2 ON (rec.id = rech2.record_id AND rech2.ts > rech1.ts)
WHERE rech2.id IS NULL
ORDER BY rec.id DESC
Here, I am getting the latest one by ts
. This works as long as there are no duplicate ts
entries. If the recent timestamp is repeated in record_history
, this query returns more than one row for a record. How can we apply the limit here on the left join to restrict duplicate rows?
sql postgresql greatest-n-per-group
edited Aug 14 at 14:13
Andriy M
15.3k53470
15.3k53470
asked Aug 14 at 13:12
RaR
1334
1334
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
@EvanCarrollrech2
is used here select the first record_history and my need for rech2 is to beNULL
â RaR
Aug 14 at 16:26
add a comment |Â
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
@EvanCarrollrech2
is used here select the first record_history and my need for rech2 is to beNULL
â RaR
Aug 14 at 16:26
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
@EvanCarroll
rech2
is used here select the first record_history and my need for rech2 is to be NULL
â RaR
Aug 14 at 16:26
@EvanCarroll
rech2
is used here select the first record_history and my need for rech2 is to be NULL
â RaR
Aug 14 at 16:26
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
6
down vote
accepted
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL
join.
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
. With the LATERAL
join method, the use of LIMIT
is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;
Regarding how to do this with the original method (2 joins and IS NULL
check), you could change the ON
condition - assuming there is an id
column in history table so that (id)
or at least (ts, id)
is unique:
LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)
By the way, you could replace that second LEFT
join and IS NULL
check with a NOT EXISTS
subquery with same results and possibly similar efficiency (or even with a NOT IN
subquery although that needs extra care for nullable columns, not recommended).
Great! That worked. Didn't know aboutLATERAL
. It is a good learning for me. Thanks!
â RaR
Aug 14 at 16:23
@ypercubeThe duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?
â RaR
Aug 14 at 16:29
1
Why do aleft join lateral .. on (true)
rather thanCROSS JOIN LATERAL ()
?
â Evan Carroll
Aug 14 at 16:34
3
@Evan because there might be rows inrecord
without any related row inrecord_history
.
â ypercubeáµÂá´¹
Aug 14 at 16:35
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted allrecord
's entries even if there are no history entries for some.
â RaR
Aug 14 at 17:11
 |Â
show 1 more comment
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
accepted
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL
join.
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
. With the LATERAL
join method, the use of LIMIT
is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;
Regarding how to do this with the original method (2 joins and IS NULL
check), you could change the ON
condition - assuming there is an id
column in history table so that (id)
or at least (ts, id)
is unique:
LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)
By the way, you could replace that second LEFT
join and IS NULL
check with a NOT EXISTS
subquery with same results and possibly similar efficiency (or even with a NOT IN
subquery although that needs extra care for nullable columns, not recommended).
Great! That worked. Didn't know aboutLATERAL
. It is a good learning for me. Thanks!
â RaR
Aug 14 at 16:23
@ypercubeThe duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?
â RaR
Aug 14 at 16:29
1
Why do aleft join lateral .. on (true)
rather thanCROSS JOIN LATERAL ()
?
â Evan Carroll
Aug 14 at 16:34
3
@Evan because there might be rows inrecord
without any related row inrecord_history
.
â ypercubeáµÂá´¹
Aug 14 at 16:35
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted allrecord
's entries even if there are no history entries for some.
â RaR
Aug 14 at 17:11
 |Â
show 1 more comment
up vote
6
down vote
accepted
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL
join.
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
. With the LATERAL
join method, the use of LIMIT
is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;
Regarding how to do this with the original method (2 joins and IS NULL
check), you could change the ON
condition - assuming there is an id
column in history table so that (id)
or at least (ts, id)
is unique:
LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)
By the way, you could replace that second LEFT
join and IS NULL
check with a NOT EXISTS
subquery with same results and possibly similar efficiency (or even with a NOT IN
subquery although that needs extra care for nullable columns, not recommended).
Great! That worked. Didn't know aboutLATERAL
. It is a good learning for me. Thanks!
â RaR
Aug 14 at 16:23
@ypercubeThe duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?
â RaR
Aug 14 at 16:29
1
Why do aleft join lateral .. on (true)
rather thanCROSS JOIN LATERAL ()
?
â Evan Carroll
Aug 14 at 16:34
3
@Evan because there might be rows inrecord
without any related row inrecord_history
.
â ypercubeáµÂá´¹
Aug 14 at 16:35
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted allrecord
's entries even if there are no history entries for some.
â RaR
Aug 14 at 17:11
 |Â
show 1 more comment
up vote
6
down vote
accepted
up vote
6
down vote
accepted
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL
join.
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
. With the LATERAL
join method, the use of LIMIT
is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;
Regarding how to do this with the original method (2 joins and IS NULL
check), you could change the ON
condition - assuming there is an id
column in history table so that (id)
or at least (ts, id)
is unique:
LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)
By the way, you could replace that second LEFT
join and IS NULL
check with a NOT EXISTS
subquery with same results and possibly similar efficiency (or even with a NOT IN
subquery although that needs extra care for nullable columns, not recommended).
Unless you are in a very old version of Postgres, you don't need the double join. You can get the same result by using a LATERAL
join.
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
. With the LATERAL
join method, the use of LIMIT
is avoiding it anyway. There can be only 1 row returned from the lateral subquery. We can add a second condition so the choice is deterministic (from the two or more rows with same timestamp):
SELECT rec.id, rec.name, rech.data AS last_history_data
FROM record AS rec
LEFT OUTER JOIN LATERAL
( SELECT rech.data
FROM record_history AS rech
WHERE rec.id = rech.record_id
ORDER BY rech.ts DESC
-- ,rech.id DESC -- optional
LIMIT 1
) AS rech
ON TRUE
ORDER BY rec.id DESC ;
Regarding how to do this with the original method (2 joins and IS NULL
check), you could change the ON
condition - assuming there is an id
column in history table so that (id)
or at least (ts, id)
is unique:
LEFT OUTER JOIN record_history rech2
ON rec.id = rech2.record_id
AND (rech2.ts > rech1.ts OR rech2.ts = rech1.ts AND rech2.id > rech1.id)
By the way, you could replace that second LEFT
join and IS NULL
check with a NOT EXISTS
subquery with same results and possibly similar efficiency (or even with a NOT IN
subquery although that needs extra care for nullable columns, not recommended).
edited Aug 14 at 16:40
answered Aug 14 at 13:36
ypercubeáµÂá´¹
71.4k11118194
71.4k11118194
Great! That worked. Didn't know aboutLATERAL
. It is a good learning for me. Thanks!
â RaR
Aug 14 at 16:23
@ypercubeThe duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?
â RaR
Aug 14 at 16:29
1
Why do aleft join lateral .. on (true)
rather thanCROSS JOIN LATERAL ()
?
â Evan Carroll
Aug 14 at 16:34
3
@Evan because there might be rows inrecord
without any related row inrecord_history
.
â ypercubeáµÂá´¹
Aug 14 at 16:35
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted allrecord
's entries even if there are no history entries for some.
â RaR
Aug 14 at 17:11
 |Â
show 1 more comment
Great! That worked. Didn't know aboutLATERAL
. It is a good learning for me. Thanks!
â RaR
Aug 14 at 16:23
@ypercubeThe duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?
â RaR
Aug 14 at 16:29
1
Why do aleft join lateral .. on (true)
rather thanCROSS JOIN LATERAL ()
?
â Evan Carroll
Aug 14 at 16:34
3
@Evan because there might be rows inrecord
without any related row inrecord_history
.
â ypercubeáµÂá´¹
Aug 14 at 16:35
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted allrecord
's entries even if there are no history entries for some.
â RaR
Aug 14 at 17:11
Great! That worked. Didn't know about
LATERAL
. It is a good learning for me. Thanks!â RaR
Aug 14 at 16:23
Great! That worked. Didn't know about
LATERAL
. It is a good learning for me. Thanks!â RaR
Aug 14 at 16:23
@ypercube
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?â RaR
Aug 14 at 16:29
@ypercube
The duplicate results can be avoided in your method by adding a second condition besides the rec.id = rech2.record_id
What condition we can add here to avoid duplicate?â RaR
Aug 14 at 16:29
1
1
Why do a
left join lateral .. on (true)
rather than CROSS JOIN LATERAL ()
?â Evan Carroll
Aug 14 at 16:34
Why do a
left join lateral .. on (true)
rather than CROSS JOIN LATERAL ()
?â Evan Carroll
Aug 14 at 16:34
3
3
@Evan because there might be rows in
record
without any related row in record_history
.â ypercubeáµÂá´¹
Aug 14 at 16:35
@Evan because there might be rows in
record
without any related row in record_history
.â ypercubeáµÂá´¹
Aug 14 at 16:35
1
1
@EvanCarroll, sorry, I didn't mention clearly. I wanted all
record
's entries even if there are no history entries for some.â RaR
Aug 14 at 17:11
@EvanCarroll, sorry, I didn't mention clearly. I wanted all
record
's entries even if there are no history entries for some.â RaR
Aug 14 at 17:11
 |Â
show 1 more comment
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f214885%2fleft-join-without-duplicate-rows%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
you're not selecting anything from rech2?
â Evan Carroll
Aug 14 at 14:32
@EvanCarroll
rech2
is used here select the first record_history and my need for rech2 is to beNULL
â RaR
Aug 14 at 16:26