Warnings: Operation caused residual IO versus Key Lookups
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
3
down vote
favorite
I've seen this warning in SQL Server 2017 execution plans:
Warnings; Operation caused residual IO. The actual number of rows read was (a high number), but the number of rows returned was 40. Here is a snippet from SQLSentry PlanExplorer
In order to improve the code, I've added a non-clustered index, so SQL Server can get to the relevant rows.
It works fine, but normally there would be too many (big) columns to include in the index. It looks like this:
If I only add the index, without include columns, it looks like this, if I force the use of the index:
Obviously, SQL Server thinks the key lookup is much more expensive, than residual IO.
I have a test setup without much test data (yet), but when the code goes into production, it needs to work with much more data, so I'm fairly sure that some sort of NonClustered index is needed.
Are key lookups really that expensive, when you run on SSDs, that I have to create full-fat indexes (with a lot of include columns)?
sql-server sql-server-2017 nonclustered-index
 |Â
show 3 more comments
up vote
3
down vote
favorite
I've seen this warning in SQL Server 2017 execution plans:
Warnings; Operation caused residual IO. The actual number of rows read was (a high number), but the number of rows returned was 40. Here is a snippet from SQLSentry PlanExplorer
In order to improve the code, I've added a non-clustered index, so SQL Server can get to the relevant rows.
It works fine, but normally there would be too many (big) columns to include in the index. It looks like this:
If I only add the index, without include columns, it looks like this, if I force the use of the index:
Obviously, SQL Server thinks the key lookup is much more expensive, than residual IO.
I have a test setup without much test data (yet), but when the code goes into production, it needs to work with much more data, so I'm fairly sure that some sort of NonClustered index is needed.
Are key lookups really that expensive, when you run on SSDs, that I have to create full-fat indexes (with a lot of include columns)?
sql-server sql-server-2017 nonclustered-index
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
2
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data fromsys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.
– Aaron Bertrand♦
45 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago
 |Â
show 3 more comments
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I've seen this warning in SQL Server 2017 execution plans:
Warnings; Operation caused residual IO. The actual number of rows read was (a high number), but the number of rows returned was 40. Here is a snippet from SQLSentry PlanExplorer
In order to improve the code, I've added a non-clustered index, so SQL Server can get to the relevant rows.
It works fine, but normally there would be too many (big) columns to include in the index. It looks like this:
If I only add the index, without include columns, it looks like this, if I force the use of the index:
Obviously, SQL Server thinks the key lookup is much more expensive, than residual IO.
I have a test setup without much test data (yet), but when the code goes into production, it needs to work with much more data, so I'm fairly sure that some sort of NonClustered index is needed.
Are key lookups really that expensive, when you run on SSDs, that I have to create full-fat indexes (with a lot of include columns)?
sql-server sql-server-2017 nonclustered-index
I've seen this warning in SQL Server 2017 execution plans:
Warnings; Operation caused residual IO. The actual number of rows read was (a high number), but the number of rows returned was 40. Here is a snippet from SQLSentry PlanExplorer
In order to improve the code, I've added a non-clustered index, so SQL Server can get to the relevant rows.
It works fine, but normally there would be too many (big) columns to include in the index. It looks like this:
If I only add the index, without include columns, it looks like this, if I force the use of the index:
Obviously, SQL Server thinks the key lookup is much more expensive, than residual IO.
I have a test setup without much test data (yet), but when the code goes into production, it needs to work with much more data, so I'm fairly sure that some sort of NonClustered index is needed.
Are key lookups really that expensive, when you run on SSDs, that I have to create full-fat indexes (with a lot of include columns)?
sql-server sql-server-2017 nonclustered-index
sql-server sql-server-2017 nonclustered-index
asked 1 hour ago


Henrik Staun Poulsen
1,0801821
1,0801821
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
2
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data fromsys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.
– Aaron Bertrand♦
45 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago
 |Â
show 3 more comments
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
2
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data fromsys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.
– Aaron Bertrand♦
45 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
2
2
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data from
sys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.– Aaron Bertrand♦
45 mins ago
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data from
sys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.– Aaron Bertrand♦
45 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago
 |Â
show 3 more comments
1 Answer
1
active
oldest
votes
up vote
3
down vote
The cost model used by the optimizer is exactly that: a model. It produces generally good results over a wide range of workloads, on a wide range of database designs, on a wide range of hardware.
You should generally not assume that individual cost estimates will strongly correlate with runtime performance on a particular hardware configuration. The point of costing is to allow the optimizer to make an educated choice between candidate physical alternatives for the same logical operation.
When you really get into the details, a skilled database professional (with the time to spare on tuning an important query) can often do better. To that extent, you can think of the optimizer's plan selection as a good starting point. In most cases, that starting point will also be the ending point, since the solution found is good enough.
Included columns are great for read-heavy OLTP workloads, where the trade-off between index space usage and update cost versus runtime read performance makes sense.
All that said: Yes, lookups can be expensive even on SSDs, or ultimately even when reading exclusively from memory. Traversing b-tree structures is not for free. Obviously the cost mounts as you do more of them.
Only you can decide if the trade-off is worth it in your case. Test both alternatives on a representative data sample, and make an informed choice.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
The cost model used by the optimizer is exactly that: a model. It produces generally good results over a wide range of workloads, on a wide range of database designs, on a wide range of hardware.
You should generally not assume that individual cost estimates will strongly correlate with runtime performance on a particular hardware configuration. The point of costing is to allow the optimizer to make an educated choice between candidate physical alternatives for the same logical operation.
When you really get into the details, a skilled database professional (with the time to spare on tuning an important query) can often do better. To that extent, you can think of the optimizer's plan selection as a good starting point. In most cases, that starting point will also be the ending point, since the solution found is good enough.
Included columns are great for read-heavy OLTP workloads, where the trade-off between index space usage and update cost versus runtime read performance makes sense.
All that said: Yes, lookups can be expensive even on SSDs, or ultimately even when reading exclusively from memory. Traversing b-tree structures is not for free. Obviously the cost mounts as you do more of them.
Only you can decide if the trade-off is worth it in your case. Test both alternatives on a representative data sample, and make an informed choice.
add a comment |Â
up vote
3
down vote
The cost model used by the optimizer is exactly that: a model. It produces generally good results over a wide range of workloads, on a wide range of database designs, on a wide range of hardware.
You should generally not assume that individual cost estimates will strongly correlate with runtime performance on a particular hardware configuration. The point of costing is to allow the optimizer to make an educated choice between candidate physical alternatives for the same logical operation.
When you really get into the details, a skilled database professional (with the time to spare on tuning an important query) can often do better. To that extent, you can think of the optimizer's plan selection as a good starting point. In most cases, that starting point will also be the ending point, since the solution found is good enough.
Included columns are great for read-heavy OLTP workloads, where the trade-off between index space usage and update cost versus runtime read performance makes sense.
All that said: Yes, lookups can be expensive even on SSDs, or ultimately even when reading exclusively from memory. Traversing b-tree structures is not for free. Obviously the cost mounts as you do more of them.
Only you can decide if the trade-off is worth it in your case. Test both alternatives on a representative data sample, and make an informed choice.
add a comment |Â
up vote
3
down vote
up vote
3
down vote
The cost model used by the optimizer is exactly that: a model. It produces generally good results over a wide range of workloads, on a wide range of database designs, on a wide range of hardware.
You should generally not assume that individual cost estimates will strongly correlate with runtime performance on a particular hardware configuration. The point of costing is to allow the optimizer to make an educated choice between candidate physical alternatives for the same logical operation.
When you really get into the details, a skilled database professional (with the time to spare on tuning an important query) can often do better. To that extent, you can think of the optimizer's plan selection as a good starting point. In most cases, that starting point will also be the ending point, since the solution found is good enough.
Included columns are great for read-heavy OLTP workloads, where the trade-off between index space usage and update cost versus runtime read performance makes sense.
All that said: Yes, lookups can be expensive even on SSDs, or ultimately even when reading exclusively from memory. Traversing b-tree structures is not for free. Obviously the cost mounts as you do more of them.
Only you can decide if the trade-off is worth it in your case. Test both alternatives on a representative data sample, and make an informed choice.
The cost model used by the optimizer is exactly that: a model. It produces generally good results over a wide range of workloads, on a wide range of database designs, on a wide range of hardware.
You should generally not assume that individual cost estimates will strongly correlate with runtime performance on a particular hardware configuration. The point of costing is to allow the optimizer to make an educated choice between candidate physical alternatives for the same logical operation.
When you really get into the details, a skilled database professional (with the time to spare on tuning an important query) can often do better. To that extent, you can think of the optimizer's plan selection as a good starting point. In most cases, that starting point will also be the ending point, since the solution found is good enough.
Included columns are great for read-heavy OLTP workloads, where the trade-off between index space usage and update cost versus runtime read performance makes sense.
All that said: Yes, lookups can be expensive even on SSDs, or ultimately even when reading exclusively from memory. Traversing b-tree structures is not for free. Obviously the cost mounts as you do more of them.
Only you can decide if the trade-off is worth it in your case. Test both alternatives on a representative data sample, and make an informed choice.
answered 18 mins ago


Paul White♦
47.6k14256406
47.6k14256406
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f221095%2fwarnings-operation-caused-residual-io-versus-key-lookups%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Question for you - Based on your last paragraph, why would the cost of a lookup be less based on the hardware? (Assumedly the same hardware that the non-clustered index will be running on) I'm unclear on that.
– George.Palacios
59 mins ago
2
It's estimated to be 76.9% of the cost of that plan. That doesn't mean it's expensive. Look at the I/O cost of 0.06 compared to your original plan with I/O cost over 10. I think you'll be better off, but you should test with actual plans against enough data that really simulates what production will be like (and if the query runs long enough for us collect data from
sys.dm_exec_query_profiles
, we'll re-cost it from actual costs vs. estimated). Stop using the estimated cost % as some absolute indicator of cost - it's relative and it's often out to lunch.– Aaron Bertrand♦
45 mins ago
@George.Palacios; See brentozar.com/pastetheplan/?id=SJtiRte2X but it is part of a long stored procedure. Look for IX_BatchNo_DeviceNo_CreatedUTC
– Henrik Staun Poulsen
42 mins ago
@AaronBertrand; the estimated cost of the key lookups is 31.0. Are you telling me that SQL Server does not know the cost of the residual IO?
– Henrik Staun Poulsen
31 mins ago
@AaronBertrand If I'm not mistaken, those % values are still the estimates even when looking at the actual plan. Which, in my opinion, makes them even more unreliable.
– Jacob H
26 mins ago