Take a look at this query plan.
Yes, that arrow indicates a single row. This is an Index Seek, returning a single row. And yet it's rubbish. That's right – it's rubbish!
In fact, I had to provide a hint for it to use this index. A table scan would've been better, and this is what happens without the index hint.
Let's look at the query. I can promise you there's no bug here.
SELECT Name, ProductNumber FROM Production.Product WHERE DaysToManufacture < 4 AND ReorderPoint < 100 AND Color = 'Red';
And the covering index is defined as follows:
CREATE INDEX rf_ix_Covering ON Production.Product(DaysToManufacture) INCLUDE (Name, ProductNumber, Size, ReorderPoint, Color) WITH (FILLFACTOR=30);
Try it yourself on AdventureWorks, you should see the same.
Of course, there's a bit of trickery going on. For example, I purposely spaced out the NCIX using FILLFACTOR=30. But despite that, most DBAs would tell you that the first plan is better than the second. But they'd be wrong.
Let's look at the tooltip of each of those Index operations. First, the Clustered Index Scan.
There's no real surprise here. We know there's only one row being returned, I told you that before. There's a predicate listed which is tested on every row, and the Name and ProductNumber are the two fields which are being outputted. Estimated Subtree Cost of 0.0127253.
How about that Seek then, which the Query Optimizer only chose when its hand was forced by a hint?
You'll probably notice immediately that the Estimated Subtree Cost is higher! 0.0141 instead of 0.0127. The reason why is in the Predicate and Seek Predicate sections. Sure, it can do a Seek – it can seek on the first key column of the index, on DaysToManufacture. But having found those records, the data is still very much in DaysToManufacture order. We can't search on those predicates very easily at all, and we're left with them being handled as a Residual Predicate.
To use a phone-book example, this is like searching for people called "Rob F." – I can easily find people whose last name starts with F, but would then have to go through all of them looking for people called Rob. That residual check could be painful. Actually, this particular example is more like saying "Find me anyone called Rob whose surname is < Z", as the Seek Predicate involved returns most of the table.
Unfortunately, the query plan doesn't show us how many rows go through the Residual Predicate check. If it did, we'd be able to see that it's over 400 rows (in a table of 504 rows), and we might rethink our indexing strategy.
In fact, I can easily make an index which causes the same query to perform a full Index Scan, plus Lookup, and which is half the cost of either of the plans we've seen so far.
CREATE INDEX rf_ix_NotCovering ON Production.Product(DaysToManufacture) WHERE DaysToManufacture < 4 AND ReorderPoint < 100 AND Color = 'Red';
Of course, now I'm really cheating, by using a Filtered Index which would only contain a single row. And I could've eliminated the Lookup by including the other columns, halving the cost yet again.
CREATE INDEX rf_ix_Covering2 ON Production.Product(DaysToManufacture) INCLUDE (Name, ProductNumber) WHERE DaysToManufacture < 4 AND ReorderPoint < 100 AND Color = 'Red';
Interestingly, the tooltips for both of these Filtered Index operations don't show the predicates – the Query Optimizer knows that the index has already done the necessary filtering, and in this case, it doesn't need to apply any further predicates, whether Seek or Residual.
So… just because an index is covering a query, doesn't mean it's necessarily the right choice. The better question is "Is this index good for this query?"
(Incidentally, this is the kind of thing that I'd go through in my pre-conference seminar at SQLPASS if chosen, and in the talk on "Joins, SARGability and the Evils of Residualiciousness" if that one gets picked)