Monday, January 30, 2012

Indexing - take the hint and leave it to the experts



The most common T-SQL command in use has to be the SELECT statement, it is the bedrock of any SQL Professional's day. Sometimes it's used to snatch some data from a table or two while some quick investigation is done, other times it is at the heart of a stored procedure or view that will inform business decisions for coming months or even years.
The latter purpose means you should spend some time making sure it is as efficient as possible. Not endless hours to save a millisecond or two (I mentioned that in my last blog on balancing effort with expected reward) but a little while making sure it is pretty good.
During this time you might read through the MSDN details on the SELECT statement (http://msdn.microsoft.com/en-us/library/ms176104.aspx ) and that might lead you to the Query Hint help (http://msdn.microsoft.com/en-us/library/ms181714.aspx). Now as a diligent DBA you are looking at your execution plans and may want to guide the query to use a particular index that you know exists on the table but for some stupid reason the query optimiser isn't using. This might be a bad idea. It might be a good idea but, it might be a very bad idea. It isn't often that the average DBA knows better than the optimiser.
Lets look at a simple table and a simple query.
CREATE TABLE [dbo].[AccountFlags]
    (
      [AccountsFlagID] [int] NOT NULL
                             IDENTITY(1, 1) ,
      [Account_Ref] [char](7) NOT NULL ,
      [FlagID] [int] NOT NULL ,
      [FlagValue] [bit] NULL
    )
        ALTER TABLE [dbo].[AccountFlags]        ADD CONSTRAINT [PK_ACCOUNTSFLAGS]        PRIMARY KEY CLUSTERED  ([AccountsFlagID]) ON [PRIMARY]
                CREATE UNIQUE NONCLUSTERED INDEX [IX_ACCOUNTSFLAGS]                ON [dbo].[AccountFlags]([Account_Ref],[FlagID]) ON [PRIMARY]
        CREATE NONCLUSTERED INDEX [IX_ACCOUNTSFLAGS2]        ON [dbo].[AccountFlags] ([Account_Ref],[FlagID] DESC)ON [PRIMARY]
    

    CREATE NONCLUSTERED INDEX [IX_ACCOUNTSFLAGS_IDValue]    ON [dbo].[AccountFlags] ([FlagID], [FlagValue])    INCLUDE ([Account_Ref]) ON [PRIMARY]
NOTE: Now this post isn't about what indexes to create, how to create them or any such intricacies, it is purely to show how an query hint suggesting a certain index is used can disrupt TSQL execution immediately and in an on-going fashion mislead your decisions and as a DBA. If you want explanations of what indexes to create then you can find plenty of that advise in other blogs and Microsoft content.
On our table we have plenty of indexes covering the columns we are querying so we should be seeing really fast query execution. However, if you have a query that has an query hint in it as follows
SELECT  FlagValueFROM    dbo.[AccountFlags] AS af WITH ( INDEX ( PK_AccountsFLAGS ) )WHERE   FlagID IN ( 1, 5 )
        AND Account_Ref = 'HAYNE04'
then the results of running the query in SSMS will be misleading and the effect of running it within a production system will draw heavily on your server's resources
IndexHints03_MissingindexFor a start the missing index suggestion will show in the results pane and that missing index suggestion is always the same.
CREATE NONCLUSTERED INDEX []
   
ON[dbo].[AccountFlags]
([Account_Ref],[FlagID])
Despite there being an index that matches this exactly.By using the query hint you are forcing the query to run in a particular way and the table doesn't get evaluated for existing indexes so the suggestion is always made to have an index created that would help.
Now the query above has a cost of 10.0609, if we remove the query hint however the cost drops to 0.0075, some 1300 times less effort. This is born out by the execution times of 1196ms for the querywith the hint and 26ms for the one without. The results of using SET STATISTICS IO ON is another compelling argument too:
With query hint -
Table 'AccountFlags'. Scan count 5, logical reads 9461, physical reads 167, read-ahead reads 9397, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Without query hint -
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'AccountFlags'. Scan count 1, logical reads 3, physical reads 3, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. 
This is clearly causing a lot of problems:
  1. The index usage statistics show a high occurrences of Primary Key scans.
    Having the index related DMVs in SQL Server has been a godsend to anyone working on indexes, knowing how indexes are being used (or not) and what indexes might be useful if created based on actual usage statistics is great. If the collection of those statistics is skewed by inappropriate query hints then all bets are off. You have no idea what indexes would actually be any use.
  2. The missing index DMVs recommend an index that is already in existence but there is nothing preventing that index being duplicated.
    As explained above, having bad statistics is worse than having no statistics. You wont know your  DMV statistics are being skewed by Query hints unless you go and find them so you may well be getting it wrong already.
  3. The database is storing an awful lot more data than it might need to. If you are following the advice of the missing index information in the query plan then you could be creating index upon index on your tables. These will be having no effect on improving your performance as the query hint will prevent them being used and they will be taking up a lot of space. This is space that is being updated every time an INSERT, UPDATE or DELETE takes place and also affects your Backup and Restore times and may even force you onto more hardware in order to cope.
    IndexHints03_Size
    IndexHints03_Size_1IndexIndexHints03_Size_2Index
    IndexHints03_Size_3Index 
  4. The query is still performing badly.
    In this case it is only (!) 1300 times worse than it might be. If you have more complex tables then you will suffer even more.
My advice? Leave Query Hints to extreme cases and apply them with advice from an index expert. Going it alone means you could be making hard work of your data access. Do you hate your servers that much?

Sources:
http://www.simple-talk.com/community/blogs/jonathanallen/archive/2012/01/10/105306.aspx

No comments:

Post a Comment