Solved

query huge table with WHERE on one column - index question

Posted on 2013-10-25
8
578 Views
Last Modified: 2013-11-06
There is a table with 57M records. There is a date field and there are 50 distinct date values there. (so approx 1.13M/each date average)

there is a query that does a SELECT from this table with the only condition on the date field.
SELECT 6 columns from table where field = '2011-09-30'

will a non clustered key recommended on that field? will it help, since there are so many records and only so few dates on which the WHERE condition is based in the SELECT query?
0
Comment
Question by:25112
8 Comments
 
LVL 73

Assisted Solution

by:sdstuber
sdstuber earned 84 total points
ID: 39601276
What's the data distribution within the table?

Are your 50 values distributed more or less uniformly throughout the table?

That is, if you read your table block-by-block from disk will at least one of those 1.13 million rows be in each block or nearly so?

If so, then an index won't help, even if the optimizer tries to use one because you'll still be reading the whole table (or nearly so) anyway.  In this case the index will actually make things worse because you have to process the index in addition to reading the the whole table.
0
 
LVL 9

Assisted Solution

by:COANetwork
COANetwork earned 83 total points
ID: 39601284
It will help, somewhat, since an index seek is, as a rule, much faster than a table scan.  You can run an estimated execution plan to see what MS suggests.  On the other hand, if your table gets a lot of inserts and/or updates, those operations will be slowed by the index.  Also, if you have so few distinct values, the index may be ignored if the optimizer finds it faster to scan than seek.
Indexed views would probably offer best performance, but in your case it may be cumbersome - creating 50 of them.
0
 
LVL 39

Assisted Solution

by:lcohan
lcohan earned 167 total points
ID: 39601336
You MUST be careful to match EXACTLY the data type in the table column to ALL the code variables for the index to be used properly - I.E. datetime <> smalldatetime.

I would INCLUDE in the index the KEY of that row as well.
0
 
LVL 5

Author Comment

by:25112
ID: 39601400
>>What's the data distribution within the table?
 it is in order.. it is date value.. a season of data has one date.. and after a season all the record get the next date and so forth.
 
 there are only bulk inserts in this table. no updates or deletes.
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 83 total points
ID: 39605434
your non clustered date index is probably not going to be used as data is likely to be spread evenly across the underlying table so a table scan will be deemed quickest...

you may wish to consider using filtered indexes however , or including the 6 columns you want on the index....

can you explain you scenario is more detail...

are your queries generally on the lastest set(s) of data
    setting up filtered indexes for the used sets my be appropriate

is it always the same 6 columns to be extracted...?
would it make sense to change the clustering key?

...
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 83 total points
ID: 39607075
There's virtually no chance SQL would use a nonclustered index on date to satisfy that query.

So, a nonclustered index will help only if it is a "covering index": so it would have to include all 6 selected columns and be keyed by the date column.

Depending on other queries to the table, you may need to cluster that table on the date column.  Technically a covering index will do fewer reads than a clustered index, but SQL must constantly maintain the extra covering index, and you must modify the index every time the SELECT query changes -- for example, if you add a 7th column to the SELECT, you must add it as another included column in the index.
0
 
LVL 39

Accepted Solution

by:
lcohan earned 167 total points
ID: 39609134
Aside all good advice from above in my opinion 57M (million rows to be specific) is not quite a huge table unless you have a "huge record" with lots of columns - character type (ntext is the worst) and/or datetime data type. If the table structure its not confidential - could you post that here as well?

FYI - I have quite a few tables in my SQL database(s) with over half a Billion rows not partitioned and everything works great therefor from my experience queries required effort,  execution time and pressure they put on your hardware depends from many different aspects.

SQL own Performance Dashboard reports could help you in general on that server not just with this query in particular - aside of the query execution plan.
0
 
LVL 5

Author Comment

by:25112
ID: 39627664
thanks.. the covering index seems to do much better.
0

Featured Post

3 Use Cases for Connected Systems

Our Dev teams are like yours. They’re continually cranking out code for new features/bugs fixes, testing, deploying, testing some more, responding to production monitoring events and more. It’s complex. So, we thought you’d like to see what’s working for us.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I annotated my article on ransomware somewhat extensively, but I keep adding new references and wanted to put a link to the reference library.  Despite all the reference tools I have on hand, it was not easy to find a way to do this easily. I finall…
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

20 Experts available now in Live!

Get 1:1 Help Now