Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

query huge table with WHERE on one column - index question

Posted on 2013-10-25
8
Medium Priority
?
619 Views
Last Modified: 2013-11-06
There is a table with 57M records. There is a date field and there are 50 distinct date values there. (so approx 1.13M/each date average)

there is a query that does a SELECT from this table with the only condition on the date field.
SELECT 6 columns from table where field = '2011-09-30'

will a non clustered key recommended on that field? will it help, since there are so many records and only so few dates on which the WHERE condition is based in the SELECT query?
0
Comment
Question by:25112
8 Comments
 
LVL 74

Assisted Solution

by:sdstuber
sdstuber earned 336 total points
ID: 39601276
What's the data distribution within the table?

Are your 50 values distributed more or less uniformly throughout the table?

That is, if you read your table block-by-block from disk will at least one of those 1.13 million rows be in each block or nearly so?

If so, then an index won't help, even if the optimizer tries to use one because you'll still be reading the whole table (or nearly so) anyway.  In this case the index will actually make things worse because you have to process the index in addition to reading the the whole table.
0
 
LVL 9

Assisted Solution

by:COANetwork
COANetwork earned 332 total points
ID: 39601284
It will help, somewhat, since an index seek is, as a rule, much faster than a table scan.  You can run an estimated execution plan to see what MS suggests.  On the other hand, if your table gets a lot of inserts and/or updates, those operations will be slowed by the index.  Also, if you have so few distinct values, the index may be ignored if the optimizer finds it faster to scan than seek.
Indexed views would probably offer best performance, but in your case it may be cumbersome - creating 50 of them.
0
 
LVL 40

Assisted Solution

by:lcohan
lcohan earned 668 total points
ID: 39601336
You MUST be careful to match EXACTLY the data type in the table column to ALL the code variables for the index to be used properly - I.E. datetime <> smalldatetime.

I would INCLUDE in the index the KEY of that row as well.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 5

Author Comment

by:25112
ID: 39601400
>>What's the data distribution within the table?
 it is in order.. it is date value.. a season of data has one date.. and after a season all the record get the next date and so forth.
 
 there are only bulk inserts in this table. no updates or deletes.
0
 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 332 total points
ID: 39605434
your non clustered date index is probably not going to be used as data is likely to be spread evenly across the underlying table so a table scan will be deemed quickest...

you may wish to consider using filtered indexes however , or including the 6 columns you want on the index....

can you explain you scenario is more detail...

are your queries generally on the lastest set(s) of data
    setting up filtered indexes for the used sets my be appropriate

is it always the same 6 columns to be extracted...?
would it make sense to change the clustering key?

...
0
 
LVL 70

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 332 total points
ID: 39607075
There's virtually no chance SQL would use a nonclustered index on date to satisfy that query.

So, a nonclustered index will help only if it is a "covering index": so it would have to include all 6 selected columns and be keyed by the date column.

Depending on other queries to the table, you may need to cluster that table on the date column.  Technically a covering index will do fewer reads than a clustered index, but SQL must constantly maintain the extra covering index, and you must modify the index every time the SELECT query changes -- for example, if you add a 7th column to the SELECT, you must add it as another included column in the index.
0
 
LVL 40

Accepted Solution

by:
lcohan earned 668 total points
ID: 39609134
Aside all good advice from above in my opinion 57M (million rows to be specific) is not quite a huge table unless you have a "huge record" with lots of columns - character type (ntext is the worst) and/or datetime data type. If the table structure its not confidential - could you post that here as well?

FYI - I have quite a few tables in my SQL database(s) with over half a Billion rows not partitioned and everything works great therefor from my experience queries required effort,  execution time and pressure they put on your hardware depends from many different aspects.

SQL own Performance Dashboard reports could help you in general on that server not just with this query in particular - aside of the query execution plan.
0
 
LVL 5

Author Comment

by:25112
ID: 39627664
thanks.. the covering index seems to do much better.
0

Featured Post

What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this blog post, we’ll look at how using thread_statistics can cause high memory usage.
What we learned in Webroot's webinar on multi-vector protection.
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…
In this video, Percona Director of Solution Engineering Jon Tobin discusses the function and features of Percona Server for MongoDB. How Percona can help Percona can help you determine if Percona Server for MongoDB is the right solution for …

580 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question