Solved

query huge table with WHERE on one column - index question

Posted on 2013-10-25
8
610 Views
Last Modified: 2013-11-06
There is a table with 57M records. There is a date field and there are 50 distinct date values there. (so approx 1.13M/each date average)

there is a query that does a SELECT from this table with the only condition on the date field.
SELECT 6 columns from table where field = '2011-09-30'

will a non clustered key recommended on that field? will it help, since there are so many records and only so few dates on which the WHERE condition is based in the SELECT query?
0
Comment
Question by:25112
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 74

Assisted Solution

by:sdstuber
sdstuber earned 84 total points
ID: 39601276
What's the data distribution within the table?

Are your 50 values distributed more or less uniformly throughout the table?

That is, if you read your table block-by-block from disk will at least one of those 1.13 million rows be in each block or nearly so?

If so, then an index won't help, even if the optimizer tries to use one because you'll still be reading the whole table (or nearly so) anyway.  In this case the index will actually make things worse because you have to process the index in addition to reading the the whole table.
0
 
LVL 9

Assisted Solution

by:COANetwork
COANetwork earned 83 total points
ID: 39601284
It will help, somewhat, since an index seek is, as a rule, much faster than a table scan.  You can run an estimated execution plan to see what MS suggests.  On the other hand, if your table gets a lot of inserts and/or updates, those operations will be slowed by the index.  Also, if you have so few distinct values, the index may be ignored if the optimizer finds it faster to scan than seek.
Indexed views would probably offer best performance, but in your case it may be cumbersome - creating 50 of them.
0
 
LVL 40

Assisted Solution

by:lcohan
lcohan earned 167 total points
ID: 39601336
You MUST be careful to match EXACTLY the data type in the table column to ALL the code variables for the index to be used properly - I.E. datetime <> smalldatetime.

I would INCLUDE in the index the KEY of that row as well.
0
Get free NFR key for Veeam Availability Suite 9.5

Veeam is happy to provide a free NFR license (1 year, 2 sockets) to all certified IT Pros. The license allows for the non-production use of Veeam Availability Suite v9.5 in your home lab, without any feature limitations. It works for both VMware and Hyper-V environments

 
LVL 5

Author Comment

by:25112
ID: 39601400
>>What's the data distribution within the table?
 it is in order.. it is date value.. a season of data has one date.. and after a season all the record get the next date and so forth.
 
 there are only bulk inserts in this table. no updates or deletes.
0
 
LVL 50

Assisted Solution

by:Lowfatspread
Lowfatspread earned 83 total points
ID: 39605434
your non clustered date index is probably not going to be used as data is likely to be spread evenly across the underlying table so a table scan will be deemed quickest...

you may wish to consider using filtered indexes however , or including the 6 columns you want on the index....

can you explain you scenario is more detail...

are your queries generally on the lastest set(s) of data
    setting up filtered indexes for the used sets my be appropriate

is it always the same 6 columns to be extracted...?
would it make sense to change the clustering key?

...
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 83 total points
ID: 39607075
There's virtually no chance SQL would use a nonclustered index on date to satisfy that query.

So, a nonclustered index will help only if it is a "covering index": so it would have to include all 6 selected columns and be keyed by the date column.

Depending on other queries to the table, you may need to cluster that table on the date column.  Technically a covering index will do fewer reads than a clustered index, but SQL must constantly maintain the extra covering index, and you must modify the index every time the SELECT query changes -- for example, if you add a 7th column to the SELECT, you must add it as another included column in the index.
0
 
LVL 40

Accepted Solution

by:
lcohan earned 167 total points
ID: 39609134
Aside all good advice from above in my opinion 57M (million rows to be specific) is not quite a huge table unless you have a "huge record" with lots of columns - character type (ntext is the worst) and/or datetime data type. If the table structure its not confidential - could you post that here as well?

FYI - I have quite a few tables in my SQL database(s) with over half a Billion rows not partitioned and everything works great therefor from my experience queries required effort,  execution time and pressure they put on your hardware depends from many different aspects.

SQL own Performance Dashboard reports could help you in general on that server not just with this query in particular - aside of the query execution plan.
0
 
LVL 5

Author Comment

by:25112
ID: 39627664
thanks.. the covering index seems to do much better.
0

Featured Post

Optimize your web performance

What's in the eBook?
- Full list of reasons for poor performance
- Ultimate measures to speed things up
- Primary web monitoring types
- KPIs you should be monitoring in order to increase your ROI

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Azure Functions is a solution for easily running small pieces of code, or "functions," in the cloud. This article shows how to create one of these functions to write directly to Azure Table Storage.
This article shows the steps required to install WordPress on Azure. Web Apps, Mobile Apps, API Apps, or Functions, in Azure all these run in an App Service plan. WordPress is no exception and requires an App Service Plan and Database to install
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

623 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question