Solved

Optimizing a Simple Select

Posted on 2013-01-03
10
368 Views
Last Modified: 2013-01-24
I have a very simple select statement:

SELECT COUNT(*) as varTotal
FROM tbl_stat_master
WHERE stat_account = 173
AND stat_created 
BETWEEN '2012-11-01 00:00:00' 
AND '2012-12-01 00:00:00'

Open in new window


It's running on a table that has 36 Million records. I'd like to it be faster.

I have two indexes setup, one on stat_account and one on stat_created.

So first of all, is there something I should do to make the query faster? Like maybe COUNT(*) isn't a good idea, or there something better than BETWEEN?

And secondly, should I change my indexes? Make a combo index that includes account and created?

Or maybe you have another idea?

Thank you!
0
Comment
Question by:cnxmax
  • 3
  • 3
  • 2
  • +2
10 Comments
 
LVL 45

Assisted Solution

by:Kdo
Kdo earned 167 total points
ID: 38741631
Hi cnx,

That query is very reasonable, and a very popular kind of query in the real world.  :)

The BETWEEN clause is just a shorthand way of writing the condition.  SQL actually translates that to "WHERE (A <= value1 AND A >= value2).

A composite index over stat_account and stat_created will probably help, depending on the database design.  It can't hurt to create that index and try it.


Good Luck,
Kent
0
 
LVL 13

Assisted Solution

by:MrBullwinkle
MrBullwinkle earned 167 total points
ID: 38741636
Consider converting DateTime to integer, or double (days since 1/1/1900 for example). It may not make a difference, just thinking aloud.

Also make sure indices aren't fragmented.
0
 
LVL 9

Assisted Solution

by:sognoct
sognoct earned 111 total points
ID: 38742026
create clustered index with stat_created and also stat_account

Create the clustered index before creating any nonclustered indexes.

I think that using <= and => is better than between (no need for engine to translate)

as much as I know count(*), count(0), count(1) have the same processing speed ... so use count(*)

if stat_created is a datetime ... you must convert the value of the where in datetime format with a convert, or system will have to convert the values from the table in the same format (slower)
0
 
LVL 13

Assisted Solution

by:MrBullwinkle
MrBullwinkle earned 167 total points
ID: 38742151
If you're calling this query multiple times with different criteria, then I think the engine can optimize the repeat calls if you use @parameters rather than a string literal.  It'll also meet the suggestion by sognoct for the type casting.

I think those are imperceptible differences in this case though.
0
 
LVL 45

Assisted Solution

by:Kdo
Kdo earned 167 total points
ID: 38742580
Ok.  Before this train gets too far off the track.

Do NOT create a clustered index.  Yet.  That will relocate all 35,000,000 rows in a way that may make this query faster, but may also have a negative impact on other queries.  You need to make that decision before you go down this path.

There is NO difference in using BETWEEN or the <= and >= syntax.  They evaluate into exactly the same execution.  Use the one that makes the most sense to your SQL.


The other posters are correct about the dates though.  You might want to add a column with a date in numeric form.

Kent
0
Maximize Your Threat Intelligence Reporting

Reporting is one of the most important and least talked about aspects of a world-class threat intelligence program. Here’s how to do it right.

 
LVL 9

Assisted Solution

by:sognoct
sognoct earned 111 total points
ID: 38743073
Kdo is right

can try it on a db copy and evalutate if there is a performance issue to other queries ...
it depends on how much delete and insert are done on this table, I guess this is a statistic table in which stat_account is the id of the stat creator and stat_created is when it was created  so maybe there are no DELETE queries if there are 36 Million records. If I'm right you can (in the debug db) try to remove current indexes and add the clustered couple ...

For getting better performance INT can give half time in SELECT, you can watch differences in this article:
INT vs DATETIME vs TIMESTAMP

For the BETWEEN is true performance are the same (can evaluate with "explain" command)
0
 
LVL 31

Assisted Solution

by:awking00
awking00 earned 55 total points
ID: 38744101
Also, be sure to gather statistics on your table and indexes.
0
 

Author Comment

by:cnxmax
ID: 38744215
Thank you for all your responses! I have a lot of good information to go off of.

To clarify a few things:

We never do deletes.

We insert about 1000/records per minute during peak times and average about 100/minute over all.

We're considering purging old data (might be a good separate question on the most efficient way to delete records older than x days on a regular basis).




A few questions...

To type cast, is this the best way:

CONVERT('2012-11-01 00:00:00' , datetime)

Open in new window




Also, if I transform the dates to integers for existing and new incoming records, what would be the best way to do: '2012-11-01 12:59:02'

I'm assuming since I only want month granularity I can just use:

201211

And put an index on that column? That seems to me like it would probably lead to a big performance boost.



@MrBullwinkle-- you said "I think the engine can optimize the repeat calls if you use @parameters rather than a string literal". I'm not sure what @parameters are?



Thanks again for all your help!
0
 
LVL 45

Accepted Solution

by:
Kdo earned 167 total points
ID: 38744276
Hi Max,

If it were my application, I'd store the integer date as yyyymmdd (year * 10000 + month * 100 + day).  You may only need month granularity now, but having the full date gives you more options, particularly for items where you need to search date windows.

And integer operations (compare, join, calculate) are the fastest of all operations.  Your queries will be noticeably faster!

And since this is a transaction logging system, don't use a clustered index.  It may offer advantages for some reporting, but you could see a performance hit during inserts when the cluster needs some minor reorganizing.


Good Luck,
Kent
0
 
LVL 13

Assisted Solution

by:MrBullwinkle
MrBullwinkle earned 167 total points
ID: 38744320
I'm not sure what language you're using, but an example in VB.NET would be
			Using cmd As New SqlCommand()
				cmd.CommandText = "SELECT * FROM tblTest WHERE ID=@ID"
				cmd.Connection=cnn
				cmd.Parameters.Add(New SqlParameter("@ID", myID))
				
				retVal=cmd.ExecuteScalar()
				cmd.Connection.Close()
			End Using
		return retVal

Open in new window


Notice how I don't have to cast my parameters type to the query. It does it automatically based on the type of the variable myID.


It does sound like converting those date fields to an integer field would be a huge gain for you.  You just need to change all the calls that reference that to convert to whatever you're doing.  If its Int(datevalue.Format("YYYYMM")) or something else.  I would suggest using "YYYYMMDD" just because integer can handle it (its no more overhead), and you won't kick yourself later if you want the granularity of a day.  If there's any chance you'll want less than that, you could do   "YYMMDDHHmm", and you'd be ok with year ranges from 2000 to 2021.  Max Int value is 2147483647 (2^(8*4-1))
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

This article describes some very basic things about SQL Server filegroups.
Using SQL Scripts we can save all the SQL queries as files that we use very frequently on our database later point of time. This is one of the feature present under SQL Workshop in Oracle Application Express.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
Polish reports in Access so they look terrific. Take yourself to another level. Equations, Back Color, Alternate Back Color. Write easy VBA Code. Tighten space to use less pages. Launch report from a menu, considering criteria only when it is filled…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now