Solved

How can I speed up this query?

Posted on 2009-05-07
44
478 Views
Last Modified: 2012-09-11
Hi,

I have a query that is based on a view that takes forever to run (and usually errors out due to TEMPDB running out of space.

In this query the view gets joined to itself and is used to filter by the maxof routercallsequence#

The base query selecting the view by date returns the days worth of data in about 4 - minutes. Certainly a usable time as the base table contains several million records and the query returns about 45k records.

So in essence I'm trying to join 45k records to themselves. there are no indices on the view and I am trying to get them added.

I can get to the tables via Access so if an Access solution applies so be it.

Thanks

Craig
0
Comment
Question by:Cbobchin
  • 18
  • 16
  • 6
  • +3
44 Comments
 
LVL 44

Expert Comment

by:GRayL
ID: 24328324
Put your 45k records into a temp table in Access and go from there.  Can you post the query that produces the 45k records and the self-join query.  Can you explain what this means >there are no indices on the view and I am trying to get them added.<??
0
 
LVL 1

Expert Comment

by:j1akey
ID: 24328382
I'm not much of an SQL guy so I'm going to look at this from a hardware perspective.  Do you know the specs of the server this is running on?  More specifically are there nice fast drives running on the server?  I've seen really nice servers that are running with a couple terabytes of space but they have relatively slow SATA drives in them so getting information off them can take a while since it's not hard for the hard disks to be your bottleneck.

0
 

Author Comment

by:Cbobchin
ID: 24328412
GRayL,

Wierd i posted the SQL statement I'm using but it didn't show up. Trying it again.

As for indices The SQL server view does not have any indexes and I'm trying to get the DBAdmin to add some to speed up the queries I'll be running from the view.



SELECT SurveyID, DateTime, Variable5, Var5_1st12, DNIS, InstrumentPortNumber, AgentPeripheralNumber, ANI,  Max(RouterCallKeySequenceNumber) AS MaxOfRouterCallKeySequenceNumber

FROM dbo.vw_TCD INNER JOIN dbo.vw_TCD AS dbo.vw_TCD_1 ON dbo.vw_TCD.Var5_1st12 = dbo.vw_TCD_1.Var5_1st12

Where dbo.vw_TCD.DateTime >= '04/15/2009 00:00:00' and dbo.vw_TCD.DateTime <= '04/15/2009 23:59:59'

GROUP BY SurveyID, DateTime, Variable5, Var5_1st12, DNIS, InstrumentPortNumber, AgentPeripheralNumber, ANI;

Open in new window

0
 

Author Comment

by:Cbobchin
ID: 24328481
i1akey, unfortunately I don't know, nor have any influence on the hardware. so it is a moot point.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24328484
Am I to under stand this is the query that reduces the record count from several million to 45 K.  If so, stick the result in a temp table named myTemp as below:

SELECT
SurveyID,
DateTime,
Variable5,
Var5_1st12,
DNIS,
InstrumentPortNumber,
AgentPeripheralNumber,
ANI,  
Max(RouterCallKeySequenceNumber) AS MaxOfRouterCallKeySequenceNumber  INTO myTemp
FROM dbo.vw_TCD INNER JOIN dbo.vw_TCD AS dbo.vw_TCD_1 ON dbo.vw_TCD.Var5_1st12 = dbo.vw_TCD_1.Var5_1st12
Where dbo.vw_TCD.DateTime >= '04/15/2009 00:00:00' and dbo.vw_TCD.DateTime <= '04/15/2009 23:59:59'
GROUP BY SurveyID, DateTime, Variable5, Var5_1st12, DNIS, InstrumentPortNumber, AgentPeripheralNumber, ANI;
0
 

Author Comment

by:Cbobchin
ID: 24328593
The portion that filters down from several million to 45k is this:

I've tried putting it into a temp table in Access  and just joining the temp tables but I get the same results..

SELECT SurveyID, DateTime, Variable5, Var5_1st12, DNIS, InstrumentPortNumber, AgentPeripheralNumber, ANI,  Max(RouterCallKeySequenceNumber) AS MaxOfRouterCallKeySequenceNumber

FROM dbo.vw_TCD 

Where dbo.vw_TCD.DateTime >= '04/15/2009 00:00:00' and dbo.vw_TCD.DateTime <= '04/15/2009 23:59:59'

GROUP BY SurveyID, DateTime, Variable5, Var5_1st12, DNIS, InstrumentPortNumber, AgentPeripheralNumber, ANI;
 

Access query code:
 

SELECT [4-15 termcall detail_1].SurveyID, [4-15 termcall detail_1].DateTime, [4-15 termcall detail_1].Variable5, [4-15 termcall detail_1].Var5_1st12, [4-15 termcall detail_1].DNIS, [4-15 termcall detail_1].InstrumentPortNumber, [4-15 termcall detail_1].AgentPeripheralNumber, [4-15 termcall detail_1].ANI, Max([4-15 termcall detail_1].RouterCallKeySequenceNumber) AS MaxOfRouterCallKeySequenceNumber

FROM [4-15 termcall detail] INNER JOIN [4-15 termcall detail] AS [4-15 termcall detail_1] ON [4-15 termcall detail].Var5_1st12 = [4-15 termcall detail_1].Var5_1st12

GROUP BY [4-15 termcall detail_1].SurveyID, [4-15 termcall detail_1].DateTime, [4-15 termcall detail_1].Variable5, [4-15 termcall detail_1].Var5_1st12, [4-15 termcall detail_1].DNIS, [4-15 termcall detail_1].InstrumentPortNumber, [4-15 termcall detail_1].AgentPeripheralNumber, [4-15 termcall detail_1].ANI, [4-15 termcall detail].Var5_1st12;

Open in new window

0
 
LVL 60

Expert Comment

by:chapmandew
ID: 24328601
are the indexes on the fields you're joining on and the fields in which you're setting criteria on in your where clause?
0
 

Author Comment

by:Cbobchin
ID: 24328736
Chapmandew, No indexes are on the view. I'm trying to get some added.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24328838
What is the point of joining a table to itself using the same fields in the ON clause?  All that does is create a 45k recordset identical to either of the two tables in the join.  Try removing the join and do the query and group by on just the table, no join.  Is the Max() function any different and more importantly, is the execution a light-year faster?  
0
 

Author Comment

by:Cbobchin
ID: 24328906
Reason for joining the table to itself is that there can be multiple records that are the same expect for the Routercallsequence number. of those that are duplicated I  need to find the records that have the highest sequence number. as well as the single records that are not duplicates.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24329021
IMHO, you do not need the JOIN.  Did you try it without?
0
 
LVL 60

Expert Comment

by:chapmandew
ID: 24329046
NO....don't add them to the views just yet.  How complex are the views?  Are there indexes on the underlying tables?
0
 

Author Comment

by:Cbobchin
ID: 24329340
GRayL, yes I did try it and it did not work.

Chap, there are some indexes on the underlying table but not on the field I'm joining on which only exists in the view (it's the 1st 12 characters of another field).
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24329367
What happened?
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24329394
Will you show us the query you tried it with?
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24330118
I know for sure that if you join a table to itself on a common field, if it contains unique values, you will wind up with a record set identical to the original table.  If any of the common fields have two or more records, you will wind up with a table which multiples of  those records, however none of the values will have changed.  If you compare Max(fldOfInterest) in the original table against Max(fldOfInterest) in the JOINed tables you will wind up with the same value.  
0
 
LVL 26

Expert Comment

by:Zberteoc
ID: 24333038
Question, how come you don't get ambiguous column error? Your columns are not qualified with the aliases.

0
 

Author Comment

by:Cbobchin
ID: 24336670
GRayl, the queries i've posted are the ones I'm using. What i'm getting is still the multiple records. I want to filter it down to just one record (with the max(routercallsequencenumber))

As to why i'm not getting the ambiguous collumn error I don't know.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24344406
If you won't answer my question, I can't help you.  Bye.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24345992
>>the queries i've posted are the ones I'm using. <<
Unfortunately, you have posted quite a few and we would just like to see the current one.

It would also help to see some sample data, current output and desired output.

>>What i'm getting is still the multiple records<<
Define "multiple records", what columns make a "multiple record".
0
 

Author Comment

by:Cbobchin
ID: 24355276
GRayL I was off line most of Fri. and I did not take my laptop home with me. So sorry for not getting back to you immediately.

I can post some sample Data and  current and hoped for output.

It will take me a bit of time but I'll post it later this morning.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24356750
I somehow suspect that you may have seen the last of GRayL in this thread.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:Cbobchin
ID: 24357418
acperkins, I'm sorry to hear that.  in any event, here is the sample data:

1st: sample Raw Data:

                 The current output  from the SQL query I posted above never completes or if it does (when I choose  like 5 records) is the same as the current output.

I had an idea over the weekend that doing a subquery in the where clause might allow me to get the max router sequence number and then filter on that in the main query to get the final results.

What do you think?

Raw Data Which is the same as current output:
 

SurveyID	DateTime	Var5_1st12	AgentPeripheralNumber	InstrumentPortNumber	RouterCallKeySequenceNumber

0032BBDC-290311DE-B143AD29-22247AEC	4/15/2009 8:53	149123111961	45677	65677	3

0032C334-290611DE-84E8AD29-22247AEC	4/15/2009 9:12	149123133286	45433	65433	3

0034A8CA-290911DE-99118017-7702D3C8	4/15/2009 9:35	149123155989	46296	66296	3

0037E064-292811DE-91B38017-7702D3C8	4/15/2009 13:15	149123380683	46316	66316	3

00380715-291C11DE-80EA8017-7702D3C8	4/15/2009 11:47	149123291565	45564	65564	3

003C977C-291B11DE-924EAD29-22247AEC	4/15/2009 11:42	149123284308	45359	65359	3

003C977C-291B11DE-924EAD29-22247AEC	4/15/2009 11:47	149123284308	46298	66298	4

0040D6E1-28F911DE-B845AD29-22247AEC	4/15/2009 7:39	149123489820	46049	66049	3

00441960-294211DE-B1C8AD29-22247AEC	4/15/2009 16:29	149123530242	45260	65260	3

00456D64-292611DE-BA608017-7702D3C8	4/15/2009 13:12	149123366024	45187	65187	3

00456D64-292611DE-BA608017-7702D3C8	4/15/2009 13:05	149123366024	45510	65510	4

0046922D-294511DE-BD3FAD29-22247AEC	4/15/2009 16:44	149123538980	46296	66296	3

0047FEEF-291311DE-9D51AD29-22247AEC	4/15/2009 10:47	149123229093	45732	65732	3

0048C95E-290F11DE-83A5AD29-22247AEC	4/15/2009 10:17	149123200689	45222	65222	3

0049B3AD-28F411DE-A31FAD29-22247AEC	4/15/2009 7:01	149123266440	45357	65357	3

0053650D-291911DE-9E4E8017-7702D3C8	4/15/2009 11:27	149123270353	45463	65463	3

0053D65F-293011DE-99CFAD29-22247AEC	4/15/2009 14:09	149123438529	45408	65408	3

00561AAD-292C11DE-81438017-7702D3C8	4/15/2009 13:49	149123409958	55211	75211	3

0057AB8E-295411DE-A8EBAD29-22247AEC	4/15/2009 18:36	149123565536	46001	66001	3

0057AB8E-295411DE-A8EBAD29-22247AEC	4/15/2009 18:35	149123565536	45700	65700	4

0063469E-293D11DE-AB088017-7702D3C8	4/15/2009 15:56	149123512371	46240	66240	3

00635140-292511DE-94E2AD29-22247AEC	4/15/2009 12:52	149123358621	45403	65403	3

00641A22-28F911DE-A4778017-7702D3C8	4/15/2009 7:42	149123489840	45041	65041	3

0065F187-291A11DE-A9798017-7702D3C8	4/15/2009 11:34	149123277168	47361	67361	3

0067B5D6-290D11DE-B556AD29-22247AEC	4/15/2009 10:03	149123186245	45362	65362	3

006C666D-291111DE-904CAD29-22247AEC	4/15/2009 10:28	149123215045	46099	66099	3
 

Desired Output:

SurveyID	DateTime	Var5_1st12	AgentPeripheralNumber	InstrumentPortNumber	RouterCallKeySequenceNumber

0032BBDC-290311DE-B143AD29-22247AEC	4/15/2009 8:53	149123111961	45677	65677	3

0032C334-290611DE-84E8AD29-22247AEC	4/15/2009 9:12	149123133286	45433	65433	3

0034A8CA-290911DE-99118017-7702D3C8	4/15/2009 9:35	149123155989	46296	66296	3

0037E064-292811DE-91B38017-7702D3C8	4/15/2009 13:15	149123380683	46316	66316	3

00380715-291C11DE-80EA8017-7702D3C8	4/15/2009 11:47	149123291565	45564	65564	3

003C977C-291B11DE-924EAD29-22247AEC	4/15/2009 11:47	149123284308	46298	66298	4

0040D6E1-28F911DE-B845AD29-22247AEC	4/15/2009 7:39	149123489820	46049	66049	3

00441960-294211DE-B1C8AD29-22247AEC	4/15/2009 16:29	149123530242	45260	65260	3

00456D64-292611DE-BA608017-7702D3C8	4/15/2009 13:05	149123366024	45510	65510	4

0046922D-294511DE-BD3FAD29-22247AEC	4/15/2009 16:44	149123538980	46296	66296	3

0047FEEF-291311DE-9D51AD29-22247AEC	4/15/2009 10:47	149123229093	45732	65732	3

0048C95E-290F11DE-83A5AD29-22247AEC	4/15/2009 10:17	149123200689	45222	65222	3

0049B3AD-28F411DE-A31FAD29-22247AEC	4/15/2009 7:01	149123266440	45357	65357	3

0053650D-291911DE-9E4E8017-7702D3C8	4/15/2009 11:27	149123270353	45463	65463	3

0053D65F-293011DE-99CFAD29-22247AEC	4/15/2009 14:09	149123438529	45408	65408	3

00561AAD-292C11DE-81438017-7702D3C8	4/15/2009 13:49	149123409958	55211	75211	3

0057AB8E-295411DE-A8EBAD29-22247AEC	4/15/2009 18:35	149123565536	45700	65700	4

0063469E-293D11DE-AB088017-7702D3C8	4/15/2009 15:56	149123512371	46240	66240	3

00635140-292511DE-94E2AD29-22247AEC	4/15/2009 12:52	149123358621	45403	65403	3

00641A22-28F911DE-A4778017-7702D3C8	4/15/2009 7:42	149123489840	45041	65041	3

0065F187-291A11DE-A9798017-7702D3C8	4/15/2009 11:34	149123277168	47361	67361	3

0067B5D6-290D11DE-B556AD29-22247AEC	4/15/2009 10:03	149123186245	45362	65362	3

006C666D-291111DE-904CAD29-22247AEC	4/15/2009 10:28	149123215045	46099	66099	3

Open in new window

0
 
LVL 44

Accepted Solution

by:
GRayL earned 500 total points
ID: 24358024
Now that you've finally shown us want you want, this assumes the table [4-15 termcall detail] contains your 45k records.

SELECT a.SurveyID, a.DateTime, a.Var5_1st12, AgentPeripheralNumber, InstrumentPortNumber, a.RouterCallKeySequenceNumber FROM [4-15 TermCall Detail] as a Where a.RouterCallKeySequenceNumber IN (SELECT Max(b.RouterCallKeySequenceNumber) FROM [4-15 TermCall Detail] AS b WHERE b.Var5_1st12 = a.Var5_1st12);

With 45k records working with a sub-query you would really benefit from indexing the table on Var5_1st12

0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24358507
If the query is as simple as that, then you may be better off using a derived table as it should be more performant.  If on the other hand, as I suspect, you still need to group more columns then you may want to indicate which are those columns or show how the query results need to change.
0
 

Author Comment

by:Cbobchin
ID: 24366587
Actually it seems to work (at least in Access with a limited amount of Data). But when I converted it to SQL server going against the defined view, it is now going on 2 hrs of run time and still no results.

0
 
LVL 44

Expert Comment

by:GRayL
ID: 24375344
Are you running it with the 45k record dataset or the original table with millions of records.  If the latter, anytime you have an sub query in a query, it is going to take time.  
0
 

Author Comment

by:Cbobchin
ID: 24375759
I ran it on both. On the smaller data set in ran very quickly. On the SQL server dataset it ran for over 15 hours and never gave me a response.




0
 
LVL 44

Expert Comment

by:GRayL
ID: 24376907
With millions of records in a table, unless the server has gobs of memory, I imagine it has to page to disk regardless of the OS, for those huge tables.  The instant that starts to happen, do the brakes ever go on.  I think the method of significantly reducing the db size before applying a query with a sub-query is the way to go.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24376917
BTW, that was what I suggested in my first post.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24380604
With only 14 million rows and such a small resultset there is no earthly reason why that simple query on a well indexed table should not return your results in under 5 minutes.  There has to be something else going on.  Either the hardware/software for SQL Server is subpar, the network is lousy or something else.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24380935
There may be no 'earthly' reason but the 'devil' is in the details.  Why do you think a query with a subquery should run on a table with over 14 meg records in 'minutes' on a server, whose credentials are totally unknown in terms of memory, and cpu power?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24381112
I believe I indicated what I perceived were the problems.  Hopefully the intended recipient of my comments (the author) grasped my intent.  Otherwise I will be happy to spell it out.
0
 

Author Comment

by:Cbobchin
ID: 24385540
AC, no I did not grasp the intent, but I'd love to hear it.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 24387087
You need to troubleshoot the problem.  At the moment you have not got a clue as to where the problem lies.  Is it SQL Server or is it in the network due to the size of the resultset returned or is it the way you are retrieving that data using ODBC (?) from MS Access?

In order to do that you need to isolate the problem.  First you need to take MS Access out of the picture and execute that query from within Query Analyzer or SSMS.  Does it take a long time.  If the answer is yes, then examine the execution plan and determine what is the problem.  More than likely the table is not indexed appropriately.  If it is indexed, then perhaps it has been a while since the statistics have been updated.

After doing all of that and it is still slow, see if the problem is the size of the resultset.  Add a TOP 100 to the query (without any ORDER BY) or a SET ROWCOUNT 100 prior to executing the query.  This may help you identify if the network is the problem.

Once you have done all that and have the time down to an acceptable amount only then should you add MS Access to the mix.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24389612
Cbobchin:  I'm all ears.  Waiting to hear the results of your tests.  Don't forget to compare results where first the queries are executed on the server containing the data, and then remotely, elsewhere in the network all by SQL Server.  There may very well be a network speed problem.
0
 

Author Comment

by:Cbobchin
ID: 24389634
I'm a working on it. I've had a couple of other items come up that were a bit higher priority today.
0
 

Author Comment

by:Cbobchin
ID: 24390626
Well one thing I can tell you all, is that the base table this whole thing is built on has just shy of 700 million records.

I'm running a query that will tell me how many records are in the view I had created for filtering some of those down.

I also know that the 3 main fields I use in the selection criteria are not indexed, though i'm working on getting them indexed.

more info as it becomes available.
0
 

Author Comment

by:Cbobchin
ID: 24396297
Well the filtered recordset that I use for my view has almost 6 million records in it.

So I'm trying to select 1 day's worth of records from a lot of records on fields that are unindexed. No wonder my query to select just 10 records has been running over 17 hours now and still has no results.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24397236
Rather than a 'filtered' recordset, have you tried to copy the filtered result to a temp table and then apply the query with the subquery to the temp table?
0
 

Author Comment

by:Cbobchin
ID: 24397307
No, because I don't have those permissions on the database.
0
 
LVL 44

Expert Comment

by:GRayL
ID: 24397452
Is a filtered table of 6 million records from 700 million in reality still a 700 million record table when it comes to processing a query?  Can you summarize where we are at now?  You started off saying several million records from which a day's worth reduced to 45K records.  Now we're at 700 million from which a day reduces to 6 million?
0
 

Author Comment

by:Cbobchin
ID: 24397592
The base table is 700 million records. The 6 million records is from 1/1/09 through yesterday based on my base filter (NOT using the subselect). A day's worth of records is still in the range of 30-50k.
0
 

Author Closing Comment

by:Cbobchin
ID: 31579109
Of no fault of GRayL, the solution is only partially complete. Turns out the database needs some minor work and the fields I was told initially held the correct data turns out not to.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Suggested Solutions

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
A simple tool to export all objects of two Access files as text and compare it with Meld, a free diff tool.
Viewers will learn how to use the SELECT statement in SQL and will be exposed to the many uses the SELECT statement has.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now