Solved

# Performance on Sum / Group By in a sql expression

Posted on 2010-08-27
387 Views
I have a Sum/Group By query - without the sum, it takes about 4 seconds to execute in query analyzer, with the sum it takes 23 seconds (and I assume will get far worse as the date range increases).

Can anyone help me figure out if there is a better way to construct this query to improve the performance of the "SUM"

Code

select T_3.[ACCOUNTNUMBER],T_3.[DESCRIPTION],Sum(T_1.AMOUNT)

From DBO.GL7TRANSACTIONS AS GL7TRANSACTIONS
LEFT OUTER JOIN DBO.GL7TRANSACTIONDISTRIBUTIONS AS T_1 ON GL7TRANSACTIONS.GL7TRANSACTIONSID = T_1.GL7TRANSACTIONSID
LEFT OUTER JOIN DBO.tf_GL7SECUREPROJECTS(1, 1, 0) AS T_2 ON T_1.GL7PROJECTSID = T_2.GL7PROJECTSID
LEFT OUTER JOIN DBO.tf_GL7SECUREACCOUNTS(1, 1, 0) AS T_3 ON GL7TRANSACTIONS.GL7ACCOUNTSID = T_3.GL7ACCOUNTSID
LEFT OUTER JOIN DBO.GL7ACCOUNTCODES AS T_4 ON T_3.GL7ACCOUNTCODESID = T_4.GL7ACCOUNTCODESID
INNER JOIN DBO.GL7BATCHES AS T_5 ON GL7TRANSACTIONS.GL7BATCHESID = T_5.GL7BATCHESID

Where ((GL7TRANSACTIONS.[POSTDATE] BETWEEN '2010-06-01' AND '2010-07-31 23:59:59.997'
AND (T_4.ACCOUNTCODE) like '6%')
AND GL7TRANSACTIONS.[GL7GENERALINFOID]  =  1
AND T_5.[STATUS] NOT IN (8,9))

group by T_3.[Accountnumber],T_3.[Description]
0
Question by:akessel
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 3
• 2
• 2

LVL 57

Assisted Solution

Raja Jegan R earned 150 total points
ID: 33542675
Your query can't be tuned further..
Probably you can try rewriting those two table valued functions into a single one for better performance.
And the proper construct would be
``````select T_3.[ACCOUNTNUMBER],T_3.[DESCRIPTION],Sum(T_1.AMOUNT)
From DBO.GL7TRANSACTIONS AS GL7TRANSACTIONS
INNER JOIN DBO.GL7BATCHES AS T_5 ON GL7TRANSACTIONS.GL7BATCHESID = T_5.GL7BATCHESID
LEFT OUTER JOIN DBO.GL7TRANSACTIONDISTRIBUTIONS AS T_1 ON GL7TRANSACTIONS.GL7TRANSACTIONSID = T_1.GL7TRANSACTIONSID
LEFT OUTER JOIN DBO.tf_GL7SECUREPROJECTS(1, 1, 0) AS T_2 ON T_1.GL7PROJECTSID = T_2.GL7PROJECTSID
LEFT OUTER JOIN DBO.tf_GL7SECUREACCOUNTS(1, 1, 0) AS T_3 ON GL7TRANSACTIONS.GL7ACCOUNTSID = T_3.GL7ACCOUNTSID
LEFT OUTER JOIN DBO.GL7ACCOUNTCODES AS T_4 ON T_3.GL7ACCOUNTCODESID = T_4.GL7ACCOUNTCODESID
Where ((GL7TRANSACTIONS.[POSTDATE] BETWEEN '2010-06-01' AND '2010-07-31 23:59:59.997'
AND (T_4.ACCOUNTCODE) like '6%')
AND GL7TRANSACTIONS.[GL7GENERALINFOID]  =  1
AND T_5.[STATUS] NOT IN (8,9))
group by T_3.[Accountnumber],T_3.[Description]
``````
0

LVL 41

Accepted Solution

ID: 33542704
Why you need to join to those functions? can you just join to the tables directly? that will certainly improve your performance. That will also make you be able to take advantage of indexes that you need to create for the columns you have in your joins and the where and group by clauses.
0

Author Comment

ID: 33543398
I am using as a base, sql that is generated from a tool the vendor supplies in their built in querying tool - in their client sofware.  But ....  I did extricate the query from this join:

LEFT OUTER JOIN DBO.tf_GL7SECUREPROJECTS(1, 1, 0) AS T_2 ON T_1.GL7PROJECTSID = T_2.GL7PROJECTSID

by using the GL7PROJECTTSID directly in the where clasue, and that seems to have made a huge difference.

Frankly I'm not to sure what's going on in the tf.dbo tables but its a complex database, with someone limited visibility into what is going on.  I'm just playing it safe and doing it the way they do it in their query tools.

I will take a look at DBO.tf_GL7SECUREACCOUNTS(1, 1, 0) and see if there is a straight table that gives me the same info - I only need the account number and description and the middle segment of the account description to execute the query and get the results I need, you would think I wouldn't need a derived table, but ...
0

LVL 57

Expert Comment

ID: 33572581
akessel,

Have you tried using the table directly and are you able to see some good performance on that..
0

Author Closing Comment

ID: 33573248
Both commens suggesting using the table directly were on the right track, when I took that approach, which I was able to do easily, it cut execution speed 60%.

What the heck is a table valued function, anyway?
0

LVL 41

Expert Comment

ID: 33576761
>>What the heck is a table valued function, anyway?<<
a Table-valued function, is just a function that returns a table as a result. Think of it as a view with steroids (because you can do more things in a function than in a view). More info here.
http://msdn.microsoft.com/en-us/library/ms191165.aspx
0

Author Comment

ID: 33577030
Ah ... I see them out there .. these in particular were designed to hide availability of some data from some users who do input, not necessary for the purpose I am using them for since it is read-only.
0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes how to use the timestamp of existing data in a database to allow Tableau to calculate the prior work day instead of relying on case statements or if statements to calculate the days of the week.
Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
In this video we outline the Physical Segments view of NetCrunch network monitor. By following this brief how-to video, you will be able to learn how NetCrunch visualizes your network, how granular is the information collected, as well as where to f…
Michael from AdRem Software explains how to view the most utilized and worst performing nodes in your network, by accessing the Top Charts view in NetCrunch network monitor (https://www.adremsoft.com/). Top Charts is a view in which you can set seve…
###### Suggested Courses
Course of the Month6 days, 11 hours left to enroll