Solved

Questioning SQL efficiency, JOINs vs IN

Posted on 2012-04-11
5
295 Views
Last Modified: 2012-04-23
Can anyone tell me which of these would be more efficient in processing?  I'm running MS SQL Server 2008.

Case 1
SELECT EF.ExtractableEmailVendorFeedName vendor
       ,COUNT(DISTINCT A.KeyID) companyCount
       ,COUNT(DISTINCT EF.DistinctID) emailCount
FROM PSExtract.dbo.vwCompany AS A
INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF
ON A.KeyID = EF.KeyID AND EF.ExtractableEmailVendorFeedID IS NOT NULL AND
EF.OSFunctionID IN (30, 330, 333) AND
EF.OSLevelID IN (20, 30, 40, 50, 60, 85) AND
A.IndustryGroupID IN (220501, 220502, 220503, 220504, 220505, 220506, 220507, 220508, 220509, 220510, 220511, 220512,
220513, 220514, 220515, 220516, 220517, 220518, 220579, 220519, 220520, 220521, 220522, 220523, 220524, 220525,
220529, 220528, 220527, 220530, 220531, 220532, 220533, 220534, 220535, 220605, 220536, 220537, 220538, 220539,
220540, 220541, 220542, 220543, 220544, 220545, 220546, 220547, 220548, 220549, 220550, 220551, 220553, 220554,
220555, 220556, 220557, 220558, 220559, 220561, 220562, 220563, 220603, 220564, 220565, 220567, 220568, 220569,
220570, 220571, 220572, 220573, 220574, 220575, 220576, 220577, 220578, 220580, 220581, 220582, 220583, 220584,
220585, 220586, 220587, 220588, 220589, 220590, 220592, 220593, 220594, 220595, 220596, 220597, 220598, 220599,
220600, 220601, 220602) AND
A.NationalRegionID IN (10110, 10130, 10140) AND
A.Employees BETWEEN 50 AND 250
GROUP BY EF.ExtractableEmailVendorFeedName
ORDER BY EF.ExtractableEmailVendorFeedName ASC


Case 2
SELECT EF.ExtractableEmailVendorFeedName vendor
       ,COUNT(DISTINCT A.KeyID) companyCount
       ,COUNT(DISTINCT EF.DistinctID) emailCount
FROM PSExtract.dbo.vwCompany AS A
INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF
ON A.KeyID = EF.KeyID AND EF.ExtractableEmailVendorFeedID IS NOT NULL AND
EF.OSFunctionID IN (30, 330, 333) AND
EF.OSLevelID IN (20, 30, 40, 50, 60, 85) AND
A.NationalRegionID IN (10110, 10130, 10140) AND
A.Employees BETWEEN 50 AND 250
INNER JOIN PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs AS IG
ON A.IndustryGroupID = IG.IndustryGroupID
GROUP BY EF.ExtractableEmailVendorFeedName
ORDER BY EF.ExtractableEmailVendorFeedName ASC

PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs  is loaded with the IndustryGroupIDs you see in case 1, they are loaded as ints, they are not indexed in any way.

Rich
0
Comment
Question by:RichNH
5 Comments
 
LVL 23

Assisted Solution

by:wdosanjos
wdosanjos earned 100 total points
ID: 37834731
I suggest that you view the execution plan of each query on SSME to evaluate which one is less expensive.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 250 total points
ID: 37835653
PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs  is loaded with the IndustryGroupIDs you see in case 1, they are loaded as ints, they are not indexed in any way.
Why not index it?  If you do it should be a no brainer.
I would however re-write it as follows:
SELECT	EF.ExtractableEmailVendorFeedName vendor,
	COUNT(DISTINCT A.KeyID) companyCount,
	COUNT(DISTINCT EF.DistinctID) emailCount
FROM	PSExtract.dbo.vwCompany AS A
	INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF ON A.KeyID = EF.KeyID
	INNER JOIN PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs AS IG ON A.IndustryGroupID = IG.IndustryGroupID
WHERE	EF.ExtractableEmailVendorFeedID IS NOT NULL
	AND EF.OSFunctionID IN (30, 330, 333)
	AND EF.OSLevelID IN (20, 30, 40, 50, 60, 85)
	AND A.NationalRegionID IN (10110, 10130, 10140)
	AND A.Employees BETWEEN 50 AND 250
GROUP BY
	EF.ExtractableEmailVendorFeedName
ORDER BY
	EF.ExtractableEmailVendorFeedName ASC

Open in new window

0
 
LVL 51

Assisted Solution

by:HainKurt
HainKurt earned 150 total points
ID: 37835800
run both and check the execution time...
I don't see much difference but second one is more easy to maintain...
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 37835859
Absolutely!
0
 
LVL 1

Author Comment

by:RichNH
ID: 37838713
Thanks for the replies folks, to answer questions in turn:

I don't have the privs to run an execution plan right now.

This was a one shot solution as all the solutions I generate pretty much are, the question was more to figure out if there was a more efficient way.  I did try running both samples and they came in neck and neck.   For the data I pulled you couldn't really tell who won.  My understanding is that sometimes the amount of data we pull and the operations we do will run hours.  This query lasted between 4 & 5 minutes.

My understanding from one of the books I was reading was that if you can put a constraint into the JOIN, it's more efficient in execution.  I read this in Beginning Microsoft SQL Server 2008 Programming by Robert Vieira.  Although I do agree that from a maintenance point of view the suggested query in ACPerkins note is much better.
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Via a live example, show how to shrink a transaction log file down to a reasonable size.
Viewers will learn how the fundamental information of how to create a table.

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now