Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Questioning SQL efficiency, JOINs vs IN

Posted on 2012-04-11
5
Medium Priority
?
310 Views
Last Modified: 2012-04-23
Can anyone tell me which of these would be more efficient in processing?  I'm running MS SQL Server 2008.

Case 1
SELECT EF.ExtractableEmailVendorFeedName vendor
       ,COUNT(DISTINCT A.KeyID) companyCount
       ,COUNT(DISTINCT EF.DistinctID) emailCount
FROM PSExtract.dbo.vwCompany AS A
INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF
ON A.KeyID = EF.KeyID AND EF.ExtractableEmailVendorFeedID IS NOT NULL AND
EF.OSFunctionID IN (30, 330, 333) AND
EF.OSLevelID IN (20, 30, 40, 50, 60, 85) AND
A.IndustryGroupID IN (220501, 220502, 220503, 220504, 220505, 220506, 220507, 220508, 220509, 220510, 220511, 220512,
220513, 220514, 220515, 220516, 220517, 220518, 220579, 220519, 220520, 220521, 220522, 220523, 220524, 220525,
220529, 220528, 220527, 220530, 220531, 220532, 220533, 220534, 220535, 220605, 220536, 220537, 220538, 220539,
220540, 220541, 220542, 220543, 220544, 220545, 220546, 220547, 220548, 220549, 220550, 220551, 220553, 220554,
220555, 220556, 220557, 220558, 220559, 220561, 220562, 220563, 220603, 220564, 220565, 220567, 220568, 220569,
220570, 220571, 220572, 220573, 220574, 220575, 220576, 220577, 220578, 220580, 220581, 220582, 220583, 220584,
220585, 220586, 220587, 220588, 220589, 220590, 220592, 220593, 220594, 220595, 220596, 220597, 220598, 220599,
220600, 220601, 220602) AND
A.NationalRegionID IN (10110, 10130, 10140) AND
A.Employees BETWEEN 50 AND 250
GROUP BY EF.ExtractableEmailVendorFeedName
ORDER BY EF.ExtractableEmailVendorFeedName ASC


Case 2
SELECT EF.ExtractableEmailVendorFeedName vendor
       ,COUNT(DISTINCT A.KeyID) companyCount
       ,COUNT(DISTINCT EF.DistinctID) emailCount
FROM PSExtract.dbo.vwCompany AS A
INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF
ON A.KeyID = EF.KeyID AND EF.ExtractableEmailVendorFeedID IS NOT NULL AND
EF.OSFunctionID IN (30, 330, 333) AND
EF.OSLevelID IN (20, 30, 40, 50, 60, 85) AND
A.NationalRegionID IN (10110, 10130, 10140) AND
A.Employees BETWEEN 50 AND 250
INNER JOIN PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs AS IG
ON A.IndustryGroupID = IG.IndustryGroupID
GROUP BY EF.ExtractableEmailVendorFeedName
ORDER BY EF.ExtractableEmailVendorFeedName ASC

PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs  is loaded with the IndustryGroupIDs you see in case 1, they are loaded as ints, they are not indexed in any way.

Rich
0
Comment
Question by:RichNH
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 23

Assisted Solution

by:wdosanjos
wdosanjos earned 400 total points
ID: 37834731
I suggest that you view the execution plan of each query on SSME to evaluate which one is less expensive.
0
 
LVL 75

Accepted Solution

by:
Anthony Perkins earned 1000 total points
ID: 37835653
PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs  is loaded with the IndustryGroupIDs you see in case 1, they are loaded as ints, they are not indexed in any way.
Why not index it?  If you do it should be a no brainer.
I would however re-write it as follows:
SELECT	EF.ExtractableEmailVendorFeedName vendor,
	COUNT(DISTINCT A.KeyID) companyCount,
	COUNT(DISTINCT EF.DistinctID) emailCount
FROM	PSExtract.dbo.vwCompany AS A
	INNER JOIN PSExtract.dbo.vwExecutiveAndExecutiveFunction AS EF ON A.KeyID = EF.KeyID
	INNER JOIN PSExtract.temp.RJZ_EssentialNet_IndustryGroupIDs AS IG ON A.IndustryGroupID = IG.IndustryGroupID
WHERE	EF.ExtractableEmailVendorFeedID IS NOT NULL
	AND EF.OSFunctionID IN (30, 330, 333)
	AND EF.OSLevelID IN (20, 30, 40, 50, 60, 85)
	AND A.NationalRegionID IN (10110, 10130, 10140)
	AND A.Employees BETWEEN 50 AND 250
GROUP BY
	EF.ExtractableEmailVendorFeedName
ORDER BY
	EF.ExtractableEmailVendorFeedName ASC

Open in new window

0
 
LVL 59

Assisted Solution

by:HainKurt
HainKurt earned 600 total points
ID: 37835800
run both and check the execution time...
I don't see much difference but second one is more easy to maintain...
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 37835859
Absolutely!
0
 
LVL 1

Author Comment

by:RichNH
ID: 37838713
Thanks for the replies folks, to answer questions in turn:

I don't have the privs to run an execution plan right now.

This was a one shot solution as all the solutions I generate pretty much are, the question was more to figure out if there was a more efficient way.  I did try running both samples and they came in neck and neck.   For the data I pulled you couldn't really tell who won.  My understanding is that sometimes the amount of data we pull and the operations we do will run hours.  This query lasted between 4 & 5 minutes.

My understanding from one of the books I was reading was that if you can put a constraint into the JOIN, it's more efficient in execution.  I read this in Beginning Microsoft SQL Server 2008 Programming by Robert Vieira.  Although I do agree that from a maintenance point of view the suggested query in ACPerkins note is much better.
0

Featured Post

Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Ever needed a SQL 2008 Database replicated/mirrored/log shipped on another server but you can't take the downtime inflicted by initial snapshot or disconnect while T-logs are restored or mirror applied? You can use SQL Server Initialize from Backup…
When trying to connect from SSMS v17.x to a SQL Server Integration Services 2016 instance or previous version, you get the error “Connecting to the Integration Services service on the computer failed with the following error: 'The specified service …
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question