Link to home
Start Free TrialLog in
Avatar of vb7guy
vb7guyFlag for United States of America

asked on

Temporary table vs using nested query

I have the following query that pulls records from two tables and returns the data.  I'm using #temp table to accomplish this.  I'm inserting records into temp table from one query and then inserting some more records into the same temp table using the second query.  I'm returning the count and the list of records.  Is there a better/faster to accomplish the same result without using temp tables? Examples would be nice.


SELECT CompanyId,EmployeeId INTO #EmployeeList  
      FROM dbo.Employees l
      WHERE l.CompanyId > 0 AND EXISTS  
            (SELECT * FROM dbo.Filings f WHERE (f.CompanyId = l.CompanyId AND f.EmployeeId   = l.EmployeeId ) AND f.EmployeeId IS NOT NULL
                  AND f.Clientid IS NULL AND f.AddDateTime >= '01/01/2012' ) ;  
INSERT INTO #EmployeeList  
      SELECT DISTINCT  f.CompanyId,lof.EmployeeId  FROM dbo.Filings f  
            INNER JOIN dbo.EmployeesOnFiling lof ON lof.FilingId = f.FilingId  WHERE f.CompanyId > 0 AND f.ClientId IS NOT NULL
            AND f.EmployeeId IS NULL AND f.AddDateTime >= '01/01/2012'  ;

SELECT COUNT(*) as thecount FROM (SELECT DISTINCT CompanyId, EmployeeId       FROM #EmployeeList) as Tbl  ;
SELECT DISTINCT CompanyId, EmployeeId FROM #EmployeeList
ASKER CERTIFIED SOLUTION
Avatar of momi_sabag
momi_sabag
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You could use Union instead of the temp table to aggregate your employees and then do your count.

select count(*) as thecount from
(select distinct   ... from  dbo.employees1 where ...
union
select distinct  ... from dbo.filings  inner join ... where ...)
Avatar of Scott Pletcher
You can avoid a temp table using technique below.  The only difference is that the count comes after the list.


SET NOCOUNT ON

DECLARE @rowcount int

SELECT DISTINCT CompanyId,EmployeeId
      FROM dbo.Employees l
      WHERE l.CompanyId > 0 AND EXISTS  
            (SELECT * FROM dbo.Filings f WHERE (f.CompanyId = l.CompanyId AND f.EmployeeId   = l.EmployeeId ) AND f.EmployeeId IS NOT NULL
                  AND f.Clientid IS NULL AND f.AddDateTime >= '01/01/2012' )
UNION
SELECT DISTINCT f.CompanyId,lof.EmployeeId  FROM dbo.Filings f  
      INNER JOIN dbo.EmployeesOnFiling lof ON lof.FilingId = f.FilingId  WHERE f.CompanyId > 0 AND f.ClientId IS NOT NULL
      AND f.EmployeeId IS NULL AND f.AddDateTime >= '01/01/2012'
ORDER BY CompanyId,EmployeeId

SET @rowcount = @@ROWCOUNT

SELECT @rowcount
Hi vb7guy,

Typos not withstanding (the embedded semicolons need to be removed) momi_sabag's query is superior to the others.  In his query, the CompanyID and EmployeeId are pulled from both tables, and the UNION operator reduces the combined sub-queries into a distinct list.

The other examples needlessly apply the DISTINCT operator on each of the sub-queries before applying the UNION operator.  That's potentially 2 additional sorts and filters.  If you've got a large database that could be a performance killer.


Kent
The other examples needlessly apply the DISTINCT operator on each of the sub-queries before applying the UNION operator.  That's potentially 2 additional sorts and filters.  If you've got a large database that could be a performance killer.

I intentionally added DISTINCT to each query.  It's faster to sort n rows, then m rows than it is to sort (n + m) rows all at once.

Also, separate DISTINCTs immediately reduces the number of rows each query has to process.  The more dups there are, the more important it is to include the DISTINCT.

When doing the UNION, SQL will know that the DISTINCT lists are already sorted and "unique'd" and only has to merge them.  Merging previously sorted lists is very fast.

I'd be willing to make a big wager that with larger result sizes, esp. ones with lots of dups in the individual queries, that the queries with DISTINCT would be much faster.
Also, I'm not sure it's safe to take the count from the first query only as the final count.  To me it looked like the count was required to be an overall count, not just a count for the first query.
momi_sabag's query is also missing the req'd GROUP BY on the first query.  GROUP BY is more overhead than a DISTINCT.

Admittedly he did not put a DISTINCT nor GROUP BY on the second query, but as I noted earlier, I consider that a minus not a plus.