asked on

Temporary table vs using nested query

I have the following query that pulls records from two tables and returns the data. I'm using #temp table to accomplish this. I'm inserting records into temp table from one query and then inserting some more records into the same temp table using the second query. I'm returning the count and the list of records. Is there a better/faster to accomplish the same result without using temp tables? Examples would be nice.

SELECT CompanyId,EmployeeId INTO #EmployeeList
      FROM dbo.Employees l
      WHERE l.CompanyId > 0 AND EXISTS
            (SELECT * FROM dbo.Filings f WHERE (f.CompanyId = l.CompanyId AND f.EmployeeId = l.EmployeeId ) AND f.EmployeeId IS NOT NULL
                  AND f.Clientid IS NULL AND f.AddDateTime >= '01/01/2012' ) ;
INSERT INTO #EmployeeList
      SELECT DISTINCT f.CompanyId,lof.EmployeeId FROM dbo.Filings f
            INNER JOIN dbo.EmployeesOnFiling lof ON lof.FilingId = f.FilingId WHERE f.CompanyId > 0 AND f.ClientId IS NOT NULL
            AND f.EmployeeId IS NULL AND f.AddDateTime >= '01/01/2012' ;

SELECT COUNT(*) as thecount FROM (SELECT DISTINCT CompanyId, EmployeeId       FROM #EmployeeList) as Tbl ;
SELECT DISTINCT CompanyId, EmployeeId FROM #EmployeeList

ASKER CERTIFIED SOLUTION

momi_sabag

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

Michael Carrillo

You could use Union instead of the temp table to aggregate your employees and then do your count.

select count(*) as thecount from
(select distinct ... from dbo.employees1 where ...
union
select distinct ... from dbo.filings inner join ... where ...)

Scott Pletcher

You can avoid a temp table using technique below. The only difference is that the count comes after the list.

SET NOCOUNT ON

DECLARE @rowcount int

SELECT DISTINCT CompanyId,EmployeeId
FROM dbo.Employees l
WHERE l.CompanyId > 0 AND EXISTS
(SELECT * FROM dbo.Filings f WHERE (f.CompanyId = l.CompanyId AND f.EmployeeId = l.EmployeeId ) AND f.EmployeeId IS NOT NULL
AND f.Clientid IS NULL AND f.AddDateTime >= '01/01/2012' )
UNION
SELECT DISTINCT f.CompanyId,lof.EmployeeId FROM dbo.Filings f
INNER JOIN dbo.EmployeesOnFiling lof ON lof.FilingId = f.FilingId WHERE f.CompanyId > 0 AND f.ClientId IS NOT NULL
AND f.EmployeeId IS NULL AND f.AddDateTime >= '01/01/2012'
ORDER BY CompanyId,EmployeeId

SET @rowcount = @@ROWCOUNT

SELECT @rowcount

Kent Olsen

Hi vb7guy,

Typos not withstanding (the embedded semicolons need to be removed) momi_sabag's query is superior to the others. In his query, the CompanyID and EmployeeId are pulled from both tables, and the UNION operator reduces the combined sub-queries into a distinct list.

The other examples needlessly apply the DISTINCT operator on each of the sub-queries before applying the UNION operator. That's potentially 2 additional sorts and filters. If you've got a large database that could be a performance killer.

Kent

Scott Pletcher

The other examples needlessly apply the DISTINCT operator on each of the sub-queries before applying the UNION operator. That's potentially 2 additional sorts and filters. If you've got a large database that could be a performance killer.

I intentionally added DISTINCT to each query. It's faster to sort n rows, then m rows than it is to sort (n + m) rows all at once.

Also, separate DISTINCTs immediately reduces the number of rows each query has to process. The more dups there are, the more important it is to include the DISTINCT.

When doing the UNION, SQL will know that the DISTINCT lists are already sorted and "unique'd" and only has to merge them. Merging previously sorted lists is very fast.

I'd be willing to make a big wager that with larger result sizes, esp. ones with lots of dups in the individual queries, that the queries with DISTINCT would be much faster.

Scott Pletcher

Also, I'm not sure it's safe to take the count from the first query only as the final count. To me it looked like the count was required to be an overall count, not just a count for the first query.

Scott Pletcher

momi_sabag's query is also missing the req'd GROUP BY on the first query. GROUP BY is more overhead than a DISTINCT.

Admittedly he did not put a DISTINCT nor GROUP BY on the second query, but as I noted earlier, I consider that a minus not a plus.