asked on

slow subquery

I am no expert so this is probably embarrassingly easy for you experts out there but why is this simple 1st statement so much quicker than the simple 2nd (with subquery)?

1st:

DECLARE @tmpDueDate datetime
SET @tmpDueDate = 7/18/2002
DECLARE @PolicyID int
SET @PolicyID = 20
DECLARE @AccountID int

SELECT @AccountID = (SELECT top 1 AccountID FROM tblAccounts
WHERE DueDate > @tmpDueDate AND PolicyID = @PolicyID
AND (TransactionTypeID = 1 or transactiontypeid = 19)
AND TransactionStatusID <> 2 and TransactionstatusID <> 4
AND Contra = 0 And paymentmethodid = 1 ORDER BY DueDate)

UPDATE tblAccounts SET TransactionStatusID = 1
WHERE AccountID = @AccountID
2nd:

DECLARE @tmpDueDate datetime
SET @tmpDueDate = 7/18/2002
DECLARE @PolicyID int
SET @PolicyID = 20
DECLARE @AccountID int

UPDATE tblAccounts SET TransactionStatusID = 1
WHERE AccountID = (SELECT top 1 AccountID FROM tblAccounts
WHERE DueDate > @tmpDueDate AND PolicyID = @PolicyID
AND (TransactionTypeID = 1 or transactiontypeid = 19)
AND TransactionStatusID <> 2 and TransactionstatusID <> 4
AND Contra = 0 And paymentmethodid = 1 ORDER BY DueDate)

tblAccounts is indexed on PolicyID, AccountID AND DueDate and contains c. 3 million rows. 1st takes 1 second, 2nd takes 150 seconds, both to do 1 update! For 2nd, estimated execution plan shows 34% of query taken up with Hash Match/Inner Join (whatever they are! - all rows read) and 25% with a sort.

ASKER CERTIFIED SOLUTION

tnewc59

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

tnewc59

The following select operation will experience the same problem:

SELECT myTable.Column1, myTable.Column2
FROM myTable
WHERE myTable.Column1 IN (
SELECT mySecondTable.Column1
FROM mySecondTable
WHERE mySecondTable.Column2 < 1000)

This same query could be re-written more efficiently as:
SELECT myTable.Column1, myTable.Column2
FROM myTable INNER JOIN
(
SELECT mySecondTable.Column1
FROM mySecondTable
WHERE mySecondTable.Column2 < 1000
) as mySubQueryTable on myTable.Column1 = mySubQueryTable.Column1

The second is more efficient as the sub query will be executed and assembled only once, but the first example will require the query to be executed 'x' times. Where 'x' is equal to the number of rows in myTable.

This is the same concept that is slowing down your second update.

tnewc59

The sort time is the time it takes to execute the 'order by' portion of the query.

The problem with using a 'top' with an order by is that your query will not return until the full result set is ordered on the 'order by'.

When possible, I try to write my queries that utilize 'top' without an 'order by' clause.

dlisk

ASKER

Thanx for your time tnewc59.