We have query that joins 4 tables. Each table is a multi million row table.
sample query to give you an idea (real tables names and column were replaced with generics).
FROM Table1 t1 with (NOLOCK)
INNER JOIN Table2 t2 with (NOLOCK)
t1.pk = t2.fk and (t2.col = 'W' or (t2.col3 = 'S' and t2.col4 = 5))
INNER JOIN Table3 t3 with (NOLOCK)
on t3.fk = t1.pk and t3.col6 = 1
and (t3.col3 is null or (t1.pk = t3.col5 and t3.col2 = 1))
and (t3.col7 = 'Y' or (7 > 0 and t3.col9 <= getDate() + 7))
INNER JOIN Table4 t4 with (NOLOCK)
t1.pk = t4.fk
WHERE t1.col9 = 'E'
and t1.col10 <> 1
and t1.col11 IS NOT NULL;
there is an index on every field that is used in every table in this query.
the 3rd join is very complex and creates internally some tables and results in some table scans (probably b/c of the OR conditions).
this query on average returns from 5-100 rows of data.
to optimize the query without changing anything all I did was add Top 1000 after the select keyword. Performance improvement is 500% to the original. Query plan does not show any table scans, only index seeks.
I cannot explain this behavior. Can someone help me understand why there is such a dramatic improvement? Isn't the top clause supposed to operate as the last step of the process, so it wouldn't effect performance at all?