LeTay
asked on
Optimise a SQL query
I have two SQL tables, says T1 and T2
T1 has PK1 as primary key and a column C1
T2 has K1 as foreign key PK1 in T1 and a column C1
I need to count the number of T1.PK1 for which T1.C1 has a given value (3) and for which there is at least on T2.C1 > 0
My current select is
select count(distinct T1.PK1)
from T1, T2
where T1.PK1 = T2.K1
and T1.C1 = 3
and T2.C1 > 0
I think that it is possible to optimise this as I presume that the entire T2 table is scanned for those rows where T2.C1 = T1.C1 (and T1.C1 = 3) ...
T1 has PK1 as primary key and a column C1
T2 has K1 as foreign key PK1 in T1 and a column C1
I need to count the number of T1.PK1 for which T1.C1 has a given value (3) and for which there is at least on T2.C1 > 0
My current select is
select count(distinct T1.PK1)
from T1, T2
where T1.PK1 = T2.K1
and T1.C1 = 3
and T2.C1 > 0
I think that it is possible to optimise this as I presume that the entire T2 table is scanned for those rows where T2.C1 = T1.C1 (and T1.C1 = 3) ...
ASKER
Thanks
Will SQL stop looking in T2 as soon as it finds one T2.C1 > 0 for each PK1 candidate ?
Will SQL stop looking in T2 as soon as it finds one T2.C1 > 0 for each PK1 candidate ?
Depending on the cardinalities involved, you need the follwoing two indices: T1(C1) and T2(C1) to support the predicates for T1.C1=3 and T2.C1>0
ASKER
No problem with the predicates
But my question remains : will SQL stop looking in T2 as soon as it sees a T2.C1 > 0 for each PK involved ?
But my question remains : will SQL stop looking in T2 as soon as it sees a T2.C1 > 0 for each PK involved ?
ASKER
For info, the table T1 is very small and the number of rows in T2 for each T1.PK is large (2.000 per PK)
The primary key in T2 is (K1,C1)
The primary key in T2 is (K1,C1)
Sure.
After rereading your post again, the index on T2 should be T2(K1, C1). Cause it should give you an index seek instead of a table scan.
When T2(K1) is indexed, a common practice, then you should get a index seek on this index with an key lookup for C1, which is slower, cause it requires the additional lookup.
After rereading your post again, the index on T2 should be T2(K1, C1). Cause it should give you an index seek instead of a table scan.
When T2(K1) is indexed, a common practice, then you should get a index seek on this index with an key lookup for C1, which is slower, cause it requires the additional lookup.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
normally you would use a FROM T1 INNER JOIN T2 ON T1.PK1 = T2.K1 instead of using a CROSS JOIN and a WHERE. But modern query machines should find that out by themselves and create an identical execution plan in both cases.
Instead of creating the WHERE at the end you could reduce the number of rows before joining like this:
Open in new window
But I think also this is something the query engine will find out by itself and optimize on it's own. Just create an execution plan for both and look at the result to see if there's any difference. You can also activate the execution time statistics (both in case of SQL Server, don't know which database you use).
Cheers,
Christian