Optimize Complex Query
Posted on 2006-11-05
I'll try to simplify this to only the relevant information and hope that I don't inadvertently omit something important. (-:
1. A table of payroll deductions. Let's say its fields are (amount DECIMAL, person_id CHAR(9), deduction_date DATE). This table is fairly big (probably around a million rows and growing).
2. A table of EXPECTED deductions -- common amounts that we'd expect to see. Let's say its only field is (amount DECIMAL). This table is tiny (probably no more than 40 rows ever).
I wish to produce a report that...
1. Includes only UNUSUAL deductions (i.e. amounts that do not appear in expected_deductions)
2. Includes only items with a particular date.
3. Is ordered by the TOTAL of all deductions for a particular person on that date.
For example, I might have...
Person A $790
Person A $30
Person B $800
Person C $500
Notice that Person A's deductions appear first because the TOTAL of 790 and 30 is greater than Person B's 800. They're still listed as separate lines, however, so just saying "throw in a GROUP BY" isn't gonna cut it. (-:
I've tried a bunch of different queries, but the best execution time I've gotten so far (around 20 seconds) was for this:
SELECT * FROM deductions
INNER JOIN (SELECT SUM(amount) total, BUID FROM deductions WHERE payroll_date='2006-11-03' GROUP BY person_id) tblX
ON (tblX.person_id = deductions.person_id)
LEFT JOIN expected_deductions ed ON (ed.amount = DOE.amount)
WHERE deduction_date = '2006-11-03' AND ed.amount IS NULL
ORDER BY tblX.total DESC, amount DESC;
Any query that obtains the result I want, exploiting any MySQL 5.0 features (including perhaps a new index that I've not yet tried) will do it. I'll settle, if necessary, for a solution that does some work in PHP too, but at this point I'm being stubborn and I'm determined to get the DBMS to tell me what I want in under 5 seconds. (-: