Link to home
Start Free TrialLog in
Avatar of Azaniah
AzaniahFlag for United Kingdom of Great Britain and Northern Ireland

asked on

Combining two distinct counts on single large table into one query.

On a large-ish table in testing of around 4 Million rows.
I have two similar counts
Firstly:
SELECT SUPPLIER, PRODUCT, COUNT(*)
FROM RECORDS WHERE INVDATE=20080810 AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT;

Secondly:
SELECT SUPPLIER, PRODUCT, COUNT(*)
FROM RECORDS WHERE AMENDED='Y' AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT;

Both counts are contained within indexes (but different ones) on the table.

I've used:
SELECT SUPPLIER, PRODUCT
,sum(case when INVDATE=20080810 THEN 1 else 0 end) num_inv
,sum(case when AMENDED='Y'  THEN 1 else 0 end) num_amended
FROM RECORDS WHERE AMENDED='Y' AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT;

This does give me what I want (albiet with lots of empty counts), but it takes several seconds, as from EXPLAIN it would appear no possible keys exist.  Using a union joined the results but as a single total column not as two individual ones.

Is there a more effecient to do a single query for this, and with the ability to extend the nubmer of different counts.

I have considered using a summary table, however with 100,000 INSERTS / UPDATES per day on the table I'm not sure that would be very effecient. (Still very new to MySQL).

Incidently is there much difference between COUNT(*) and SUM(1)?

Thanks in advance.




Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

what about this:
SELECT SUPPLIER, PRODUCT
,sum(case when INVDATE=20080810 THEN 1 else 0 end) num_inv
,sum(case when AMENDED='Y'  THEN 1 else 0 end) num_amended
FROM RECORDS WHERE TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT;

Open in new window

Avatar of Azaniah

ASKER

Angelll unfortunately not, as the date search (num_inv) is the number of invoices received 'as of' or on the date. The amended status is an overall view across all dates.

Essentially the client would need to see a list of supplier+products, invoice received to date, and those that have been amended at any point as a seperate count.

>The amended status is an overall view across all dates.
I have to disagree!

anyhow, you might add this:
SELECT SUPPLIER, PRODUCT
,sum(case when INVDATE=20080810 THEN 1 else 0 end) num_inv
,sum(case when AMENDED='Y'  THEN 1 else 0 end) num_amended
FROM RECORDS WHERE TYPE='I' AND CLIENT='BOB'
AND ( INVDATE=20080810 OR  AMENDED='Y' )
GROUP BY SUPPLIER, PRODUCT;

Open in new window

Avatar of Azaniah

ASKER

It's a bit faster 10 seconds vs 15, however the two seperate counts are 0.01secs so clearly no-where as effecient. Explain is at least listing possible keys, if not using any.

The number of possible rows returned for either set will be quite small, 40 max per count, so it might be more sensible to do the two counts and join the records later.

I might be mis-understanding something, but when you say you don't agree, if an client logs and and the two most important counts they need are new invoices (i.e. todays) and the total number of invoices that have been amended (so they can be addressed) I am not sure why you don't agree.

I don't agree, because the case expression for the SUM() will "filter/count" as you requested, not having the condition in the where clause does not matter for that. in other terms, the query I suggested will return the same data/counts as your initial query.

now, the matter is performance.
what indexes do you have, actually, on the table?

if you don't have any index with CLIENT and TYPE fields being listed first, try to reorder the index columns in the index so that CLIENT is first, and TYPE second (and INVDATE eventually third)...

otherwise, you might use this syntax (assuming MySQL 5.x or higher)


SELECT COALESCE(l1.SUPPLIER, l2.SUPPLIER) SUPPLIER
, COALESCE(l1.PRODUCT, l2.PRODUCT) PRODUCT
, count_invdate
, count_amended
FROM (
SELECT SUPPLIER, PRODUCT, COUNT(*) count_invdate
FROM RECORDS WHERE INVDATE=20080810 AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT
) l1
FULL OUTER JOIN (
SELECT SUPPLIER, PRODUCT, COUNT(*) amended
FROM RECORDS WHERE AMENDED='Y' AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT
) l2
 ON l1.SUPPLIER = l2.SUPPLIER
AND l1.PRODUCT = l2.PRODUCT

Open in new window

Avatar of Azaniah

ASKER

Angellll,

Looks like the version I am testing with will not allow this JOIN (Community Server 5.0.51b), however I can see what the logic is so I will look at updating the test version of MySQL.

Thanks again for you help.
well, the FULL OUTER JOIN is maybe not even needed in your case, if all product/supplier are present in both subqueries?---
Avatar of Azaniah

ASKER

They are not. I did try looking at doing a LEFT and RIGHT join and doing a UNION - but my feeble brain refused to cope today.

Avatar of Azaniah

ASKER

The following seems to work, but I am not sure if it has any impacts that I might be missing

SELECT SUPPLIER, PRODUCT, COUNT(*) count_invdate, SUM(0) count_amended
FROM INVOICES WHERE INVDATE=20080420 AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT
UNION
SELECT SUPPLIER, PRODUCT, SUM(0) count_invdate, COUNT(*) count_amended
FROM INVOICES WHERE AMENDED='Y' AND TYPE='I' AND CLIENT='BOB'
GROUP BY SUPPLIER, PRODUCT

Open in new window

Avatar of Azaniah

ASKER

Annoyingly that doesn't work correctly. Returns all the correct data but not joined as needed.

ASKER CERTIFIED SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Azaniah

ASKER

That works correctly and is much faster,certainly appears to have resolved my issue.
Thanks for your help again Angelll