Link to home
Start Free TrialLog in
Avatar of JohnnyBCJ
JohnnyBCJFlag for Canada

asked on

How do I make this query faster and more MySQL friendly?

I am curious on how do I make this query more MySQL friendly. This query displays the correct information but it takes a long time to display the results. This query will only work on the MySQL Workbench but not in my program because it times out.

What am I able to do to speed up this query?
SELECT    c.LastOffice, c.LNAME, c.FNAME, ms.SIN, ms.YEAR, ms.CSCH1, ms.CREVREC, ms.CPBACK, ms.CSCOL AS CLREC, ms.ADJ2, ms.AR FROM         
(SELECT     
SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC, SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM('CLIENT.REC2') AS CL, SUM('CLIENT.SHORT2') AS CS, SUM('SPOUSE.SHORT2') AS SS, SUM(ADJUSTMENT2) AS ADJ2, SUM('CLIENT.REC2') + SUM('CLIENT.SHORT2') + SUM('SPOUSE.SHORT2') AS CSCOL, SUM(COALESCE (CSCH1, 0))                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
FROM          
(SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, SUM(CASE WHEN 'Adinfo' = 'P' THEN COALESCE (CKAMT, 0) ELSE 0 END) AS PBack2,   SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 FROM Disbursements AS Disbursements_1 WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) GROUP BY SIN, YEAR UNION ALL SELECT     SIN, YEAR, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 FROM Receipts 
WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
GROUP BY SIN, YEAR                                                
UNION ALL                                                
SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC2, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT2, SUM(CASE WHEN 'Type' = 'RV' THEN COALESCE (AMT_REC, 0) - COALESCE (INT_AMT, 0) ELSE 0 END) AS CREVREC, SUM(CASE WHEN 'Type' = 'CL' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.REC2', SUM(CASE WHEN 'Type' = 'CS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.SHORT2', SUM(CASE WHEN 'Type' = 'SS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 
FROM         Receipts AS Receipts_1
WHERE     (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                                
GROUP BY SIN, YEAR                                                
UNION ALL                                                
SELECT     strSIN, strYear, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', SUM(decAMT) AS ADJUSTMENT2                                                FROM         tblAdjustments AS tblAdjustments_1                                                
WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
GROUP BY strSIN, strYear) AS HI                        
GROUP BY SIN, YEAR                        
HAVING      (SUM(COALESCE (CSCH1, 0)) + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) <> 0))                        AS ms 
INNER JOIN Clients AS c ON c.SIN = ms.SIN 
GROUP BY c.LNAME, c.FNAME, ms.SIN, c.LastOffice, ms.YEAR, ms.CSCH1, ms.CREVREC, ms.CINT, ms.CPBACK, ms.CL, ms.CS, ms.SS, ms.ADJ2, ms.CSCOL, ms.AR 
ORDER BY c.LNAME, c.FNAME, ms.SIN;

Open in new window

Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Hi John,

The query looks reasonable.

Can you describe it?  How many rows are being processed?  How many returned?  Have you seen an explain plan?  Are the appropriate indexes in place?  Can you add indexes if necessary?


Kent
Avatar of JohnnyBCJ

ASKER

The query returns any clients who's balance does not equal zero. There are 25,000+ clients at the moment.  The query returns about 2000 rows but it goes through probably close to 100,000 records (If you include all the tables it goes through to get this result).

I haven't seen an explain plan. I don't have any indexes in place but I can add indexes if needed.





Is this the explain plan you mentioned? It doesn't make a whole lot of sense to me.


1    PRIMARY    <derived2>    ALL                    19828    Using temporary; Using filesort

1    PRIMARY    c    ALL                    23830    Using where; Using join buffer

2    DERIVED    <derived3>    ALL                    61116    Using temporary; Using filesort

3    DERIVED    Disbursements_1    ALL                    22446    Using where; Using temporary; Using filesort

4    UNION    Receipts    ALL                    20724    Using where; Using temporary; Using filesort

5    UNION    Receipts_1    ALL                    20724    Using where; Using temporary; Using filesort

6    UNION    tblAdjustments_1    ALL                    1806    Using where; Using temporary; Using filesort
    UNION RESULT    <union3,4,5,6>    ALL                        


Each of the subqueries filters on the date of the record.  If you'll put an index on that column in each of the tables, the query will be able to read only the rows in the correct date range.  As it is now, the query is doing a full table scan on all of the tables.

CREATE INDEX idx001 on Receipts (DATE_REC);
CREATE INDEX idx002 on tblAdjustments (Date);


See if that doesn't help.

Kent
Yes, you are correct. Each of the sub queries filters on the date of the record. The only problem is the vast majority of the time, the date filter does nothing. It wasn't until a very rare problem happened that caused me to add in the by date filter.

I did what you suggested and got the result in 164 seconds where I previously got 167 seconds.

You usually look at this report using the current year end date to see what people owe right now. The reason why I had to put the date filter in was if you were to look at what a person owed in the past (last month, year, week, etc).

If I need to explain anything, please ask away.
Would this query work faster if it was a view?
Making the query a view won't change anything.  The DBMS will run the exact same SQL.

Run each of the 3 subqueries (separated by the UNION ALL statements) as see what the timing is for each of these 3.  I suspect that their timings will be reasonable and that significant time is being spent accessing the derived tables.


Kent
Good idea. I'll do that now and I'll get back to you.
I did that and each of the sub queries take under 1 second but when I run the entire thing it takes over 3 minutes.
Here are my results:

23734 rows returned           0.015 sec / 0.266 sec
19796 rows returned           0.109 sec / 0.047 sec
  1719 rows returned           0.016 sec / 0.000 sec
45,000 rows isn't a lot of data.  Reprocessing those rows shouldn't take very long.

So let's look at the inner join.  Is there a cartesian product being produced?  (Can there be more than 1 row with the same value of SIN?)


Kent
Every table can have multiple rows except the clients table.
Now hold on. Please let me back track.

The final product of this query is pretty much this:

Last Office, LName, FName, SIN, Year, Sum(Invoices), Sum(Receipts for each type), Sum(Adjustments).

So each of the joins gives me the sum of either invoices, receipts or adjustments, meaning each one gives me back 1 record.


I wasn't able to use a join in this situation because a person may have an invoice but no receipt, or an invoice and a receipt but no adjustment. If a person never had an invoice, receipt and an adjustment, it wouldn't show. It's possible that a person has an adjustment but no invoice. I couldn't figure out how to do the proper join (if there is one in this situation).
Thats the reason why I went with union all. This shows me all the information I need regardless what the client has.






You could put it all on one line using OUTER JOIN instead of UNION ALL.

I've taken the liberty of reformatting the SQL just to be easier to read.  If you'll note, all of the aggregated values in the inner subqueries are modified by COALESCE so no NULL values are possible.

When you select the items from the next-outer query, the HAVING clause filters the aggregate data.  That query assumes non-NULL.  Remove the COALESCE function from the HAVING clause.

Also, run the second query below.  What is the count?


Kent

SELECT    
  c.LastOffice, c.LNAME, c.FNAME, ms.SIN, ms.YEAR, ms.CSCH1, ms.CREVREC, ms.CPBACK, ms.CSCOL AS CLREC, ms.ADJ2, ms.AR 
FROM         
(
  SELECT     
    SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC, SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM('CLIENT.REC2') AS CL, SUM('CLIENT.SHORT2') AS CS, SUM('SPOUSE.SHORT2') AS SS, SUM(ADJUSTMENT2) AS ADJ2, SUM('CLIENT.REC2') + SUM('CLIENT.SHORT2') + SUM('SPOUSE.SHORT2') AS CSCOL, SUM(COALESCE (CSCH1, 0))                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
  FROM          
  (
    SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, SUM(CASE WHEN 'Adinfo' = 'P' THEN COALESCE (CKAMT, 0) ELSE 0 END) AS PBack2,   SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 FROM Disbursements AS Disbursements_1 WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) GROUP BY SIN, YEAR UNION ALL SELECT     SIN, YEAR, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 
    FROM Receipts 
    WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC2, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT2, SUM(CASE WHEN 'Type' = 'RV' THEN COALESCE (AMT_REC, 0) - COALESCE (INT_AMT, 0) ELSE 0 END) AS CREVREC, SUM(CASE WHEN 'Type' = 'CL' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.REC2', SUM(CASE WHEN 'Type' = 'CS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.SHORT2', SUM(CASE WHEN 'Type' = 'SS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 
    FROM         Receipts AS Receipts_1
    WHERE     (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                                
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     strSIN, strYear, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', SUM(decAMT) AS ADJUSTMENT2                                                
    FROM         tblAdjustments AS tblAdjustments_1                                                
    WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY strSIN, strYear
  ) AS HI
  GROUP BY SIN, YEAR                        
  HAVING      (SUM(COALESCE (CSCH1, 0)) + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) <> 0)
) AS ms 
INNER JOIN Clients AS c 
  ON c.SIN = ms.SIN 
GROUP BY c.LNAME, c.FNAME, ms.SIN, c.LastOffice, ms.YEAR, ms.CSCH1, ms.CREVREC, ms.CINT, ms.CPBACK, ms.CL, ms.CS, ms.SS, ms.ADJ2, ms.CSCOL, ms.AR 
ORDER BY c.LNAME, c.FNAME, ms.SIN;

--
--
--

SELECT count(*)
FROM         
(
  SELECT     
    SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC, SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM('CLIENT.REC2') AS CL, SUM('CLIENT.SHORT2') AS CS, SUM('SPOUSE.SHORT2') AS SS, SUM(ADJUSTMENT2) AS ADJ2, SUM('CLIENT.REC2') + SUM('CLIENT.SHORT2') + SUM('SPOUSE.SHORT2') AS CSCOL, SUM(COALESCE (CSCH1, 0))                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
  FROM          
  (
    SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, SUM(CASE WHEN 'Adinfo' = 'P' THEN COALESCE (CKAMT, 0) ELSE 0 END) AS PBack2,   SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 FROM Disbursements AS Disbursements_1 WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) GROUP BY SIN, YEAR UNION ALL SELECT     SIN, YEAR, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 
    FROM Receipts 
    WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     SIN, YEAR, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, SUM(COALESCE (AMT_REC, 0)) AS AMT_REC2, SUM(COALESCE (INT_AMT, 0)) AS INT_AMT2, SUM(CASE WHEN 'Type' = 'RV' THEN COALESCE (AMT_REC, 0) - COALESCE (INT_AMT, 0) ELSE 0 END) AS CREVREC, SUM(CASE WHEN 'Type' = 'CL' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.REC2', SUM(CASE WHEN 'Type' = 'CS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.SHORT2', SUM(CASE WHEN 'Type' = 'SS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'SPOUSE.SHORT2', 0 AS ADJUSTMENT2 
    FROM         Receipts AS Receipts_1
    WHERE     (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                                
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     strSIN, strYear, 0 AS AMT_REC, 0 AS INT_AMT, 0 AS PBack2, 0 AS CSCH1, 0 AS AMT_REC2, 0 AS INT_AMT2, 0 AS CREVREC, 0 AS 'CLIENT.REC2', 0 AS 'CLIENT.SHORT2', 0 AS 'SPOUSE.SHORT2', SUM(decAMT) AS ADJUSTMENT2                                                
    FROM         tblAdjustments AS tblAdjustments_1                                                
    WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY strSIN, strYear
  ) AS HI
  GROUP BY SIN, YEAR                        
  HAVING      (SUM(COALESCE (CSCH1, 0)) + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) <> 0)
) AS ms 
INNER JOIN Clients AS c 
  ON c.SIN = ms.SIN 

Open in new window

I decided to upgrade to the latest version of MySQL WorkBench and it's been a mistake. I haven't got workbench to work properly since. I'll give you your results as soon as I can.
I ran your queries.

19828 rows returned      164.291 sec / 0.188 sec
1 rows returned                 32.609 sec / 0.000 sec

19828 is the count.
Wow.....

The two queries are essentially the same.  The only difference is that the faster queries counts the objects to be grouped, and the slower one selects, groups, and sorts in 5 times the time it takes to count.  Unbelievable.  132 seconds to group and sort 19,828 rows.

Is the GROUP BY clause necessary?  It looks like a very strange set of conditions would have to be met for the GROUP BY to actually have an effect.


Kent
I don't know any other way to get the same results without grouping by SIN and Year.

I'm trying to get the balance of all clients (SIN), and by year. I don't want a receipt for the year 2000 to be applied to an 2004 invoice for example. If a client has a balance, we need to know what year it applies to. Also if a client has a balance for more than one year, I need to have multiple rows for that client but with separate years.

Is there something in MySQL that I can configure that may change these results? I'm fairly new with MySQL but I have a background in MS SQL. It doesn't make any sense to me why it would take so long to do this query. I have several queries that are like this (they don't work because they time out). Yes, I'm able to extend the time out but the real issue is why does this take so long to begin with?
Just going to chime in quickly to second the motion on Kent's last question.  Looking at the query, it would appear that you don't really need all the inner GROUP BY on the UNION since you are using GROUP BY on the other query.

Or even look at changing some of how the query is structured to remove some of the need for UNION.

For example, the first two queries in the UNION go against the same table.  Without deep inspection of why, that would stick out to me as a potential change.

The outer query doesn't explicitly aggregate any column, but you still perform the GROUP BY.  Just curious...

Another good catch, Kent.
I wouldn't think this would impact performance, but just a side note that SUM() will optimize out NULLs or another way of saying it is not adding anything to 5 or adding 0 is still only 5, so COALESCE on fields involved in a SUM does nothing unless you were defaulting to a non-zero value.  Now with AVG or COUNT this may make a difference , but not a SUM.
Thanks mwvisa1 for chiming in.

The 1st Receipts sums up all the receipts and interest (regardless of type) for the client per year.

By the looks of it, the 2nd one does the same thing the first one does and also splits them into their separate type. I don't think there is a need for the 1st Receipt query?


Kdo, running the query without the 2nd group by clause returns in 164.547 seconds.


I remember that you can do that with MySQL (GROUP BY with no specific aggregation), but I don't remember the exact effect.  Perhaps it's a DISTINCT operation?

Oh -- and there are really 4 subqueries that are combined with UNION ALL.  I did a lousy job of reformatting the query.  :(

It appears that 2 of the queries can be made into a single query, but I don't think that that is where the performance issue lies.


Kent

The default value for the fields that are COALESCE are null. So if I remove COALESCE from my query, it should still work the same?

If I run this same query under MS SQL, It displays in under 5 seconds, if that.
You did a good job Kdo. It beats my crappy code post in my question!!
And for clarity, here's the query reformatted in the LONG form....


SELECT    
  c.LastOffice, 
  c.LNAME, 
  c.FNAME, 
  ms.SIN, 
  ms.YEAR, 
  ms.CSCH1, 
  ms.CREVREC, 
  ms.CPBACK, 
  ms.CSCOL AS CLREC, 
  ms.ADJ2, 
  ms.AR 
FROM         
(
  SELECT     
    SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC, SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM('CLIENT.REC2') AS CL, SUM('CLIENT.SHORT2') AS CS, SUM('SPOUSE.SHORT2') AS SS, SUM(ADJUSTMENT2) AS ADJ2, SUM('CLIENT.REC2') + SUM('CLIENT.SHORT2') + SUM('SPOUSE.SHORT2') AS CSCOL, SUM(COALESCE (CSCH1, 0))                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
  FROM          
  (
    SELECT 
      SIN, 
      YEAR, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      SUM(CASE WHEN 'Adinfo' = 'P' THEN COALESCE (CKAMT, 0) ELSE 0 END) AS PBack2,   
      SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS 'CLIENT.REC2', 
      0 AS 'CLIENT.SHORT2', 
      0 AS 'SPOUSE.SHORT2', 
      0 AS ADJUSTMENT2 
    FROM Disbursements AS Disbursements_1 
    WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) 
    GROUP BY SIN, YEAR 
    
    UNION ALL 
    
    SELECT     
      SIN, 
      YEAR, 
      SUM(COALESCE (AMT_REC, 0)) AS AMT_REC, 
      SUM(COALESCE (INT_AMT, 0)) AS INT_AMT, 
      0 AS PBack2, 
      0 AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS 'CLIENT.REC2', 
      0 AS 'CLIENT.SHORT2', 
      0 AS 'SPOUSE.SHORT2', 
      0 AS ADJUSTMENT2     
    FROM Receipts 
    WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     
      SIN, 
      YEAR, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      0 AS PBack2, 
      0 AS CSCH1, 
      SUM(COALESCE (AMT_REC, 0)) AS AMT_REC2, 
      SUM(COALESCE (INT_AMT, 0)) AS INT_AMT2, 
      SUM(CASE WHEN 'Type' = 'RV' THEN COALESCE (AMT_REC, 0) - COALESCE (INT_AMT, 0) ELSE 0 END) AS CREVREC, SUM(CASE WHEN 'Type' = 'CL' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.REC2', 
      SUM(CASE WHEN 'Type' = 'CS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'CLIENT.SHORT2', 
      SUM(CASE WHEN 'Type' = 'SS' THEN COALESCE (AMT_REC, 0) ELSE 0 END) AS 'SPOUSE.SHORT2', 
      0 AS ADJUSTMENT2 
    FROM         Receipts AS Receipts_1
    WHERE     (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                                
    GROUP BY SIN, YEAR                                                

    UNION ALL                                                

    SELECT     
      strSIN, strYear, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      0 AS PBack2, 
      0 AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS 'CLIENT.REC2', 
      0 AS 'CLIENT.SHORT2', 
      0 AS 'SPOUSE.SHORT2', 
      SUM(decAMT) AS ADJUSTMENT2                                                
    FROM         tblAdjustments AS tblAdjustments_1                                                
    WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY strSIN, strYear
  ) AS HI
  GROUP BY SIN, YEAR                        
  HAVING      (SUM(COALESCE (CSCH1, 0)) + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) <> 0)
) AS ms 
INNER JOIN Clients AS c 
  ON c.SIN = ms.SIN 
GROUP BY c.LNAME, c.FNAME, ms.SIN, c.LastOffice, ms.YEAR, ms.CSCH1, ms.CREVREC, ms.CINT, ms.CPBACK, ms.CL, ms.CS, ms.SS, ms.ADJ2, ms.CSCOL, ms.AR 
ORDER BY c.LNAME, c.FNAME, ms.SIN;

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Guess I missed a whole bunch of posts while I was analyzing the code snippet for my last posting.  Sorry if it is no longer relevant.
No, your post is still relevant. I missed out on that completely. That is going to help me a lot with the other queries that are giving me problems!


Hi John,

I'm wondering if it's a tuning problem.  There's nothing in the query that's that complex and we're not dealing with very much data.

Perhaps MySQL has too little memory available to it and it's doing a lot of paging?

What's your host system and memory configuration?


Kent

Removing the 1st Receipts table query and replacing it with what mwvisa1 said gives me 2229 rows returned in 18.844 seconds.

Unfortunately the real/correct information only gives me 1554 rows.
With that said, it seems like it may only be a syntax error as I'm not getting any results for CPBack or CLRec. Give me a few minutes to figure out where I went wrong.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Seems like we are getting somewhere if that dropped you down to 18 seconds, but I actually had same thought as Kent and was hunting down this article link for you.  Especially look at the section on MySQL configuration and setting for join buffer since that is used on one of the bigger joins here.

3 Ways to Speed Up MySQL by gr8gonzo
Hi John,

You may be able to combine the two sub-queries of Recipients, but the other items won't combine.

And that will affect the row count as all of the recipient data for a single entity will be on one line.


Kent
I'm converting my old program that used a different database to MySQL. I know that the numbers in my old program are correct. I just got to make my new program's numbers match my old one.
Got you!
Thanks Kdo. I did that and I am now down to 18 seconds but I'm getting different results from the correct information. I'm sure it's just a matter of a syntax error somewhere.

Right now I'm not getting any results for CPBack or the CLREC columns. I don't think that will be a big deal tho. I'll figure out where I fooled up.
I already found out where I fooled up. It was what Mvisa mentioned about the ' and the `.

I'm now down to 1729 rows when the correct information says I should be down to 1546.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial

That would be a 'yes', huh?  :)
Here is what I got after spotting the you had more ' around columns in the code.  See if matches up to what you have.
SELECT    
  c.LastOffice, 
  c.LNAME, 
  c.FNAME, 
  ms.SIN, 
  ms.YEAR, 
  ms.CSCH1, 
  ms.CREVREC, 
  ms.CPBACK, 
  ms.CSCOL AS CLREC, 
  ms.ADJ2, 
  ms.AR 
FROM         
(
  SELECT     
    SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC
    , SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM(`CLIENT.REC2`) AS CL
    , SUM(`CLIENT.SHORT2`) AS CS, SUM(`SPOUSE.SHORT2`) AS SS, SUM(ADJUSTMENT2) AS ADJ2
    , SUM(`CLIENT.REC2`) + SUM(`CLIENT.SHORT2`) + SUM(`SPOUSE.SHORT2`) AS CSCOL, SUM(CSCH1)                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
  FROM          
  (
    SELECT 
      SIN, 
      YEAR, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,   
      SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS `CLIENT.REC2`, 
      0 AS `CLIENT.SHORT2`, 
      0 AS `SPOUSE.SHORT2`, 
      0 AS ADJUSTMENT2 
    FROM Disbursements AS Disbursements_1 
    WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) 
    GROUP BY SIN, YEAR 
    
    UNION ALL 
    
    SELECT SIN, YEAR
     , SUM(AMT_REC) AS AMT_REC
     , SUM(INT_AMT) AS INT_AMT
     , 0 AS PBack2, 0 AS CSCH1
     , SUM(AMT_REC) AS AMT_REC2, SUM(INT_AMT) AS INT_AMT2
     , SUM(CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) END) AS CREVREC
     , SUM(CASE `Type` WHEN 'CL' THEN AMT_REC END) AS `CLIENT.REC2`
     , SUM(CASE `Type` WHEN 'CS' THEN AMT_REC END) AS `CLIENT.SHORT2`
     , SUM(CASE `Type` WHEN 'SS' THEN AMT_REC END) AS `SPOUSE.SHORT2`
     , 0 AS ADJUSTMENT2 
   FROM Receipts 
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
   GROUP BY SIN, YEAR                                                 

    UNION ALL                                                

    SELECT     
      strSIN, strYear, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      0 AS PBack2, 
      0 AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS `CLIENT.REC2`, 
      0 AS `CLIENT.SHORT2`, 
      0 AS `SPOUSE.SHORT2`, 
      SUM(decAMT) AS ADJUSTMENT2                                                
    FROM         tblAdjustments AS tblAdjustments_1                                                
    WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
    GROUP BY strSIN, strYear
  ) AS HI
  GROUP BY SIN, YEAR                        
  HAVING ((SUM(CSCH1) + SUM(PBack2) - SUM(CREVREC) - SUM(`CLIENT.REC2`) 
       - SUM(`CLIENT.SHORT2`) - SUM(`SPOUSE.SHORT2`) + SUM(ADJUSTMENT2)) <> 0)
) AS ms 
INNER JOIN Clients AS c 
  ON c.SIN = ms.SIN 
ORDER BY c.LNAME, c.FNAME, ms.SIN;

Open in new window

That actually occurs in several places in the SQL.  Check all of the sub-queries carefully.


Kent
Yes indeed Kdo. Thank you! :-)

mwvisa1, I ran your query and it ran in 13.422 seconds and gave me 1542 rows! Not quite 1546 but close!!

This has came a long ways from where I was this morning. You guys made my day!

I can't say thank you enough!
I noticed Kdo. You won't believe how much this is going to help me with the other 100+ Reports I have to fix. Thanks again!!
You are most welcome.  Hopefully, the missing 4 rows is somewhere within the fact that we consolidated some things and hopefully you find what tweaking needs to be done to resolve that final piece of the puzzle.

Best regards,

Kevin
By the way, I almost forgot, here is what I meant by my earlier thought on the inner GROUP BY -- not sure if my thinking is correct that performing this operation once is better than the 4 times.  Might be gaining efficiency since each of the UNION ALL statements is grouped into smaller result set before the UNION and GROUP BY , but worth the shot to see if it is more efficient to simply select the rows and then aggregate at once.  Sometimes I find this works better as a lot of the processing time often is the collation and ordering, etc. of the data which is why you saw that the COUNT() of your query was dramatically less time to process than the actual results since the results had to be sorted whereas the count only cares about existence.
SELECT    
  c.LastOffice, 
  c.LNAME, 
  c.FNAME, 
  ms.SIN, 
  ms.YEAR, 
  ms.CSCH1, 
  ms.CREVREC, 
  ms.CPBACK, 
  ms.CSCOL AS CLREC, 
  ms.ADJ2, 
  ms.AR 
FROM         
(
  SELECT     
    SIN, YEAR, SUM(CSCH1) AS CSCH1, SUM(CREVREC) + SUM(INT_AMT2) AS CREVREC
    , SUM(INT_AMT2) AS CINT, SUM(PBack2) AS CPBACK, SUM(`CLIENT.REC2`) AS CL
    , SUM(`CLIENT.SHORT2`) AS CS, SUM(`SPOUSE.SHORT2`) AS SS, SUM(ADJUSTMENT2) AS ADJ2
    , SUM(`CLIENT.REC2`) + SUM(`CLIENT.SHORT2`) + SUM(`SPOUSE.SHORT2`) AS CSCOL, SUM(CSCH1)                                                + SUM(COALESCE (PBack2, 0)) - SUM(COALESCE (CREVREC, 0)) - SUM(COALESCE ('CLIENT.REC2', 0)) - SUM(COALESCE ('CLIENT.SHORT2', 0)) - SUM(COALESCE ('SPOUSE.SHORT2', 0)) + SUM(COALESCE (ADJUSTMENT2, 0)) AS AR                        
  FROM          
  (
    SELECT 
      SIN, 
      YEAR, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      CASE `Adinfo` WHEN 'P' THEN CKAMT END AS PBack2,   
      C_SCH1_REFUND AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS `CLIENT.REC2`, 
      0 AS `CLIENT.SHORT2`, 
      0 AS `SPOUSE.SHORT2`, 
      0 AS ADJUSTMENT2 
    FROM Disbursements AS Disbursements_1 
    WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y')) 
    
    UNION ALL 
    
    SELECT SIN, YEAR
     , AMT_REC
     , INT_AMT
     , 0 AS PBack2, 0 AS CSCH1
     , AMT_REC AS AMT_REC2, INT_AMT AS INT_AMT2
     , CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) END AS CREVREC
     , CASE `Type` WHEN 'CL' THEN AMT_REC END AS `CLIENT.REC2`
     , CASE `Type` WHEN 'CS' THEN AMT_REC END AS `CLIENT.SHORT2`
     , CASE `Type` WHEN 'SS' THEN AMT_REC END AS `SPOUSE.SHORT2`
     , 0 AS ADJUSTMENT2 
   FROM Receipts 
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                                                                             

    UNION ALL                                                

    SELECT     
      strSIN, strYear, 
      0 AS AMT_REC, 
      0 AS INT_AMT, 
      0 AS PBack2, 
      0 AS CSCH1, 
      0 AS AMT_REC2, 
      0 AS INT_AMT2, 
      0 AS CREVREC, 
      0 AS `CLIENT.REC2`, 
      0 AS `CLIENT.SHORT2`, 
      0 AS `SPOUSE.SHORT2`, 
      decAMT AS ADJUSTMENT2                                                
    FROM         tblAdjustments AS tblAdjustments_1                                                
    WHERE     (Date <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))                                               
  ) AS HI
  GROUP BY SIN, YEAR                        
  HAVING ((SUM(CSCH1) + SUM(PBack2) - SUM(CREVREC) - SUM(`CLIENT.REC2`) 
       - SUM(`CLIENT.SHORT2`) - SUM(`SPOUSE.SHORT2`) + SUM(ADJUSTMENT2)) <> 0)
) AS ms 
INNER JOIN Clients AS c 
  ON c.SIN = ms.SIN 
ORDER BY c.LNAME, c.FNAME, ms.SIN;

Open in new window

mwvisa1, that gives me 1542 rows in 15.328 seconds vs 13.828 seconds.

Great suggestion tho. I never quite understood what you were trying to say but now I do.
I am glad and thanks for sharing the results.  Confirms that though less efficient looking from a code maintenance standpoint that the consolidation of records each level has to deal with is slightly more efficiency by having the multiple group by statements.  

Well let us know if we can do anything further to assist in finding the missing rows ...
(might be good to rerun the SQL server version and see if the 4 rows went away their two and it is just a timing thing ... another thought that may help in the migration process is to create a linked server to your MySQL system from MS SQL and then on the MS SQL side you can use EXCEPT keyword to get a quick result set of the rows that are different between the two queries -- http://msdn.microsoft.com/en-us/library/ms188055.aspx)
I can open up another question on this if you like as it's really not the same question as the original but I'm going to ask it because you two have been so unbelievably helpful to me!

I thought that with 'Union all', it would union all the records regardless if there is a record from table A that doesn't have a record in table B. That doesn't seem to be the case.

I found the missing records. They both have Invoices (CSCH1) but they don't have anything else associated with them. I can look them up in the client file and they're both identical.

If I run the following query, I actually find the missing records.

    SELECT  
      SIN,  
      YEAR,  
      0 AS AMT_REC,  
      0 AS INT_AMT,  
      SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,    
      SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1,  
      0 AS AMT_REC2,  
      0 AS INT_AMT2,  
      0 AS CREVREC,  
      0 AS `CLIENT.REC2`,  
      0 AS `CLIENT.SHORT2`,  
      0 AS `SPOUSE.SHORT2`,  
      0 AS ADJUSTMENT2  
    FROM Disbursements AS Disbursements_1  
    WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
    GROUP BY SIN, YEAR  


but if I run this query, the two records disappear.

SELECT  
      SIN,  
      YEAR,  
      0 AS AMT_REC,  
      0 AS INT_AMT,  
      SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,    
      SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1,  
      0 AS AMT_REC2,  
      0 AS INT_AMT2,  
      0 AS CREVREC,  
      0 AS `CLIENT.REC2`,  
      0 AS `CLIENT.SHORT2`,  
      0 AS `SPOUSE.SHORT2`,  
      0 AS ADJUSTMENT2  
    FROM Disbursements AS Disbursements_1  
    WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
    GROUP BY SIN, YEAR  
       UNION all  
    SELECT SIN, YEAR
     , SUM(AMT_REC) AS AMT_REC
     , SUM(INT_AMT) AS INT_AMT
     , 0 AS PBack2, 0 AS CSCH1
     , SUM(AMT_REC) AS AMT_REC2, SUM(INT_AMT) AS INT_AMT2
     , SUM(CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) END) AS CREVREC
     , SUM(CASE `Type` WHEN 'CL' THEN AMT_REC END) AS `CLIENT.REC2`
     , SUM(CASE `Type` WHEN 'CS' THEN AMT_REC END) AS `CLIENT.SHORT2`
     , SUM(CASE `Type` WHEN 'SS' THEN AMT_REC END) AS `SPOUSE.SHORT2`
     , 0 AS ADJUSTMENT2
   FROM Receipts
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
   GROUP BY SIN, YEAR  


Any suggestions?
Hi John,
This may be a "precedence" issue.  The final GROUP BY clause may not be operating as we think.

Try the query below.  It forces the final GROUP BY to affect only the low sub-query.


Kent

SELECT  
     SIN,  
     YEAR,  
     0 AS AMT_REC,  
     0 AS INT_AMT,  
     SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,    
     SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1,  
     0 AS AMT_REC2,  
     0 AS INT_AMT2,  
     0 AS CREVREC,  
     0 AS `CLIENT.REC2`,  
     0 AS `CLIENT.SHORT2`,  
     0 AS `SPOUSE.SHORT2`,  
     0 AS ADJUSTMENT2  
   FROM Disbursements AS Disbursements_1  
   WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
   GROUP BY SIN, YEAR  

      UNION all  
SELECT * FROM
(
   SELECT SIN, YEAR
     , SUM(AMT_REC) AS AMT_REC
     , SUM(INT_AMT) AS INT_AMT
     , 0 AS PBack2, 0 AS CSCH1
     , SUM(AMT_REC) AS AMT_REC2, SUM(INT_AMT) AS INT_AMT2
     , SUM(CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) END) AS CREVREC
     , SUM(CASE `Type` WHEN 'CL' THEN AMT_REC END) AS `CLIENT.REC2`
     , SUM(CASE `Type` WHEN 'CS' THEN AMT_REC END) AS `CLIENT.SHORT2`
     , SUM(CASE `Type` WHEN 'SS' THEN AMT_REC END) AS `SPOUSE.SHORT2`
     , 0 AS ADJUSTMENT2
   FROM Receipts
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
  GROUP BY SIN, YEAR  
) t0

Open in new window

I tested your query. The rows are still missing.
The GROUP BY operator should take precedence over the UNION ALL, so it may be the GROUP BY on the whole UNION result:

 GROUP BY SIN, YEAR                        
  HAVING ((SUM(CSCH1) + SUM(PBack2) - SUM(CREVREC) - SUM(`CLIENT.REC2`)
       - SUM(`CLIENT.SHORT2`) - SUM(`SPOUSE.SHORT2`) + SUM(ADJUSTMENT2)) <> 0)

If you run JUST the UNION portion , do the records still show up and if so are there any other rows for the same SIN, YEAR that could possibly make the formula in the HAVING clause equate to 0 ??
I can't figure out if the records show up without the group by as it's 3 clients that aren't showing up.

I'm going to run the query again with

((SUM(CSCH1) + SUM(PBack2) - SUM(CREVREC) - SUM(`CLIENT.REC2`)
       - SUM(`CLIENT.SHORT2`) - SUM(`SPOUSE.SHORT2`) + SUM(ADJUSTMENT2)) = 0)

(instead of <> 0) to see if the 3 records show up there. If they do than we know where our problem is.
The records don't show up when I attempt the query with = 0 instead of <> 0.
Hi John,

If you take your last query and run the upper and lower queries separately, do the "missing" lines appear from the upper or lower query?


Kent
If I run the top query it does. If I run the lower query it does not.

The top query is the only part that contains any information on the missing records. The clients have invoices and nothing else.

The lower query doesn't contain any information on the missing records.
There's no guaranteed sort order of the resulting rows.  Have you piped the output through `grep` or checked it with an editor?


Kent
I checked the rows by the costs of the invoices (they are unique numbers).

No I haven't piped the output through `grep`. I have no idea what you're referring to unfortunately.
Ok.  Let's "adjust" the query a bit to guarantee the right order.

All of the rows for every value of SIN should be adjacent.
SELECT * FROM
(
SELECT  
     SIN,  
     YEAR,  
     0 AS AMT_REC,  
     0 AS INT_AMT,  
     SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,    
     SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1,  
     0 AS AMT_REC2,  
     0 AS INT_AMT2,  
     0 AS CREVREC,  
     0 AS `CLIENT.REC2`,  
     0 AS `CLIENT.SHORT2`,  
     0 AS `SPOUSE.SHORT2`,  
     0 AS ADJUSTMENT2  
   FROM Disbursements AS Disbursements_1  
   WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
   GROUP BY SIN, YEAR  
      UNION all  
   SELECT SIN, YEAR
     , SUM(AMT_REC) AS AMT_REC
     , SUM(INT_AMT) AS INT_AMT
     , 0 AS PBack2, 0 AS CSCH1
     , SUM(AMT_REC) AS AMT_REC2, SUM(INT_AMT) AS INT_AMT2
     , SUM(CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) END) AS CREVREC
     , SUM(CASE `Type` WHEN 'CL' THEN AMT_REC END) AS `CLIENT.REC2`
     , SUM(CASE `Type` WHEN 'CS' THEN AMT_REC END) AS `CLIENT.SHORT2`
     , SUM(CASE `Type` WHEN 'SS' THEN AMT_REC END) AS `SPOUSE.SHORT2`
     , 0 AS ADJUSTMENT2
   FROM Receipts
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
  GROUP BY SIN, YEAR  
) t0
order by 1, 2

Open in new window

I used your query and the 3 rows are still missing.
It is very strange for UNION ALL to removing rows.  Even with UNION, it would only remove duplicates and so if the rows only exist in the first query and not the second then a UNION of the two should still contain those rows.  Something very strange is occurring there...
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I agree. This certainly is a challenge that is pretty odd to figure out.

All 3 rows that are missing only have invoices but no receipts associated with them. That is the only common factor that I can see.
Kent,

That was probably my fault, I think I confused SUM() and COUNT(), but probably because the assumption is that there is at least one row with a valid value.  So adding 5 and 0 resulted the same as 5 and null where null was optimized out.  However, if all the rows are null then the result is indeed null, so I may have been wrong in recommendation of that -- again I figured the outer group by doing aggregate on the NULL and the 0 literals of the other UNION'd queries would result in 0.

This shouldn't cause the issue, but in your formula where you check :
((SUM(CSCH1) + SUM(PBack2) - SUM(CREVREC) - SUM(`CLIENT.REC2`)
       - SUM(`CLIENT.SHORT2`) - SUM(`SPOUSE.SHORT2`) + SUM(ADJUSTMENT2)) = 0)

It could make a difference as NULL + anything will be NULL which can't be directly compared and so will be excluded from the final result set.

Kevin
I would recommend putting this on the outside instead though :

((COALESCE(SUM(CSCH1), 0) + COALESCE(SUM(PBack2), 0)
       - COALESCE(SUM(CREVREC), 0) - COALESCE(SUM(`CLIENT.REC2`) , 0)
       - COALESCE(SUM(`CLIENT.SHORT2`), 0) - COALESCE(SUM(`SPOUSE.SHORT2`), 0)
      + COALESCE(SUM(ADJUSTMENT2), 0)) = 0)
Kdo, you solved the 2nd mystery.

I did the changes as you suggested and the 3 records decided to show themselves.

THANK YOU VERY MUCH!!

I hope to keep in contact with you two if I have any more questions on this. It's pretty amazing how you guys can 'pick and poke' at the problem until you solve it. I may have spent a week trying to figure out this problem on my own.

THANK YOU AGAIN!!
Actually, when I reformat the lower query to match the upper,4 more rows have the same potential to return NULL in certain data columns.

I'm probably chasing our tails, but NULLS do have weird effects on things.



Kent

SELECT * FROM
(
SELECT  
     SIN,  
     YEAR,  
     0 AS AMT_REC,  
     0 AS INT_AMT,  
     SUM(CASE `Adinfo` WHEN 'P' THEN CKAMT END) AS PBack2,    
     SUM(COALESCE (C_SCH1_REFUND, 0)) AS CSCH1,  
     0 AS AMT_REC2,  
     0 AS INT_AMT2,  
     0 AS CREVREC,  
     0 AS `CLIENT.REC2`,  
     0 AS `CLIENT.SHORT2`,  
     0 AS `SPOUSE.SHORT2`,  
     0 AS ADJUSTMENT2  
   FROM Disbursements AS Disbursements_1  
   WHERE (CKDATE <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
   GROUP BY SIN, YEAR  
      UNION all  
   SELECT 
     SIN, 
     YEAR, 
     SUM(AMT_REC) AS AMT_REC, 
     SUM(INT_AMT) AS INT_AMT, 
     0 AS PBack2, 
     0 AS CSCH1, 
     SUM(AMT_REC) AS AMT_REC2, 
     SUM(INT_AMT) AS INT_AMT2, 
     SUM(CASE `Type` WHEN 'RV' THEN COALESCE(AMT_REC, 0) - COALESCE(INT_AMT, 0) ELSE 0 END) AS CREVREC, 
     SUM(CASE `Type` WHEN 'CL' THEN AMT_REC ELSE 0 END) AS `CLIENT.REC2`, 
     SUM(CASE `Type` WHEN 'CS' THEN AMT_REC ELSE 0 END) AS `CLIENT.SHORT2`, 
     SUM(CASE `Type` WHEN 'SS' THEN AMT_REC ELSE 0 END) AS `SPOUSE.SHORT2`, 
     0 AS ADJUSTMENT2
   FROM Receipts
   WHERE (DATE_REC <= STR_TO_DATE('10/31/2010', '%m/%d/%Y'))  
  GROUP BY SIN, YEAR  
) t0
order by 1, 2

Open in new window

Yep.  NULLS do have weird behavior.....


I never ever thought in all of my programming career I would have so much troubles with nothing at all.

This isn't my first battle with nulls, nor is it my last.

Thank you again!!
Thank you very much!