asked on

How do I compare row values in a MYSQL table so that I can count how many have increased/decreased within the grouped values

I have the following table which records attainment levels of a Client in different areas on a review date.

clientcode review_date area1_level area2_level area3_level area4_level
C1013      2013-01-14      5       8      5      4
C1013      2013-04-04      5      8      5      5
C1013      2013-06-25      5      8      5      5
C1028      2014-07-07      9      5      10      9
C1031      2013-10-25      5      7      5      8
C1031      2014-01-21      3      2      5      4
C1061      2012-09-07      8      10      3      7
C1061      2012-12-04      8      10      3      7
C1061      2013-03-05      7      10      3      7
C1061      2013-06-13      7      10      4      7
C1061      2013-09-12      7      10      4      7
C1068      2012-12-17      8      1      8      9
C1068      2013-03-07      8      1      8      9
C1118      2014-03-11      8      5      8      7

Each client has a unique client code.

I would like to be able to
1) Compare the first review date values with the latest review date values in such a way as that I can count how many clients have ‘improved’, ‘worsened’ or ‘no change’ in each area.
2) Compare the latest review date values with the previous review date values in the same manner.

If there was only one review then I need to count this as a ‘no change’ in each area.
I wonder if the best way to do this is to join the first review columns to the latest review value columns, then compare the values with a comparison column for each area and then count the result.

The table above is the result of a complex query select statement already.

I have the following questions.
1.      Can this be easily done with a single query in a simpler way, without having to join the complex query which created the above with it’s self?
2.      If I have a series of review dates to obtain the latest date I can “GROUP the clientcode and then SORT BY review_date DESC LIMIT 1 “ to get the latest review date values but how to I select the date before the latest review date?
3.      If there isn’t any previous review date values how can I instead use take the one review values and use them again, thereby showing ‘no change’?

Thanks for your help.

EICT

ASKER

I've resolved No.1 by

1. selecting the results to get the table as above but order the review_date in DESC order
2. then select this table using the Group By clause to receive the first review record for each clientcode group, called table1
3. selecting the results to get the table as above but order the review_date in ASC order
4. then select this table using the Group By clause to receive the last review record for each clientcode group, called table2
5. join subqueries table1 and table2
6. Select the result of all this and then use 'IF' statements in the SELECT clause to compare the first and last values.

Not sure how to complete No2. though and get the review before the last review. If that makes sense.

PortletPaul

How close or useful is this?

| CLIENTCODE | AREA | NUM_RECORDS |   EARLIEST |     LATEST | MIN_LEVEL | MAX_LEVEL | DELTA |
|------------|------|-------------|------------|------------|-----------|-----------|-------|
|      C1013 |    1 |           3 | 2013-01-14 | 2013-06-25 |         5 |         5 |     0 |
|      C1013 |    2 |           3 | 2013-01-14 | 2013-06-25 |         8 |         8 |     0 |
|      C1013 |    3 |           3 | 2013-01-14 | 2013-06-25 |         5 |         5 |     0 |
|      C1013 |    4 |           3 | 2013-01-14 | 2013-06-25 |         4 |         5 |     1 |
|      C1028 |    1 |           1 | 2014-07-07 | 2014-07-07 |         9 |         9 |     0 |
|      C1028 |    2 |           1 | 2014-07-07 | 2014-07-07 |         5 |         5 |     0 |
|      C1028 |    3 |           1 | 2014-07-07 | 2014-07-07 |        10 |        10 |     0 |
|      C1028 |    4 |           1 | 2014-07-07 | 2014-07-07 |         9 |         9 |     0 |
|      C1031 |    1 |           2 | 2013-10-25 | 2014-01-21 |         3 |         5 |     2 |
|      C1031 |    2 |           2 | 2013-10-25 | 2014-01-21 |         2 |         7 |     5 |
|      C1031 |    3 |           2 | 2013-10-25 | 2014-01-21 |         5 |         5 |     0 |
|      C1031 |    4 |           2 | 2013-10-25 | 2014-01-21 |         4 |         8 |     4 |
|      C1061 |    1 |           5 | 2012-09-07 | 2013-09-12 |         7 |         8 |     1 |
|      C1061 |    2 |           5 | 2012-09-07 | 2013-09-12 |        10 |        10 |     0 |
|      C1061 |    3 |           5 | 2012-09-07 | 2013-09-12 |         3 |         4 |     1 |
|      C1061 |    4 |           5 | 2012-09-07 | 2013-09-12 |         7 |         7 |     0 |
|      C1068 |    1 |           2 | 2012-12-17 | 2013-03-07 |         8 |         8 |     0 |
|      C1068 |    2 |           2 | 2012-12-17 | 2013-03-07 |         1 |         1 |     0 |
|      C1068 |    3 |           2 | 2012-12-17 | 2013-03-07 |         8 |         8 |     0 |
|      C1068 |    4 |           2 | 2012-12-17 | 2013-03-07 |         9 |         9 |     0 |
|      C1118 |    1 |           1 | 2014-03-11 | 2014-03-11 |         8 |         8 |     0 |
|      C1118 |    2 |           1 | 2014-03-11 | 2014-03-11 |         5 |         5 |     0 |
|      C1118 |    3 |           1 | 2014-03-11 | 2014-03-11 |         8 |         8 |     0 |
|      C1118 |    4 |           1 | 2014-03-11 | 2014-03-11 |         7 |         7 |     0 |

Open in new window

it was produced by the following query:

SELECT
      clientcode
    , area
    , COUNT(*)                                 AS num_records
    , date_format(MIN(review_date),'%Y-%m-%d') AS earliest
    , date_format(MAX(review_date),'%Y-%m-%d') AS latest
    , MIN(level)                               AS min_level
    , MAX(level)                               AS max_level
    , MAX(level) - MIN(level)                  AS delta
FROM (

            SELECT
                  clientcode
                , review_date
                , 1           AS area
                , area1_level AS level
            FROM Results
            UNION ALL
                  SELECT
                        clientcode
                      , review_date
                      , 2           AS area
                      , area2_level AS level
                  FROM Results
            UNION ALL
                  SELECT
                        clientcode
                      , review_date
                      , 3           AS area
                      , area3_level AS level
                  FROM Results
            UNION ALL
                  SELECT
                        clientcode
                      , review_date
                      , 4           AS area
                      , area4_level AS level
                  FROM Results
      ) x
GROUP BY
      clientcode
    , area
;

Open in new window

Note. The majority of the query is using UNION ALL to "unpivot" your data into a normalized form which makes calculations much simpler. I do understand this query isn't exactly what you asked for, but what you have asked for is more complex than one query.

ASKER CERTIFIED SOLUTION

PortletPaul

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

EICT

ASKER

Hi PortletPaul,
The second query is exactly what I want to get the LEVEL_AT_MIN_DT and LEVEL_AT MAX_DT.

Before I close this question. Do you have any idea how I can pick the Maximum Date and the Date Before. I need this as well as the Max/Min Dates.

For example, using my data above for Client C1061 I want data from rows with review dates 2013-06-13 and 2013-09-12

Thanks

EICT

ASKER

I suppose I could do a separate sub query to find all the dates less than the Max date (i.e all the dates less than 2013-09-12) and then pick the max from this result.

Regards,
Matt

PortletPaul

>>" how I can pick the Maximum Date and the Date Before"

You need the equivalent of row_number() over(partition by ... order by ...) regrettably this "analytic function" isn't available in MySQL.

There was a nice blog on this topic which went missing, but here is a link to it that still works

In essence you attache 2 @variable to the row of a query using a cross join. Then you set value into those variables on each row so that you establish a "row number" starting at 1 and order by whatever fields you choose. This is "partitioned" (which is similar to "group by") so each partition gets to start at 1.

In the end you can filter for row numbers 1 and 2.

SELECT 
        clientcode
      , review_date
      , RowNumber
FROM (
      SELECT
            @row_num :=IF(@prev_value = r.clientcode, @row_num + 1, 1)AS RowNumber
          , r.clientcode
          , r.review_date
          , @prev_value := r.clientcode prev_client
      FROM Results r
            CROSS JOIN (
                        SELECT @row_num :=1,  @prev_value :=''
                       ) vars
      ORDER BY
            r.clientcode
          , r.review_date DESC
     ) sq
WHERE RowNumber < 3

Open in new window

http://sqlfiddle.com/#!9/9d789/18

EICT

ASKER

Thanks for your help.