Link to home
Start Free TrialLog in
Avatar of rwheeler23
rwheeler23Flag for United States of America

asked on

SQL Syntax sum by highest version needed

I have a table with a structure very similar to this:

ORDRNMBR,ORDRVRSN,ORDRDATE,ORDRAMNT where in this case each order may have multiple versions and hence the need for the ORDRVRSN field.

If it were not for that field a simple SELECT SUM(ORDRAMNT) would do. However for the 5% of the orders that do have multiple versions when it comes to summing I only want the order amount associated with the highest version number. So my question is what is the proper syntax to only include the highest version number for these orders while also including all the other orders in the total.
SOLUTION
Avatar of Russell Fox
Russell Fox
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of rwheeler23

ASKER

The issue is I only want one of the amounts included in the total. For example:

ORDRNMBR,ORDRVRSN,ORDRDATE,ORDRAMNT
1000,0,12/31/15,1000
1001,0,12/30/15,2000
1002,0,12/29/15,3000
1002,1,12/29/15,4000

So the sum needs to include only 1000,2000 and 4000 because order 1002 has a second version so I only want the highest version amount which is 4000. The 3000 is not to be included because it is the first version of this order.
That should work then. If you just want the sum, remove the other fields:
; WITH cteOrders
AS (
    SELECT 
        ORDRNMBR
        ,ORDRVRSN
        ,ORDRDATE
        ,ORDRAMNT
        , ROW_NUMBER() OVER (PARTITION BY ORDRNMBR ORDER BY ORDRVRSN DESK) AS VersionRank
    FROM ORDER_TABLE
)
SELECT SUM(ORDRAMNT) AS ORDRAMNT_TOTAL
FROM cteOrders
WHERE VersionRank = 1

Open in new window

If you want to see what the ROW_NUMBER is producing, select everything from the CTE to make sure the new VersionRank field is really giving you what you want to see:
; WITH cteOrders
AS (
    SELECT 
        ORDRNMBR
        ,ORDRVRSN
        ,ORDRDATE
        ,ORDRAMNT
        , ROW_NUMBER() OVER (PARTITION BY ORDRNMBR ORDER BY ORDRVRSN DESK) AS VersionRank
    FROM ORDER_TABLE
)
SELECT *
FROM cteOrders
ORDER BY ORDRNMBR, ORDRVRSN, VersionRank

Open in new window

Avatar of PortletPaul
"DESK" should be DESC in the above

It isn't essential (or in my view even desirable) to use a CTE because a simple "derived table" provides the same result.
SELECT
      *
FROM (
      SELECT
            ORDRNMBR
          , ORDRVRSN
          , ORDRDATE
          , ORDRAMNT
          , ROW_NUMBER() OVER (PARTITION BY ORDRNMBR 
                              ORDER BY ORDRVRSN DESC) AS VersionRank
      FROM ORDER_TABLE
    ) AS derived
ORDER BY
      ORDRNMBR
    , ORDRVRSN
    , VersionRank

Open in new window

no points please
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The query by Russell Fox which does - correctly - include the where clause for versionrank = 1

I forgot to include it - sorry
all I wanted to point out is that a derived table can be used, and that a CTE isn't mandatory.

RANK() can result in more than one row (within a single partition) sharing the same value

ROW_NUMBER() does not do that. I would stay with row_number myself.
Thank you all for your input on this. Depending on who is working on any given day, this query may be run repeatedly and has to parse thousands of records.  Is there any appreciable performance difference between these two approaches?
A better approach might be to add a flag to the table, IsActive, BIT, default = 1, and when you create a new version of an order you deactivate any prior versions. Then in your query just add "WHERE IsActive = 1".
Let me investigate the ramifications of doing that.