select max of rows for each dimension

Table : THE_TABLE
Columns:
THE_TABLE_ID int
DATE_ID date_id  - (day since Jan 1 1970)
TIME_ID time_id - subset of DATE_ID  (the minute of the day (1 to 60 minutes *24))
DIMENSION_ID - the dimension to show the latest value for each unique entry

How do I select the row with the highest DATE_ID and TIME_ID for each DIMENSION_ID? All ID columns are indexed, so I do not want to use a function. I need this to work on DB2, MSSQL,. and Oracle.

This code example works to find the latest of all rows, but I need it to return a row for each unique DIMESNION_ID.

CONSTRAINTS:
SQL must work in Oracle, DB2, and MSSQL
SQL must use the indexes on DATE_ID and TIME_ID columns.
SELECT *
  FROM (SELECT t.*
            , ROW_NUMBER() OVER (ORDER BY DATE_ID DESC, TIME_ID DESC) r
         FROM the_table t
      ) sq
WHERE sq.r <= (select count(distinct DIMENSION_ID) from the_table)

Open in new window

mbevilacquaAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

JestersGrindCommented:
Try this.

Greg




;WITH CTE
AS
(
    SELECT DIMENSION_ID, DATE_ID, TIME_ID,
          ROW_NUMBER() OVER (PARTITION BY DIMENSION_ID ORDER BY DATE_ID DESC, TIME_ID DESC) AS Row
    FROM THE_TABLE       
)
SELECT DIMENSION_ID, DATE_ID, TIME_ID
FROM CTE
WHERE Row = 1

Open in new window

0
cminearCommented:
By my reading, you'd like the maximum date_id for each dimension_id, and then get the maximum time_id from that set for each dimension, and finally include the_table_id for rows that have those max. dates and times for each dimension.  If so, I believe the query below should work, in general.  It should be generic SQL which should work with each database you stated, but I have not tested it specifically on any of them.  As to your second constraint, well, that would depend on the query optimizers for each database engine.  There's no reason why they wouldn't use the indices for these queries (as there is no external functions involved), but I obviously cannot say for certain.

In the SQL, here is what I mean for each of the subselect abbreviations:
  mtpdd -- max time per date and dimension
  mdpd  -- max date per dimension
  mdtpd -- max time and date per dimension

Starting with the first subselect (mtpdd), we do a select which determines the maximum time for each dimension-date pair.  Yes, this is more than what you want, but please be patient.  The next subselect (mdpd) determines the maximum date for each dimension.  When we do a(n INNER) JOIN of mtpdd with mdpd, we get the maximum date and time for each dimension (mdtpd).  If all you want is just this information, then you're done.  But my interpretation of your question made me think you wanted the the_table_id for each row which has the max date and time for a dimension.  So we have perform the JOIN between the_table and mdtpd.

The caveat is that you would get more than one row for any dimension, if there are more than one row with the maximum date and time for a dimension.  So if you have code assuming only one row per dimension, that may break or not work as expected.  But if your data is such that that would never be the case, then of course you'd only get one row per dimension.
SELECT t.* FROM
the_table t 
JOIN
(SELECT mtpdd.dimension_id, mtpdd.date_id, mtpdd.max_time AS time_id FROM
   (SELECT dimension_id, date_id, max(time_id) as max_time 
    FROM the_table 
    GROUP BY dimension_id, date_id) AS mtpdd
   JOIN 
   (SELECT dimension_id, max(date_id) as max_date 
    FROM the_table 
    GROUP BY dimension_id) AS mdpd
   ON (mtpdd.dimension_id = mdpd.dimension_id
       AND mtpdd.date_id = mdpd.max_date)) 
AS mdtpd
ON (t.dimension_id = mdtpd.dimension_id AND t.date_id = mdtpd.date_id
    AND t.time_id = mdtpd.time_id)
ORDER BY dimension_id

Open in new window

0
cminearCommented:
After a little extra reading, you may want to replace all occurrences of "JOIN" in my query above with "INNER JOIN".  DB2 may not like "JOIN" alone, and we don't want an OUTER JOIN to be assumed.
0
cminearCommented:
With a little tweaking, I got model-specific versions of my query working with Oracle 10g and SQL Server 2005.  In addition to the 'INNER JOIN' change I suggested before, it would appear that Oracle doesn't like the 'AS' between the subselect and the label for the subselect.  Hopefully, this query would also easily work with DB2 (to which I do not have convenient access).
SELECT t.* FROM
the_table t 
INNER JOIN
(SELECT mtpdd.dimension_id, mtpdd.date_id, mtpdd.max_time AS time_id FROM
   (SELECT dimension_id, date_id, max(time_id) as max_time 
    FROM the_table 
    GROUP BY dimension_id, date_id) mtpdd
   INNER JOIN 
   (SELECT dimension_id, max(date_id) as max_date 
    FROM the_table 
    GROUP BY dimension_id) mdpd
   ON (mtpdd.dimension_id = mdpd.dimension_id
       AND mtpdd.date_id = mdpd.max_date)) mdtpd
ON (t.dimension_id = mdtpd.dimension_id AND t.date_id = mdtpd.date_id
    AND t.time_id = mdtpd.time_id)
ORDER BY dimension_id

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.