Need to group (sum) within a subset of data

I need to sum up a set of numbers that have a particular status - however, I do not have enough data to merely use a group by on it.  Here is the original data:

CarID      Order    Date/Time                         Miles      Status
5             546       5/2/2009  01:30:00              5             Empty
5             546       5/2/2009  02:00:00             10            Empty
5             546       5/2/2009  03:30:00             200          Full
5             546       5/2/2009  06:30:00              5             Empty
5             546       5/2/2009  07:30:00             25            Empty
48           503       5/2/2009  01:00:00              3             Empty
48           503       5/2/2009  02:00:00             20            Full
48           503       5/2/2009  03:00:00             30            Full
48           503       5/2/2009  04:00:00             10            Empty

I need to sum the miles based on the carID, OrderID, and status, but also keep the latest date/time.  Done correctly, the data would look like this:
CarID      Order    Date/Time                         Miles         Status
5             546       5/2/2009  02:00:00             15            Empty
5             546       5/2/2009  03:30:00             200          Full
5             546       5/2/2009  07:30:00             30            Empty
48           503       5/2/2009  01:00:00              3             Empty
48           503       5/2/2009  03:00:00             50            Full
48           503       5/2/2009  04:00:00             10            Empty

I cannot group on CarID, OrderID, and Status, as that would sum ALL empty miles.  I need to sum all the Empty miles for a car & order until I get to a full status, then sum all the full miles until I get to an empty statuts, etc.  Essentially, I'm summing the empty and full miles in the date/time order that they occured.

Having a brain fart - or maybe this is one of those 'simple to explain, hard to execute' situations?
LVL 1
LogisticsOneAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

JR2003Commented:
You just need a group by and Max(DateTime)
SELECT [CarID], [Order], Max([DateTime]), [Miles], [Status]
  FROM myTable
 GROUP BY [CarID], [Order], [Miles], [Status]

Open in new window

0
JR2003Commented:
On second reading of your problem I see the problem. Personally, I would write a stored procedure with a cursor in it to get the data and test each row against the previous one. However, you can however do it with a single SQL statement given below. There is an article describing how it works here: http://www.sqlteam.com/article/detecting-runs-or-streaks-in-your-data

SELECT A.[CarID], A.[Order], Max(A.[DateTime]) [DateTime], Sum(A.[Miles]), A.[Status]
  FROM (SELECT [CarID],
               [Order],
               [DateTime],
               [Miles],
               [Status],
               (SELECT Count(*)
                  FROM myTable T1
                 WHERE T1.Status <> T2.Status
                   AND T1.[Datetime] <= T2.[Datetime]
                   AND T1.CarId = T2.CarId) as Group1
          FROM myTable T2) A
  GROUP BY A.[CarID], A.[Order],  A.[Status], A.Group1
  ORDER BY CarId, [DateTime]

Open in new window

0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
LogisticsOneAuthor Commented:
Okay, I've got the query put together, but I have a meeting to go to (Arrrgh!) - I can't wait to try it out after the meeting!
0
The Ultimate Tool Kit for Technolgy Solution Provi

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more!

LogisticsOneAuthor Commented:
JR2003. IS. THE. BOMB.

Had to tweak the query just a bit, but the query you provided, along with the link, was exactly the direction I needed!!

SELECT A.[CarID], A.[Order], Max(A.[DateTime]), Sum(A.[Miles]), A.[Status]
  FROM (SELECT [CarID],
               [Order],
               [DateTime],
               [Miles],
               [Status],
               (SELECT Count(*)
                  FROM myTable T1
                 WHERE T1.Status <> T2.Status
                   AND T1.[Datetime] <= T2.[Datetime]
                   AND T1.CarId = T2.CarId) as Group1
          FROM myTable T2) A
  GROUP BY A.[CarID], A.[Order],  A.[Status], A.Group1
  ORDER BY CarId, Max(A.[DateTime])
0
LogisticsOneAuthor Commented:
Awesome!
0
JR2003Commented:
Just watch out this query doesn't bite you from behind.
It will work well on very small amounts of data but if there are a lot of rows then a cursor and a bit of row by row processing will be a lot quicker.
0
LogisticsOneAuthor Commented:
When you say 'small amounts' - about how many records is that?
0
JR2003Commented:
For each row returned in the reults it executes the query
(SELECT Count(*)
                  FROM myTable T1
                 WHERE T1.Status <> T2.Status
                   AND T1.[Datetime] <= T2.[Datetime]
                   AND T1.CarId = T2.CarId) as Group1  
Depending how long this takes and how many rows you receive back will affect the performance.
So a lot will depend on how many rows there are for each CarId. If there are not too many rows per CardId it will probably be ok.
I can't say how many rows will cause unacceptable performance as that would depend on a lot of variables. The best thing to do is try the query on real data and see if its performance is acceptable. The addition of indexes will also help in the example you gave an index on myTable(CarId, DataTime, Status) would help the performance.
The advantage of implmenting it with a cursor is that you would have a normal query and just store the previous row to compare to the current row so it is a more linear process and so more scalable.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Databases

From novice to tech pro — start learning today.