We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

Query with a running total

GnarOlak
GnarOlak asked
on
Medium Priority
584 Views
Last Modified: 2008-02-01
I have a table that includes a start and stop time for a series of events.  What I need to know is at any arbitrary time how many of these events are "ON".  That's the easy part.  I can just count the number of events where the start and stop time stradles the given time.

Now what I need to do is determine for a given day the maximum number of events that are "ON" at the same time.

Here is my scheme and I need some help.  However, if you can think of a better way to do this then by all means speak up.

I'm selecting all of the start times along with a +1 delta column and unioning that with all of the end times with a -1 delta column.  This is then ordered by the time so I get a list of times along with a +1 or -1 value.  I've also included a priority column with 'a' for ON and 'b' for OFF so that ON times are sorted before OFF times when they are the same.  Now the number of events that are ON at any of the "transition times" is just the running sum of the delta column.  What I don't know how to do (other than in a loop) is calculate that running sum.  Is there a function to do that?

Once I have that running sum I can just select max from the result and get my value.

Here is a sample of the data (time, delta, running total) with the running total added manually:

2006-10-06-12.55.10.000000           1 1
2006-10-06-13.01.04.000000          -1 0
2006-10-06-14.23.26.000000           1 1
2006-10-06-14.32.04.000000           1 2
2006-10-06-14.32.04.000000          -1 1
2006-10-06-14.34.11.000000           1 2
2006-10-06-14.34.11.000000           1 3
2006-10-06-14.34.11.000000          -1 2
2006-10-06-14.34.11.000000          -1 1
2006-10-06-14.34.21.000000           1 2
2006-10-06-14.34.21.000000           1 3
2006-10-06-14.34.21.000000          -1 2
2006-10-06-14.34.22.000000          -1 1
2006-10-06-14.36.08.000000           1 2
2006-10-06-14.36.08.000000           1 3
2006-10-06-14.36.08.000000          -1 2
2006-10-06-14.36.08.000000          -1 1
2006-10-06-14.36.21.000000           1 2
2006-10-06-14.36.21.000000           1 3
2006-10-06-14.36.21.000000          -1 2
2006-10-06-14.36.22.000000          -1 1
2006-10-06-14.36.25.000000           1 2
2006-10-06-14.36.25.000000           1 3
Comment
Watch Question

Commented:
if im not mistaken you already posted a similar type of question where the answer was to use temp tables to select a min or max value:
with table1 as (query 1)
table2 as (query2)
etc etc then
(select max value from table1, table2 where table1.col=table2.col)
i think this is simple way without getting into very heavy sql

other way is via stored procedure where you can loop through fetch

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
Hi

i have a solution for you, but i made some assumptions :
1) each event has an event_id
2) an event can not start and end at the exact same timestamp
3) on the data that i tested, no two rows had the same timestamp value, but it should not matter as long as rule 2 applies

the logic :
join your table with itself and for every event that is on, you select all the event that started before it,
and has not finished yet
the sql for that :
select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time > t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc

that will return a table that contains for each event_id, it's start time and a count of all the event that were opened when it started (you add 1 for the currrent event you check) ordered by the open events count and the time at which that happend

to find out what you wanted to know, all you need to do is select the first row from the above table
select * from
(select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time > t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc) as ttt
fetch first row only

momi sabag

Author

Commented:
Momi_sabaq, rule 2 unfortunately does not hold.  That is why I had to inject an artificial ordering mechanism (a/b).

What I was really hoping for was a built in function similar to SUM() that would total the column up to that point in the sequence.  Something like this:

select
    event_time
    ,running_sum(delta)
from
(
   select
      start_time as event_time
      ,'a' as priority -- so start times come before end times
      ,1 as delta
   from
      my_event_table
   union all
   select
      end_time as event_time
      ,'b' as priority -- so start times come before end times
      ,-1 as delta
   from
      my_event_table
)
order by
   event_time
   ,priority
;

I tried using SUM() and various kinds of grouping schemes but nothing worked.  I have written a solution using a cursor and keeping track of the running total myself but it's not an elegant solution for, what would seem to me to be, a common requirement.  This almost identical to a "year-to-date" or "month-to-date" kind of problem and it seems like a good candidate for an automated solution.  But if there is none then I will use my solution.  I'll leave this question open for a while an see if anyone comes up with a solution.
hi

come to think of it , you can ignore rule number 2
as long as you do this modification :

select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time >= t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc

and the second sql will be

select * from
(select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time >= t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc) as ttt
fetch first row only


i believe this is a better solution than using a functino in terms of performance
anyway, can you try it on your data and tell me if it works ?
i tried it on my version of your data, but i want to be sure it works
thanks

momi sabag
Hi

another thing,
you can not have a funtion as the one you want since,
when you write a built in function
if it is a column function, then it works on the entire result set
if it is a scalar function, then it works on a single row only
so i don't think anyone will be able to give you the function that you want

momi
Kent OlsenData Warehouse / Database Architect
CERTIFIED EXPERT
Commented:

I'm going to agree with ghp7000 that writing a stored procedure to walk through the data is probably the easiest and fastest.

The issue is the underlying mathematics.  You need to compute a lot of sums and compare them.  Algebraically, it's written as:

MAX
(
  delta(1),
  SUM (delta(1..2)),
  SUM (delta(1..3)),
....
  SUM (delta(1..n))
)

And SQL will try to solve it that way.  A horribly inefficient approach.

What you want is recursive, though we think of it as iterative.  In a list of N items, generate N sums where each sum is the sum of the current item plus the previous sum.

MAX
(
  delta(1) as sum1,
  sum1 + delta(2) as sum2,
  sum2 + delta(3) as sum3
....
  sum[N-1] + delta[N] as sumN
)


Unfortunately, SQL doesn't have a good mechanism to do this without resorting to a stored procedure and chasing a cursor.

That said, the OLAP extensions may help here, if you don't have too much data.  Let me do a bit of research and get back on this....

Kent
 

Author

Commented:
I'm going to close this question tomorrow and split the points among everyone who offered suggestions.  I have written the query two differenct way.  The first is in a loop tracking the various values and keeping running totals.  The second is far more involved and makes use of the RANK() function and nested subqueries to calculate the running query in real time.  On a small data set they perform about the same.  I am going to test it on a larger data set.  Even if the real time solution is slower I will go with it unless the difference is "large".  It is more of a style issue.  I prefer to do as much in queries rather than in post processing code either in SQL within a stored procedure or in a separate program executing SQL.

Thanks for all your help and suggestions.  If nothing else it has offered alternative views on my problem.
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.