Solved

Query with a running total

Posted on 2006-10-23
7
520 Views
Last Modified: 2008-02-01
I have a table that includes a start and stop time for a series of events.  What I need to know is at any arbitrary time how many of these events are "ON".  That's the easy part.  I can just count the number of events where the start and stop time stradles the given time.

Now what I need to do is determine for a given day the maximum number of events that are "ON" at the same time.

Here is my scheme and I need some help.  However, if you can think of a better way to do this then by all means speak up.

I'm selecting all of the start times along with a +1 delta column and unioning that with all of the end times with a -1 delta column.  This is then ordered by the time so I get a list of times along with a +1 or -1 value.  I've also included a priority column with 'a' for ON and 'b' for OFF so that ON times are sorted before OFF times when they are the same.  Now the number of events that are ON at any of the "transition times" is just the running sum of the delta column.  What I don't know how to do (other than in a loop) is calculate that running sum.  Is there a function to do that?

Once I have that running sum I can just select max from the result and get my value.

Here is a sample of the data (time, delta, running total) with the running total added manually:

2006-10-06-12.55.10.000000           1 1
2006-10-06-13.01.04.000000          -1 0
2006-10-06-14.23.26.000000           1 1
2006-10-06-14.32.04.000000           1 2
2006-10-06-14.32.04.000000          -1 1
2006-10-06-14.34.11.000000           1 2
2006-10-06-14.34.11.000000           1 3
2006-10-06-14.34.11.000000          -1 2
2006-10-06-14.34.11.000000          -1 1
2006-10-06-14.34.21.000000           1 2
2006-10-06-14.34.21.000000           1 3
2006-10-06-14.34.21.000000          -1 2
2006-10-06-14.34.22.000000          -1 1
2006-10-06-14.36.08.000000           1 2
2006-10-06-14.36.08.000000           1 3
2006-10-06-14.36.08.000000          -1 2
2006-10-06-14.36.08.000000          -1 1
2006-10-06-14.36.21.000000           1 2
2006-10-06-14.36.21.000000           1 3
2006-10-06-14.36.21.000000          -1 2
2006-10-06-14.36.22.000000          -1 1
2006-10-06-14.36.25.000000           1 2
2006-10-06-14.36.25.000000           1 3
0
Comment
Question by:GnarOlak
7 Comments
 
LVL 13

Assisted Solution

by:ghp7000
ghp7000 earned 166 total points
ID: 17791097
if im not mistaken you already posted a similar type of question where the answer was to use temp tables to select a min or max value:
with table1 as (query 1)
table2 as (query2)
etc etc then
(select max value from table1, table2 where table1.col=table2.col)
i think this is simple way without getting into very heavy sql

other way is via stored procedure where you can loop through fetch
0
 
LVL 37

Accepted Solution

by:
momi_sabag earned 167 total points
ID: 17798258
Hi

i have a solution for you, but i made some assumptions :
1) each event has an event_id
2) an event can not start and end at the exact same timestamp
3) on the data that i tested, no two rows had the same timestamp value, but it should not matter as long as rule 2 applies

the logic :
join your table with itself and for every event that is on, you select all the event that started before it,
and has not finished yet
the sql for that :
select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time > t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc

that will return a table that contains for each event_id, it's start time and a count of all the event that were opened when it started (you add 1 for the currrent event you check) ordered by the open events count and the time at which that happend

to find out what you wanted to know, all you need to do is select the first row from the above table
select * from
(select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time > t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc) as ttt
fetch first row only

momi sabag
0
 
LVL 6

Author Comment

by:GnarOlak
ID: 17798428
Momi_sabaq, rule 2 unfortunately does not hold.  That is why I had to inject an artificial ordering mechanism (a/b).

What I was really hoping for was a built in function similar to SUM() that would total the column up to that point in the sequence.  Something like this:

select
    event_time
    ,running_sum(delta)
from
(
   select
      start_time as event_time
      ,'a' as priority -- so start times come before end times
      ,1 as delta
   from
      my_event_table
   union all
   select
      end_time as event_time
      ,'b' as priority -- so start times come before end times
      ,-1 as delta
   from
      my_event_table
)
order by
   event_time
   ,priority
;

I tried using SUM() and various kinds of grouping schemes but nothing worked.  I have written a solution using a cursor and keeping track of the running total myself but it's not an elegant solution for, what would seem to me to be, a common requirement.  This almost identical to a "year-to-date" or "month-to-date" kind of problem and it seems like a good candidate for an automated solution.  But if there is none then I will use my solution.  I'll leave this question open for a while an see if anyone comes up with a solution.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 37

Expert Comment

by:momi_sabag
ID: 17798563
hi

come to think of it , you can ignore rule number 2
as long as you do this modification :

select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time >= t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc

and the second sql will be

select * from
(select  t1.start_time, t1.event_id , count(*) + 1
from momi.tt t1, momi.tt t2
where t1.event_id <> t2.event_id
and t1.event_action='ON'
and t1.start_time >= t2.start_time
and t2.event_action='ON'
and not exists (select 1 from momi.tt t3
                       where t3.event_id = t2.event_id and t3.event_action = 'CLOSE' and t3.start_time <= t1.start_time ) group by t1.start_time, t1.event_id order by 3 desc, 1 asc) as ttt
fetch first row only


i believe this is a better solution than using a functino in terms of performance
anyway, can you try it on your data and tell me if it works ?
i tried it on my version of your data, but i want to be sure it works
thanks

momi sabag
0
 
LVL 37

Expert Comment

by:momi_sabag
ID: 17798603
Hi

another thing,
you can not have a funtion as the one you want since,
when you write a built in function
if it is a column function, then it works on the entire result set
if it is a scalar function, then it works on a single row only
so i don't think anyone will be able to give you the function that you want

momi
0
 
LVL 45

Assisted Solution

by:Kdo
Kdo earned 167 total points
ID: 17804208

I'm going to agree with ghp7000 that writing a stored procedure to walk through the data is probably the easiest and fastest.

The issue is the underlying mathematics.  You need to compute a lot of sums and compare them.  Algebraically, it's written as:

MAX
(
  delta(1),
  SUM (delta(1..2)),
  SUM (delta(1..3)),
....
  SUM (delta(1..n))
)

And SQL will try to solve it that way.  A horribly inefficient approach.

What you want is recursive, though we think of it as iterative.  In a list of N items, generate N sums where each sum is the sum of the current item plus the previous sum.

MAX
(
  delta(1) as sum1,
  sum1 + delta(2) as sum2,
  sum2 + delta(3) as sum3
....
  sum[N-1] + delta[N] as sumN
)


Unfortunately, SQL doesn't have a good mechanism to do this without resorting to a stored procedure and chasing a cursor.

That said, the OLAP extensions may help here, if you don't have too much data.  Let me do a bit of research and get back on this....

Kent
 
0
 
LVL 6

Author Comment

by:GnarOlak
ID: 17806835
I'm going to close this question tomorrow and split the points among everyone who offered suggestions.  I have written the query two differenct way.  The first is in a loop tracking the various values and keeping running totals.  The second is far more involved and makes use of the RANK() function and nested subqueries to calculate the running query in real time.  On a small data set they perform about the same.  I am going to test it on a larger data set.  Even if the real time solution is slower I will go with it unless the difference is "large".  It is more of a style issue.  I prefer to do as much in queries rather than in post processing code either in SQL within a stored procedure or in a separate program executing SQL.

Thanks for all your help and suggestions.  If nothing else it has offered alternative views on my problem.
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

Recursive SQL in UDB/LUW (you can use 'recursive' and 'SQL' in the same sentence) A growing number of database queries lend themselves to recursive solutions.  It's not always easy to spot when recursion is called for, especially for people una…
Recursive SQL in UDB/LUW (it really isn't that hard to do) Recursive SQL is most often used to convert columns to rows or rows to columns.  A previous article described the process of converting rows to columns.  This article will build off of th…
Sending a Secure fax is easy with eFax Corporate (http://www.enterprise.efax.com). First, Just open a new email message.  In the To field, type your recipient's fax number @efaxsend.com. You can even send a secure international fax — just include t…
This tutorial demonstrates a quick way of adding group price to multiple Magento products.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now