Total Coverage Time over specified Period

I would like to write a query that returns the total amount of time over a specified period that a particular tool was in use.  The records I have show when specific jobs using that tool started and stopped.  I can't just add up these times, however, because sometimes the jobs overlap in time, e.g. in the case where a tool has a large capacity and can work on more than one part at a time.

My table is like this:
Job_ID
Tool_ID
Part_ID
Job_Start_DT
Job_Complete_DT

Given Tool_ID and a Period_Begin_DT and Period_End_DT I would like the query to return the total amount of time the tool was in use.

The best solution I could come up with was to create another table with a row for each slice of time between Period_Begin_DT and Period_End_DT (my slices were 6minutes).  I then created a query that counted how many times each slice was used, and then converted all non-zero numbers to 1 and summed up the slices used.  This kinda sorts works, but it takes a long time to run and is only as accurate as my time slices.

Any better ideas?
koughdurAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

slightwv (䄆 Netminder) Commented:
Please provide sample data and expected results.

Time for a tool used should be a simple sum per the job it was used on.  I don't understand what you mean by jobs overlapping.
0
PortletPaulfreelancerCommented:
and, what is the unit of measure? tool-days? tool-hours? smaller?

if the using is less than a day is it based on 'working day'? (and what is the working day here)

are weekends &/or holidays to be excluded?

{+ edit}
>>"The best solution I could come up with was to create another table with a row for each slice of time between Period_Begin_DT and Period_End_DT (my slices were 6minutes)"

sorry: seems that the unit of measure may be 6 minutes.

That technique (having a set of rows for each needed time unit) is how I would approach it too. Did you use a CTE?

Perhaps you could show us that query?
0
sdstuberCommented:
Are you looking for something like this?


  SELECT tool_id, SUM(group_end - group_start) total_days_used
    FROM (  SELECT tool_id, group_start, MAX(job_complete_dt) group_end
              FROM (SELECT tool_id,
                           job_start_dt,
                           job_complete_dt,
                           LAST_VALUE(
                               CASE WHEN job_start_dt <= prev_end THEN NULL ELSE job_start_dt END IGNORE NULLS)
                           OVER(PARTITION BY tool_id
                                ORDER BY job_start_dt
                                ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                               group_start
                      FROM (SELECT tool_id,
                                   job_start_dt,
                                   job_complete_dt,
                                   LAG(job_complete_dt) OVER(PARTITION BY tool_id ORDER BY job_start_dt) prev_end
                              FROM tool_usage))
          GROUP BY tool_id, group_start)
GROUP BY tool_id
ORDER BY tool_id


since DATE math is in terms of days the result represents days.
If you want to convert it to hours then multiply by 24
minutes by 1440
seconds by 86400

if you want to convert it to an interval  try something like for the first line
  SELECT tool_id, numtodsinterval(SUM(group_end- group_start),'day') total_time_used


I don't know what kind of performance you were looking for,  I only mocked up 16000 rows but it finished in 51 milliseconds for me.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

sdstuberCommented:
this was my test case

please extend or shrink as needed to help demonstrate the requirements

CREATE TABLE tool_usage
(
    Job_ID            INTEGER,
    Tool_ID           INTEGER,
    Part_ID           INTEGER,
    Job_Start_DT      DATE,
    Job_Complete_DT   DATE
);

INSERT INTO tool_usage
    SELECT job_id,
           tool_id,
           part_id,
           job_start_dt,
           job_start_dt + DBMS_RANDOM.VALUE(1, 180) / 1440 job_complete_dt
      FROM (SELECT job_id,
                   tool_id,
                   part_id,
                   SYSDATE - DBMS_RANDOM.VALUE(1, 50) job_start_dt
              FROM (    SELECT LEVEL job_id
                          FROM DUAL
                    CONNECT BY LEVEL < 100) jobs,
                   (    SELECT LEVEL tool_id
                          FROM DUAL
                    CONNECT BY LEVEL < 20) tools,
                   (    SELECT LEVEL part_id
                          FROM DUAL
                    CONNECT BY LEVEL < 10) parts);

COMMIT;

Open in new window

0
koughdurAuthor Commented:
To answer questions:  I'm OK with everything being in units of days.  I can convert to whatever I need.  Also, the choice of 6min was a compromise between speed and accuracy.  I figured 1/10th of a minute was a good choice between the two.

For each job there are six cases to consider:
1) Job starts and ends before time period:  0 days
2) Job starts and ends after time period:  0 days
3) Job starts and ends within time period:  job_end - job_start
4) Job starts before time period and ends after:  period_end - period_start
5) Job starts before time period and ends with time period:  job_end - period_start
6) Job starts within time period and ends after time period:  period_end - job_start

I created a function that allows me to pass in the four date values and return the overlap time and it seems to work rather quick, so no problems there.  The problem is when the jobs overlap.  I don't want to double or triple count the time overlap.

Here is some sample data and what I expect to get back:
Tool_Id = 1
Period Start = 9/1/2015 0:00
Period End = 9/2/2015 0:00

Job1 Start = 9/1/2015  1:00;   End = 9/1/2015  2:00
Job2 Start = 8/31/2015  23:00;   End = 9/1/2015  3:00
Job 3 Start = 9/1/2015  4:00;    End = 9/1/2015  7:00
Job 4 Start = 9/1/2015  22:00;    End = 9/2/2015  3:00
Job 5 Start = 9/1/2015  21:00;   End = 9/1/2015  23:00

What I want to get is 0.375 (i.e. 3/8 = 9/24)

If I just sum up the total hours I get  11/24ths of an hour because I'm double counting a couple of the hours.

sdstuber:  Your query seems like it's heading in the right direction, but I don't see how to specify period_begin_dt and period_end_dt.  Can you clarify?
0
koughdurAuthor Commented:
11/24ths of an hour should be 11/24ths of a day.

Also sdstuber:  Why use LAG() with no offset?
0
koughdurAuthor Commented:
sdstuber;

I ran your query against my table and for a one week period I got 612 hours for one tool.  That is over 4X the number of hours in a week.

I see what you're trying to do with the query.  I believe that if I incorporate my overlap algorithm in with the basic structure of your query I may be able to get what I want.

BTW, the WHERE clause I added was:

job_start_dt <= period_end AND job_complete_dt >= period_start

because I want to include jobs that used any of the period time even those that started before the period or ended after the period.
0
sdstuberCommented:
Adding an external range makes it a little more complicated because it effectively alters the start/end points of some of your ranges.

But still doable...

This returns 0.375 using your sample data


  SELECT tool_id, SUM(group_end - group_start) total_days_used
    FROM (  SELECT tool_id, group_start, MAX(job_complete_dt) group_end
              FROM (SELECT tool_id,
                           job_start_dt,
                           job_complete_dt,
                           LAST_VALUE(
                               CASE WHEN job_start_dt <= prev_end THEN NULL ELSE job_start_dt END IGNORE NULLS)
                           OVER(PARTITION BY tool_id
                                ORDER BY job_start_dt
                                ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                               group_start
                      FROM (SELECT tool_id,
                                   GREATEST(job_start_dt, period_start) job_start_dt,
                                   LEAST(job_complete_dt, period_end) job_complete_dt,
                                   LAG(job_complete_dt) OVER(PARTITION BY tool_id ORDER BY job_start_dt) prev_end
                              FROM tool_usage,
                                   (SELECT TO_DATE('9/1/2015  0:00', 'mm/dd/yyyy  hh24:mi') period_start,
                                           TO_DATE('9/2/2015  0:00', 'mm/dd/yyyy  hh24:mi') period_end
                                      FROM DUAL) inputs
                             WHERE job_complete_dt >= period_start AND job_start_dt <= period_end))
          GROUP BY tool_id, group_start)
GROUP BY tool_id
ORDER BY tool_id

Open in new window

0
sdstuberCommented:
LAG with no offset  defaults to an offset of 1

if you want to change

       LAG(job_complete_dt)
to
       LAG(job_complet_dt,1)  

that's fine
0
koughdurAuthor Commented:
sdstuber:

I modified my current method to be accurate down to the minute and compared the results to your query and came up with this table:  (The period was from 9/1/2015 to 9/8/2015)

Tool_ID      sdstuber Method (days)      Current Method (days)      Diff in Minutes
1        2.164      2.170      -8.3
2        7.924      6.576      1941.3
3        10.426      7.000      4933.5
4        1.058      1.065      -10.2
5        1.497      1.499      -3.2
6        0.641      0.649      -11.5
7        3.512      3.326      268.8
8        1.110      1.162      -74.9
10       1.096      1.106      -14.1
11         0.420      0.448      -39.5
12      2.130      2.031      142.8
14        2.800      2.803      -4.3
15        0.121      0.126      -7.1
16       0.831      0.835      -5.5
17        1.889      1.892      -3.4
18        6.532      6.533      -1.2
19       1.095      1.135      -56.6
23        1.777      1.781      -5.7
30       0.368      0.373      -7.7
31        1.947      1.963      -23.4
32       1.192      1.207      -21.7
33        0.167      0.232      -93.2
34       1.498      1.526      -40.5
35        4.949      4.963      -20.9
36       2.883      2.903      -27.9
37       0.334      0.335      -1.1
38        0.777      0.781      -6.7
39       0.082      0.083      -1.0

I believe I understand what your query is trying to do.  It comes close in a lot of cases, but I believe it is not handling all possible overlap situations.  In the case of tool #3, there were 4 jobs that lasted the full 7 days and a number of other jobs that if added to 7 came very close to 10.426.  So your algorithm was correct in not quadruple counting the four jobs that covered the entire time period, but somehow failed to take into account that the other jobs were also within the same time period.

Regardless, I've learned a lot from your query.  I've been spending too much time in MS Access Land and my query skills have been atrophying.
0
sdstuberCommented:
please post some sample data that demonstrates the problem
0
koughdurAuthor Commented:
Here is the data for the first five tools.
JobData.csv
0
sdstuberCommented:
With that data, I get the following results :

1      0.5173611111111111111111111111111111111111
2      0.8444444444444444444444444444444444444445
3      1.65555555555555555555555555555555555556
4      0.1645833333333333333333333333333333333333


What are you expecting?

I used this query, essentially the same as what  I posted above except I changed the table and column names from your original names to what you have in the csv file.
Also, I change the LAG slightly to include the LEAST criteria for truncating end points by the period range

  SELECT tool_id, SUM(group_end - group_start) total_days_used
    FROM (  SELECT tool_id, group_start, MAX(job_end) group_end
              FROM (SELECT tool_id,
                           job_st,
                           job_end,
                           LAST_VALUE(CASE WHEN job_st <= prev_end THEN NULL ELSE job_st END IGNORE NULLS)
                               OVER(PARTITION BY tool_id ORDER BY job_st ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                               group_start
                      FROM (SELECT tool_id,
                                   GREATEST(job_st, period_start) job_st,
                                   LEAST(job_end, period_end) job_end,
                                   LAG(LEAST(job_end, period_end)) OVER(PARTITION BY tool_id ORDER BY job_st) prev_end
                              FROM jobdata,
                                   (SELECT TO_DATE('9/1/2015  0:00', 'mm/dd/yyyy  hh24:mi') period_start,
                                           TO_DATE('9/2/2015  0:00', 'mm/dd/yyyy  hh24:mi') period_end
                                      FROM DUAL) inputs
                             WHERE job_end >= period_start AND job_st <= period_end))
          GROUP BY tool_id, group_start)
GROUP BY tool_id
ORDER BY tool_id

Open in new window

0
PortletPaulfreelancerCommented:
May we see your current query please?
0
sdstuberCommented:
nevermind my posts above
I knew there was something I didn't like about them
They are wrong.  I've solved this before...

try this instead

  SELECT tool_id, SUM(job_end - job_st) total_days_used
    FROM (  SELECT tool_id, job_st, MAX(job_end) job_end
              FROM (SELECT CASE
                               WHEN job_st IS NULL
                               THEN
                                   MAX(
                                       job_st)
                                   OVER(PARTITION BY tool_id
                                        ORDER BY oldjob_st
                                        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                               ELSE
                                   job_st
                           END
                               job_st,
                           CASE
                               WHEN job_end IS NULL
                               THEN
                                   MIN(
                                       job_end)
                                   OVER(PARTITION BY tool_id
                                        ORDER BY oldjob_st
                                        RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
                               ELSE
                                   job_end
                           END
                               job_end,
                           tool_id
                      FROM (SELECT job_st,
                                   job_end,
                                   tool_id,
                                   oldjob_st,
                                   oldjob_end
                              FROM (SELECT CASE
                                               WHEN LAG(job_end, 1) OVER(PARTITION BY tool_id ORDER BY job_st) IS NULL
                                               THEN
                                                   job_st
                                               WHEN job_st > LAG(job_end, 1) OVER(PARTITION BY tool_id ORDER BY job_st)
                                               THEN
                                                   job_st
                                               ELSE
                                                   NULL
                                           END
                                               job_st,
                                           CASE
                                               WHEN LEAD(job_end, 1) OVER(PARTITION BY tool_id ORDER BY job_end) IS NULL
                                               THEN
                                                   job_end
                                               WHEN job_end < LEAD(job_st, 1) OVER(PARTITION BY tool_id ORDER BY job_end)
                                               THEN
                                                   job_end
                                               ELSE
                                                   NULL
                                           END
                                               job_end,
                                           tool_id,
                                           job_st oldjob_st,
                                           job_end oldjob_end
                                      FROM (SELECT *
                                              FROM (SELECT MIN(
                                                               job_st)
                                                           OVER(PARTITION BY tool_id
                                                                ORDER BY job_st
                                                                RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                                                               minbegin,
                                                           MAX(
                                                               job_end)
                                                           OVER(PARTITION BY tool_id
                                                                ORDER BY job_st
                                                                RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
                                                               maxend,
                                                           tool_id,
                                                           job_st,
                                                           job_end
                                                      FROM (SELECT tool_id,
                                                                   GREATEST(job_st, period_start) job_st,
                                                                   LEAST(job_end, period_end) job_end
                                                              FROM jobdata,
                                                                   (SELECT TO_DATE('9/1/2015  0:00', 'mm/dd/yyyy  hh24:mi')
                                                                               period_start,
                                                                           TO_DATE('9/2/2015  0:00', 'mm/dd/yyyy  hh24:mi')
                                                                               period_end
                                                                      FROM DUAL)
                                                             WHERE job_end >= period_start AND job_st <= period_end))
                                             WHERE job_st <= minbegin OR job_end >= maxend))
                             WHERE job_st IS NOT NULL OR job_end IS NOT NULL))
             WHERE job_st IS NOT NULL
          GROUP BY job_st, tool_id)
GROUP BY tool_id
ORDER BY tool_id

Open in new window


using your sample code this produces these results...


1      0.5173611111111111111111111111111111111111
2      0.575
3      1
4      0.1645833333333333333333333333333333333333
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
sdstuberCommented:
sorry the code is so complex, but I'm at least comfortable with this one


depending on how you define your time ranges you might want to change > to >= and < to <=.
Otherwise ranges that butt up to each other but do not over lap will be considered separate.  Which may or may not be what you want.  You haven't specified either way.
0
koughdurAuthor Commented:
Your SQL is indeed complex, but sometimes that's the only thing that works.  I've learned a lot from your code besides this being a good solution for the problem I'm trying to solve, i.e. determine our tool utilization and see if we can squeeze any more out of them.

This is probably the most useful solution I've ever received to any of my questions on Experts Exchange.

Thanks again for all your efforts.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Query Syntax

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.