• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 534
  • Last Modified:

A complex Oracle PL/SQL query

I have data as follows:

ColKey           ColStartDt                                ColEnd Dt
1                  01-JAN-2012                   03-JAN-2012 11:59:59 PM
2                  04-JAN-2012                   06-JAN-2012 11:59:59 PM
3                  07-JAN-2012                   09-JAN-2012 11:59:59 PM
3                  06-JAN-2012                   11-JAN-2012 11:59:59 PM
4                  12-JAN-2012                   14-JAN-2012 11:59:59 PM
4                  14-JAN-2012                   17-JAN-2012 11:59:59 PM
5                  18-JAN-2012                   20-JAN-2012 11:59:59 PM

As you can see, some ColKey values fall in multiple date ranges. I need to have an output such as:

ColKey, count(*)
1             1
2             1
3             2
4             2
5             1

I am trying to come up with the query but any help would be appreciated.
0
soccerplayer
Asked:
soccerplayer
  • 12
  • 5
  • 4
  • +3
1 Solution
 
slightwv (䄆 Netminder) Commented:
You can get your output with:
select ColKey,count(*) from mytable group by ColKey;

I'm not understanding the date ranges.
0
 
soccerplayerAuthor Commented:
If only it was that simple.

We need to do a count for ColKey that falls in multiple overlapping date ranges in the entire table. Let's say there are 10 date ranges in the whole table, some of them are overlapping and some of them are not.

If ColKey 4 falls in two date ranges and they are overlapping then the count for ColKey 4 will be 2.
0
 
sdstuberCommented:
can you provide sufficient sample data to show what you mean?
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
slightwv (䄆 Netminder) Commented:
>>We need to do a count for ColKey that falls in multiple overlapping date ranges in the entire table

I don't understand what this means.

In your sample data I see:
4                  12-JAN-2012                   14-JAN-2012 11:59:59 PM
4                  14-JAN-2012                   17-JAN-2012 11:59:59 PM

I see you want a count of '2' for key 4.

I don't understand where the 2 dates overlap.
0
 
soccerplayerAuthor Commented:
Thank you sdstuber and slightvw for your help.

If you have a specific question about the sample data I have provided, can you please let me know so that I can explain? Here is some additional data which might help.


ColKey           ColStartDt                                ColEnd Dt
1                  01-JAN-2012                   03-JAN-2012 11:59:59 PM
2                  04-JAN-2012                   06-JAN-2012 11:59:59 PM
3                  07-JAN-2012                   09-JAN-2012 11:59:59 PM
3                  06-JAN-2012                   11-JAN-2012 11:59:59 PM
4                  12-JAN-2012                   14-JAN-2012 11:59:59 PM
4                  14-JAN-2012                   17-JAN-2012 11:59:59 PM
5                  18-JAN-2012                   20-JAN-2012 11:59:59 PM
2                  07-JAN-2012                   09-JAN-2012 11:59:59 PM

For example, ColKey 3 falls in two different date ranges and they are overlapping. The count for ColKey 3 will be 2. ColKey 2 falls in two different date ranges as well but they are NOT overlapping so the count for ColKey 2 will be 1.

Please let me know if you have more questions and I'll gladly answer.
0
 
joaoalmeidaCommented:
Try:


select t1.colkey, count(0)
from mytable t1, mytable t2
where t1.colkey=t2.colkey and not (t1.start_date > t2.end_date and t1.end_date < t2.start_date)
group by t1.colkey
0
 
sdstuberCommented:
what are the expected results from your last set of data?
0
 
soccerplayerAuthor Commented:
Expected results from my last set of data are:

ColKey, count(*)
1             1
2             1
3             2
4             2
5             1

joaoalmeida, I'll try your solution and let you know.
0
 
slightwv (䄆 Netminder) Commented:
While I wait for the expected results asked for by sdstuber

>>Please let me know if you have more questions and I'll gladly answer.

In the original results you have a count of 1 for key 1.  Since key1 has only one entry, how can it overlap itself?
0
 
soccerplayerAuthor Commented:
joaoalmeida, I tried your solution but the count is coming 40+ times larger than maximum possible count.
0
 
soccerplayerAuthor Commented:
Answer to slightvw.

It cannot that's why the count is 1. I am only interested in ColKeys with count>1.
0
 
slightwv (䄆 Netminder) Commented:
>>Expected results from my last set of data are:

1             1

>>I am only interested in ColKeys with count>1.

Then why is "1    1" in your expected results?
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
as the question is about ranges, would this article help?
http://www.experts-exchange.com/Database/Miscellaneous/A_3952-ranges-gaps-overlaps-for-numbers-and-date-ranges.html

I think you want to check for data, where for the same colkey you have another row which "overlaps"
the question here is: do you have a primary key in this table?
0
 
soccerplayerAuthor Commented:
That was just to show everyone. I'll do a SELECT from the results to filter >1 rows only.
0
 
soccerplayerAuthor Commented:
angelIII, yes there is a PK but ColKey is not the PK. It shouldn't make a difference though.
0
 
soccerplayerAuthor Commented:
One important point. All records obviously overlap with itself exactly. That would not count.
0
 
Guy Hengel [angelIII / a3]Billing EngineerCommented:
question: what if for 1 value of colkey, you have 4 ranges, of which 2 pairs are overlapping?
what should be the result?

anyhow, if you are only interested in the rows that have some overlapping:

select t.*
  from yourtable t
 where exists(select null from yourtable o
             where o.ColKey       = t.ColKey
                and o.ColStartDt   <= t.ColEndDt
                and o.ColStartEndtd >= t.ColStartDt
                and o.pk <> t.pk
  )

Open in new window


and go on from there ...
0
 
joaoalmeidaCommented:
Try this small change.

select t1.colkey, count(0)
from mytable t1, mytable t2
where t1.colkey=t2.colkey and not (t1.start_date > t2.end_date or t1.end_date < t2.start_date)
group by t1.colkey
0
 
awking00Commented:
select colkey, sum(colcount) colcount from
(select
  colkey
 ,colstartdt
 ,colenddt
 ,case when lead(colstartdt) over (partition by colkey order by colstartdt) > colenddt
       then 0
       else 1
  end colcount
 from yourtable)
group by colkey
order by colkey;
0
 
awking00Commented:
SQL> select * from yourtable;

    COLKEY COLSTARTD COLENDDT
---------- --------- ---------
         1 01-JAN-12 03-JAN-12
         2 04-JAN-12 06-JAN-12
         3 07-JAN-12 09-JAN-12
         3 06-JAN-12 11-JAN-12
         4 12-JAN-12 14-JAN-12
         4 14-JAN-12 17-JAN-12
         5 18-JAN-12 20-JAN-12
         2 07-JAN-12 09-JAN-12

SQL> select colkey, sum(colcount) colcount from
  2  (select
  3    colkey
  4   ,colstartdt
  5   ,colenddt
  6   ,case when lead(colstartdt) over (partition by colkey order by colstartdt)
 > colenddt
  7         then 0
  8         else 1
  9    end colcount
 10   from yourtable)
 11  group by colkey
 12  order by colkey;

    COLKEY   COLCOUNT
---------- ----------
         1          1
         2          1
         3          2
         4          2
         5          1
0
 
soccerplayerAuthor Commented:
joaoalmeida, the counts dropped significantly but they are still twice as much as the maximum possible.
0
 
soccerplayerAuthor Commented:
awking00, your SQL seems to be working. Let me do some more tests.
0
 
awking00Commented:
My main concern was that there might be the case when one set of begin-end dates overlaps with two other sets of begin-end dates but those two sets do not overlap each other. If that case never exists, then I think my query should work.
0
 
soccerplayerAuthor Commented:
There is no way for me to tell if such a situation exists as the number of records is large. How would I find out if such a situation exists? Can you give me an idea of such a query?
0
 
awking00Commented:
The example below shows one record covering the entire month of January, one for the range of the 5th to the 10th and another from the 15th to the 20th. The second range would compute as though there was no overlap because there is no overlap with the following range. I'm not sure how you would research that nor what you would want your count to be, but I think it might be managed by adding a lag fucntion and testing for an either or scenario.
    COLKEY COLSTARTD COLENDDT    COLCOUNT
---------- --------- --------- ----------
         6 01-JAN-12 31-JAN-12          1
         6 05-JAN-12 10-JAN-12          0
         6 15-JAN-12 20-JAN-12          1
0
 
soccerplayerAuthor Commented:
awking00, can you explain this piece of the SQL little bit?

select
colkey,colstartdt,colenddt
,case when lead(colstartdt) over (partition by colkey order by colstartdt)> colenddt
then 0 else 1 end colcount
from yourtable

We are doing a partition by ColKey after ordering by ColStartDt. By doing a lead(ColStartDt)> ColEndDt, are we saying that if the ColStartDt of the current row is greater than the ColEndDt of current row or next row (since default offset is 1)?
0
 
awking00Commented:
That subquery produces the following output from your example:

    COLKEY COLSTARTD COLENDDT    COLCOUNT
---------- --------- --------- ----------
         1 01-JAN-12 03-JAN-12          1
         2 04-JAN-12 06-JAN-12          0
         2 07-JAN-12 09-JAN-12          1
         3 06-JAN-12 11-JAN-12          1
         3 07-JAN-12 09-JAN-12          1
         4 12-JAN-12 14-JAN-12          1
         4 14-JAN-12 17-JAN-12          1
         5 18-JAN-12 20-JAN-12          1
It's saying that when the ColStartDt of the next row is greater than the ColEndDt of the current row, then it is deemed to be not overlapping so, when the outer query does the sum, in the case of ColKey = 2, it is only "counted" as 1.
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

  • 12
  • 5
  • 4
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now