query to find missing gaps

Hi,
I have data like this

Position_tcd     start_date(date)  end_date(date)
101                      Jan-01-2000          Feb-20-2003                      
101                      Feb-21-2003          Nov-11-2011
101                     Nov-12-2011             Dec-31-9999

102                   Jan-01-2000           Feb-20-2003
102                    Nov-2-2011            Dec-11-2012
102                    Dec-12-2011          Dec-31-9999

I want to run a validation query to see if an position_tcd have gaps of dates in them.

how can i do that..


I want query to give me date gaps based upon position_tcd so something like this:

102  Feb-21-2003    Nov-1-2003  missing data
sam2929Asked:
Who is Participating?
 
PortletPaulfreelancerCommented:
Using the LAG() function we can compare across rows, so below we compare a "previous" end_date to "this" start_date, if the difference between those 2 dates is > 1 then we list the missing dates.
This result:
| POSITION_TCD | MISSED_START | MISSED_END |
|--------------|--------------|------------|
|          102 |   2003-02-21 | 2011-11-01 |

Open in new window

produced by the following query
SELECT
        position_tcd
      , to_char(x + 1,'YYYY-MM-DD') missed_start
      , to_char(start_date - 1,'YYYY-MM-DD') missed_end
FROM (
      SELECT
              position_tcd
            , lag(end_date,1) over (partition BY position_tcd ORDER BY start_date) AS x
            , start_date
            , end_date
      FROM s_position
     )
WHERE start_date - x > 1
;


############### data
CREATE TABLE S_POSITION
	("POSITION_TCD" int, "START_DATE" date, "END_DATE" date)
;

INSERT ALL 
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (101, '01-Jan-2000', '20-Feb-2003')
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (101, '21-Feb-2003', '11-Nov-2011')
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (101, '12-Nov-2011', '31-Dec-9999')
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (102, '01-Jan-2000', '20-Feb-2003')
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (102, '02-Nov-2011', '11-Dec-2012')
	INTO S_POSITION ("POSITION_TCD", "START_DATE", "END_DATE")
		 VALUES (102, '12-Dec-2011', '31-Dec-9999')
SELECT * FROM dual
;
-- http://sqlfiddle.com/#!4/6ce5c/8

Open in new window

0
 
Mark GeerlingsDatabase AdministratorCommented:
I think the lag function suggested by PortletPaul is the best way to solve this problem.  Your question is another example of the hardest kind of query to write in Oracle, basically:
"give me a report of what is *NOT* in the database."

Some common business examples are: invoices not paid, orders not shipped, POs not received, etc.  These are always more-complex to write (and almost always slower for the database to answer) than queries of what *IS* in the database.
0
 
awking00Commented:
Can we assume that the start_date for the 3rd 102 record is a type and should be
Dec-12-2012?
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.