Link to home
Start Free TrialLog in
Avatar of nQuote
nQuoteFlag for United States of America

asked on

Trying to resolve a complex PL/SQL query

I have a table with columns and data as follows:

ColKey      ColDt1                  ColDt2
A            04/02/2014   05/06/2018 11:59:59 PM
A            12/06/2016   03/02/2017 11:59:59 PM  --exception row
A            05/07/2018   02/03/2019 11:59:59 PM
B            09/02/2012   11/03/2014 11:59:59 PM
B            11/04/2014   12/11/2016 11:59:59 PM
....
The dates continue based on ColKey. However, there are exception situations, for example, for ColKey=A. The total number of rows of all exception situations might be much larger but the exception situation will always be between 3 rows, i.e., two continuous flowing rows and a exception row in the middle based on date. There might be 11 rows for a ColKey but exception situation might happen between row 5 and row 7 or might be between rows 8 and 10 among 15 rows.

I am trying to find the total count of all such exception situations in the whole table. I can explain further if needed. Thanks for your help.
Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Will an exception that generates multiple lines always have exactly the same time recorded for all rows associated with the exception?

If so, select ColKey and ColDt2, and group by the two items. (Or distinct.)  Filter by ColDt2 is not null.   The total number of rows returned will be the number of exceptions recorded.


Avatar of nQuote

ASKER

Kent, thank you for your response.

No, the date will be different but the timestamp will be the same. However, the timestamp for all dates in ColDt2 is the same.
Yeah.  I didn't phrase that well.  :(

If ColDt2 is used to link rows of an event, that'll work.

SELECT count(*)
FROM
(
  SELECT distinct ColKey, ColDt2
  FROM MyTable
  WHERE ColDt2 is not NULL
) t0;

Open in new window


Avatar of nQuote

ASKER

Kent, thank you for your response.
Unfortunately, that won't work. Let me give a more detailed scenario of the data:

ColKey      ColDt1                  ColDt2
A            04/02/2014   05/06/2018 11:59:59 PM
A            12/06/2016   03/02/2017 11:59:59 PM  --exception row
A            05/07/2018   02/03/2019 11:59:59 PM
A            02/04/2019   03/05/2019 11:59:59 PM
A            03/06/2019   08/01/2019 11:59:59 PM
A            08/02/2019   11/12/2019 11:59:59 PM
A            11/13/2019    12/31/2099
B            09/02/2012   11/03/2014 11:59:59 PM
B            11/04/2014   12/11/2016 11:59:59 PM
B            12/12/2016   12/31/2099
C            05/06/2005   06/05/2006 11:59:59 PM
C            06/06/2006   07/07/2007 11:59:59 PM
C            07/08/2007   02/03/2008 11:59:59 PM
C            09/02/2009   11/12/2011 11:59:59 PM  --exception row
C            02/04/2008   05/01/2012 11:59:59 PM
C            05/02/2012   06/04/2016 11:59:59 PM
C            06/05/2016   12/31/2099
....
For the data above, the count will be 2 as there are two exception situations. Please let me know if you have any question.
How do you identify those 2 rows as exceptions?
Avatar of nQuote

ASKER

So basically the logic to determine if there is an exception row between two rows is that the ColDt2 of a row is not one second less than ColDt1 of the next row. We need to pick every three rows and then check. So if the ColDt2 of row 1 (among the three) is not one second less than than ColDt1 of the next row, that counts as 1 instance. In that case, we don't need to consider the third row of the set anymore. Then we pick the next set of three rows and so on.
Hmmm...  Still not quite following:

1 - C            05/06/2005   06/05/2006 11:59:59 PM
2 - C            06/06/2006   07/07/2007 11:59:59 PM
3 - C            07/08/2007   02/03/2008 11:59:59 PM
4 - C            09/02/2009   11/12/2011 11:59:59 PM  --exception row
5 - C            02/04/2008   05/01/2012 11:59:59 PM
6 - C            05/02/2012   06/04/2016 11:59:59 PM

If row 4 is an exception because it's not a second different from line 5, why isn't row 2 and exception because it's not a second off from row 3?
Avatar of nQuote

ASKER

The ColDt2 of row 2 is one second less than the ColDt1 of row 3. That makes them valid. So, for each row, its ColDt2 must be one second less than the ColDt1 of the next row.
Sorry.  I was focused just on the second date column.  :)  But I'm zeroing in....

Is the time value in the second column always 23:59:59?  If so,

SELECT count(*)
FROM
(
  SELECT t1.*
  FROM MyTable t0
  INNER JOIN MyTable t1
  ON t0.ColKey = t1.ColKey
   AND t0.ColDt1 = (trunc(t1.ColDt2)+1)
) t0;

Otherwise, we'll need to do a bit more time math...


Avatar of nQuote

ASKER

This does not give the right count. Remember that the ColDt2 must one second less than ColDt1 of the next row in an ORDER BY situation for this to be correct. This way the dates are sequential. The display I gave you is an output of an ORDER BY clause of the table.
Avatar of nQuote

ASKER

We might not a cursor for this as we have to consider three rows at a time. I was hoping to do it without it.
We'll need to know how to put the rows in the order that you show them.  Nothing in these 3 columns suggests that just the date columns are sufficient.

Avatar of nQuote

ASKER

select ColKey, ColDt1, ColDt2
from Tbl1
order by ColKey, ColDt1
Avatar of nQuote

ASKER

Thanks. I was able to resolve it by using a cursor.
ASKER CERTIFIED SOLUTION
Avatar of nQuote
nQuote
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Apologies, soccerplayer.  I had to step away.

This actually codes pretty easily without a cursor, too.

SELECT ColKey, ColDt1, ColDt2, trunc(ColDt2)+1 ColDtx, row_number () over (partition by Colkey, order by Colkey, Coldt1)
FROM mytable.

That will return the rows to look like this:
ColKey      ColDt1                  ColDt2
A, 04/02/2014, 05/06/2018 11:59:59 PM, 05/07/2018, 1
A, 12/06/2016, 03/02/2017 11:59:59 PM, 03/03/2017, 2  --exception row
A, 05/07/2018, 02/03/2019 11:59:59 PM, 02/04/2019, 3
A, 02/04/2019, 03/05/2019 11:59:59 PM, 03/06/2019, 4
A, 03/06/2019, 08/01/2019 11:59:59 PM, 08/02/2019, 5
A, 08/02/2019, 11/12/2019 11:59:59 PM, 11/13/2019, 6
A, 11/13/2019, 12/31/2099, 1/1/2100, 7
B, 09/02/2012, 11/03/2014 11:59:59 PM, 11/04/2014, 1
B, 11/04/2014, 12/11/2016 11:59:59 PM, 12/12/2016, 2
B, 12/12/2016, 12/31/2099, 1/1/2100, 3
C, 05/06/2005, 06/05/2006 11:59:59 PM, 06/06/2006, 1
C, 06/06/2006, 07/07/2007 11:59:59 PM, 07/08/2007, 2
C, 07/08/2007, 02/03/2008 11:59:59 PM, 02/03/2008, 3
C, 02/04/2008, 05/01/2012 11:59:59 PM, 05/02/2012, 4
C, 09/02/2009, 11/12/2011 11:59:59 PM, 11/13/2011, 5  --exception row
C, 05/02/2012, 06/04/2016 11:59:59 PM, 06/05/2016, 6
C, 06/05/2016, 12/31/2099, 1/1/2100, 7

This has the original 3 columns, the +1 value of ColDt2, and the row number relative to the ColKey.  it's pretty easy to self join this so that you can compare the rows to find the exceptions.

:)