Link to home
Start Free TrialLog in
Avatar of newtoperlpgm
newtoperlpgmFlag for United States of America

asked on

Oracle Regular Expressions and Dates

I want to use Oracle regular expression to extract dates from a varchar2 field in Oracle.  
I need to convert the data into date format so that I can then evaluate it to filter out dates older than today.

regexp_like(batch_id,'[0-9]{2}.[0-9]{2}.[0-9]{2}') works in my where clause and brings back all the data, but I also need to convert it and compare it to SYSDATE to filter out old dates.

The data is in one varchar2 column and looks like this:


Student Term 01.31.19
Student Term 09.15.18
Student Term 09.30.18
Student Term 11.30.18
STAFF 08/31/18-08/15/19
EXTRA 8.31.18-12.21.18
EXTRAS END 08.31.18
AUGMENT END 08.31.18


All help is greatly appreciated.
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Another problem you will have is if you have a string:  01/02/18

Is that January 2nd or February 1st?

That said, what are your expected results from the above data?

For example, what do you want back with "STAFF 08/31/18-08/15/19"?

Is a dash the only allowed separators?  What about "I started on 11/12/13 and took a break on 11/15/16 and started again on 12/31/20"?

What about data like 99999999?  It matches your regex but cannot be converted.
>>For example, what do you want back with "STAFF 08/31/18-08/15/19"?<<
Good question, especially since one date is prior to sysdate and one is subsequent to it. It's also the only data shown that uses a slash as the separator and not a period. How would you know which format mask to use when converting to a date for comparison purposes. I think some more detail about what you have and what you want to accomplish with your comparisons would be most helpful.
Avatar of newtoperlpgm

ASKER

Thanks for the questions, very good observances.  The dates are in MMDDYYYY format, so 01/02/18 is January 02, 2018.

Also, I want the ending date from the string.  For example, 08/15/19 from "STAFF 08/31/18-08/15/19" I expect to yield 08/15/19

Is a dash the only allowed separators?  What about "I started on 11/12/13 and took a break on 11/15/16 and started again on 12/31/20"?
I will never have 99999999, or , "I started on 11/12/13 and took a break on 11/15/16 and started again on 12/31/20" because it is a human entering the dates, and if I ever do, those will just not be converted.  Also, at this time the dash is the only separator, but it really could be anything, but again, if we have really bad date information, i.e., worse than what you see in my example, we will not expect to yield that data without first having it fixed.  

Thank you so much for any help you can provide.
ASKER CERTIFIED SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You might consider modifying your regular expression to only allow values of 0 or 1 for the first integer of the day portion and values of 0-3 for the first integer of the month portion to further eliminate invalid entries (although it wouldn't prevent February 30th e.g.).
regexp_substr(varch_id,'[0-1]{1}[0-9]{1}[./][0-3][1}[0-9]{1}[./]{0-9]{2}'
Thank you for your help.