Link to home
Start Free TrialLog in
Avatar of Sam OZ
Sam OZFlag for Australia

asked on

Check patter of string and return first part

I have a Docuemnt No in following sequence
     DocuemntType-AreaNo-DepartMentCode-SequenceNo

    Typically a  DocuemntNumber wil be    Type1-01-001-123
     But there is bad data also so some of them may be just 01-003-234  ( No type) or 01/004/123 ( with /)  ..e.t.c
     I need the oracle query for following
          Check if the pattern of docuemnt number is correct  ( That is 3 hiphens) .
             If the pattern is correct , return the Docuemnt type  (First part of the DocuemntNo)
             Else return ''
ASKER CERTIFIED SOLUTION
Avatar of Alex [***Alex140181***]
Alex [***Alex140181***]
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Sean Stuber
Sean Stuber

What identifies a document type?  The assumption in the post above is it's anything before the first hyphen as long as there are 3 hyphens and no non-numerics after the first hyphen
So  'this---'  would return 'this' as the answer using the query above.  Is that correct?

If it is, then I suggest removing the extra parentheses in the return value because sub-expressions are expensive.
Also, I would eliminate the case statement entirely as it's simply extra processing for no functional value.

So, combining both of those suggestions, the doc_return function call would simply be:

REGEXP_SUBSTR(doc_no, '^(.*)-\d*-\d*-\d*$', 1,1,NULL,1)

Again though, this assumes the "document type" is as I've described it.  If that's not the case, please specify and provide sample data and expected results.

I will take a guess that maybe you want to ensure there are at least one digit in each of the AreaNo-DepartMentCode-SequenceNo portions and one character in the document type
If so then replace the * with +

REGEXP_SUBSTR(doc_no, '^(.+)-\d+-\d+-\d+$', 1,1,NULL,1)
I agree with the "useless" REGEXP_COUNT ;-)
But you know, you're still using the expensive sub-expressions with "REGEXP_SUBSTR(doc_no, '^(.*)-\d*-\d*-\d*$', 1,1,NULL,1)" or "REGEXP_SUBSTR(doc_no, '^(.+)-\d+-\d+-\d+$', 1,1,NULL,1)".
As always: if this isn't huge big data whole lotta noise data processing and stuff, the expensive part is irrelevant. BUT, if you need to process huge amounts of data (maybe even in a time critical manner), then you should re-think the overall approach ;-)
If the pattern is truly simply three hyphens -
select decode(regexp_count(doc_no,'-'),3,substr(doc_no,1,instr(doc_no,'-') - 1))
from yourtable;
Avatar of Sam OZ

ASKER

Hi  Alex (Alexander Eßer)
    Your query works. But I can have  values like Type1-01-001-123/01   which are also good.  Can you please modify query ?

Thansk all for your reply
regexp_substr(doc_no, '^(.*)(-\d*-\d*-\d*)', 1, 1, null, 1)

Open in new window

BUT it all depends on what you want to achieve. I strongly recommend that you take one step back from your current level/view of detailness and re-think (like stdstuber already mentioned in a similar manner).
Your first question stated a file pattern abc, now you say "no, it's also xyz..." That's NOT the way you develop!
>>But I can have  values like Type1-01-001-123/01   which are also good<<
My query works with that value so the question is, "Is the desired result truly, for any documentno with three hyphens, return the substring up to the first hyphen"? If not, we would need to know all of the patterns that qualify for returning the first part.
Avatar of Sam OZ

ASKER

Hi Alex and all,
   Sorry for delay in getting back
       There is only one condition  . If  Document pattern is  DocuemntType-AreaNo-DepartMentCode-SequenceNo  return DocumentType
Else return ''

Only thing is Type1-01-001-123/01 is a good pattern . I am checking only the hiphens and not other characters
I'll reiterate, my query will treat Type1-01-001-123/01 as a good pattern and will retrieve the first part of every column that has 3 hyphens. What else do you need?
I'm really looking forward to how this is going to end...