Link to home
Start Free TrialLog in
Avatar of GurcanK
GurcanK

asked on

How to Use REGEXP_SUBSTR with PL/SQL

Dear Experts,

I have an Oracle  table with a DETAIL column as VARCHAR2(15). The DETAIL column contains values such as:

"ZENT=OZA NA 1234"
"ZENT=011231234567"
"ZENT=2341234567"
"ZENT=2341234567,"
"ZENT=2341234567B"
"ZENT=0123412345,67"
"ZENT=0123412345B67"

I want to achieve three things with the values of this column:

1-Extract the string excluding "ZENT=" part. For example "ZENT=011231234567" will be "011231234567". This will be called as "Extracted String" from now on.  

2-The extracted string will have no characters other than numeric. For example, "011231234567," will be "011231234567" (comma eliminated); "011231B234567" will be "011231234567" (B eliminated).

3-The extracted string will have at least 10 numeric characters in it.

How can I fulfill above requirements using REGEXP_SUBSTR and other PL/SQL functions?

Best Regards
Avatar of Rgonzo1971
Rgonzo1971

Hi,

You could replace all non numeric and then test for length

SELECT REGEXP_REPLACE('ZENT=OZA NA 1234', '[^0-9]+', '')
FROM Table

Open in new window

Regards
with org as
 (select 'ZENT=OZA NA 1234' val
    from dual
  union
  select 'ZENT=011231234567'
    from dual
  union
  select 'ZENT=2341234567'
    from dual
  union
  select 'ZENT=2341234567,'
    from dual
  union
  select 'ZENT=2341234567B'
    from dual
  union
  select 'ZENT=0123412345,67'
    from dual
  union
  select 'ZENT=0123412345B67'
    from dual)
select val,
       regexp_substr(val, '\d{10}') val_regexp
  from org;

Open in new window


Should the 'ZENT=OZA NA 1234' be left-padded with zeros or what do you expect?
Avatar of GurcanK

ASKER

I would like to locate "ZENT=" part first, then extract right of the "ZENT=" part till space, and then remove non-numeric. How can I achieve this?  

For example:
"RRRRRRRR MZX ZENT=012341234567,  BBBBBBBZ"
First I will get  only "012341234567," then remove comma and get: "012341234567"
Sorry, I missed the "least":

with org as
 (select 'ZENT=OZA NA 1234' val
    from dual
  union
  select 'ZENT=011231234567'
    from dual
  union
  select 'ZENT=2341234567'
    from dual
  union
  select 'ZENT=2341234567,'
    from dual
  union
  select 'ZENT=2341234567B'
    from dual
  union
  select 'ZENT=0123412345,67'
    from dual
  union
  select 'ZENT=0123412345B67'
    from dual)
select val,
       regexp_substr(val, '\d+') val_regexp
  from org;

Open in new window

And if you're just interested in the numeric values, don't mess around with the 'ZENT=" bla or whatever, just go for the digits ;-)
Avatar of GurcanK

ASKER

The column value can be: "RRRRRRRR MZX ZENT=012341234567B  BBBBBBBZ"

I will get: "012341234567" as a result.
Avatar of GurcanK

ASKER

For example: "RR123RRR 001 ZENT=012341234567B  BBBBBBBZ" will be "012341234567"
with org as
 (select 'RRRRRRRR MZX ZENT=012341234567B  BBBBBBBZ' val
    from dual)
select val,
       regexp_substr(val, '\d+') val_regexp
  from org;

Open in new window

Btw: check this site for testing regular expressions ;-)

https://regex101.com/
if you want "at least 10"  then you need {10,}  as modifier for your digit select

WITH org
     AS (SELECT 'ZENT=OZA NA 1234' val FROM DUAL
         UNION ALL
         SELECT 'ZENT=011231234567' FROM DUAL
         UNION ALL
         SELECT 'ZENT=2341234567' FROM DUAL
         UNION ALL
         SELECT 'ZENT=2341234567,' FROM DUAL
         UNION ALL
         SELECT 'ZENT=2341234567B' FROM DUAL
         UNION ALL
         SELECT 'ZENT=0123412345,67' FROM DUAL
         UNION ALL
         SELECT 'ZENT=0123412345B67' FROM DUAL
         UNION ALL
         SELECT 'RRRRRRRR MZX ZENT=012341234567B  BBBBBBBZ' FROM DUAL
         UNION ALL
         SELECT 'RR123RRR 001 ZENT=012341234567B  BBBBBBBZ' FROM DUAL)
SELECT val, REGEXP_SUBSTR(val, '[0-9]{10,}') val_regexp, REGEXP_SUBSTR(val, '\d{10,}') other_method
  FROM org;
Hi,

pls try ( My code doesn't stop at non numeric but before the first space and the delete the non numeric)
WITH org
      AS (SELECT 'ZENT=OZA NA 1234' val FROM DUAL
          UNION ALL
          SELECT 'ZENT=011231234567' FROM DUAL
          UNION ALL
          SELECT 'ZENT=2341234567' FROM DUAL
          UNION ALL
          SELECT 'ZENT=2341234567,' FROM DUAL
          UNION ALL
          SELECT 'ZENT=2341234567B' FROM DUAL
          UNION ALL
          SELECT 'ZENT=0123412345,67' FROM DUAL
          UNION ALL
          SELECT 'ZENT=0123412345B67 1sas' FROM DUAL
          UNION ALL
          SELECT 'RRRRRRRR MZX ZENT=012341234567B  BBBBBBBZ' FROM DUAL
          UNION ALL
          SELECT 'RR123RRR 001 ZENT=012341234567B  BBBBBBBZ' FROM DUAL)
SELECT CASE WHEN LENGTH(REGEXP_REPLACE(REGEXP_SUBSTR(val,'ZENT=\S+'),'\D','')) >=  10 THEN regexp_replace(REGEXP_SUBSTR(val,'ZENT=\S+'),'\D','') ELSE '' END
FROM org

Open in new window

Regards
SOLUTION
Avatar of Rgonzo1971
Rgonzo1971

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
WTF??? Are you serious regarding your points' distribution???
@ALex140181

Your solution does not cover the Problem where a letter is in the middle of the code

like in "011231B234567"

See question will be "011231234567" (B eliminated).
omg, this can't be true...
this is just ridiculous! what a pitty...
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of GurcanK

ASKER

Congrulations sdstuber, thanks too much.
@sdstuder
Nice one but could give problems if code is
ZENT=01234DDFFFFFF1  then length is less than 10
>>>> Congrulations sdstuber, thanks too much.

does this mean you wish to change your accepted answer or at least continue the discussion about the final requirements?


>>> then length is less than 10

yes, but the request was for items of length 10 or more, which that text was.

if the requirement is only for data that is 10 after cleanup, then I agree it wouldn't work reliably.
I don't know if that sample data was representative of real data though.
does this mean you wish to change your accepted answer or at least continue the discussion about the final requirements?

lol (don't get me wrong), but NOW I'm really uber-excited what's going to happen *big smile*
Avatar of GurcanK

ASKER

Yes, I can change answer.
actually taking a little from both of the last two answers we can simplify a bit more with one of these...


REGEXP_REPLACE(REGEXP_SUBSTR(val, 'ZENT=[^ ]{10,}'), '[^0-9]')
REGEXP_REPLACE(REGEXP_SUBSTR(val, 'ZENT=\S{10,}'), '\D+')


using {10,} in the expression put the length check before cleaning up the numbers
but rgonzo's also didn't include the subexpression which my previous post did unnecessarily.
A word of terminology....

This is not a PL/SQL question.  This is a SQL question.  PL/SQL is Oracle's proprietary language used for procedures/functions/packages/triggers/etc.  If you put PL/SQL into the question, then some people are going to skip over it because that is not their area of expertise.  I also commonly skip over these questions when I don't have the time to root through procedure code to help someone.  Using the correct terminology will get the best people to look at your question.