Oracle Regular Expression Looking for Missing Parenthesis

I am looking to find missing parens, both right or left, in an Oracle column of varchar. So far I have

select  testcol,  'MISSING LEFT PAREN' NOTE
from test a
where
regexp_like(a.testcol, '[^(].*[)]')
;

What I need to do is find any missing parens that should be there to match an existing one. Either right or left and if there are multiples.
Thanks much,
CaputoAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
sdstuberConnect With a Mentor Commented:
unfortunately this is not solvable for an unknown number of parentheses pairs with regular expressions (at least not most regexp engines including Oracle's)

if you know you will have at most N pairs, you can build a patterns to handle those N

if N is unknown though, then you will have to resort to other means


one relatively easy way to look for N pairs is strip all non-parentheses characters then replace all '()' strings with null N times
if you're left with a null string afterwards then you had good pairing, if you end up with non-null string then you had bad pairing

here's an example where N = 3.





SELECT   s,
         NVL2(
             REGEXP_REPLACE(
                 REGEXP_REPLACE(
                     REGEXP_REPLACE(REGEXP_REPLACE(s, '[^()]'), '\([^()]*\)'),
                     '\([^()]*\)'
                 ),
                 '\([^()]*\)'
             ),
             'Bad pairs',
             'Good pairs'
         )
  FROM   (SELECT   'nopairs' s FROM DUAL
          UNION ALL
          SELECT   'bad ( left' s FROM DUAL
          UNION ALL
          SELECT   'bad ) right' s FROM DUAL
          UNION ALL
          SELECT   '(good) and (bad' s FROM DUAL
          UNION ALL
          SELECT   'one (pair)' s FROM DUAL
          UNION ALL
          SELECT   '(two) (pairs)' s FROM DUAL
          UNION ALL
          SELECT   '((two pairs nested))' s FROM DUAL
          UNION ALL
          SELECT   '(three) ( ) (pairs)' FROM DUAL
          UNION ALL
          SELECT   '(((three pairs nested)))' FROM DUAL)

Open in new window

0
 
sdstuberCommented:
the expressions above can be simplified with ...


SELECT   s,
         NVL2(
            REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(s, '[^()]'), '\(\)'), '\(\)'),'\(\)'),
             'Bad pairs',
             'Good pairs'
         )



here's a function that will iterate through any number of pairs
if it returns NULL then your string had good pairs, if it returns not-null then you have uneven pairs
CREATE OR REPLACE FUNCTION good_pairs(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_temp   VARCHAR2(32767) := REGEXP_REPLACE(p_string, '[^()]', NULL);
    v_new    VARCHAR2(32767) := REGEXP_REPLACE(v_temp, '\(\)');
BEGIN
    WHILE v_temp != NVL(v_new, '-')
    LOOP
        v_temp   := v_new;
        v_new    := REGEXP_REPLACE(v_new, '\(\)');
    END LOOP;
 
    RETURN v_temp;
END;

Open in new window

0
 
awking00Commented:
See attached.
test-parens.txt
0
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

 
sdstuberCommented:
if I'm reading that right, doesn't test_parens only check the total count of left and right?

so  ' a)))(  b ( c ('   would check out ok
0
 
awking00Commented:
sdstuber,
As usual, you're right. They do should be evaluated as pairs, which that function doesn't do. Back to the drawing board.
0
 
sdstuberCommented:
here are two other versions that use a stack type algorithm,  push for ( and pop for )

version 2 iterates through all characters
version 3 strips all non parentheses characters first (using regular expression) then iterates through just the parentheses

version 3 is faster than version 2, but neither is as good as the regular expression looping from post 24013818  (for 110,000 iterations)

9 seconds for version 1
30 seconds for version 2,
14 seconds for version 3

but the best I found is the first post with the hard limit of 3,  
that ran in 531 milliseconds for 110,000 iterations

so, if you can identify a maximum number of parentheses to check I recommend doing it that way since invoking a pl/sql function is MUCH slower, even in the best case
0
 
sdstuberCommented:
here are two other versions that use a stack type algorithm,  push for ( and pop for )

version 2 iterates through all characters
version 3 strips all non parentheses characters first (using regular expression) then iterates through just the parentheses

version 3 is faster than version 2, but neither is as good as the regular expression looping from post 24013818  (for 110,000 iterations)

9 seconds for version 1
30 seconds for version 2,
14 seconds for version 3

but the best I found is the first post with the hard limit of 3,  
that ran in 531 milliseconds for 110,000 iterations

so, if you can identify a maximum number of parentheses to check I recommend doing it that way since invoking a pl/sql function is MUCH slower, even in the best case
CREATE OR REPLACE FUNCTION good_pairs2(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_cnt   INTEGER := 0;
    v_index INTEGER := 1;
BEGIN
    WHILE v_index <= LENGTH(p_string)
    LOOP
        IF SUBSTR(p_string, v_index, 1) = '('
        THEN
            v_cnt := v_cnt + 1;
        ELSIF SUBSTR(p_string, v_index, 1) = ')'
        THEN
            v_cnt := v_cnt - 1;
        END IF;
 
        IF v_cnt < 0
        THEN
            EXIT;
        END IF;
 
        v_index := v_index + 1;
    END LOOP;
 
    IF v_cnt = 0
    THEN
        RETURN 'Good Pairs';
    ELSE
        RETURN 'Bad Pairs';
    END IF;
END;
 
 
CREATE OR REPLACE FUNCTION good_pairs3(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_cnt   INTEGER := 0;
    v_index INTEGER := 1;
    v_string varchar2(32767) := regexp_replace(p_string,'[^()]');
BEGIN
    WHILE v_index <= LENGTH(v_string)
    LOOP
        IF SUBSTR(v_string, v_index, 1) = '('
        THEN
            v_cnt := v_cnt + 1;
        ELSIF SUBSTR(v_string, v_index, 1) = ')'
        THEN
            v_cnt := v_cnt - 1;
        END IF;
 
        IF v_cnt < 0
        THEN
            EXIT;
        END IF;
 
        v_index := v_index + 1;
    END LOOP;
 
    IF v_cnt = 0
    THEN
        RETURN 'Good Pairs';
    ELSE
        RETURN 'Bad Pairs';
    END IF;
END;

Open in new window

0
 
CaputoAuthor Commented:
Thanks guys, I think the first answer will suffice, however the others are certainly interesting and worth some review.
0
 
sdstuberCommented:
glad I could help
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.