?
Solved

Oracle Regular Expression Looking for Missing Parenthesis

Posted on 2009-03-29
9
Medium Priority
?
669 Views
Last Modified: 2013-12-18
I am looking to find missing parens, both right or left, in an Oracle column of varchar. So far I have

select  testcol,  'MISSING LEFT PAREN' NOTE
from test a
where
regexp_like(a.testcol, '[^(].*[)]')
;

What I need to do is find any missing parens that should be there to match an existing one. Either right or left and if there are multiples.
Thanks much,
0
Comment
Question by:Caputo
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 2
9 Comments
 
LVL 74

Accepted Solution

by:
sdstuber earned 2000 total points
ID: 24013620
unfortunately this is not solvable for an unknown number of parentheses pairs with regular expressions (at least not most regexp engines including Oracle's)

if you know you will have at most N pairs, you can build a patterns to handle those N

if N is unknown though, then you will have to resort to other means


one relatively easy way to look for N pairs is strip all non-parentheses characters then replace all '()' strings with null N times
if you're left with a null string afterwards then you had good pairing, if you end up with non-null string then you had bad pairing

here's an example where N = 3.





SELECT   s,
         NVL2(
             REGEXP_REPLACE(
                 REGEXP_REPLACE(
                     REGEXP_REPLACE(REGEXP_REPLACE(s, '[^()]'), '\([^()]*\)'),
                     '\([^()]*\)'
                 ),
                 '\([^()]*\)'
             ),
             'Bad pairs',
             'Good pairs'
         )
  FROM   (SELECT   'nopairs' s FROM DUAL
          UNION ALL
          SELECT   'bad ( left' s FROM DUAL
          UNION ALL
          SELECT   'bad ) right' s FROM DUAL
          UNION ALL
          SELECT   '(good) and (bad' s FROM DUAL
          UNION ALL
          SELECT   'one (pair)' s FROM DUAL
          UNION ALL
          SELECT   '(two) (pairs)' s FROM DUAL
          UNION ALL
          SELECT   '((two pairs nested))' s FROM DUAL
          UNION ALL
          SELECT   '(three) ( ) (pairs)' FROM DUAL
          UNION ALL
          SELECT   '(((three pairs nested)))' FROM DUAL)

Open in new window

0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24013818
the expressions above can be simplified with ...


SELECT   s,
         NVL2(
            REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(s, '[^()]'), '\(\)'), '\(\)'),'\(\)'),
             'Bad pairs',
             'Good pairs'
         )



here's a function that will iterate through any number of pairs
if it returns NULL then your string had good pairs, if it returns not-null then you have uneven pairs
CREATE OR REPLACE FUNCTION good_pairs(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_temp   VARCHAR2(32767) := REGEXP_REPLACE(p_string, '[^()]', NULL);
    v_new    VARCHAR2(32767) := REGEXP_REPLACE(v_temp, '\(\)');
BEGIN
    WHILE v_temp != NVL(v_new, '-')
    LOOP
        v_temp   := v_new;
        v_new    := REGEXP_REPLACE(v_new, '\(\)');
    END LOOP;
 
    RETURN v_temp;
END;

Open in new window

0
 
LVL 32

Expert Comment

by:awking00
ID: 24019127
See attached.
test-parens.txt
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 74

Expert Comment

by:sdstuber
ID: 24019256
if I'm reading that right, doesn't test_parens only check the total count of left and right?

so  ' a)))(  b ( c ('   would check out ok
0
 
LVL 32

Expert Comment

by:awking00
ID: 24020972
sdstuber,
As usual, you're right. They do should be evaluated as pairs, which that function doesn't do. Back to the drawing board.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24021540
here are two other versions that use a stack type algorithm,  push for ( and pop for )

version 2 iterates through all characters
version 3 strips all non parentheses characters first (using regular expression) then iterates through just the parentheses

version 3 is faster than version 2, but neither is as good as the regular expression looping from post 24013818  (for 110,000 iterations)

9 seconds for version 1
30 seconds for version 2,
14 seconds for version 3

but the best I found is the first post with the hard limit of 3,  
that ran in 531 milliseconds for 110,000 iterations

so, if you can identify a maximum number of parentheses to check I recommend doing it that way since invoking a pl/sql function is MUCH slower, even in the best case
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24021544
here are two other versions that use a stack type algorithm,  push for ( and pop for )

version 2 iterates through all characters
version 3 strips all non parentheses characters first (using regular expression) then iterates through just the parentheses

version 3 is faster than version 2, but neither is as good as the regular expression looping from post 24013818  (for 110,000 iterations)

9 seconds for version 1
30 seconds for version 2,
14 seconds for version 3

but the best I found is the first post with the hard limit of 3,  
that ran in 531 milliseconds for 110,000 iterations

so, if you can identify a maximum number of parentheses to check I recommend doing it that way since invoking a pl/sql function is MUCH slower, even in the best case
CREATE OR REPLACE FUNCTION good_pairs2(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_cnt   INTEGER := 0;
    v_index INTEGER := 1;
BEGIN
    WHILE v_index <= LENGTH(p_string)
    LOOP
        IF SUBSTR(p_string, v_index, 1) = '('
        THEN
            v_cnt := v_cnt + 1;
        ELSIF SUBSTR(p_string, v_index, 1) = ')'
        THEN
            v_cnt := v_cnt - 1;
        END IF;
 
        IF v_cnt < 0
        THEN
            EXIT;
        END IF;
 
        v_index := v_index + 1;
    END LOOP;
 
    IF v_cnt = 0
    THEN
        RETURN 'Good Pairs';
    ELSE
        RETURN 'Bad Pairs';
    END IF;
END;
 
 
CREATE OR REPLACE FUNCTION good_pairs3(p_string IN VARCHAR2)
    RETURN VARCHAR2
IS
    v_cnt   INTEGER := 0;
    v_index INTEGER := 1;
    v_string varchar2(32767) := regexp_replace(p_string,'[^()]');
BEGIN
    WHILE v_index <= LENGTH(v_string)
    LOOP
        IF SUBSTR(v_string, v_index, 1) = '('
        THEN
            v_cnt := v_cnt + 1;
        ELSIF SUBSTR(v_string, v_index, 1) = ')'
        THEN
            v_cnt := v_cnt - 1;
        END IF;
 
        IF v_cnt < 0
        THEN
            EXIT;
        END IF;
 
        v_index := v_index + 1;
    END LOOP;
 
    IF v_cnt = 0
    THEN
        RETURN 'Good Pairs';
    ELSE
        RETURN 'Bad Pairs';
    END IF;
END;

Open in new window

0
 

Author Comment

by:Caputo
ID: 24029330
Thanks guys, I think the first answer will suffice, however the others are certainly interesting and worth some review.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 24029557
glad I could help
0

Featured Post

Veeam Task Manager for Hyper-V

Task Manager for Hyper-V provides critical information that allows you to monitor Hyper-V performance by displaying real-time views of CPU and memory at the individual VM-level, so you can quickly identify which VMs are using host resources.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
From implementing a password expiration date, to datatype conversions and file export options, these are some useful settings I've found in Jasper Server.
Video by: Steve
Using examples as well as descriptions, step through each of the common simple join types, explaining differences in syntax, differences in expected outputs and showing how the queries run along with the actual outputs based upon a simple set of dem…
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Suggested Courses

800 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question