Link to home
Start Free TrialLog in
Avatar of premkumar somasundaram palani
premkumar somasundaram palani

asked on

Regexp to get exact word

I am trying to identify the rows which match the below conditions from the Source String

Source Strings
1) Match
Should get selected.
2) Matched
Should not get selected as it has "ed" at the end.
3)  Match
Should get selected even though there is a space before the keyword.
4) RO_Match
Should not get selected as the word is not exactly same as "Match"
6) RMatch
Should not get selected as the word is not exactly same as "Match"
7).Match
Should get selected as  "." character is allowed before the  Keyword.

How to do this thru regexp_instr?
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Do you have a list of 'allowed' and 'not allowed' characters?

For example:  '_' isn't allowed but '.' is.

Given the explanations above:
RO.Match
Should get selected
Avatar of premkumar somasundaram palani

ASKER

Allowed characters
1) .  (Period)- Zero or one occurence
2) Space - Zero or more occurence

Not allowed
1) Any alphanumeric character
2)  Alphanumberic in combination with Underscore. Ex: RO_Match
3) Only Underscore

RO.Match is not allowed as it has "RO" before "."

I am trying to pick the database object names from Source String.

Database objects list:
1) RO_MATCH
2) MATCH
3) TEST_Package

Source String is "XXXX Match XXXX Test_Package".
Only Match and Test_Package should be selected. But this query regexp_instr(<Source String>,'[^_]*'||dba_objects.object_name||'[^_]*) gives RO_Match in the result.
how about this?

regexp_instr(source_string,' *.?Match.? *)

or maybe simpler but not completely functionally equivalent


regexp_instr(source_string,'[ .]Match[ .]')
A full solution is going to depend on the search space. (i.e. all source strings)

There are expressions that can match your rules for some strings but not others.
regexp_instr(source_string,' *.?Match.? *)  ==> still prints RO_MATCH in output
regexp_instr(source_string,'[ .]Match[ .]') ==> NO Values are getting printed out.
I reread the query I think I understand better now I'll try again
SELECT REGEXP_INSTR(
           'XXXX Match XXXX Test_Package',
           '(^|[. ])'||object_name||'([. ]|$)',
           1,
           1,
           0,
           'i'
       )
  FROM (SELECT 'RO_MATCH' object_name FROM DUAL
        UNION ALL
        SELECT 'MATCH' FROM DUAL
        UNION ALL
        SELECT 'TEST_Package' FROM DUAL)


you would use your real query from dba_objects instead of the inline view I used
When i substitute the values in Your query, it gives me correct result. But my query does not print "Match".

My Exact Source String:  "INSERT INTO R_MATCH SELECT xxxx FROM  TEST_PACKAGE"

Query:  regexp_instr(TM.SQL_STMT,'(^|[. ])'||do.object_name||'([. ]|$)')

If i use the query regexp_instr(TM.SQL_STMT,'[^_]*'||do.object_name||'[^_]*')    both R_Match and Match gets printed. I need only R_Match in the output.

Since my source string is multi-lined, not sure this would be the reason.
>>> But my query does not print "Match".

why would it?  you asked to use regexp_instr which returns a number


>>> Query:  regexp_instr(TM.SQL_STMT,'(^|[. ])'||do.object_name||'([. ]|$)')

that's not a query, that's a function call
please post your query that you are using where you are trying to "print" these results
 
>> If i use the query regexp_instr(TM.SQL_STMT,'[^_]*'||do.object_name||'[^_]*')  

I already posted a different expression an hour prior to your reply, try that one


>>> My Exact Source String:  "INSERT INTO R_MATCH SELECT xxxx FROM  TEST_PACKAGE"
>>> Since my source string is multi-lined, not sure this would be the reason.


These two statements contradict each other.

Is the first line really the source string, or is the second line correct and you have a multi-line statement?
If the latter, then please provide a real example.
you have been mixing case in your examples and explanations with contradictory results

also,  this example doesn't match previous examples

If i use the query regexp_instr(TM.SQL_STMT,'[^_]*'||do.object_name||'[^_]*')    both R_Match and Match gets printed. I need only R_Match in the output.

Why would R_Match   be found when your inputs  are

Database objects list:
1) RO_MATCH
2) MATCH
3) TEST_Package

TEST_Package should match but none of the others should using the last expression I posted.
However it won't match either if you want case-sensitive matching, which, as noted above, you seem to have vacillated on.

Your question will probably be easy to answer if you can provide a complete, representative and consistent test case of sample data and expected results (both in rows and columns, not descriptions.)
ASKER CERTIFIED SOLUTION
Avatar of Sean Stuber
Sean Stuber

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Your last solution it worked.I used instr to check if such a object is present and substr to extract the same.