Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Oracle PL/SQL - Repetitive Special Character Search

Posted on 2013-11-04
14
Medium Priority
?
1,382 Views
Last Modified: 2013-11-04
Hi,

I have a table with a text field and I need to search for records with repetitive characters/special characters.

For example, I need to find names that have 'XXX XXX XXXX' or '1111111'.
I also need to find repetitive string with special characters such as '%%$$$$$@@'.

Could anyone help me how I can construct the query to find such patterns?

Thank you.
0
Comment
Question by:iamnamja
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 4
  • 3
14 Comments
 
LVL 32

Expert Comment

by:awking00
ID: 39622088
Can you provide more descriptive examples of what your search should fine and the kind of data that would contain such characters? I suspect we can use regular expressions to find repeating and special characters.
0
 

Author Comment

by:iamnamja
ID: 39622211
Hi,

For example, in my customer name field, I have strings such as:

John Brown
Paul Smith
Mary Cohen
Susan XXXXX
11111111

My search should find 'Susan XXXXX', and '11111111'.

I tried using regular expressions in my where clause : regexp_like(Cust_Name, '[X]'), but this finds all the names that contain 'X'. I need to find strings that have any characters that show up repetitively whether it's alphanumeric or special characters.
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622268
See if this works for you.  The 'flaw' is legitimate repeating characters.   Note in the following test case I added 'Allen Jackson' and it is returned because of the two 'l's in Allen.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('11111');
insert into tab1 values('Allen Jackson');
commit;


select col1 from tab1 
where regexp_like(col1,'(.)\1')
/

Open in new window

0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:iamnamja
ID: 39622333
Thank you for the comment. However, my search shouldn't catch 'Allen Jackson' because that is a legitimate customer name. I'm trying to find any strings that contain data quality issues.
Is there any way to increase the number of occurance in the regexp_like search? For example, to return only the names with more than 5+ repeating characters.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622339
To avoid also getting rows like Allen or Miller, you can modify the regexp to
select * from rpt
where regexp_like(str,'(.)\1{3}');
which should only retrieve where characters repeat at least 3 times.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622346
Sorry  iposted after your comment, to retrieve characters repeating at least 5 times -
select * from rpt
where regexp_like(str,'(.)\1{5}');
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622366
>>to retrieve characters repeating at least 5 times

Slight correction:
regexp_like(str,'(.)\1{4}')

The backreference counts as one.  So to get 5 repeating, you need the number 4.

If you run it against my test case with a 5, you won't get 'Susan XXXXX'
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622370
I just discovered the qualifier inside the braces is >. I always thought it was >= so for 5-time repeats
select * from rpt
where regexp_like(str,'(.)\1{4}');
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622382
Thanks, slightwv. I forgot about the backreference having to be counted.
0
 

Author Comment

by:iamnamja
ID: 39622465
Thank you so much. I think 'select * from rpt where regexp_like(str,'(.)\1{4}'); ' will work for my search. I have one more related question:
Is there a way to find a string with combination of repeating special characters such as '%%$$$$$@@'? Not treating %, $, @ differently and to catch any kind of special non-alphanumeric characters if they are repeating more than 5+ occurances?
0
 
LVL 77

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 800 total points
ID: 39622536
Off the top of my head, the way I can think of:
replace all non-alpha, non-number or a space with the same character, then do the check as already posted.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('111111111111111111');
insert into tab1 values('Allen Jackson');
insert into tab1 values('%%$$$$@@');
commit;


with mydata as
(
select col1, regexp_replace(col1,'[^0-9a-zA-Z ]','#') fudged from tab1 
)
select col1 from mydata
where regexp_like(fudged,'(.)\1{4,}')
/

Open in new window

0
 
LVL 32

Expert Comment

by:awking00
ID: 39622546
You might use the translate function on your string first
where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

This example changes # to #. @ to #, $ to #, and % to #. You can expand the list to include other special characters that may exist by adding a '#' to the third translate parameter for each special character added.
0
 
LVL 32

Accepted Solution

by:
awking00 earned 1200 total points
ID: 39622558
SQL> select * from rpt;

STR
--------------------
aaaaaaaaa
bb
cccc
ddddddd
11
22222
@@@@@
123456
%%
eee%%$$$$@@

SQL> select * from rpt
  2  where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

STR
--------------------
aaaaaaaaa
ddddddd
22222
@@@@@
eee%%$$$$@@
0
 

Author Closing Comment

by:iamnamja
ID: 39622664
Thank you both awking00 and slightwv for your comments. This returns exactly what I need to search. Thanks!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article started out as an Experts-Exchange question, which then grew into a quick tip to go along with an IOUG presentation for the Collaborate confernce and then later grew again into a full blown article with expanded functionality and legacy…
Configuring and using Oracle Database Gateway for ODBC Introduction First, a brief summary of what a Database Gateway is.  A Gateway is a set of driver agents and configurations that allow an Oracle database to communicate with other platforms…
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database
This video shows how to copy a database user from one database to another user DBMS_METADATA.  It also shows how to copy a user's permissions and discusses password hash differences between Oracle 10g and 11g.

688 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question