Solved

Oracle PL/SQL - Repetitive Special Character Search

Posted on 2013-11-04
14
1,173 Views
Last Modified: 2013-11-04
Hi,

I have a table with a text field and I need to search for records with repetitive characters/special characters.

For example, I need to find names that have 'XXX XXX XXXX' or '1111111'.
I also need to find repetitive string with special characters such as '%%$$$$$@@'.

Could anyone help me how I can construct the query to find such patterns?

Thank you.
0
Comment
Question by:iamnamja
  • 7
  • 4
  • 3
14 Comments
 
LVL 31

Expert Comment

by:awking00
ID: 39622088
Can you provide more descriptive examples of what your search should fine and the kind of data that would contain such characters? I suspect we can use regular expressions to find repeating and special characters.
0
 

Author Comment

by:iamnamja
ID: 39622211
Hi,

For example, in my customer name field, I have strings such as:

John Brown
Paul Smith
Mary Cohen
Susan XXXXX
11111111

My search should find 'Susan XXXXX', and '11111111'.

I tried using regular expressions in my where clause : regexp_like(Cust_Name, '[X]'), but this finds all the names that contain 'X'. I need to find strings that have any characters that show up repetitively whether it's alphanumeric or special characters.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622268
See if this works for you.  The 'flaw' is legitimate repeating characters.   Note in the following test case I added 'Allen Jackson' and it is returned because of the two 'l's in Allen.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('11111');
insert into tab1 values('Allen Jackson');
commit;


select col1 from tab1 
where regexp_like(col1,'(.)\1')
/

Open in new window

0
 

Author Comment

by:iamnamja
ID: 39622333
Thank you for the comment. However, my search shouldn't catch 'Allen Jackson' because that is a legitimate customer name. I'm trying to find any strings that contain data quality issues.
Is there any way to increase the number of occurance in the regexp_like search? For example, to return only the names with more than 5+ repeating characters.
0
 
LVL 31

Expert Comment

by:awking00
ID: 39622339
To avoid also getting rows like Allen or Miller, you can modify the regexp to
select * from rpt
where regexp_like(str,'(.)\1{3}');
which should only retrieve where characters repeat at least 3 times.
0
 
LVL 31

Expert Comment

by:awking00
ID: 39622346
Sorry  iposted after your comment, to retrieve characters repeating at least 5 times -
select * from rpt
where regexp_like(str,'(.)\1{5}');
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622366
>>to retrieve characters repeating at least 5 times

Slight correction:
regexp_like(str,'(.)\1{4}')

The backreference counts as one.  So to get 5 repeating, you need the number 4.

If you run it against my test case with a 5, you won't get 'Susan XXXXX'
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 31

Expert Comment

by:awking00
ID: 39622370
I just discovered the qualifier inside the braces is >. I always thought it was >= so for 5-time repeats
select * from rpt
where regexp_like(str,'(.)\1{4}');
0
 
LVL 31

Expert Comment

by:awking00
ID: 39622382
Thanks, slightwv. I forgot about the backreference having to be counted.
0
 

Author Comment

by:iamnamja
ID: 39622465
Thank you so much. I think 'select * from rpt where regexp_like(str,'(.)\1{4}'); ' will work for my search. I have one more related question:
Is there a way to find a string with combination of repeating special characters such as '%%$$$$$@@'? Not treating %, $, @ differently and to catch any kind of special non-alphanumeric characters if they are repeating more than 5+ occurances?
0
 
LVL 76

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 200 total points
ID: 39622536
Off the top of my head, the way I can think of:
replace all non-alpha, non-number or a space with the same character, then do the check as already posted.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('111111111111111111');
insert into tab1 values('Allen Jackson');
insert into tab1 values('%%$$$$@@');
commit;


with mydata as
(
select col1, regexp_replace(col1,'[^0-9a-zA-Z ]','#') fudged from tab1 
)
select col1 from mydata
where regexp_like(fudged,'(.)\1{4,}')
/

Open in new window

0
 
LVL 31

Expert Comment

by:awking00
ID: 39622546
You might use the translate function on your string first
where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

This example changes # to #. @ to #, $ to #, and % to #. You can expand the list to include other special characters that may exist by adding a '#' to the third translate parameter for each special character added.
0
 
LVL 31

Accepted Solution

by:
awking00 earned 300 total points
ID: 39622558
SQL> select * from rpt;

STR
--------------------
aaaaaaaaa
bb
cccc
ddddddd
11
22222
@@@@@
123456
%%
eee%%$$$$@@

SQL> select * from rpt
  2  where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

STR
--------------------
aaaaaaaaa
ddddddd
22222
@@@@@
eee%%$$$$@@
0
 

Author Closing Comment

by:iamnamja
ID: 39622664
Thank you both awking00 and slightwv for your comments. This returns exactly what I need to search. Thanks!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Have you ever had to make fundamental changes to a table in Oracle, but haven't been able to get any downtime?  I'm talking things like: * Dropping columns * Shrinking allocated space * Removing chained blocks and restoring the PCTFREE * Re-or…
This post first appeared at Oracleinaction  (http://oracleinaction.com/undo-and-redo-in-oracle/)by Anju Garg (Myself). I  will demonstrate that undo for DML’s is stored both in undo tablespace and online redo logs. Then, we will analyze the reaso…
This video shows how to Export data from an Oracle database using the Original Export Utility.  The corresponding Import utility, which works the same way is referenced, but not demonstrated.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

707 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

15 Experts available now in Live!

Get 1:1 Help Now