Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Oracle PL/SQL - Repetitive Special Character Search

Posted on 2013-11-04
14
Medium Priority
?
1,422 Views
Last Modified: 2013-11-04
Hi,

I have a table with a text field and I need to search for records with repetitive characters/special characters.

For example, I need to find names that have 'XXX XXX XXXX' or '1111111'.
I also need to find repetitive string with special characters such as '%%$$$$$@@'.

Could anyone help me how I can construct the query to find such patterns?

Thank you.
0
Comment
Question by:iamnamja
  • 7
  • 4
  • 3
14 Comments
 
LVL 32

Expert Comment

by:awking00
ID: 39622088
Can you provide more descriptive examples of what your search should fine and the kind of data that would contain such characters? I suspect we can use regular expressions to find repeating and special characters.
0
 

Author Comment

by:iamnamja
ID: 39622211
Hi,

For example, in my customer name field, I have strings such as:

John Brown
Paul Smith
Mary Cohen
Susan XXXXX
11111111

My search should find 'Susan XXXXX', and '11111111'.

I tried using regular expressions in my where clause : regexp_like(Cust_Name, '[X]'), but this finds all the names that contain 'X'. I need to find strings that have any characters that show up repetitively whether it's alphanumeric or special characters.
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622268
See if this works for you.  The 'flaw' is legitimate repeating characters.   Note in the following test case I added 'Allen Jackson' and it is returned because of the two 'l's in Allen.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('11111');
insert into tab1 values('Allen Jackson');
commit;


select col1 from tab1 
where regexp_like(col1,'(.)\1')
/

Open in new window

0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 

Author Comment

by:iamnamja
ID: 39622333
Thank you for the comment. However, my search shouldn't catch 'Allen Jackson' because that is a legitimate customer name. I'm trying to find any strings that contain data quality issues.
Is there any way to increase the number of occurance in the regexp_like search? For example, to return only the names with more than 5+ repeating characters.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622339
To avoid also getting rows like Allen or Miller, you can modify the regexp to
select * from rpt
where regexp_like(str,'(.)\1{3}');
which should only retrieve where characters repeat at least 3 times.
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622346
Sorry  iposted after your comment, to retrieve characters repeating at least 5 times -
select * from rpt
where regexp_like(str,'(.)\1{5}');
0
 
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 39622366
>>to retrieve characters repeating at least 5 times

Slight correction:
regexp_like(str,'(.)\1{4}')

The backreference counts as one.  So to get 5 repeating, you need the number 4.

If you run it against my test case with a 5, you won't get 'Susan XXXXX'
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622370
I just discovered the qualifier inside the braces is >. I always thought it was >= so for 5-time repeats
select * from rpt
where regexp_like(str,'(.)\1{4}');
0
 
LVL 32

Expert Comment

by:awking00
ID: 39622382
Thanks, slightwv. I forgot about the backreference having to be counted.
0
 

Author Comment

by:iamnamja
ID: 39622465
Thank you so much. I think 'select * from rpt where regexp_like(str,'(.)\1{4}'); ' will work for my search. I have one more related question:
Is there a way to find a string with combination of repeating special characters such as '%%$$$$$@@'? Not treating %, $, @ differently and to catch any kind of special non-alphanumeric characters if they are repeating more than 5+ occurances?
0
 
LVL 78

Assisted Solution

by:slightwv (䄆 Netminder)
slightwv (䄆 Netminder) earned 800 total points
ID: 39622536
Off the top of my head, the way I can think of:
replace all non-alpha, non-number or a space with the same character, then do the check as already posted.

drop table tab1 purge;
create table tab1(col1 varchar2(40));

insert into tab1 values('John Brown');
insert into tab1 values('Paul Smith');
insert into tab1 values('Mary Cohen');
insert into tab1 values('Susan XXXXX');
insert into tab1 values('111111111111111111');
insert into tab1 values('Allen Jackson');
insert into tab1 values('%%$$$$@@');
commit;


with mydata as
(
select col1, regexp_replace(col1,'[^0-9a-zA-Z ]','#') fudged from tab1 
)
select col1 from mydata
where regexp_like(fudged,'(.)\1{4,}')
/

Open in new window

0
 
LVL 32

Expert Comment

by:awking00
ID: 39622546
You might use the translate function on your string first
where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

This example changes # to #. @ to #, $ to #, and % to #. You can expand the list to include other special characters that may exist by adding a '#' to the third translate parameter for each special character added.
0
 
LVL 32

Accepted Solution

by:
awking00 earned 1200 total points
ID: 39622558
SQL> select * from rpt;

STR
--------------------
aaaaaaaaa
bb
cccc
ddddddd
11
22222
@@@@@
123456
%%
eee%%$$$$@@

SQL> select * from rpt
  2  where regexp_like(translate(str,'#@$%','####'),'(.)\1{4}');

STR
--------------------
aaaaaaaaa
ddddddd
22222
@@@@@
eee%%$$$$@@
0
 

Author Closing Comment

by:iamnamja
ID: 39622664
Thank you both awking00 and slightwv for your comments. This returns exactly what I need to search. Thanks!
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Cursors in Oracle: A cursor is used to process individual rows returned by database system for a query. In oracle every SQL statement executed by the oracle server has a private area. This area contains information about the SQL statement and the…
When it comes to protecting Oracle Database servers and systems, there are a ton of myths out there. Here are the most common.
This video explains at a high level about the four available data types in Oracle and how dates can be manipulated by the user to get data into and out of the database.
This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.
Suggested Courses

963 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question