I need to find credit card numbers hidden in text columns, using Oracle 11g and SQLPlus

During an audit it was discovered that users where hiding customers credit card info in text columns like a description or notes column vs. where it belongs. We can't filter it on the front-end because it's 3rd party software and we are unable to make changes and the vender refuses to make changes based on our needs.
So I need to use some SQL commands to scan columns for matches. Been looking at the REGEXP_SUBSTR commands but I have to write something that will scan all the columns in a particular schema with applying the expression.

Has anyone accomplished this task already ? If so any direction or examples are greatly appreciated.
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

slightwv (䄆 Netminder) Commented:
How much data are you talking about?

I would look towards regexp_instr but it is the same principle.

The problems you'll have is how to search for the number.  Off the top of my head, I can think of at least 3 ways:  series of numbers with spaces, without spaces and with dashes.

Depending on how much data you are talking, I would probably create a pl/sql loop, take each row and convert it to XML using DBMS_XMLGEN.GETXML and perform the regexp_instr on the row.
Mark GeerlingsDatabase AdministratorCommented:
I agree with the recommendation to use a PL\SQL procedure for this with a cursor loop to retrieve the possible varchar2 (and/or number?) columns that may contain credit card numbers.  Then exactly which evaluation technique you use for each value retrieved may depend a bit on your data volumes and on which technique(s) you may have some experience with.  I don't see how converting the rows to XML would help (but then I'm no expert in XML, so I don't know what operators might be available in XML).  Regexp_instr may be the most efficient operator to use for searching through varchar2 data in PL\SQL, but I'm no expert with this either.  It does look like a good idea but I can't help you with the syntax for that.
slightwv (䄆 Netminder) Commented:
>>I don't see how converting the rows to XML would help

It would convert the entire row to a 'string'.  Then one call to regexp_instr would check ALL columns at once.

That way you don't have to worry about conversions, string concatenation, individual columns, etc...

Here's an example:
drop table tab1 purge;

create table tab1(id number primary key, col1 varchar2(20), col2 varchar2(20));

insert into tab1 values(1,'No card here','Hello');
insert into tab1 values(2,'1111222233334444','Hello');
insert into tab1 values(3,'Hello','1111 2222 3333 4444');
insert into tab1 values(4,'Hello','1111-2222-3333-4444');

	myrow clob;
	for i in (select id from tab1) loop
		select dbms_xmlgen.getxml('select * from tab1 where id=' || i.id) into myrow from dual;
		if regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0 then
			dbms_output.put_line('May have found one: ' || to_char(i.id));
		end if;
	end loop;

Open in new window

Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

partyonAuthor Commented:
I'm first searching all tables and columns in a particular schema that has data types of 'CHAR', 'VARCHAR2', 'NCHAR' 'NVARCHAR2', 'CLOB' and "NCLOB'  these would be columns that could hold a credit card number in them.

So, If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  So my output should be something like this as a pattern 1234-1234-1234-1234 correct ?
slightwv (䄆 Netminder) Commented:
>>If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  

It looks for 4 numbers followed by an optional space or dash then another set of 4 numbers... etc...

It matches the patterns:
1111 2222 3333 4444

I cannot say ALL credit card numbers are 12 digits but all mine are...  Memory tells me that there are some that are not.

It also doesn't guarantee the number it finds is a credit card number.  It could be just a number that fits the 'pattern'.
Mark GeerlingsDatabase AdministratorCommented:
Amex account numbers can be in the format: 1111 222222 33333 (or: 4 digits space 6 digits space 5 digits) for a total of 15 digits, not the 16 digits that are common for: MC,Visa, Disconver, etc.  So, maybe that means you need to adjust your search to be:
regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0
or regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

(I say "maybe", because I'm not an expert on the syntax for regexp_instr.)
slightwv (䄆 Netminder) Commented:
You can 'or' the patterns.

So, using your new pattern, you should be able to do it all with a single call:

regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}|[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

There is a '|' in there.  Honest!

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Steve WalesSenior Database AdministratorCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.