[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 917
  • Last Modified:

I need to find credit card numbers hidden in text columns, using Oracle 11g and SQLPlus

During an audit it was discovered that users where hiding customers credit card info in text columns like a description or notes column vs. where it belongs. We can't filter it on the front-end because it's 3rd party software and we are unable to make changes and the vender refuses to make changes based on our needs.
So I need to use some SQL commands to scan columns for matches. Been looking at the REGEXP_SUBSTR commands but I have to write something that will scan all the columns in a particular schema with applying the expression.

Has anyone accomplished this task already ? If so any direction or examples are greatly appreciated.
0
partyon
Asked:
partyon
2 Solutions
 
slightwv (䄆 Netminder) Commented:
How much data are you talking about?

I would look towards regexp_instr but it is the same principle.

The problems you'll have is how to search for the number.  Off the top of my head, I can think of at least 3 ways:  series of numbers with spaces, without spaces and with dashes.

Depending on how much data you are talking, I would probably create a pl/sql loop, take each row and convert it to XML using DBMS_XMLGEN.GETXML and perform the regexp_instr on the row.
0
 
Mark GeerlingsDatabase AdministratorCommented:
I agree with the recommendation to use a PL\SQL procedure for this with a cursor loop to retrieve the possible varchar2 (and/or number?) columns that may contain credit card numbers.  Then exactly which evaluation technique you use for each value retrieved may depend a bit on your data volumes and on which technique(s) you may have some experience with.  I don't see how converting the rows to XML would help (but then I'm no expert in XML, so I don't know what operators might be available in XML).  Regexp_instr may be the most efficient operator to use for searching through varchar2 data in PL\SQL, but I'm no expert with this either.  It does look like a good idea but I can't help you with the syntax for that.
0
 
slightwv (䄆 Netminder) Commented:
>>I don't see how converting the rows to XML would help

It would convert the entire row to a 'string'.  Then one call to regexp_instr would check ALL columns at once.

That way you don't have to worry about conversions, string concatenation, individual columns, etc...

Here's an example:
drop table tab1 purge;

create table tab1(id number primary key, col1 varchar2(20), col2 varchar2(20));


insert into tab1 values(1,'No card here','Hello');
insert into tab1 values(2,'1111222233334444','Hello');
insert into tab1 values(3,'Hello','1111 2222 3333 4444');
insert into tab1 values(4,'Hello','1111-2222-3333-4444');
commit;


declare
	myrow clob;
begin
	for i in (select id from tab1) loop
		select dbms_xmlgen.getxml('select * from tab1 where id=' || i.id) into myrow from dual;
		if regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0 then
			dbms_output.put_line('May have found one: ' || to_char(i.id));
		end if;
	end loop;
end;
/

Open in new window

0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
partyonAuthor Commented:
I'm first searching all tables and columns in a particular schema that has data types of 'CHAR', 'VARCHAR2', 'NCHAR' 'NVARCHAR2', 'CLOB' and "NCLOB'  these would be columns that could hold a credit card number in them.

So, If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  So my output should be something like this as a pattern 1234-1234-1234-1234 correct ?
0
 
slightwv (䄆 Netminder) Commented:
>>If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  

It looks for 4 numbers followed by an optional space or dash then another set of 4 numbers... etc...

It matches the patterns:
1112223334444
1111 2222 3333 4444
1111-2222-3333-4444

I cannot say ALL credit card numbers are 12 digits but all mine are...  Memory tells me that there are some that are not.

It also doesn't guarantee the number it finds is a credit card number.  It could be just a number that fits the 'pattern'.
0
 
Mark GeerlingsDatabase AdministratorCommented:
Amex account numbers can be in the format: 1111 222222 33333 (or: 4 digits space 6 digits space 5 digits) for a total of 15 digits, not the 16 digits that are common for: MC,Visa, Disconver, etc.  So, maybe that means you need to adjust your search to be:
regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0
or regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

(I say "maybe", because I'm not an expert on the syntax for regexp_instr.)
0
 
slightwv (䄆 Netminder) Commented:
You can 'or' the patterns.

So, using your new pattern, you should be able to do it all with a single call:

regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}|[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

There is a '|' in there.  Honest!
0
 
Steve WalesSenior Database AdministratorCommented:
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now