I need to find credit card numbers hidden in text columns, using Oracle 11g and SQLPlus

Posted on 2014-08-18
Last Modified: 2015-02-08
During an audit it was discovered that users where hiding customers credit card info in text columns like a description or notes column vs. where it belongs. We can't filter it on the front-end because it's 3rd party software and we are unable to make changes and the vender refuses to make changes based on our needs.
So I need to use some SQL commands to scan columns for matches. Been looking at the REGEXP_SUBSTR commands but I have to write something that will scan all the columns in a particular schema with applying the expression.

Has anyone accomplished this task already ? If so any direction or examples are greatly appreciated.
Question by:partyon
    LVL 76

    Expert Comment

    by:slightwv (䄆 Netminder)
    How much data are you talking about?

    I would look towards regexp_instr but it is the same principle.

    The problems you'll have is how to search for the number.  Off the top of my head, I can think of at least 3 ways:  series of numbers with spaces, without spaces and with dashes.

    Depending on how much data you are talking, I would probably create a pl/sql loop, take each row and convert it to XML using DBMS_XMLGEN.GETXML and perform the regexp_instr on the row.
    LVL 34

    Expert Comment

    I agree with the recommendation to use a PL\SQL procedure for this with a cursor loop to retrieve the possible varchar2 (and/or number?) columns that may contain credit card numbers.  Then exactly which evaluation technique you use for each value retrieved may depend a bit on your data volumes and on which technique(s) you may have some experience with.  I don't see how converting the rows to XML would help (but then I'm no expert in XML, so I don't know what operators might be available in XML).  Regexp_instr may be the most efficient operator to use for searching through varchar2 data in PL\SQL, but I'm no expert with this either.  It does look like a good idea but I can't help you with the syntax for that.
    LVL 76

    Expert Comment

    by:slightwv (䄆 Netminder)
    >>I don't see how converting the rows to XML would help

    It would convert the entire row to a 'string'.  Then one call to regexp_instr would check ALL columns at once.

    That way you don't have to worry about conversions, string concatenation, individual columns, etc...

    Here's an example:
    drop table tab1 purge;
    create table tab1(id number primary key, col1 varchar2(20), col2 varchar2(20));
    insert into tab1 values(1,'No card here','Hello');
    insert into tab1 values(2,'1111222233334444','Hello');
    insert into tab1 values(3,'Hello','1111 2222 3333 4444');
    insert into tab1 values(4,'Hello','1111-2222-3333-4444');
    	myrow clob;
    	for i in (select id from tab1) loop
    		select dbms_xmlgen.getxml('select * from tab1 where id=' || into myrow from dual;
    		if regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0 then
    			dbms_output.put_line('May have found one: ' || to_char(;
    		end if;
    	end loop;

    Open in new window


    Author Comment

    I'm first searching all tables and columns in a particular schema that has data types of 'CHAR', 'VARCHAR2', 'NCHAR' 'NVARCHAR2', 'CLOB' and "NCLOB'  these would be columns that could hold a credit card number in them.

    So, If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  So my output should be something like this as a pattern 1234-1234-1234-1234 correct ?
    LVL 76

    Expert Comment

    by:slightwv (䄆 Netminder)
    >>If I use the regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0  will it return credit card numbers for all the variations ? meaning Visa, MC, Amex, Discover ..... ?  

    It looks for 4 numbers followed by an optional space or dash then another set of 4 numbers... etc...

    It matches the patterns:
    1111 2222 3333 4444

    I cannot say ALL credit card numbers are 12 digits but all mine are...  Memory tells me that there are some that are not.

    It also doesn't guarantee the number it finds is a credit card number.  It could be just a number that fits the 'pattern'.
    LVL 34

    Assisted Solution

    Amex account numbers can be in the format: 1111 222222 33333 (or: 4 digits space 6 digits space 5 digits) for a total of 15 digits, not the 16 digits that are common for: MC,Visa, Disconver, etc.  So, maybe that means you need to adjust your search to be:
    regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}') > 0
    or regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

    (I say "maybe", because I'm not an expert on the syntax for regexp_instr.)
    LVL 76

    Accepted Solution

    You can 'or' the patterns.

    So, using your new pattern, you should be able to do it all with a single call:

    regexp_instr(myrow,'[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}|[0-9]{4}[ -]?[0-9]{6}[ -]?[0-9]{5}') > 0

    There is a '|' in there.  Honest!
    LVL 22

    Expert Comment

    by:Steve Wales
    This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    PRTG Network Monitor: Intuitive Network Monitoring

    Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

    Background In several of the companies I have worked for, I noticed that corporate reporting is off loaded from the production database and done mainly on a clone database which needs to be kept up to date daily by various means, be it a logical…
    Configuring and using Oracle Database Gateway for ODBC Introduction First, a brief summary of what a Database Gateway is.  A Gateway is a set of driver agents and configurations that allow an Oracle database to communicate with other platforms…
    This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
    This video shows how to copy an entire tablespace from one database to another database using Transportable Tablespace functionality.

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    19 Experts available now in Live!

    Get 1:1 Help Now