Program to update millions of record

I have a Oracle table which consist of 11 million records...
And i need to encrypt a field for every single record...thus i wrote the following program.

But it's taking a very very very long time to finish updating.

Any suggestions/advice how can i go about do this fast?
public static void main(String[] args){
	    try {
	        DesEncrypter encrypter = new DesEncrypter("this is my passphrase");
		    Connection con=null;
		    Class.forName("oracle.jdbc.driver.OracleDriver");
		    con=DriverManager.getConnection("jdbc:oracle:thin:@localhost:mydb","dbuser","oracle123");
		      
		    PreparedStatement s= con.prepareStatement("SELECT field1, field2 FROM TableA");
		    PreparedStatement pstmt = con.prepareStatement("UPDATE TableA set field2= ? WHERE field1 = ? ");
 
 
		    String x = "";
		    String y = "";
		    
		    ResultSet rs = s.executeQuery();
		    while(rs.next()){
			    x = rs.getString(1);    //value of field1
			    y = rs.getString(2);    //value of field2
 
			    
	            pstmt.setString(1, encrypter.encrypt(y));   //encrypt value of field2
	            pstmt.setString(2, x);
	            pstmt.executeUpdate(); 
			    
		    }
		    
		    rs.close();
		    con.close();
	        
	    } catch (Exception e) {
	    	e.printStackTrace();
	    }

Open in new window

edisonffsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Mick BarryJava DeveloperCommented:
use a batch update

0
Mick BarryJava DeveloperCommented:
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
CEHJCommented:
Do you *really* want *all* data encrypted? If so, let the db do the encryption, preferably tranparently ( http://www.oracle.com/technology/oramag/oracle/05-sep/o55security.html ) or at the very least using db functions
0
Big Business Goals? Which KPIs Will Help You

The most successful MSPs rely on metrics – known as key performance indicators (KPIs) – for making informed decisions that help their businesses thrive, rather than just survive. This eBook provides an overview of the most important KPIs used by top MSPs.

mrjoltcolaCommented:
Is this a one time operation, or a maintenance operation that will be repeated?

Have you any log output that tells you how many records per minute you are current processing?

The approach you took is probably the slowest option of all, but may be the most direct way, and if it is a one time thing, you might just let it run to the finish, however, be VERY sure your query is making use of indexing or you will be waiting for a long time. I assume you are sure field1 is a primary key / unique index.

However, if this is an operation that must be repeated, periodically, I recommend several things.
1) Use parallelism. How many CPUs do you have? You can run multiple instances of the above program, with different key ranges, or even one program with several threads, and each instance will only work on a given range.  This depends on how your primary key is dispersed. What is "field1" ? If it is an incrementing integer, you could distribute by range, or you could use modulus of a number of workers.
2) Use partitions to enhance the effect of parallelism. Run a worker for each partition. The Oracle IO will be physically separate.
3) Consider using DBMS_CRYPTO Oracle package instead of your Java API. DBMS_CRYPTO supports DES as well as AES. So you can accomplish it without the external Java program entirely and skip the JDBC bottleneck.

http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14258/d_crypto.htm

0
mrjoltcolaCommented:
And yes, I also agree with CEHJ, consider using Oracle Wallet (TDE) / encrypted columns.

Using single DES on 11 million records, with the same passphrase has 2 security flaws.

1) Single DES is not that strong. DES was considered "cracked" in the late 90s. At least 3DES/CBC should be used, but AES is considered the successor to DES.
2) Regardless of cipher, if one record is cracked, all of your records are cracked. For this reason, if you are going to encrypt 11 million records with the same key, at least use a strong cipher or public/private key system, such as RSA. This is implemented in Oracle Wallet (TDE).

0
edisonffsAuthor Commented:
objects:

my pstmt.executeUpdate() is wrapped around a while loop, which means i am updating record by record....

I've looked the example u gave me, and this is a revised version of my codes (just the while loop)

Is this better?
                   int count = 1;
 
                    while(rs.next()){
                            x = rs.getString(1);    //value of field1
                            y = rs.getString(2);    //value of field2
 
                            
                            pstmt.setString(1, encrypter.encrypt(y));   //encrypt value of field2 
                            pstmt.setString(2, x);
                            pstmt.addBatch();
 
                            if(count == 1000){
                                int [] updateCounts = pstmt.executeBatch(); //Execute batch
                                count = 1
                            } else {
                                 count ++;
                            }
  
                    }

Open in new window

0
Mick BarryJava DeveloperCommented:
yes that should be faster

0
edisonffsAuthor Commented:
thanks for the tips. Oracle Wallet seems interesting, but it's rather new to me. Is there any example of java codes to make use of this? (ie. to decrypt the encrypted value)
0
CEHJCommented:
>>Is there any example of java codes to make use of this? (ie. to decrypt the encrypted value)

You don't need to make use of it. It's transparent, so you don't need to do anything
0
mrjoltcolaCommented:
If you are only interested in a Java solution, then I'm not sure what else you are looking for. You are limiting your performance, however. One of the problems with your Java approach, or any other host language, is that you are selecting from a result set of 11 million rows, fetching each one, then submitting an update statement 11 million times, which is a soft parse.

If you are on Oracle 10g or up, you can do the same operation within Oracle PL/SQL completely, and it will probably perform an order of magnitude faster. Depends on if you want the crypto-text to be raw or hex/ascii. Here is a sample that will encrypt / decrypt the PASSWORD field in all rows of table TEST, with a single SQL statement.

-- Encrypt with DES
DECLARE
  bits INTEGER := DBMS_CRYPTO.ENCRYPT_DES + DBMS_CRYPTO.CHAIN_CBC + DBMS_CRYPTO.PAD_PKCS5;
  rawkey RAW(128);
BEGIN
  rawkey := utl_i18n.string_to_raw('this is the key', 'AL32UTF8');
  UPDATE TEST SET PASSWORD = dbms_crypto.encrypt(utl_i18n.string_to_raw(PASSWORD, 'AL32UTF8'), bits, rawkey);
END;
/
 
-- Decrypt DES
DECLARE
  bits INTEGER := DBMS_CRYPTO.ENCRYPT_DES + DBMS_CRYPTO.CHAIN_CBC + DBMS_CRYPTO.PAD_PKCS5;
  rawkey RAW(128);
BEGIN
  rawkey := utl_i18n.string_to_raw('this is the key', 'AL32UTF8');
  UPDATE TEST SET PASSWORD = utl_i18n.raw_to_char(dbms_crypto.decrypt(PASSWORD, bits, rawkey), 'AL32UTF8');
END;
/

Open in new window

0
edisonffsAuthor Commented:
CEHJ:

The requirement given is that this value must only be readable to the web application (which means only the web application can decrypt and use the value), which means no user can just simply do SELECTstatement in SQLPlus to see the actual value.

Does Oracle wallet serves this purpose?

hope u can enlighten me on this.
0
CEHJCommented:
>>
Does Oracle wallet serves this purpose?

hope u can enlighten me on this.
>>

I think mrjoltcola might be more informed on this than i am
0
mrjoltcolaCommented:
Hi. Regarding Oracle Wallet, or more specifically, in this case, Transparent Data Encryption, it sounds like what you want is protection from DBAs and people with generate SQL access, which is not what TDE does. TDE hides the encryption behind the Oracle session, so the encryption happens at the storage level. If someone is connected to Oracle, with privileges, he can query the table.

The downside to using Oracle Advanced Security is cost, if you don't already have it licensed.

I think your current approach can be improved upon, please see my suggestions from http:#24470585 as well as the PL/SQL scripts that can do the encryption from the DB layer. Try them on some sample data.

So my questions still stand from http:#24470585 regarding the nature of this operation. Is it a recurring thing? Must the key be changed every so often?
0
edisonffsAuthor Commented:
it will be monthly process.
the key will be changed every month...

anyway i have tried object's suggested approach
time taken to update 100,000 records is an estimated of 1 hour and 45 minutes...
0
Mick BarryJava DeveloperCommented:
yes, jdbc is going to be pretty slow at this
you'll probably find that just the select takes forever by itself

0
CEHJCommented:
>>time taken to update 100,000 records is an estimated of 1 hour and 45 minutes...

Yes, batching is unlikely to make a significant difference
0
CEHJCommented:
edisonffs, perhaps you can tell us, for the benefit of future visitors to this question, what you gained from using batch updates?
0
mrjoltcolaCommented:
I find it regretful that you came to experts for help solving a performance problem and then did not both try the various solutions offered, and settled for & accepted a solution that you are not happy with. Please be aware that when you post in multiple zones, you will receive expertise from different aspects. Obviously you are a Java programmer, and were looking for a Java solution, nothing wrong with that, however, it is not always the fastest tool for the job. To really approach the performance you want, you are going to need to get closer to the particular database, and forego the portability of Java/JDBC. You may also need to rethink your whole approach. There may be better solutiosn that to update 11million records monthly with a shared, low strength key.
0
CEHJCommented:
>>There may be better solutiosn that to update 11million records monthly with a shared, low strength key.

... especially when there seems to be no current mechanism apparent for avoiding updating the same record(s) multiple times  ;-)
0
edisonffsAuthor Commented:
mrjoltcola:

Not trying to disrespect the experts here or anything. Just that due to some constraints, i can't try out all the solutions suggested, or some is just a little hard for me understand.

As what you've commented, i believe that's the most Java can do with updating millions of records. I
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.