Link to home
Start Free TrialLog in
Avatar of Jim Youmans
Jim YoumansFlag for United States of America

asked on

Importing DB2 BLOBs into SQL 2016 database very slow

We have a DB2 10.5 system (on windows) that we are exporting to a SQL Server 2016 data warehouse on a daily basis using SSIS 2013.  The issue I am running into is that some of the tables in DB2 have BLOB columns (picture data) and when I use SSIS to import that to SQL Server it is very slow.  The main DB2 table has about 25,000 new records per day and it takes about 3 hours to move that from DB2 to SQL Server.

The main bottle neck is that SSIS has to write the BLOB to disk, then read it and store it in SQL Server.  I am averaging about 130 rows per second with both systems on solid state drives.  
My SSIS data flow is set up like so...

OLE DB Data Souce connected to DB2 10.5
Data Converison task to change DB2 BLOB datatype to Unicode Text Stream (DT_NTEXT)
Derived Column to fix DB2 date to SQL date issue
OLE DB Data Dest to SQL Server 2016

I had the SSIS BLOB storage on C:\temp at first and that was even slower (about 75 rows per second) and I read a blog about why not to do that.  So I moved the BLOB to a different drive and that about doubled the speed but at 130 rows per second, it takes about 3 hours to process 25,000.

Is there any way to speed this up?  Is there any way to use a memory location to read and write the BLOBs too?

Any suggestions would be much appreciated!

Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

You might want to try using a RAM disk for your temp storage location for this process.  You allocate some memory to emulate a disk drive.  We have used this technique from time to time to improve the performance of certain types of processes like this.
Avatar of Gary Patterson, CISSP
Gary Patterson, CISSP
Flag of United States of America image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Jim Youmans


Thank you for your suggestions.  I was excited about the RAM disk until I spoke to our server team and they told me that setting up a RAMDISK can be possible on a member cluster, this configuration is not recommended or supported while the cluster service is up.  So that will not work.

As for real time replication from DB2 to staging tables on SQL server, that is something I will look into but I still need to do the initial load in less than a week.

I was hoping to use the DefaultBufferSize and make it large enough (I think 100 MB is the limit) that there would not be a need to spool to disk but from what I can tell, SSIS will always write BLOB data to disk because it cannot tell the size before hand.  Is that correct?

Any other suggestions?
Hi Jim.  Lots of possible solutions, but nothing I can think of that doesn't require some development work.  In a cluster environment, ramdisk is definitely a problem, since it is instance specific.
I have pretty much given up on using SSIS for large tables with BLOB fields.  It runs OK for a bit then starts throwing warnings about locked buffers and slows down to like 300 rows per second.  When you have 55 million rows that is like 48 hours or more.  I had set the buffer size to 100 MB (limit I believe for SSIS) but still hits same issues.  It ends up writing 1,000s of small files to disk no matter what I do.  

Ultimately, the performance trick is to keep the blob in memory, without an intermediate disk.

I'm not an SSIS expert, so I'm going to go an alternate route.

I would write an application that retrieves the blob from db2 into a variable, and inserts or updates it directly into SQL server, without intermediate storage to disk.  Assuming there is enough memory available to your app, performance should improve significantly.