Link to home
Start Free TrialLog in
Avatar of r_pat72
r_pat72

asked on

Help on SQL SSIS Task Optimization

Hi,

I have an excel file contains around 1200 columns and 100,000 rows. I have created a SSIS package to export this dataset to a SQL Table. I am using dataflow task to export these data to SQL table, but it takes very long time more than 1 hour. The client excepts to complete this task within 1 minute.

Can someone suggest a better way to optimize this process or a better way to achieve this?

Thank you for your help!
Avatar of Ryan Chong
Ryan Chong
Flag of Singapore image

tried use Bulk Insert / openrowset methods for data importing?

Import Bulk Data by Using BULK INSERT or OPENROWSET(BULK...) (SQL Server)
https://docs.microsoft.com/en-us/sql/relational-databases/import-export/import-bulk-data-by-using-bulk-insert-or-openrowset-bulk-sql-server
Avatar of r_pat72
r_pat72

ASKER

Thank you for your reply. Question: How do i create table on the fly using Bulk Insert task? It only shows existing tables for the SQL connection.
Does it have any options of creating tables based on file selection from source?
ASKER CERTIFIED SOLUTION
Avatar of Ryan Chong
Ryan Chong
Flag of Singapore image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
>I have an excel file contains around 1200 columns and 100,000 rows.
>The client excepts to complete this task within 1 minute.
For starters it is obvious that your client is not an expert in SSIS or ETL in general, as Excel is an extremely poor choice of source data format as users can edit it in thousands of ways that would cause an SSIS mapping error in a data flow task between source and destination.  Far better would be to have them save this file as a .csv or some other text format.

>but it takes very long time more than 1 hour.
Have you tried going into Advanced Properties of your source connection (data source?) and setting the data types to minimize the footprint of each column to what is needed, and not whatever it thought were the defaults?

Also is this 100k file a 'full load' file of all rows, or an 'incrimental' file where it only has changes from the last time this file was provided?  incrimental files are always much smaller and therefore quicker to load.

Also is it really (c'mon, really?) necessary for you to import all 1,200 columns?  If not, you still have to define them in the source, but not pumping them into the destination will save time.