Link to home
Start Free TrialLog in
Avatar of marrowyung
marrowyung

asked on

using SSIS to import a CSV file to a table while 3 x field must be unique`

Dear all,

we are running SQL server 2005 and we want to import files to a table and then make sure that 3 x unique column can proof that the data is good quality, this mean there are only 1 record base on the combination of 3 x column, how SSIS help on this.
Avatar of Brendt Hess
Brendt Hess
Flag of United States of America image

I see two possible conditions here:

1)  If there is a duplicate, no records are to be imported.

2)  If there is a duplicate, import only the first record (or the last record, depending on specification).


In either case, the issue is that you do not know if there are duplicates until the second duplicate record is loaded.  However, you don't want the duplicate data loaded.

In case (2), then importing with an error handler for duplicate rows should work fine.  The first insert works, but subsequent inserts of duplicate rows (based on the unique columns) fails.  The import continues.

In case (1), I would use a different process.  I would import into a copy of the destination table first (a staging table), then delete any duplicated records, then finally copy records from the staging table to the final destination.
Avatar of marrowyung
marrowyung

ASKER

it seems 2 is better, but does SSIS provide any tools for us that allow we specify the 3 x colume is unique that based on this, no row can be duplicate?
should I use select distinct column A, B and C on that target table using SSIS? anyway to do it ?

How about adding one more unique column in the target table, this column must not exist in ithe original csv file?

ASKER CERTIFIED SOLUTION
Avatar of marrowyung
marrowyung

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
please close this question and have the point refund.