[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 394
  • Last Modified:

How to deduplicate in SSIS

Hi I have data that comes in as a file and I load it in a table and it has lot of duplicates, is there a task that helps me to deduplicate it ?pls suggest
0
sqlcurious
Asked:
sqlcurious
2 Solutions
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
You could throw in an aggregate task in your data flow and group by everything that you would consider a duplicate, but under normal circumstances I'd write an SP that handles that, either by executing the SP to delete dups, or use the SP as a source for subsequent data flows.
0
 
Anthony PerkinsCommented:
This type of question comes up all the time.  And the answer is the same regardless:  Never, ever import data that has not been validated into production tables.  Always import the data into staging tables you can then do whatever you like:
1. Delete the duplicates.
2. Only import the unique values.
0
 
Vikas GargBusiness Intelligence DeveloperCommented:
Hi,

If you can't remove duplicate from source and want to remove duplicates using SSIS transformation then SORT transformation is better to remove duplicates.


http://www.experts-exchange.com/Database/MS-SQL-Server/SQL_Server_2008/Q_28485575.html
0
NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

 
sqlcuriousAuthor Commented:
Thanks!
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
Okay,  wild curiosity overwhelms me.  Please explain in great detail how a sort transform eliminates duplicates.
0
 
Vikas GargBusiness Intelligence DeveloperCommented:
Jim,

You can go through the link I provided in which I have shown how Sort can remove duplicates
0
 
Vikas GargBusiness Intelligence DeveloperCommented:
Sort
0
 
Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
Fair enough.  Changing answer to reflect new info which wasn't stated in the original accepted answer.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Tackle projects and never again get stuck behind a technical roadblock.
Join Now