Filter duplicate rows in SSIS data flow task
Posted on 2010-09-14
I am retrieving data from oracle and inserting into sql server.
I have 2 tables in oracle called Person and Contacts.
Person_ID, Forename, Surname, Initials, Title, AddressId, TelNo1, TelNo2, Email1, Email2
Forename, Surname, Initials, Title, AddressId, TelNo1, TelNo2, Email1, Email2
In SSIS, DataFlow, I am retrieving data from 2 tables and UNION ALL and assigning uniqueID for PersonID(sequence number)
If I have a row in Person table and 3 rows in contacts table, after UNION ALL I am getting 4 rows for a person.
All contacts will be inserted to Telecom table and Person details will be inserted to People table.
But in People table I want only 1 row to be inserted(no duplicates).
If AddressID is different for the same person, its not called as duplicate. so all rows
are inserted to People table.
How can I do this in SSIS data flow task after UNION ALL and assigning unique PersonID?
I mean tracking duplicate records and not inserting into People table.
I can't do UNIQUE INDEX with NO_DUP_VAL as the PersonID is generating after UNION ALL
and is assigned to each contact i.e added to Telecom table.
So, I want to do some logic after UNION ALL.