We help IT Professionals succeed at work.

Filter duplicate rows in SSIS data flow task

rajvja
rajvja asked
on
1,914 Views
Last Modified: 2012-05-10
Hi Experts,

   I am retrieving data from oracle and inserting into sql server.
   I have 2 tables in oracle called Person and Contacts.
   Person:
      Person_ID, Forename, Surname, Initials, Title, AddressId, TelNo1, TelNo2, Email1, Email2
   Contact:
        Forename, Surname, Initials, Title, AddressId, TelNo1, TelNo2, Email1, Email2
        
      In SSIS, DataFlow, I am retrieving data from 2 tables and UNION ALL and assigning uniqueID for PersonID(sequence number)
      
      If I have a row in Person table and 3 rows in contacts table, after UNION ALL I am getting 4 rows for a person.
      All contacts will be inserted to Telecom table and Person details will be inserted to People table.
      But in People table I want only 1 row to be inserted(no duplicates).
      
      If AddressID is different for the same person, its not called as duplicate. so all rows
      are inserted to People table.
      
      How can I do this in SSIS data flow task after UNION ALL and assigning unique PersonID?
      I mean tracking duplicate records and not inserting into People table.
      
      I can't do UNIQUE INDEX with NO_DUP_VAL as the PersonID is generating after UNION ALL
      and is assigned to each contact i.e added to Telecom table.
      
      So, I want to do some logic after UNION ALL.
      
      Any suggestion?
Comment
Watch Question

Bhavesh ShahLead Analyst
CERTIFIED EXPERT
Top Expert 2010

Commented:
Hi,

Can u use max(addressid)?
  1. Do you have to do a UNION ALL instead of a UNION?  (DO you know the difference between those two? ;-)
  2. If you have to use the UNION ALL, then you might want to insert that data into a staging table and then select distinct rows from the staging table.
Remember, SSIS lets you have multiple steps if you need them and it looks like you may need them. ;-)
 

Author

Commented:
Hi,

 thanks for the replies. Even I used UNION I am getting the same records.
There is a column in Person table called PERSON_ID(it is unique per row) and I need to retrieve that column coz based on that column I am generate surrogate keys.

Any solution? I am really strucked.
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
thanks

Gain unlimited access to on-demand training courses with an Experts Exchange subscription.

Get Access
Why Experts Exchange?

Experts Exchange always has the answer, or at the least points me in the correct direction! It is like having another employee that is extremely experienced.

Jim Murphy
Programmer at Smart IT Solutions

When asked, what has been your best career decision?

Deciding to stick with EE.

Mohamed Asif
Technical Department Head

Being involved with EE helped me to grow personally and professionally.

Carl Webster
CTP, Sr Infrastructure Consultant
Empower Your Career
Did You Know?

We've partnered with two important charities to provide clean water and computer science education to those who need it most. READ MORE

Ask ANY Question

Connect with Certified Experts to gain insight and support on specific technology challenges including:

  • Troubleshooting
  • Research
  • Professional Opinions
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.