Posted on 2009-12-17
I'm still a bit new to SSIS and am looking for some advice. I have a datawarehouse that requires data to be moved around. Obviously my first thought is "SSIS is perfect for this" but there are maybe some techniques I need to refine a bit.
What I want to be able to do is load data from one table to the next, but only if that data does not already exist in the target table. I've done something using a lookup transformation and a conditional split, but I wonder if there is a better, more efficient, way of doing it. For my way to work I have to override the errors produced by the lookup when it doesn't find anything.
As an example of what I am doing Imagine the following two tables.
FirstName, LastName, Gender, DOB
Bob, Fleming, M, 01/01/1978
Rose, Cutting, F, 31/12/1985
Hugo, First, M, 12/12/2000
PeopleID, FirstName, LastName, Gender, DOB
1001, Bob, Fleming, M, 01/01/1978
As you can see in the stage table, Bob Fleming is already there, so I don't want to load him, I do want to load the others though. In SQL I would do this like this
INSERT INTO STG_PEOPLE
SELECT rp.FirstName, rp.LastName, rp.Gender, rp.DOB
FROM RAW_PEOPLE rp LEFT OUTER JOIN STG_PEOPLE sp
ON rp.FirstName = sp.FirstName AND rp.LastName = sp.LastName AND rp.DOB = sp.DOB
WHERE sp.PeopleID IS NULL
Is there a nice elegant way of achiving this using SSIS?