We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now


How do I insert unique reference values with lookup + ole db command?

Medium Priority
Last Modified: 2013-11-30

I am trying to insert data from an Excel file to a SQL Server database, using SSIS packages. (I have simplified the actual package, to narrow in the problem, so it's like a school book example).
The Excel file has three columns: Name, Product, Supplier.
The database has a table TestTable with three columns: Id (identity int), Name, SupplierId
The SupplierId is mapped to a reference table named Supplier, with columns: Id, Name.

Short description of the SSIS package:
Excel Source --> Sort (on Name) --> Derived Column (adding SupplierId with dummy key 1) --> Lookup (trying to match Supplier names with reference table, replacing SupplierId if found)

If the Lookup finds the supplier, move with a SupplierId. Sort on SupplierId.
If not, insert new suppliers in reference table Suppliers.

Lookup to get the SupplierId of the newly inserted suppliers.
Sort on SupplierId.
Merge and Lookup to either insert with OLE DB Destination or update with OLE DB Command.

The problem is:
When making the insert of new suppliers, not only unique suppliers are inserted. If, e.g., the Excel file has two records with supplier "FOX LTD", and the supplier does not yet exist, it adds to records to the reference table. I of course want it to add unique suppliers.

I tried to add a Sort before the insert OLE DB Command, removing rows with duplicate sort values, but then I of course will lose the records that I want to move on with...

This must be an easy thing so solve, but I really can't find out how.
Is there anyone in here who can help me out?

If any code or further explanation is needed, I will provide it.
Watch Question

well.... give us more details...
... I also supose you understand correctly SSIS for setting 250 points...



Concerning the points being awarded, I don't know how to rate the question, and I couldn't find any guidelines. I probably don't understand SSIS completely as I should, being a total newbie. However, when I couldn't find an answer to the question anywhere, I thought I'd try here. Do you suggest the points reward too high or too low??

More details...
What kind of details would be useful?

The insert of any new suppliers is only one step on the way in a larger process. That means I need the original dataset (with SupplierId added as a column, of course) with me "downstream", for later update of a fact table.

1. Lookup
2. If match - proceed to 4
3. If no match - insert new Suppliers and proceed to 3
4. Join the two datasets (same records as before, but now with SupplierId added, both previously existing and newly inserted)
5. Proceed working with the dataset, e.g. inserting it into a fact table...

relating the points I normally create an SSIS example for those who set the question with 500 points, and give guiedlines and suggestions to those who give less than 500poinst, because I understand that those persons are familiar with SSIS and I dont need to create the example... only for that reason. You are welcome here!

Ok... now...
You are looking data for SupplierName or supplierID?
Could you attach and example of input data and output with what you want... some details to better explain your issue!



OK, thanks for your explanation.

I've attached an image, I hope it helps explaining what I'm trying to do.

From "upstream", a dataset is running, containing some columns and records, a.o. SupplierName.
In "Derived Column" I add a SupplierId column (int) and set it to 1.
The "Lookup" is checking the Supplier table in the database (columns Id, Name), trying to match SupplierName with Name in table, replacing SupplierId if it matches. So far everything is ok.

The error output from "Lookup" contains Suppliers that don't yet exist in the database, and thus need to be added.
This is supposed to be done in "OLE DB Command".
The SqlCommand in "OLE DB Command" is "INSERT INTO [Supplier] (Name) VALUES (?)".
The result of this data flow transformation is that it inserts all the records' SupplierName. I of course want unique Suppliers to be created.

In "Lookup 1" after the insert, I do the came check as in the previous lookup, matching supplier names with supplier id (that now exist in the database).

Let's say that the dataset contains the following data (with header/column name on top)
Name        Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

I want the Supplier table to be filled with three suppliers, not five.

I tried to add a Sort component before the insert, checking "Remove rows with duplicate sort values", but as I'm using the pass-through dataset "downstream" (of course I want to insert/update all facts), this doesn't work for me.

Does that explain my problem better?


The lookup 1 found the suppliers names inserted in the previous step?
Try to use aggregate component to have all distinct suplliers names...

or you can also split the dataflow in 2 dataflow... one to insert suppliers and other to do the lookups and insert into destination...



Yes, the Lookup 1 finds all the recently (in the step before) added suppliers.

Splitting the dataflow in 2... Isn't that exactly what I'm doing?
The flow with the not-yet-existing suppliers is taking a branched stream, to insert suppliers... When done, it merges into the "main" flow again, to insert all data into the destination.

I still think the problem is that the OLE DB Command is inserting all suppliers and duplicates, not unique suppliers.

In other words, the stream should look like this:

[Dataset] to process, coming from upstream:
PName      Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

1. INSERT INTO Supplier (Name)
    SELECT Supplier FROM [Dataset]

2. Do a Lookup to get SupplierId for suppliers in [Dataset]. The [Dataset] now looks like this:
PName      Supplier          SupplierId
Mr X          Home Depot    1
Mrs Y        Walmart          2
Mr Z          Walmart          2
Mr W         Home Depot    1
Ms Q         Walgreens      3

2. Do a Lookup to see which records in [Dataset] to insert and which to update.

3. Have one OLE DB Command update the destination table, and one OLE DB Destination to insert into the destination table. The destination table would finally look as follows:

PName      SupplierId
Mr X          1
Mrs Y        2
Mr Z          2
Mr W         1
Ms Q         3

I'm a little confused... I will start from
PName      Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

and make an example for you.. I think it's better!
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview



Hi again, I apologize for my late response, the summer holidays came in between...

OK, performing it as two tasks, one for inserting/updating dimensions and one for inserting/updating the actual facts... That makes sense. It may take some more time, if the data source is very large, but it's a clean way to do it, and as I can't see any other way to solve it (with a data flow transformation control. e.g.), I will try it out! I'll be back with the result shortly.

ok.. I'll be waiting and this time more fast, ok? :-)


Hi again!

It worked like a charm!
I accepted your solution, and raised the point value, I think you put a lot of time into this, which I'm grateful for. I hope you're ok with that. :o)

Thank you very much for your help!
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.


Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.