?
Solved

How do I insert unique reference values with lookup + ole db command?

Posted on 2009-07-06
12
Medium Priority
?
716 Views
Last Modified: 2013-11-30
Hi,

I am trying to insert data from an Excel file to a SQL Server database, using SSIS packages. (I have simplified the actual package, to narrow in the problem, so it's like a school book example).
The Excel file has three columns: Name, Product, Supplier.
The database has a table TestTable with three columns: Id (identity int), Name, SupplierId
The SupplierId is mapped to a reference table named Supplier, with columns: Id, Name.

Short description of the SSIS package:
Excel Source --> Sort (on Name) --> Derived Column (adding SupplierId with dummy key 1) --> Lookup (trying to match Supplier names with reference table, replacing SupplierId if found)

If the Lookup finds the supplier, move with a SupplierId. Sort on SupplierId.
If not, insert new suppliers in reference table Suppliers.
** THIS IS WHERE THE PROBLEM IS. **

Lookup to get the SupplierId of the newly inserted suppliers.
Sort on SupplierId.
Merge and Lookup to either insert with OLE DB Destination or update with OLE DB Command.


The problem is:
When making the insert of new suppliers, not only unique suppliers are inserted. If, e.g., the Excel file has two records with supplier "FOX LTD", and the supplier does not yet exist, it adds to records to the reference table. I of course want it to add unique suppliers.

I tried to add a Sort before the insert OLE DB Command, removing rows with duplicate sort values, but then I of course will lose the records that I want to move on with...


This must be an easy thing so solve, but I really can't find out how.
Is there anyone in here who can help me out?

If any code or further explanation is needed, I will provide it.
0
Comment
Question by:sennen
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 7
  • 5
12 Comments
 
LVL 22

Expert Comment

by:PedroCGD
ID: 24785890
well.... give us more details...
... I also supose you understand correctly SSIS for setting 250 points...

regards,
Pedro
www.pedrocgd.blogspot.com
0
 

Author Comment

by:sennen
ID: 24786036
Concerning the points being awarded, I don't know how to rate the question, and I couldn't find any guidelines. I probably don't understand SSIS completely as I should, being a total newbie. However, when I couldn't find an answer to the question anywhere, I thought I'd try here. Do you suggest the points reward too high or too low??


More details...
What kind of details would be useful?

The insert of any new suppliers is only one step on the way in a larger process. That means I need the original dataset (with SupplierId added as a column, of course) with me "downstream", for later update of a fact table.

1. Lookup
2. If match - proceed to 4
3. If no match - insert new Suppliers and proceed to 3
4. Join the two datasets (same records as before, but now with SupplierId added, both previously existing and newly inserted)
5. Proceed working with the dataset, e.g. inserting it into a fact table...

0
 
LVL 22

Expert Comment

by:PedroCGD
ID: 24786127
relating the points I normally create an SSIS example for those who set the question with 500 points, and give guiedlines and suggestions to those who give less than 500poinst, because I understand that those persons are familiar with SSIS and I dont need to create the example... only for that reason. You are welcome here!

Ok... now...
You are looking data for SupplierName or supplierID?
Could you attach and example of input data and output with what you want... some details to better explain your issue!

regards!
Pedro
0
Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

 

Author Comment

by:sennen
ID: 24812467
OK, thanks for your explanation.

I've attached an image, I hope it helps explaining what I'm trying to do.

From "upstream", a dataset is running, containing some columns and records, a.o. SupplierName.
In "Derived Column" I add a SupplierId column (int) and set it to 1.
The "Lookup" is checking the Supplier table in the database (columns Id, Name), trying to match SupplierName with Name in table, replacing SupplierId if it matches. So far everything is ok.

The error output from "Lookup" contains Suppliers that don't yet exist in the database, and thus need to be added.
This is supposed to be done in "OLE DB Command".
The SqlCommand in "OLE DB Command" is "INSERT INTO [Supplier] (Name) VALUES (?)".
The result of this data flow transformation is that it inserts all the records' SupplierName. I of course want unique Suppliers to be created.

In "Lookup 1" after the insert, I do the came check as in the previous lookup, matching supplier names with supplier id (that now exist in the database).


Let's say that the dataset contains the following data (with header/column name on top)
Name        Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

I want the Supplier table to be filled with three suppliers, not five.

I tried to add a Sort component before the insert, checking "Remove rows with duplicate sort values", but as I'm using the pass-through dataset "downstream" (of course I want to insert/update all facts), this doesn't work for me.


Does that explain my problem better?
2009-07-09-130530.jpg
0
 
LVL 22

Expert Comment

by:PedroCGD
ID: 24812911

The lookup 1 found the suppliers names inserted in the previous step?
Try to use aggregate component to have all distinct suplliers names...

or you can also split the dataflow in 2 dataflow... one to insert suppliers and other to do the lookups and insert into destination...
helped?

Pedro
www.pedrocgd.blogspot.com
0
 

Author Comment

by:sennen
ID: 24813199
Yes, the Lookup 1 finds all the recently (in the step before) added suppliers.

Splitting the dataflow in 2... Isn't that exactly what I'm doing?
The flow with the not-yet-existing suppliers is taking a branched stream, to insert suppliers... When done, it merges into the "main" flow again, to insert all data into the destination.

I still think the problem is that the OLE DB Command is inserting all suppliers and duplicates, not unique suppliers.

In other words, the stream should look like this:

[Dataset] to process, coming from upstream:
PName      Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

1. INSERT INTO Supplier (Name)
    SELECT Supplier FROM [Dataset]

2. Do a Lookup to get SupplierId for suppliers in [Dataset]. The [Dataset] now looks like this:
PName      Supplier          SupplierId
Mr X          Home Depot    1
Mrs Y        Walmart          2
Mr Z          Walmart          2
Mr W         Home Depot    1
Ms Q         Walgreens      3

2. Do a Lookup to see which records in [Dataset] to insert and which to update.

3. Have one OLE DB Command update the destination table, and one OLE DB Destination to insert into the destination table. The destination table would finally look as follows:

PName      SupplierId
Mr X          1
Mrs Y        2
Mr Z          2
Mr W         1
Ms Q         3


0
 
LVL 22

Expert Comment

by:PedroCGD
ID: 24813239
I'm a little confused... I will start from
PName      Supplier
Mr X          Home Depot
Mrs Y        Walmart
Mr Z          Walmart
Mr W         Home Depot
Ms Q         Walgreens

and make an example for you.. I think it's better!
Pedro
0
 
LVL 22

Accepted Solution

by:
PedroCGD earned 2000 total points
ID: 24813639
I I wrote split the dataflow is soplit in two different dataflow... one to insert dimensions (Supplier) and other to insert facts... check attached images...

Helped?
Regards,
Pedro
www.pedrocgd.blogspot.com
SSIS-Interface.JPG
SSIS-Dimensions.JPG
SSIS-Facts.JPG
0
 
LVL 22

Expert Comment

by:PedroCGD
ID: 24823408
improvements?
0
 

Author Comment

by:sennen
ID: 25088411
Hi again, I apologize for my late response, the summer holidays came in between...

OK, performing it as two tasks, one for inserting/updating dimensions and one for inserting/updating the actual facts... That makes sense. It may take some more time, if the data source is very large, but it's a clean way to do it, and as I can't see any other way to solve it (with a data flow transformation control. e.g.), I will try it out! I'll be back with the result shortly.
0
 
LVL 22

Expert Comment

by:PedroCGD
ID: 25088657
ok.. I'll be waiting and this time more fast, ok? :-)
regards,
Pedro
www.pedrocgd.blogspot.com
0
 

Author Comment

by:sennen
ID: 25089032
Hi again!

It worked like a charm!
I accepted your solution, and raised the point value, I think you put a lot of time into this, which I'm grateful for. I hope you're ok with that. :o)

Thank you very much for your help!
0

Featured Post

Get real performance insights from real users

Key features:
- Total Pages Views and Load times
- Top Pages Viewed and Load Times
- Real Time Site Page Build Performance
- Users’ Browser and Platform Performance
- Geographic User Breakdown
- And more

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question