Solved

SSIS Lookup

Posted on 2013-12-03
13
421 Views
Last Modified: 2016-02-10
Hi,

I need to create a package that imports data from an Excel file into a MSSQL database. The package needs to lookup codes from the database using descriptions in the Excel file, if the lookup does not match then I would like to email an end user about the non-matching lookup and stop the database import.

So far I have added a lookup and set it to redirect error rows, which works fine. But I am unable to stop the database import.

Is this possible?

Thanks, Greg
0
Comment
Question by:spen_lang
  • 5
  • 4
  • 4
13 Comments
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39692558
<knee-jerk reaction>

>So far I have added a lookup and set it to redirect error rows
Add a RowCount transform between this and the next step, and assign the count to a variable.

Then in your control flow, change the arrow that goes from this data pump to the next task to be conditional WHERE success AND @your_variable == 0.

If it's zero execution will continue, if it's greater than zero it will not proceed to the next step, i.e. it'll stop there.
0
 

Author Comment

by:spen_lang
ID: 39692581
The lookup is connected to an OLE DB Destination which does not seem to give me this option... Unless I am looking in the wrong place (mind block)
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39692594
>The lookup is connected to an OLE DB Destination
Please send me a screen shot of this.

Normally this is done with a separate Lookup Transform between source and destination, which you can then branch off of for errors / missing rows.
0
 

Author Comment

by:spen_lang
ID: 39692629
Data Flow
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39692674
So create a variable somewhere, and create five Row Count transforms between the script task and the Union All (you could even set the value in the script tasks since it's there, and avoid the Row Count transforms).

>But I am unable to stop the database import.
Ok I get it.  Slightly different dang deal.  Add a Conditional Split between the final lookup and the transform, double-click on it, and create two paths:  One if @the_variable == 0 that leads to the destination, and one if @the_variable != 0 that leads to the email or whatever you want to do other than insert.

Then in the control flow, same instructions as above.
0
 
LVL 37

Expert Comment

by:ValentinoV
ID: 39692706
"But I am unable to stop the database import."

Just wondering: why would you want this?  Is it an "all or nothing" matter?  After all, it's only the "good" records that will make it through to the destination...

If you really really (really) need that then you'll need to work with a staging table.  In your Data Flow, records get inserted into the staging table.  You'll also need to keep track of the number of records coming out of your source, can be done as explained by Jim (RowCount).  The RowCount component should be located directly under the source because at that moment the flow still contains all records.

The next step in your control flow compares the number of records in the staging table with the RowCount variable: when equal then records should get written to final destination table.  This can all be done through an Execute SQL task...
0
Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

 
LVL 37

Expert Comment

by:ValentinoV
ID: 39692739
Looks like Jim posted his reply while I was writing mine :)

Jim: "One if @the_variable == 0 that leads to the destination, and one if @the_variable != 0 that leads to the email"

Are you sure that will have the desired outcome? As SSIS uses batches, let's say there's an issue in batch number 3, wouldn't the data of the first two batches be already written to the destination as the counter only changes from zero in batch 3?
0
 
LVL 65

Accepted Solution

by:
Jim Horn earned 250 total points
ID: 39692768
Hey VV..

>As SSIS uses batches, let's say there's an issue
Yeah I see what you mean.  Rows could be written to the destination before an exception is detected, defeating the 'all or nothing' requirement.   Maybe this should be two separate data pumps, one for the validation and one for the conditional execution, unless you see a way to pull this off in the same data flow.   Is there a batch setting that can be edited?

>If you really really (really) need that then you'll need to work with a staging table.
Didn't go down that road, but excellent point.  If we're talking a wompload of data, it's usually better to accept what you can, and that way the prod support people only have an 'I have 10 rows not accounted for' conversation (and volume level) as opposed to an 'The entire file was not accepted because of a small number of rows' conversation.

spen_lang - I ran an ETL group for two years, so if you could at all possible take VV's advice and accept all rows and then frame up the ones that fail lookup for subsequent handling, that would cut down on the prod support time to fix + screaming level.

Trust me on this one.  Been there, dealt with the screaming that's.
0
 
LVL 37

Assisted Solution

by:ValentinoV
ValentinoV earned 250 total points
ID: 39692934
(hey Jim) :)

An alternative approach (and actually preferred, Jim's previous comment triggered it) to my proposal could be this: all incoming records (incl. the bad ones) go into the staging table.  The bad records have additional fields filled in that detail what went wrong with it.  Only the good ones make it further down to PRD.

This makes it a lot easier to troubleshoot error situations, as you'll have all problems sitting in the staging table.

Ow, and don't forget to store the file name in one of the columns!
0
 

Author Comment

by:spen_lang
ID: 39692959
I have to use a staging table for this project in any case so this solution sounds spot on thanks.

On another note, what is the easiest way to get all data from a "union all" into an email as an attachment? Will I have to export it to a csv file and then pick it up again?

Thanks
0
 
LVL 37

Expert Comment

by:ValentinoV
ID: 39692963
"Maybe this should be two separate data pumps, one for the validation and one for the conditional execution, unless you see a way to pull this off in the same data flow.   Is there a batch setting that can be edited?"

Not that I'm aware of.  I can think of a control flow like this

1. start transaction
2. data flow (fail in case of problem)
3. commit transaction if all okay

But I'm not really in favor of this possibility because it requires too much tweaking.  Such as setting the RetainSameConnection property of the connection manager to true (which is not the default)...

And the built-in transaction support, well, I stay away from that...
0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 39692993
>what is the easiest way to get all data from a "union all" into an email as an attachment?
Hmm.  Knew I should have written that article.

That would be a Send Mail task (I'd use Task Factory Advanced Email and SMS Task if that's available to you), where the body of the variable is popualted with HTML-based table object, similar to what this answer by SQL expert ScottPletcher does.

There is a limitation of 4,000 characters in an email, and this is a fair amount of work to pull off, so as a quick and dirty it would be a better idea to output these rows into a text file, then send your email with that text file as an attachment.
0
 

Author Comment

by:spen_lang
ID: 39693045
Thanks for all your help, all very good points which I have made a start on and will continue with tomorrow. Will share the points out tomorrow...
0

Featured Post

Zoho SalesIQ

Hassle-free live chat software re-imagined for business growth. 2 users, always free.

Join & Write a Comment

SQL Server  2012 Release with lots of Enhancements in Database Engine functions, SSIS, SSRS and some of new services like Data Quality Server and Master Data Service. Of particular interest, and the focus of this Article is SSIS. So, time to elab…
If you have heard of RFC822 date formats, they can be quite a challenge in SQL Server. RFC822 is an Internet standard format for email message headers, including all dates within those headers. The RFC822 protocols are available in detail at:   ht…
This video discusses moving either the default database or any database to a new volume.
This video shows how to remove a single email address from the Outlook 2010 Auto Suggestion memory. NOTE: For Outlook 2016 and 2013 perform the exact same steps. Open a new email: Click the New email button in Outlook. Start typing the address: …

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now