Link to home
Start Free TrialLog in
Avatar of ITMikeK
ITMikeKFlag for United States of America

asked on

How would one write a set based query to replace a cursor to handle this situation?

Given a flat file that looks like this:

AcctNum Date1 Value1 Date2 Value2 Date3 Value3 ...... through Value 13
This is a rolling data file supplied by a customer for the last 13 months of data. For example: in the August file Date1 would probably be something like 20130811, Date2 would be the previous month, 20130705, etc.

When we would get the September file, Date1 would be 20130915, Date2, 20130811, Date3 20130705. See the pattern?

What I am trying to do is create a small subset of Value records for each account/date. Example : AcctNum: 0001, TransDate 201308, Value 5555 AcctNum: 0001, TransDate 201309, Value 6666

I've been able to do this with a cursor, going through each Date/Value pair from the input file (13 times per record), determining if the AcctNum/TransDate already exists in our subset, if not, insert it.

Needless to say, this is extremely slow and we have at least 750K records per month to process. Is there any way to handle this without using a cursor?

I'm trying to be as concise as possible, but let me know if I can clarify anything.

Thank you!

sql-server-2012
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Have you tried this using UNPIVOT, or another such approach like UNION SELECT AcctNum, DateX, ValueX?  Both avoid the cursor, so may reduce the overhead of the process.

P.S. How are you bringing in the flat file?  If it is via SSIS, you can use the unpivot transformation, so that the data is normalized in your staging table (before insert into production table).

SSIS transform: http://technet.microsoft.com/en-us/library/ms141723.aspx
T-SQL operator: http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx
Avatar of ITMikeK

ASKER

In other words I want to transform :

     AccountNum  Date1 Value1 Date2 Value2 Date3 Value3 .......Date13 Value3 into

     New Usage File

     Rec1: AccountNum
               Contents of Date1
               Contents of Value 1

     Rec2: AccountNum
               Contents of Date 2
               Contents of Value2

                 .................

Remember, every month Date/Value pairs will roll to the right with the newest months data in the Date 1/Value1 columns.  The previous Date1/Value1 pair will now be in Date2/Value2 with the previous information in Date13/Value13 "dropping off" (hence, a 13 month history)

There are 400K+ records in the flat file to per processed every month.  Also, if a new customer was added, we (might) have their entire 13 month history record added to the flat file.

Hope this helps.
@ITMikeK,

I was editing my post when you were typing, so please re-read above.  I understood the movement of the data in my original post.  The point is it will be computationally more difficult if you are doing 13 operations versus flattening the data and performing one.

Hence, my recommendation is to look at UNPIVOT either in SSIS or T-SQL.
Avatar of ITMikeK

ASKER

I do have access to SSIS for SQL Server 2012.  Any suggestions for best examples before I start googling everything?
ASKER CERTIFIED SOLUTION
Avatar of Kevin Cross
Kevin Cross
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Correction to my note/code above.

If you can have NULL values, you may want a LEFT JOIN here.

Yes, LEFT JOIN would work, but the flaw is that ROW_NUMBER() will sequence the non-NULL values, making your matching invalid.  Hence, if you do have NULL values, you need to change more than the JOIN type.  Therefore, a better solution to get "rn" to match dates and values is...

rn = REPLACE(date_num, 'date', '') /* REPLACE(value_num, 'value', '') */

Open in new window


This works if the suffix behind 'date' and 'value' are the same by pair and not just numbers based on the order they appear.

Let me know if you have any questions.
Avatar of ITMikeK

ASKER

Kevin,

   The UNION option worked great!  Now what were you saying about MERGE?  We are using 2012, but I haven't tried that yet.  The ultimate goal is to take next months file run this same UNION query and whatever doesn't exist in the first set from the new file, go ahead and insert it based on AccountNum and First_Date, like you mentioned.  Is there a quick way to complete this?
Avatar of ITMikeK

ASKER

Actually, I looked it up and it is very easy to understand.  Thank you for your help!
You are most welcome!  Yes, I found MERGE easy to adopt.  

Best regards and happy coding,

Kevin