<

How to use Script Component as Asynchronous Transformation

Published on
17,331 Points
10,631 Views
2 Endorsements
Last Modified:
Approved
Community Pick
Suppose you have a flat file which is not delimited or fixed width or Ragged right. the structure of data is not well formed for Flat File Connection manager to read,

This is an example of data file:

date              : 04/14/2010
customer name     : John doe
address           : 2014 apple st. Chicago, IT 30000
sale order        : 04/15/2010
confirmation tag  : abcdefghi
date              : 12/14/2010
customer name     : reza rad
address           : Tehran Iran
sale order        : 02/15/2010
confirmation tag  : rererererere
date              : 06/14/2009
customer name     : Abolfazl rad goudarzi
address           : Tehran Iran
sale order        : 02/05/2010
confirmation tag  : fasgfsagas

Open in new window


You need to fetch data and fill them in appropriate fields in a Database table.

As you may know you can not use Flat File manager without any handy works to change data in appropriately manner. in fact you need a Transformation between Flat File Source and the Destination.

But how can you do this?

the answer is ...

With Asynchronous Transformation.

Let me talk about what the Asynchronous transformation is exactly:

there are two types of transformation in SSIS .

1- Synchronous Transformation

in this kind of transformation , the input data will process ROW BY ROW . and output is synchronize with the input. a good example of Synchronous transformation is the Derived Column Transformation.

2- Asynchronous Transformation

in this kind of transformation, the input data will fetch all at first, ALL ROWS will read in first step, and then output will generate. there is no synchronization between input and output. an example of Asynchronous Transformation is the Aggregate Transformation.

Let's back to sample data file above, We need to fetch data of 5 rows and combine them and make one output row. combination of input data needs to be done by a Script Component Transformation. And this is obvious that this script component should be Asynchronous Transformation.

Now we will implement a Script Component which work Asynchronously . Let me explain in steps:

1- Create new SSIS package,and add a Data Flow task in it. then add a Flat File Source in the Data Flow.

2- Configure the Flat File source, add a Flat File connection manager, connect it to the sample datafile above, let's name it as ee.txt . in the Columns tab, set Column Delimiter as Colon {:} .
 1.jpg

this Column delimiter will separates field names and field values . this means that there will be two column : Column 0 and Column 1. column 0 will consists of field names in flat file: date  - customer name - address - sale order - confirmation tag . and column 1 will consists of field values for each field name.

3- Add a Script Component Transformation and set it as transformation. double click on it and configure it.

4- in Input Columns tab, add two columns : column 0 , column 1
 2.jpg

5- in Input and Outputs tab, select the Output 0, and rename it as myFlatOutput. set the SynchronousInputID as None. this will change your script component to asynchronous.
 3.jpg

6- add these output columns under myFlatOutput :

column name           DataType

date                      date [DT_DATE]

customername         string [DT_STR]

address                  string [DT_STR]

saleorder                date [DT_DATE]

confirmationtag       string [DT_STR]

 4.jpg

 7- in script tab, set Script Language as Microsoft Visual C# 2008, and Edit Script.

8- override the Input0_ProcessInput method to fetch all data till end of file, as below:
 
public override void Input0_ProcessInput(Input0Buffer Buffer)
    {
        while (Buffer.NextRow())
        {
            Input0_ProcessInputRow(Buffer);
        }

        if (Buffer.EndOfRowset())
        {
            myFlatOutputBuffer.SetEndOfRowset();
        }
    }

Open in new window


9- override the Input0_ProcessInputRow method to add new rows to output, and combine values and make a full row, as below:
 
public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        if(Row.Column0.ToLower().Trim() == "date")
        {
            myFlatOutputBuffer.AddRow();
        }

        switch(Row.Column0.ToLower().Trim())
        {
            case "date":
                myFlatOutputBuffer.date = DateTime.ParseExact(Row.Column1.Trim(), "MM/dd/yyyy", System.Globalization.CultureInfo.InvariantCulture);
                break;
            case "customer name":
                myFlatOutputBuffer.customername = Row.Column1.Trim();
                break;
            case "address":
                myFlatOutputBuffer.address = Row.Column1.Trim();
                break;
            case "sale order":
                myFlatOutputBuffer.saleorder = DateTime.ParseExact(Row.Column1.Trim(), "MM/dd/yyyy", System.Globalization.CultureInfo.InvariantCulture);
                break;
            case "confirmation tag":
                myFlatOutputBuffer.confirmationtag = Row.Column1.Trim();
                break;

        }
    }

Open in new window


Note that you should use AddRow() method to add new rows to the output of transformation.

10- Build the script and exit. this is the whole script:
 
/* Microsoft SQL Server Integration Services Script Component
*  Write scripts using Microsoft Visual C# 2008.
*  ScriptMain is the entry point class of the script.*/

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }

    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        if(Row.Column0.ToLower().Trim() == "date")
        {
            myFlatOutputBuffer.AddRow();
        }

        switch(Row.Column0.ToLower().Trim())
        {
            case "date":
                myFlatOutputBuffer.date = DateTime.ParseExact(Row.Column1.Trim(), "MM/dd/yyyy", System.Globalization.CultureInfo.InvariantCulture);
                break;
            case "customer name":
                myFlatOutputBuffer.customername = Row.Column1.Trim();
                break;
            case "address":
                myFlatOutputBuffer.address = Row.Column1.Trim();
                break;
            case "sale order":
                myFlatOutputBuffer.saleorder = DateTime.ParseExact(Row.Column1.Trim(), "MM/dd/yyyy", System.Globalization.CultureInfo.InvariantCulture);
                break;
            case "confirmation tag":
                myFlatOutputBuffer.confirmationtag = Row.Column1.Trim();
                break;

        }
    }

    public override void Input0_ProcessInput(Input0Buffer Buffer)
    {
        while (Buffer.NextRow())
        {
            Input0_ProcessInputRow(Buffer);
        }

        if (Buffer.EndOfRowset())
        {
            myFlatOutputBuffer.SetEndOfRowset();
        }
    }



    public override void CreateNewOutputRows()
    {
        /*
          Add rows by calling the AddRow method on the member variable named "<Output Name>Buffer".
          For example, call MyOutputBuffer.AddRow() if your output was named "MyOutput".
        */
    }

}

Open in new window


11- Great Done. and now, I want to use a Recordset destination and a DataViewer to show the results, Note that you can use your OLEDB destination or any other destination you want here.

Create a variable and name it as Variable, set datatype as object.

add a RecordSet Destination, double click on it, set variable name as User::Variable,

in Input columns tab, add all input columns.

right click on precedence constraint between script component and recordset destination, and select DataViewer.

hit add , and in next window hit ok.

12- run the package.

this is a schema of DataFlow :
 5.jpg

and the result:
 6.jpg

That's all.
2
Comment
Author:Reza Rad
0 Comments

Featured Post

Introducing Cloud Class® training courses

Tech changes fast. You can learn faster. That’s why we’re bringing professional training courses to Experts Exchange. With a subscription, you can access all the Cloud Class® courses to expand your education, prep for certifications, and get top-notch instructions.

Join & Write a Comment

This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month