Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

PIVOT Transformation - SSIS - Complete Tutorial

Reza RadConsultant, Trainer
CERTIFIED EXPERT
Published:
Updated:
Sometimes, you need to use PIVOT in SSIS to ensure your data matches the output requirements of your users. So, what is PIVOT and what is SSIS and how can that help ?

Firstly a quick explanation of those acronyms as described by Microsoft :
 
Microsoft SQL Server Integration Services (SSIS) is a platform for building high performance data integration solutions, including the extraction, transformation, and loading (ETL) of packages for data warehousing. SSIS is the new name assigned to the component formerly branded as Data Transformation Services (DTS).
 
PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output
Sound OK so far ? So, what we will be doing is using the SSIS Tool to perform a Query and transform the output into a PIVOT result. This tutorial provides a complete pivot sample on AdventureWorks sample SQL Server database with SSIS 2008... The database is downloadable from the Microsoft website.

But first, we do need some kind of "real world" scenario as a basis of our tutorial. So, let us assume the requirement that your Logistics manager wants to find order quantities for each product by year and have the output provided as a seperate output file (maybe a spreadsheet).

You may use the query to get the result:
 
select
                          Product.ProductID,
                          Product.Name,
                          year(SalesOrderHeader.OrderDate) as OrderYear,
                          sum(SalesOrderDetail.OrderQty) as OrderQuantity
                      from Sales.SalesOrderDetail
                      inner join Production.Product
                      on SalesOrderDetail.ProductID=Product.ProductID
                      inner join Sales.SalesOrderHeader
                      on SalesOrderDetail.SalesOrderID=SalesOrderHeader.SalesOrderID
                      group by Product.ProductID,Product.Name,year(SalesOrderHeader.OrderDate)
                      order by Product.Name,year(SalesOrderHeader.OrderDate)

Open in new window


and this is result of query above:
1.jpg

Which is fine - all the raw data is there, but we really want to look at the years going across the page. This is when we need to use the PIVOT functionality.

Now, we do assume some basic knowledge of SSIS, and if you dont, then not to worry too much, but recommend you first have a look at some of the material on : http://msdn.microsoft.com/en-us/sqlserver/cc510302.aspx

Right, now we are ready to begin with SSIS, and the very first step is to create a new SSIS package.

Add a Data Flow Task and in the dataflow tab :

Add a OLE DB Source and connect it to AdventureWorks database in your SQL Server

Write the query in SQL command Text of oledb data source,

Then add a PIVOT Transformation after OLE DB Data Source
  2.jpg

In Pivot transformation advanced editor , go to input columns tab and select all input columns
  3.jpg

Then go to input and output properties tab, select pivot default input, under input columns you will find all columns which you selected in previous tab.

The only property you must set for each input column is the PivotUsage. This describe the values you can use in PivotUsage property:
  4.jpg

So , in this example , PivotUsage property for each input column will be as below:

ProductID           0
Name                 1
OrderYear           2
OrderQuantity     3

Notice that you must have at least one input column with PivotUsage 2, one input column with PivotUsage 3,one input column with PivotUsage 0 OR 1 at least in your pivot transformation.
  5.jpg

After setting input columns, go to Pivot Default output , and under output columns add these columns:

ID, Name , 2001 ,2002 , 2003 , 2004

ID and Name columns will show the exact values form ProductID and Name input columns.

So, set SourceColumn property of ID output column to LineAgeID of ProductID input column.

Note: this is critical to use the lineageID value of input columns not the ID value. if you use ID value you will face error.

But how can you find lineageID of input columns? Simply select ProductID under input columns and see the lineageID value there.

Note: the lineageID is different on each machine, so you can not use exact lineage numbers used in this example. You must use your lineageID values.

this is lineageID of ProductID input column:
  6.jpg

and this is the SourceColumn property in ID output column:
  7.jpg

leave the PivotKeyValue of ID and Name output column as empty.

Do above steps for Name output column with Name input column too

Now it's Pivot columns turn, select 2001 under output columns this column will show the results for the year 2001, so enter 2001 in PivotKeyValue property. And set SourceColumn with lineageID of OrderQuantity input column.

In fact you must set SourceColumn property with the lineageID of the input column which has the PivotUsage 3 value.

in this example , lineageID of OrderQuantity is 69 (Remember you must use your own lineageIDs)
  8.jpg

So, we set SourceColumn with 69 in 2001 output column:
  9.jpg

Now , do the same for 2002,2003,2004 columns. The only difference is that you must set PivotKeyValue to 2002,2003,2004 in these columns. But lineageID will be the same in all 2001 - 2004 columns.

Pivot configuration finished now.

Just add a destination, and fill the result of Pivot in destination. I used a RecordSet Destination with a Grid Data Viewer to show the result, you can add Grid dataviewer by right click on green arrow between pivot transformation and  your destination, then select data viewers, then add Grid. Your own destination may well be an external file (like a spreadsheet), maybe you datawarehouse, or even another SQL Server.

this is the full desing of my DataFlow Task:
  10.jpg

and this is the final Result:
  11.jpg
 
3
16,752 Views
Reza RadConsultant, Trainer
CERTIFIED EXPERT

Comments (1)

ZolfSoftware

Commented:
Could not view/open the images for this article

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.