Sometimes, you need to use PIVOT in SSIS to ensure your data matches the output requirements of your users. So, what is PIVOT and what is SSIS and how can that help ?
Firstly a quick explanation of those acronyms as described by Microsoft :
Microsoft SQL Server Integration Services (SSIS) is a platform for building high performance data integration solutions, including the extraction, transformation, and loading (ETL) of packages for data warehousing. SSIS is the new name assigned to the component formerly branded as Data Transformation Services (DTS).
PIVOT rotates a table-valued expression by turning the unique values from one column in the expression into multiple columns in the output, and performs aggregations where they are required on any remaining column values that are wanted in the final output
Sound OK so far ? So, what we will be doing is using the SSIS Tool to perform a Query and transform the output into a PIVOT result. This tutorial provides a complete pivot sample on AdventureWorks sample SQL Server database with SSIS 2008... The database is downloadable from the Microsoft website.
But first, we do need some kind of "real world" scenario as a basis of our tutorial. So, let us assume the requirement that your Logistics manager wants to find order quantities for each product by year and have the output provided as a seperate output file (maybe a spreadsheet).
You may use the query to get the result:
year(SalesOrderHeader.OrderDate) as OrderYear,
sum(SalesOrderDetail.OrderQty) as OrderQuantity
inner join Production.Product
inner join Sales.SalesOrderHeader
group by Product.ProductID,Product.Name,year(SalesOrderHeader.OrderDate)
order by Product.Name,year(SalesOrderHeader.OrderDate)
and this is result of query above:
Which is fine - all the raw data is there, but we really want to look at the years going across the page. This is when we need to use the PIVOT functionality.
Now, we do assume some basic knowledge of SSIS, and if you dont, then not to worry too much, but recommend you first have a look at some of the material on : http://msdn.microsoft.com/en-us/sqlserver/cc510302.aspx
Right, now we are ready to begin with SSIS, and the very first step is to create a new SSIS package.
Add a Data Flow Task and in the dataflow tab :
Add a OLE DB Source and connect it to AdventureWorks database in your SQL Server
Write the query in SQL command Text of oledb data source,
Then add a PIVOT Transformation after OLE DB Data Source
In Pivot transformation advanced editor , go to input columns tab and select all input columns
Then go to input and output properties tab, select pivot default input, under input columns you will find all columns which you selected in previous tab.
The only property you must set for each input column is the PivotUsage. This describe the values you can use in PivotUsage property:
So , in this example , PivotUsage property for each input column will be as below:
Notice that you must have at least one input column with PivotUsage 2, one input column with PivotUsage 3,one input column with PivotUsage 0 OR 1 at least in your pivot transformation.
After setting input columns, go to Pivot Default output , and under output columns add these columns:
ID, Name , 2001 ,2002 , 2003 , 2004
ID and Name columns will show the exact values form ProductID and Name input columns.
So, set SourceColumn property of ID output column to LineAgeID of ProductID input column.
Note: this is critical to use the lineageID value of input columns not the ID value. if you use ID value you will face error.
But how can you find lineageID of input columns? Simply select ProductID under input columns and see the lineageID value there.
Note: the lineageID is different on each machine, so you can not use exact lineage numbers used in this example. You must use your lineageID values.
this is lineageID of ProductID input column:
and this is the SourceColumn property in ID output column:
leave the PivotKeyValue of ID and Name output column as empty.
Do above steps for Name output column with Name input column too
Now it's Pivot columns turn, select 2001 under output columns this column will show the results for the year 2001, so enter 2001 in PivotKeyValue property. And set SourceColumn with lineageID of OrderQuantity input column.
In fact you must set SourceColumn property with the lineageID of the input column which has the PivotUsage 3 value.
in this example , lineageID of OrderQuantity is 69 (Remember you must use your own lineageIDs)
So, we set SourceColumn with 69 in 2001 output column:
Now , do the same for 2002,2003,2004 columns. The only difference is that you must set PivotKeyValue to 2002,2003,2004 in these columns. But lineageID will be the same in all 2001 - 2004 columns.
Pivot configuration finished now.
Just add a destination, and fill the result of Pivot in destination. I used a RecordSet Destination with a Grid Data Viewer to show the result, you can add Grid dataviewer by right click on green arrow between pivot transformation and your destination, then select data viewers, then add Grid. Your own destination may well be an external file (like a spreadsheet), maybe you datawarehouse, or even another SQL Server.
this is the full desing of my DataFlow Task:
and this is the final Result: