Dynamic Pivot Procedure for SQL Server

AID: 653
  • Status: Published

29667 points

  • Bymark_wills
  • TypeTutorial
  • Posted on2009-05-15 at 12:32:05
Awards
  • Community Pick
  • Experts Exchange Approved
  • Editor's Choice

by Mark Wills

PIVOT is a great facility and solves many an EAV (Entity - Attribute - Value) type transformation where we need the information held as data within a column to become columns in their own right. Now, in some cases that is relatively easy to do, other times it can be a challenge (and that is why we are here).

Let's have a quick look at the PIVOT function...

SELECT
          <display_column_list>

FROM

          (SELECT <source_columns> as Column_Source
                         ,<column_to_be_aggregated> as Column_Value
                         ,<column_with_new_column_names> as Column_List
           FROM <datasource> ) as DataSource

PIVOT
          (<aggregate_function>(Column_Value)  FOR  Column_List  IN
          ([<new_column_1_heading>],[<new_column_2_heading>],...,[<new_column_N_heading>]) ) PivotTable

ORDER BY <column_number_or_name>;

That looks pretty straight forward, except for one or two small details:

1) First up, we need to know the <display_column_list>
    easy enough, just do a Select * instead and problem solved (except that it does control display sequence)

2) Secondly, we need to know and hard code the new column headings as in :
     ([<new_column_1_heading>],[<new_column_2_heading>],...,[<new_column_N_heading>])

And that last point is often the big challenge. Not so bad if we are doing something static like "months in a year", just list out those months in sequence, and make sure you can cast the column which contains those new headings accordingly e.g. datename (month,<column_with_new_column_names>)

But, what about a moving target - like "last 3 months" ? Or, an EAV table with unknown attribute names ? Normally, that means we need to rewrite our query every month, or after every change of data.

There is a way, and that involves some Dynamic SQL, more importantly, we can make it a procedure which can handle any "simple" dynamic pivot table.

So, lets get started... but first we need a fairly simple example, so we will create some data accordingly. Feeling generous, we will do two. One is a classic "rolling periods"  and the other a typical EAV

CREATE TABLE tst_CustSales (
   TCS_ID INT Identity Primary Key Clustered,
   TCS_Customer varchar(60),
   TCS_Date DATETIME,
   TCS_Quantity INT,
   TCS_Value MONEY )
GO

CREATE TABLE tst_EAV_Data (
   TED_ID INT Identity Primary Key Clustered,
   TED_Entity varchar(60),
   TED_Attribute varchar(60),
   TED_Value varchar(60) )
GO

-- now let's populate our tst_* tables

INSERT tst_CustSales (TCS_Customer, TCS_Date, TCS_Quantity, TCS_Value) 
SELECT * FROM (
SELECT 'Customer 1' as Customer,'20090101' as Date, 11 as Qty, 1001.00 as Val union all    
SELECT 'Customer 1','20090201',12, 1002.00 union all
SELECT 'Customer 1','20090301',13, 1003.00 union all
SELECT 'Customer 1','20090401',14, 1004.00 union all
SELECT 'Customer 2','20090101',21, 2001.00 union all
SELECT 'Customer 2','20090201',22, 2002.00 union all
SELECT 'Customer 2','20090301',23, 2003.00 union all
SELECT 'Customer 2','20090401',24, 2004.00 union all
SELECT 'Customer 3','20090101',31, 3001.00 union all
SELECT 'Customer 4','20090201',42, 4002.00 union all
SELECT 'Customer 5','20090301',53, 5003.00 ) as src
GO
-- notice I do not mention the Identity Column - SQL will manage that for me
-- notice the yyyymmdd "style 112" format - implicitly converts to datetime

-- now our EAV table, again imagine some diverse attributes

INSERT tst_EAV_Data (TED_Entity, TED_Attribute, TED_Value)
SELECT * FROM (
SELECT 'Customer 1' as Customer,'Phone' as Attr,'+61299991234' as Data_Val union all
SELECT 'Customer 1','Address','24 Somewhere Street' union all
SELECT 'Customer 1','Building','The ReallyTall One' union all
SELECT 'Customer 1','Contact','Marcus Aurelius' union all
SELECT 'Customer 2','Phone','+61288881234' union all
SELECT 'Customer 2','Contact','Ritesh Shah' union all
SELECT 'Customer 3','Address','1600 Pennsylvania Avenue' union all
SELECT 'Customer 3','Building','The WhiteHouse' union all
SELECT 'Customer 4','Phone','+61277771234' union all
SELECT 'Customer 4','Address','1 Nile Way' union all
SELECT 'Customer 4','Building','The Pyramids' union all
SELECT 'Customer 4','Contact','Cleo Patra' union all
SELECT 'Customer 5','Phone','+61277771222' union all
SELECT 'Customer 5','Friend','Cleo Patra' ) as src
GO
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
49:
50:
51:
52:
53:

Select allOpen in new window



Now we can get down and dirty with the Pivot.
First we will construct a properly formed one so you can "see" the pivot in action.

SELECT TCS_Customer, [01 Feb 2009],[01 Mar 2009],[01 Apr 2009]
FROM
  (select TCS_Customer, TCS_Date, TCS_Value from tst_CustSales ) sourcedata
PIVOT
  (sum(TCS_Value) for TCS_Date in ([01 Feb 2009],[01 Mar 2009],[01 Apr 2009])) pivottable
GO
                                  
1:
2:
3:
4:
5:
6:

Select allOpen in new window



You can see from the above that the column_list and headings are all hard coded...
Also note how SQL is dynamically converting the datetime to those column headings that is because dd MMM yyyy can be implicitly converted to datetime when used in a date context. But "Style 106" is language dependant, so you do need to take care.

In this case, amazingly, can handle the "hard coded" column names (note we have no time component in our test data, we show you below how to remove time).

Now let us have a look at some Dynamic SQL for the EAV table, again in "long hand".  The dynamic bit is getting those column names so we do not have to hard code them...

DECLARE @Columns varchar(8000)
DECLARE @SQL varchar(8000)

SET @Columns = substring((select ',['+TED_Attribute+']' from tst_EAV_Data group by TED_Attribute for xml path('')),2,8000)
SET @SQL = 'SELECT * FROM 
  (Select TED_Entity as Cust,TED_Attribute,TED_Value from tst_EAV_Data) sourcedata
PIVOT
  (max(TED_Value) for TED_Attribute in ('+@Columns+')) pivottable'
EXEC(@sql)

GO
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:

Select allOpen in new window



Let's have a look at the above, all we really did was to generate the column list.
You can try it again replacing the EXEC(@SQL) with Print @SQL
You will see pretty much the same command structure as the earlier pivot.

Now to create a Procedure so we can simply keep using a stored procedure rather than having to write code all the time.

So lets get into it...

CREATE PROCEDURE uDynamicPivot(
                 @sourcedata varchar(8000),
                 @Pivot_On_Source_Column varchar(2000), 
                 @Pivot_Value_Aggregate varchar(10), 
                 @Pivot_Value_Column varchar(2000), 
                 @Pivot_Column_List varchar(2000),
                 @Pivot_Column_Style_Code varchar(4))  -- used in convert for style code
AS 
BEGIN
-- we really should put in some error checking, e.g. if anything is NULL it will crash.
   declare @columns varchar(max)
   declare @sql nvarchar(max)

   set @sql = N'set @columns = substring((select '', [''+convert(varchar,'+@Pivot_Column_List+@Pivot_Column_Style_Code+')+'']'' from '+@sourcedata+' group by '+@Pivot_Column_List+' for xml path('''')),2,8000)'
   execute sp_executesql @sql,
                         N'@columns varchar(max) output',
                         @columns=@columns output 

   set @sql = N'SELECT * FROM 
       (SELECT '+@Pivot_On_Source_Column+','+@Pivot_Column_List+','+@Pivot_Value_Column+' from '+@sourcedata+') src
       PIVOT
       ('+@Pivot_Value_Aggregate+'('+@Pivot_Value_Column+') FOR '+@Pivot_Column_List+' IN ('+@columns+') ) pvt
       ORDER BY 1'
   execute sp_executesql @sql

END

GO
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:

Select allOpen in new window



Now, let's use that procedure by plugging the parameters needed for a PIVOT function

uDynamicPivot 'tst_CustSales','TCS_customer','sum','TCS_Value','TCS_Date',',106'

and the EAV

uDynamicPivot 'tst_EAV_Data','TED_Entity as Cust','max','TED_Value','TED_Attribute',''

We can even include some "where" clauses for simple requirements

uDynamicPivot 'tst_CustSales where TCS_Date >= convert(varchar(6),dateadd(month,-3,getdate()),112)+''01''','TCS_customer','sum','TCS_Value','TCS_Date',',106'

But that is getting pretty ugly, and that is where the VIEW comes into play...

VIEWS allow data to be presented in a similar way in which we use a table.
A view is a good way to present data that does need some kind of transformation. It also allows a certain detachment from the underlying table.
Views are really a pointer or script to the actual data and does not contain data itself more so the "rules" on how to get/show the data.
Once created, it is part of the database and can be re-used as often as you like.

CREATE VIEW vw_last_3_months AS
SELECT TCS_Customer as Customer
,      TCS_Value
,      DATEADD(day, DATEDIFF(day, 0, TCS_Date),0) as Date    
,      DATEADD(month, DATEDIFF(month, 0, TCS_Date),0) as Start_Of_Month  
,      DATEADD(month, DATEDIFF(month, -1, TCS_Date), -1) as End_Of_Month 
FROM   tst_CustSales
WHERE  TCS_Date >= convert(varchar(6),dateadd(month,-3,getdate()),112)+'01'

GO
                                  
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:

Select allOpen in new window



Note how we are using date functions to transform our date data to remove time and generate a start and end of month.
Now we can do our "simple" function call using our View.

uDynamicPivot 'vw_last_3_months','customer','sum','TCS_Value','End_Of_Month',',106'

And that as they say is that.
Please take care when running on your machine make sure you check table names, double check your code, go step at a time. And hope you have some fun with it.

Asked On
2009-05-15 at 12:32:05ID653
Tags

SQL2005 sql2008 PIVOT

Topic

SQL Server 2005

Views
8514

Comments

Expert Comment

by: DanielWilson on 2009-06-09 at 10:06:42ID: 1493

Very cool, Mark.

The downside of having it use dynamic SQL is worth it to have a generic solution like this.  This is definitely one I will keep in mind.

Expert Comment

by: cs97jjm3 on 2009-06-18 at 00:10:57ID: 1677

like this thanks another bookmark :-)

Expert Comment

by: swatighare on 2009-06-20 at 02:43:12ID: 1725

Nice article. Really good one

Author Comment

by: mark_wills on 2009-06-20 at 06:22:27ID: 1728

Thanks very much, all of you, I really do appreciate your comments.

Expert Comment

by: karthik_842005 on 2009-06-24 at 02:42:49ID: 1761

Thanks a lot.It really works superb.

Expert Comment

by: dbaSQL on 2009-07-16 at 06:09:53ID: 2105

Fabulous, Mr. Wills.  Extremely ingenious solution.  

Expert Comment

by: Inteqam on 2010-12-03 at 19:00:54ID: 21828

Excellent. any performance issues?


Author Comment

by: mark_wills on 2010-12-03 at 19:31:29ID: 21830

Thanks Inteqam,

Well, there can be performance issues. It really depends on the structure and size of your underlying tables.

Using a view over the top of a table should still respect the indexes of the underlying tables, so, can use some of the SQL tools to check the performance of your data source and that will help.

Then there is size... If tons of rows and quite a few columns, then it can have a performance hit, but guess that is similarly true for any *involved* query dealing with large datasets.

Cheers,
Mark Wills

Expert Comment

by: Inteqam on 2010-12-03 at 19:42:44ID: 21831

Thanks for the quick response,

what i meant is that delays might be caused only for the use of pivot tables, so, if a view takes let's say 10 seconds to return 3 million record for 3 vendors, and i used pivot tables to sort vendors prices in columns, the dynamic SQL should return 1 million record with 3 extra fields and again 1 less field (price field). But in how much time? approximated.

Author Comment

by: mark_wills on 2010-12-03 at 19:55:04ID: 21832

Tough question...

The PIVOT functionality does have a few overheads compared to a straight query, and not so sure if I have ever directly compared the results, simply because if I need a pivot solution, there is not too many choices, and as far as a pivot result is concerned then the pivot function does work very well.

But if you have a look at my Bio / Profile there is a means to contact and I can always run up your query as a test and feedback the results if you would like. I certainly dont mind doing the needful work if it can generate a useful comparison.

Expert Comment

by: Inteqam on 2010-12-03 at 20:30:50ID: 21834

Thanks for the offer :)

But, I followed your steps, I dont have a complex query, I was just wondering how pivot tables affect performance, your answer was very satisfying.

I'll drop a visit to your profile either way, sure it worth it.

Thanks again

Add your Comment

Please Sign up or Log in to comment on this article.

Loading Advertisement...

Top MS SQL Server 2005 Experts

  1. TempDBA

    65,566

    Master

    800 points yesterday

    Profile
    Rank: Sage
  2. dtodd

    63,587

    Master

    0 points yesterday

    Profile
    Rank: Genius
  3. acperkins

    58,447

    Master

    0 points yesterday

    Profile
    Rank: Genius
  4. ScottPletcher

    58,352

    Master

    0 points yesterday

    Profile
    Rank: Genius
  5. mwvisa1

    50,316

    Master

    0 points yesterday

    Profile
    Rank: Genius
  6. matthewspatrick

    45,668

    0 points yesterday

    Profile
    Rank: Savant
  7. jogos

    39,619

    2,000 points yesterday

    Profile
    Rank: Sage
  8. lcohan

    33,652

    0 points yesterday

    Profile
    Rank: Genius
  9. angelIII

    27,498

    0 points yesterday

    Profile
    Rank: Elite
  10. tim_cs

    23,664

    0 points yesterday

    Profile
    Rank: Wizard
  11. anujnb

    22,114

    0 points yesterday

    Profile
    Rank: Guru
  12. EugeneZ

    21,264

    0 points yesterday

    Profile
    Rank: Genius
  13. jimhorn

    19,907

    0 points yesterday

    Profile
    Rank: Genius
  14. Lowfatspread

    19,564

    0 points yesterday

    Profile
    Rank: Genius
  15. Buttercup1

    14,852

    2,000 points yesterday

    Profile
    Rank: Master
  16. huslayer

    14,800

    0 points yesterday

    Profile
    Rank: Sage
  17. ralmada

    14,400

    0 points yesterday

    Profile
    Rank: Genius
  18. ValentinoV

    12,286

    0 points yesterday

    Profile
    Rank: Sage
  19. pratima_mcs

    11,380

    0 points yesterday

    Profile
    Rank: Genius
  20. HainKurt

    10,600

    0 points yesterday

    Profile
    Rank: Genius
  21. Cenjoy100

    10,000

    0 points yesterday

    Profile
  22. lludden

    9,700

    0 points yesterday

    Profile
    Rank: Wizard
  23. rrjegan17

    9,400

    0 points yesterday

    Profile
    Rank: Genius
  24. dqmq

    9,368

    0 points yesterday

    Profile
    Rank: Genius
  25. santhimurthyd

    9,240

    0 points yesterday

    Profile
    Rank: Guru

Hall Of Fame