<

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x

Broaden Your Horizons : Pivot Again!

Published on
18,484 Points
5,484 Views
5 Endorsements
Last Modified:
Awarded
Community Pick
Kevin Cross
Father, husband and general problem solver who loves coding SQL, C#, Salesforce Apex or whatever.
Like to compare data horizontally?  Absolutely!  Well, have I got a GREAT trick for you, so have a read and enjoy...

Pivoting data is quite handy and as you know SQL Server 2005 or higher now has a nice PIVOT keyword to do the heavy lifting for us.  To that end, I am going to show you how to code multiple PIVOT statements in a SQL statement so we can effectively pivot on multiple columns and, thus, to "broaden your (data) horizons".

Consequently, I will proceed with the presumption that you have an existing knowledge of how to use PIVOT and its challenges; however, since this article builds on my previous tip/trick on handling rolling periods, a good starting point is here: Pivot Rolling Periods Without Using Dynamic T-SQL.

We will take this path to getting horizontal, but please feel free to slide to the section you are most interested in:

The Data.
Refresher: Rolling Period Pivot Trick(s).
Laying It Across The Table.
Let's Spread Out Some More.

1. The Test Data.


The following attachment is a SQL script to create the table structure and data we will be working with throughout this article.  
bookings-data.sql.txtFor those comfortable in just running the script, you can grab the file and move on to the next section, otherwise, I will go through briefly the code segments.


Important: If you already have a table in your system called "bookings", please change ALL references to the name "bookings" in the following T-SQL statements.


1. DDL to create table.
create table bookings (
   bk_id uniqueidentifier default (newid()) primary key ,
   bk_date datetime not null default (getdate()) ,
   bk_quantity int not null default (1),
   bk_amount decimal(19,4) not null ,
   bk_region nvarchar(10) not null default ('EE-RDMS') ,
   bk_salesperson nvarchar(50) ,
   bk_customer nvarchar(10) not null
)
;

Open in new window

2. Fill initial data, utilizing some of the defaults set above to make things easier.
insert into bookings( bk_amount, bk_salesperson, bk_customer )
select 1234.56, 'angeliii', 'C104'
union select 567.89, 'chapmandew', 'C567'
union select 987.65, 'mark_wills', 'C543'
union select 234.50, 'aneeshattingal', 'C123'
union select 5234.56, 'angeliii', 'C123'
union select 569.50, 'chapmandew', 'C890'
union select 98.765, 'mark_wills', 'C1011'
union select 1234.50, 'aneeshattingal', 'C1234'
union select 567.89, 'mark_wills', 'C568'
union select 987.65, 'mark_wills', 'C543'
union select 1234.75, 'aneeshattingal', 'C123'
union select 3234.15, 'aneeshattingal', 'C246'
union select 5690.50, 'chapmandew', 'C543'
union select 980.765, 'chapmandew', 'C1011'
union select 12345.00, 'angeliii', 'C1234'
union select 67.89, 'angeliii', 'C567'
union select 87.65, 'mark_wills', 'C543'
union select 34.50, 'aneeshattingal', 'C123'
union select 234.56, 'angeliii', 'C123'
union select 69.50, 'chapmandew', 'C890'
union select 8.765, 'mark_wills', 'C1011'
union select 234.50, 'aneeshattingal', 'C1234'
union select 67.89, 'mark_wills', 'C568'
union select 87.65, 'angeliii', 'C543'
union select 234.75, 'aneeshattingal', 'C123'
union select 234.15, 'mark_wills', 'C246'
union select 690.50, 'chapmandew', 'C543'
union select 80.765, 'mwvisa1', 'C1011'
union select 2345.00, 'angeliii', 'C1234'
union select 345.00, 'mwvisa1', 'C1234'
;

Open in new window

3. Let's add some additional data values by inserts based on manipulating existing data.
while (select count(*) from bookings) < 400
insert into bookings( bk_date, bk_amount, bk_salesperson, bk_customer)
select bk_date+16 ,
bk_amount * 2.00, bk_salesperson, bk_customer
from bookings
union
select bk_date+4 ,
bk_amount * 1.25, bk_salesperson, bk_customer
from bookings
union
select bk_date-5 ,
bk_amount * 0.80, bk_salesperson, bk_customer
from bookings
;

Open in new window

4. Let's spread our data chronologically and add some quantities by updating our test data.  
(To make things easier, we will take advantage of common table expression(s) (CTE), ntile() function and over() analytical statement(s) here; however, I will not be covering these in more depth, so please see the linked Microsoft references for more information.)
with bks
as
(
   select bk_id, bk_date, bk_quantity, 
   (bk_amount / 500) x, -- create factor for each row by amount
   -- assign different number to each record
   ntile( 12+month(getdate())-1 ) 
	  -- get random ordering to make a little more real world
      over ( order by rand( checksum( newid() ) ) ) n
   from bookings
)
update bks
set bk_date = dateadd(month, -n, bk_date+n) ,
bk_quantity = n * rand(x) * x + bk_quantity
;

Open in new window

Hopefully this is a little more realistic without getting too (or more) anal-retentive.  Please note, due to the "random" nature of building the data, your own examples might yield different results to those shown here.


What's next...

2. Refresher: Rolling Period Pivot Trick(s).


Remember the key to our success from my previous article was simply: convert constantly changing data like relative dates to a finite set of values like 'one month ago', 'this year', etc.

To accomplish this, we looked at the following T-SQL that takes advantage of DATEDIFF.
datediff(month, bk_date, getdate())

Open in new window

Now we have a nice integer representing the number of months ago a date in our data is from today.

Similarly (for number of quarters or years):
datediff(quarter, bk_date, getdate())

Open in new window

datediff(year, bk_date, getdate())

Open in new window

And so forth for week, day, and other valid dateparts.

When dealing with one year's worth of data or analyzing data in aggregate (i.e., you don't care for differentiation of the year), these also come in handy:
datename(month, bk_date)   -- e.g., September

Open in new window

datename(quarter, bk_date) -- e.g., 3 (3rd Quarter)

Open in new window

Got it!

But, what happens if we have two or more of those datediff type columns? They can end up with the same values - for example current month is [0], and current year is also [0].  So, when we start PIVOT on multiple columns, we really have to take care with our naming conventions.  You will see more clearly below when we add the 'm' and 'y' prefixes.


Now let's move on...

3. Laying It Across The Table.


We know how to pivot data and we know how to use some easy tricks to avoid having to use dynamic SQL. By you being here, I am figuring you don't know how to use two (you heard me correctly, two) pivots in one SQL statement, so let's get to it.

1. Prepare our values (column names).
We are going to compare month-to-date, last month actual, two months prior actual, and year-to-date totals using our months and years ago trick above.  Since our values for months ago and years ago all become column names per our desire to pivot on multiple columns, we must pay attention to the fact that these values now need to be unique.

Simple approach: convert each from integer to varchar  and add an identifying prefix.
'm' + convert(varchar, datediff(month, bk_date, getdate()))

Open in new window

'y' + convert(varchar, datediff(year, bk_date, getdate()))

Open in new window

Now we can tell 0 months ago values from those 0 years ago.  This also simplifies our life as  we won't have to deal with bracketing number only column names (e.g., [0]) as we can simply use m0 or y0.  Furthermore, the removal of this ambiguity, alleviates us from having to type out a table name or alias prefix every time we want to reference one of these pivoted columns.

Since we are analyzing the same bookings over different periods, we can simply repeat the "bk_amount" value and just alias with different column names (e.g., amount_m and amount_y - again for easy identification).
select bk_salesperson as salesperson,
bk_amount as amount_m, 
  'm' + convert(varchar, datediff(month, bk_date, getdate())) as monthsago,
bk_amount as amount_y, 
  'y' + convert(varchar, datediff(year, bk_date, getdate())) as yearsago
from dbo.bookings
where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
and bk_date < datediff(day, 0, getdate()) + 1
;

Open in new window

Now we will try our first pivot, using the above data.  I will use a CTE, but a derived table could also be used.  Temp table and other approaches are possible, but a benefit to CTE or derived table approach is you can utilize the demonstrated queries directly in a view.

2. First attempt...
with bks(salesperson, amount_m, monthsago, amount_y, yearsago)
as
(
   select bk_salesperson,
   bk_amount, 'm' + convert(varchar, datediff(month, bk_date, getdate())),
   bk_amount, 'y' + convert(varchar, datediff(year, bk_date, getdate()))
   from dbo.bookings
   where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
   and bk_date < datediff(day, 0, getdate()) + 1
)
select Salesperson, 
   m0 as CurrentMonth, -- Current month: 0 months ago.
   m1 as LastMonth,    -- 1 month ago.
   m2 as TwoMonthsAgo, -- 2 months ago.
   y0 as CurrentYear   -- Current year: 0 years ago.
from bks
pivot( sum(amount_m) for monthsago in (m0, m1, m2) ) pvt_m
pivot( sum(amount_y) for yearsago in (y0) ) pvt_y
;

Open in new window

We have now successfully pivoted on multiple columns.  Have a look.
Pivot Initial ResultsOr download the spreadsheet :
pivot-initial-results.xls

However, are these the results you expected?

Probably not!  Since we have now pivoted by multiple values and pivot groups by the columns not specified in its clause, you will get a row for every combination of the values.  

3. Grouping it all together.

It would appear that the salesperson is not grouped properly.  Let's add a GROUP BY and some proper aggregates.

with bks(salesperson, amount_m, monthsago, amount_y, yearsago)
as
(
   select bk_salesperson,
   bk_amount, 'm' + convert(varchar, datediff(month, bk_date, getdate())),
   bk_amount, 'y' + convert(varchar, datediff(year, bk_date, getdate()))
   from dbo.bookings
   where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
   and bk_date < datediff(day, 0, getdate()) + 1
)
select Salesperson, 
   sum(m0) as CurrentMonth, -- Current month: 0 months ago.
   sum(m1) as LastMonth,    -- 1 month ago.
   sum(m2) as TwoMonthsAgo, -- 2 months ago.
   sum(y0) as CurrentYear   -- Current year: 0 years ago.
from bks
pivot( sum(amount_m) for monthsago in (m0, m1, m2) ) pvt_m
pivot( sum(amount_y) for yearsago in (y0) ) pvt_y
group by salesperson
;

Open in new window

Much better ! (or is it ?)

4. Robustness Required    (added 12 Oct 2009)
The code above could be used in most cases where the data is clean, predictable and well populated across every element without duplication. However, I have found that it can be possible to exclude some dollar values from the second pivot when the first pivot groups by columns not involved in the pivot (i.e., occurrence of data other than those discrete months, such as multiple bookings per month not reflected in the year pivot).

For example, let's have a quick look at this simple query :
select bk_salesperson, sum(bk_amount) as year_total
from dbo.bookings
where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
and bk_date < datediff(day, 0, getdate()) + 1
group by bk_salesperson
;

Open in new window

Notice the difference now?  That "current year" in the previous example isn't really the full current year as shown above (well, using my random data population at least)... And therein lies our "trap" using pivot, it does not quite pivot the elements according to both of those pivot requirements.

So, there is a more robust method available to us...

What we have done below is to use derived tables (i.e., a sub-query on bks) to pull out data specific to each pivot; apply the pivot syntax (which does the sum(amount) ); and then join the (potentially) disparate pivots together using an inner join.  You will note another difference in this revised code is that bk_amount does not have to be repeated, because we are now (more simply) using the bk_date separately for months and year.  By using the common table expression (or CTE) we can extract the raw data components and then pivot on those separately (as derived tables) and avoid having an overriding "group by".

with bks( salesperson, amount, monthsago, yearsago )
as
(
   select bk_salesperson, bk_amount
        , 'm'+convert(char(10), datediff(month, bk_date, getdate()))
        , 'y'+convert(char(10), datediff(year, bk_date, getdate()))
   from dbo.bookings
   where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
   and bk_date < datediff(day, 0, getdate()) + 1
)
select m.Salesperson
     , m0 as CurrentMonth -- Current month: 0 months ago.
     , m1 as LastMonth    -- 1 month ago.
     , m2 as TwoMonthsAgo -- 2 months ago.
     , y0 as CurrentYear  -- Current year: 0 years ago.
from (select salesperson, amount, monthsago from bks) b1
pivot( sum(amount) for monthsago in (m0, m1, m2) ) m
inner join (select salesperson, amount, yearsago from bks) b2
pivot( sum(amount) for yearsago in (y0) ) y
   on y.salesperson = m.salesperson
order by salesperson
;

Open in new window

Now look at these results.
Pivot with Group By ResultsOr download the spreadsheet :
pivot-groupby-results.xls

Note using this approach, it is also technically feasible to utilize raw integer values for monthsago and yearsago since each pivot is now applying against separate data sources (i.e., derived tables from individual sub-queries) and being joined afterwards. However, for other join scenarios (e.g., cross join) you will find that the ambiguity of columns will come into play again, so I still recommend you get into habit of using the prefix in your column names OR at least be aware of its usage and pitfalls if you don't.


Now, let's have some more fun and extend our new found "pivotabilities"...

4. Let's Spread Out Some More.


Nice pivot results above, but back to trying to be more real world, what about different values?  Well, the good news it is as simple as duplicating column values with different column names as we showed with amount above, but keeping in mind that pivoted values also need to be unique.

So to dazzle and amaze you, we will now pivot SIX columns at once.  But first to reemphasize the ease in which a CTE can be converted to a view and re-enforce the fact that pivot source can be any pivot-ready data.

1. Create comparative bookings view.
Sometimes, it is a lot more convenient to create a VIEW over calculated data. A VIEW is simply a runtime presentation of data, and works much the same way as a Table, but really just points to the data in the table and doesn't hold data itself (well, there are some exceptions, but beyond this topic for now).
create view vw_comparative_bookings
as
with bks( salesperson, amount, quantity, monthsago, yearsago )
as
(
   select bk_salesperson, bk_amount, bk_quantity
        , 'm'+convert(char(10), datediff(month, bk_date, getdate()))
        , 'y'+convert(char(10), datediff(year, bk_date, getdate()))
   from dbo.bookings
   where bk_date >= dateadd(year, datediff(year, 0, getdate())-1, 0)
   and bk_date < datediff(day, 0, getdate()) + 1
)
select * 
from bks
;

Open in new window


2. Pivot view.
Now that we have our view, we can use that a lot more conveniently in our Pivot statement.
select pvt_aM.Salesperson
     , aM0 as [Current MTD $]    -- Current month-to-date.
     , aM12 as [PY This Month $] -- Prior year actual for same month.
     , aY0 as [CY Total $$]      -- Current year-to-date.
     , aY1 as [PY Total $$]      -- Prior year actual (entire year).
     , qM0 as [Current MTD # Units]        
     , qM12 as [PY This Month # Units]
     , qY0 as [CY # Units]
     , qY1 as [PY # Units]
     , cM0 as [Current MTD # Orders]
     , cM12 as [PY This Month # Orders]
     , cY0 as [CY # Orders]
     , cY1 as [PY # Orders]
from 
-- dollars
(select salesperson, amount, 'a'+monthsago as monthsago 
   from vw_comparative_bookings) b1
pivot( sum(amount) for monthsago in (aM0, aM12) ) pvt_aM
inner join 
(select salesperson, amount, 'a'+yearsago as yearsago
   from vw_comparative_bookings) b2
pivot( sum(amount) for yearsago in (aY0, aY1) ) pvt_aY
   on pvt_aM.salesperson = pvt_aY.salesperson
   
-- units
inner join
(select salesperson, quantity, 'q'+monthsago as monthsago
   from vw_comparative_bookings) b3
pivot( sum(quantity) for monthsago in (qM0, qM12) ) pvt_qM
   on pvt_aM.salesperson = pvt_qM.salesperson
inner join 
(select salesperson, quantity, 'q'+yearsago as yearsago
   from vw_comparative_bookings) b4
pivot( sum(quantity) for yearsago in (qY0, qY1) ) pvt_qY
   on pvt_qM.salesperson = pvt_qY.salesperson
   
-- number of orders
inner join
(select salesperson, 1 as cnt, 'c'+monthsago as monthsago
   from vw_comparative_bookings) b5
pivot( count(cnt) for monthsago in (cM0, cM12) ) pvt_cM
   on pvt_aM.salesperson = pvt_cM.salesperson
inner join 
(select salesperson, 1 as cnt, 'c'+yearsago as yearsago
   from vw_comparative_bookings) b6
pivot( count(cnt) for yearsago in (cY0, cY1) ) pvt_cY
   on pvt_cM.salesperson = pvt_cY.salesperson
;

Open in new window

And our new results.
Pivot Final ResultsOr download the spreadsheet (bonus data shown which is included in final source code below) :
pivot-final-results.xls

Important: (12 Oct 2009)
For previous readers, this section's code snippets have been updated to coincide with the added "more robust" multiple pivot technique.


Now that our horizons have been made much wider, please enjoy this new found pivoting trick as you venture into creating good business intelligence and analytical reports.

Thanks for taking another journey with me.

Happy coding!

Best regards,

Kevin (aka MWVisa1)


Downloads: (updated: 12 Oct 2009)
bookings-query-20091012.sql.txt

References:

Pivot Rolling Periods Without Using Dynamic T-SQL
http://www.experts-exchange.com/A_654 .html

Common Table Expression(s)
http://msdn.microsoft.com/en-us/library/ms190766(SQL.90).aspx

NTILE() Function
http://msdn.microsoft.com/en-us/library/ms175126(SQL.90).aspx

OVER() Clause
http://msdn.microsoft.com/en-us/library/ms189461(SQL.90).aspx

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
If you found this article helpful, please click the Yes button after the question just below. This will give me (the author) a few points and might encourage me to write more articles.

If you didn't or otherwise feel the need to vote No, please first leave a comment to give me a chance to answer and perhaps to improve this article.

Thank you!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
5
Comment
Author:Kevin Cross
  • 2
2 Comments
LVL 61

Author Comment

by:Kevin Cross
Special thanks to Mark Wills for his guidance on this article!

Moreover, thank you to those who have taken the time out of their days to read this.  It has come in handy for me and hope it does the same for you.

Best regards,
Kevin
0
LVL 61

Author Comment

by:Kevin Cross
For those that like to learn other practical applications as well as the subtle nuances of different applications, here is a Q&A example requiring multiple pivots as well as dynamic column headings: http://www.experts-exchange.com/Q_26162698.html.

Thanks again to all that read my articles.  
It is very much appreciated!

Best regards and happy coding,
Kevin
0

Featured Post

The 7 Worst Nightmares of a Sysadmin

Fear not! To defend your business’ IT systems we’re going to shine a light on the seven most sinister terrors that haunt sysadmins. That way you can be sure there’s nothing in your stack waiting to go bump in the night.

Join & Write a Comment

Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

Keep in touch with Experts Exchange

Tech news and trends delivered to your inbox every month