|
[x]
Link Generator
|
||
|
[x]
Link Generator
|
||
| ID: 1537 |
|
[x]
Attachment Details
|
||
|
[x]
Thank you for voting!
|
||
| Thank you for voting! Helpful votes let the author know what you think about their article. For every 'yes' vote that the article gets, the author receives 50 points! | ||
|
[x]
Community Pick
|
||
Community Pick is worth 500 points, assigned to Articles that earned 10 net helpful votes, and/or deemed helpful by a Page Editor. |
||
|
[x]
EE Approved
|
||
EE Approval is worth 4,000 points, assigned to Articles that are considered a valuable resource in the zone they're published. |
||
The following attachment is a SQL script to create the table structure and data we will be working with throughout this article.
For those comfortable in just running the script, you can grab the file and move on to the next section, otherwise, I will go through briefly the code segments.
Important: If you already have a table in your system called "bookings", please change ALL references to the name "bookings" in the following T-SQL statements.
1. DDL to create table.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: | create table bookings (
bk_id uniqueidentifier default (newid()) primary key ,
bk_date datetime not null default (getdate()) ,
bk_quantity int not null default (1),
bk_amount decimal(19,4) not null ,
bk_region nvarchar(10) not null default ('EE-RDMS') ,
bk_salesperson nvarchar(50) ,
bk_customer nvarchar(10) not null
)
;
|
2. Fill initial data, utilizing some of the defaults set above to make things easier.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: | insert into bookings( bk_amount, bk_salesperson, bk_customer ) select 1234.56, 'angeliii', 'C104' union select 567.89, 'chapmandew', 'C567' union select 987.65, 'mark_wills', 'C543' union select 234.50, 'aneeshattingal', 'C123' union select 5234.56, 'angeliii', 'C123' union select 569.50, 'chapmandew', 'C890' union select 98.765, 'mark_wills', 'C1011' union select 1234.50, 'aneeshattingal', 'C1234' union select 567.89, 'mark_wills', 'C568' union select 987.65, 'mark_wills', 'C543' union select 1234.75, 'aneeshattingal', 'C123' union select 3234.15, 'aneeshattingal', 'C246' union select 5690.50, 'chapmandew', 'C543' union select 980.765, 'chapmandew', 'C1011' union select 12345.00, 'angeliii', 'C1234' union select 67.89, 'angeliii', 'C567' union select 87.65, 'mark_wills', 'C543' union select 34.50, 'aneeshattingal', 'C123' union select 234.56, 'angeliii', 'C123' union select 69.50, 'chapmandew', 'C890' union select 8.765, 'mark_wills', 'C1011' union select 234.50, 'aneeshattingal', 'C1234' union select 67.89, 'mark_wills', 'C568' union select 87.65, 'angeliii', 'C543' union select 234.75, 'aneeshattingal', 'C123' union select 234.15, 'mark_wills', 'C246' union select 690.50, 'chapmandew', 'C543' union select 80.765, 'mwvisa1', 'C1011' union select 2345.00, 'angeliii', 'C1234' union select 345.00, 'mwvisa1', 'C1234' ; |
3. Let's add some additional data values by inserts based on manipulating existing data.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: | while (select count(*) from bookings) < 400 insert into bookings( bk_date, bk_amount, bk_salesperson, bk_customer) select bk_date+16 , bk_amount * 2.00, bk_salesperson, bk_customer from bookings union select bk_date+4 , bk_amount * 1.25, bk_salesperson, bk_customer from bookings union select bk_date-5 , bk_amount * 0.80, bk_salesperson, bk_customer from bookings ; |
4. Let's spread our data chronologically and add some quantities by updating our test data.
(To make things easier, we will take advantage of common table expression(s) (CTE), ntile() function and over() analytical statement(s) here; however, I will not be covering these in more depth, so please see the linked Microsoft references for more information.)
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: | with bks
as
(
select bk_id, bk_date, bk_quantity,
(bk_amount / 500) x, -- create factor for each row by amount
-- assign different number to each record
ntile( 12+month(getdate())-1 )
-- get random ordering to make a little more real world
over ( order by rand( checksum( newid() ) ) ) n
from bookings
)
update bks
set bk_date = dateadd(month, -n, bk_date+n) ,
bk_quantity = n * rand(x) * x + bk_quantity
;
|
Hopefully this is a little more realistic without getting too (or more) anal-retentive. Please note, due to the "random" nature of building the data, your own examples might yield different results to those shown here.
Remember the key to our success from my previous article was simply: convert constantly changing data like relative dates to a finite set of values like 'one month ago', 'this year', etc.
To accomplish this, we looked at the following T-SQL that takes advantage of DATEDIFF.
1: | datediff(month, bk_date, getdate()) |
Now we have a nice integer representing the number of months ago a date in our data is from today.
Similarly (for number of quarters or years):
1: | datediff(quarter, bk_date, getdate()) |
1: | datediff(year, bk_date, getdate()) |
And so forth for week, day, and other valid dateparts.
When dealing with one year's worth of data or analyzing data in aggregate (i.e., you don't care for differentiation of the year), these also come in handy:
1: | datename(month, bk_date) -- e.g., September |
1: | datename(quarter, bk_date) -- e.g., 3 (3rd Quarter) |
Got it!
But, what happens if we have two or more of those datediff type columns? They can end up with the same values - for example current month is [0], and current year is also [0]. So, when we start PIVOT on multiple columns, we really have to take care with our naming conventions. You will see more clearly below when we add the 'm' and 'y' prefixes.
We know how to pivot data and we know how to use some easy tricks to avoid having to use dynamic SQL. By you being here, I am figuring you don't know how to use two (you heard me correctly, two) pivots in one SQL statement, so let's get to it.
1. Prepare our values (column names).
We are going to compare month-to-date, last month actual, two months prior actual, and year-to-date totals using our months and years ago trick above. Since our values for months ago and years ago all become column names per our desire to pivot on multiple columns, we must pay attention to the fact that these values now need to be unique.
Simple approach: convert each from integer to varchar and add an identifying prefix.
1: | 'm' + convert(varchar, datediff(month, bk_date, getdate())) |
1: | 'y' + convert(varchar, datediff(year, bk_date, getdate())) |
Now we can tell 0 months ago values from those 0 years ago. This also simplifies our life as we won't have to deal with bracketing number only column names (e.g., [0]) as we can simply use m0 or y0. Furthermore, the removal of this ambiguity, alleviates us from having to type out a table name or alias prefix every time we want to reference one of these pivoted columns.
Since we are analyzing the same bookings over different periods, we can simply repeat the "bk_amount" value and just alias with different column names (e.g., amount_m and amount_y - again for easy identification).
1: 2: 3: 4: 5: 6: 7: 8: 9: | select bk_salesperson as salesperson, bk_amount as amount_m, 'm' + convert(varchar, datediff(month, bk_date, getdate())) as monthsago, bk_amount as amount_y, 'y' + convert(varchar, datediff(year, bk_date, getdate())) as yearsago from dbo.bookings where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0) and bk_date < datediff(day, 0, getdate()) + 1 ; |
Now we will try our first pivot, using the above data. I will use a CTE, but a derived table could also be used. Temp table and other approaches are possible, but a benefit to CTE or derived table approach is you can utilize the demonstrated queries directly in a view.
2. First attempt...
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: | with bks(salesperson, amount_m, monthsago, amount_y, yearsago) as ( select bk_salesperson, bk_amount, 'm' + convert(varchar, datediff(month, bk_date, getdate())), bk_amount, 'y' + convert(varchar, datediff(year, bk_date, getdate())) from dbo.bookings where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0) and bk_date < datediff(day, 0, getdate()) + 1 ) select Salesperson, m0 as CurrentMonth, -- Current month: 0 months ago. m1 as LastMonth, -- 1 month ago. m2 as TwoMonthsAgo, -- 2 months ago. y0 as CurrentYear -- Current year: 0 years ago. from bks pivot( sum(amount_m) for monthsago in (m0, m1, m2) ) pvt_m pivot( sum(amount_y) for yearsago in (y0) ) pvt_y ; |
We have now successfully pivoted on multiple columns. Have a look.
Or download the spreadsheet :
However, are these the results you expected?
Probably not! Since we have now pivoted by multiple values and pivot groups by the columns not specified in its clause, you will get a row for every combination of the values.
3. Grouping it all together.
It would appear that the salesperson is not grouped properly. Let's add a GROUP BY and some proper aggregates.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: | with bks(salesperson, amount_m, monthsago, amount_y, yearsago) as ( select bk_salesperson, bk_amount, 'm' + convert(varchar, datediff(month, bk_date, getdate())), bk_amount, 'y' + convert(varchar, datediff(year, bk_date, getdate())) from dbo.bookings where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0) and bk_date < datediff(day, 0, getdate()) + 1 ) select Salesperson, sum(m0) as CurrentMonth, -- Current month: 0 months ago. sum(m1) as LastMonth, -- 1 month ago. sum(m2) as TwoMonthsAgo, -- 2 months ago. sum(y0) as CurrentYear -- Current year: 0 years ago. from bks pivot( sum(amount_m) for monthsago in (m0, m1, m2) ) pvt_m pivot( sum(amount_y) for yearsago in (y0) ) pvt_y group by salesperson ; |
Much better ! (or is it ?)
4. Robustness Required (added 12 Oct 2009)
The code above could be used in most cases where the data is clean, predictable and well populated across every element without duplication. However, I have found that it can be possible to exclude some dollar values from the second pivot when the first pivot groups by columns not involved in the pivot (i.e., occurrence of data other than those discrete months, such as multiple bookings per month not reflected in the year pivot).
For example, let's have a quick look at this simple query :
1: 2: 3: 4: 5: 6: | select bk_salesperson, sum(bk_amount) as year_total from dbo.bookings where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0) and bk_date < datediff(day, 0, getdate()) + 1 group by bk_salesperson ; |
Notice the difference now? That "current year" in the previous example isn't really the full current year as shown above (well, using my random data population at least)... And therein lies our "trap" using pivot, it does not quite pivot the elements according to both of those pivot requirements.
So, there is a more robust method available to us...
What we have done below is to use derived tables (i.e., a sub-query on bks) to pull out data specific to each pivot; apply the pivot syntax (which does the sum(amount) ); and then join the (potentially) disparate pivots together using an inner join. You will note another difference in this revised code is that bk_amount does not have to be repeated, because we are now (more simply) using the bk_date separately for months and year. By using the common table expression (or CTE) we can extract the raw data components and then pivot on those separately (as derived tables) and avoid having an overriding "group by".
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: | with bks( salesperson, amount, monthsago, yearsago )
as
(
select bk_salesperson, bk_amount
, 'm'+convert(char(10), datediff(month, bk_date, getdate()))
, 'y'+convert(char(10), datediff(year, bk_date, getdate()))
from dbo.bookings
where bk_date >= dateadd(year, datediff(year, 0, getdate()), 0)
and bk_date < datediff(day, 0, getdate()) + 1
)
select m.Salesperson
, m0 as CurrentMonth -- Current month: 0 months ago.
, m1 as LastMonth -- 1 month ago.
, m2 as TwoMonthsAgo -- 2 months ago.
, y0 as CurrentYear -- Current year: 0 years ago.
from (select salesperson, amount, monthsago from bks) b1
pivot( sum(amount) for monthsago in (m0, m1, m2) ) m
inner join (select salesperson, amount, yearsago from bks) b2
pivot( sum(amount) for yearsago in (y0) ) y
on y.salesperson = m.salesperson
order by salesperson
;
|
Now look at these results.
Or download the spreadsheet :
Note using this approach, it is also technically feasible to utilize raw integer values for monthsago and yearsago since each pivot is now applying against separate data sources (i.e., derived tables from individual sub-queries) and being joined afterwards. However, for other join scenarios (e.g., cross join) you will find that the ambiguity of columns will come into play again, so I still recommend you get into habit of using the prefix in your column names OR at least be aware of its usage and pitfalls if you don't.
Nice pivot results above, but back to trying to be more real world, what about different values? Well, the good news it is as simple as duplicating column values with different column names as we showed with amount above, but keeping in mind that pivoted values also need to be unique.
So to dazzle and amaze you, we will now pivot SIX columns at once. But first to reemphasize the ease in which a CTE can be converted to a view and re-enforce the fact that pivot source can be any pivot-ready data.
1. Create comparative bookings view.
Sometimes, it is a lot more convenient to create a VIEW over calculated data. A VIEW is simply a runtime presentation of data, and works much the same way as a Table, but really just points to the data in the table and doesn't hold data itself (well, there are some exceptions, but beyond this topic for now).
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: | create view vw_comparative_bookings
as
with bks( salesperson, amount, quantity, monthsago, yearsago )
as
(
select bk_salesperson, bk_amount, bk_quantity
, 'm'+convert(char(10), datediff(month, bk_date, getdate()))
, 'y'+convert(char(10), datediff(year, bk_date, getdate()))
from dbo.bookings
where bk_date >= dateadd(year, datediff(year, 0, getdate())-1, 0)
and bk_date < datediff(day, 0, getdate()) + 1
)
select *
from bks
;
|
2. Pivot view.
Now that we have our view, we can use that a lot more conveniently in our Pivot statement.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: | select pvt_aM.Salesperson
, aM0 as [Current MTD $] -- Current month-to-date.
, aM12 as [PY This Month $] -- Prior year actual for same month.
, aY0 as [CY Total $$] -- Current year-to-date.
, aY1 as [PY Total $$] -- Prior year actual (entire year).
, qM0 as [Current MTD # Units]
, qM12 as [PY This Month # Units]
, qY0 as [CY # Units]
, qY1 as [PY # Units]
, cM0 as [Current MTD # Orders]
, cM12 as [PY This Month # Orders]
, cY0 as [CY # Orders]
, cY1 as [PY # Orders]
from
-- dollars
(select salesperson, amount, 'a'+monthsago as monthsago
from vw_comparative_bookings) b1
pivot( sum(amount) for monthsago in (aM0, aM12) ) pvt_aM
inner join
(select salesperson, amount, 'a'+yearsago as yearsago
from vw_comparative_bookings) b2
pivot( sum(amount) for yearsago in (aY0, aY1) ) pvt_aY
on pvt_aM.salesperson = pvt_aY.salesperson
-- units
inner join
(select salesperson, quantity, 'q'+monthsago as monthsago
from vw_comparative_bookings) b3
pivot( sum(quantity) for monthsago in (qM0, qM12) ) pvt_qM
on pvt_aM.salesperson = pvt_qM.salesperson
inner join
(select salesperson, quantity, 'q'+yearsago as yearsago
from vw_comparative_bookings) b4
pivot( sum(quantity) for yearsago in (qY0, qY1) ) pvt_qY
on pvt_qM.salesperson = pvt_qY.salesperson
-- number of orders
inner join
(select salesperson, 1 as cnt, 'c'+monthsago as monthsago
from vw_comparative_bookings) b5
pivot( count(cnt) for monthsago in (cM0, cM12) ) pvt_cM
on pvt_aM.salesperson = pvt_cM.salesperson
inner join
(select salesperson, 1 as cnt, 'c'+yearsago as yearsago
from vw_comparative_bookings) b6
pivot( count(cnt) for yearsago in (cY0, cY1) ) pvt_cY
on pvt_cM.salesperson = pvt_cY.salesperson
;
|
And our new results.
Or download the spreadsheet (bonus data shown which is included in final source code below) :
Important: (12 Oct 2009)
For previous readers, this section's code snippets have been updated to coincide with the added "more robust" multiple pivot technique.
|
[x]
Posted via EE Mobile
|
||
Search, ask, and monitor your questions on the go with EE Mobile. Visit Experts Exchange from your mobile device and never be out of touch again. |
||
|
[x]
Posted via EE Mobile
|
||
Search, ask, and monitor your questions on the go with EE Mobile. Visit Experts Exchange from your mobile device and never be out of touch again. |
||
| 05/05/10 03:44 PM, ID: 14101 |
Advertisement
Moreover, thank you to those who have taken the time out of their days to read this. It has come in handy for me and hope it does the same for you.
Best regards,
Kevin