Link to home
Start Free TrialLog in
Avatar of raymurphy
raymurphy

asked on

Summarising Historical Data

Have table named hist_exch_rates holding exchange rates info with following structure :

ex_date          char(8)
fromc              char(3)
toc                  char(3)
rate_mul          float
rate_div          float

Table holds historical exchange rate info going back to 2007 (starting with ex_date value of
20070101) and up to current date (with ex_date value of 20110601). Typical contents for Jan 2007 would be :

ex_date         fromc  toc   rate_mul  rate_div
20070101      AED    CHF   0.33        3.01
20070101      ARS    CHF   0.44        2.51

Typical contents for Feb 2007 would be :

ex_date         fromc  toc   rate_mul  rate_div
20070201      AED    CHF   0.55        3.01
20070201      ARS    CHF   0.66        2.51

I am only interested in dealing with entries where to_currency is CHF.

For each month, there can be 65 entries where the toc would be CHF (fromc would be different
as in the above data samples).

I've been asked to prepare a historical report - essentially showing all entries where
toc is CHF for the last five years (from ex_date of 20070101 through to ex_date of 20110601)
but the output needs to summarised in the following sort of format :
                                                   
fromc toc      JAN07rate_mul    JAN07rate_div       FEB07rate_mul     FEB07rate_div

AED   CHF      0.33                     3.01                       0.55                      3.01
ARS   CHF      0.44                     2.51                      0.66                       2.51

So on the output the JAN07ratemul and JAN07rate_div columns would go all the way along to MAY11rate_mul and MAY11rate_div.

Database is SQL Server 2005. Hope I've explained what I need clearly, and would really
appreciate some assistance in getting the data into the required outputput format.


Avatar of ralmada
ralmada
Flag of Canada image

I guess you the period will be variable right? so you can use dynamic SQL then. Check the below
declare @cols1 varchar(max)
declare @cols2 varchar(max)
declare @cols3 varchar(max)
declare @strSQL varchar(max)


set @cols1 = stuff((select distinct '], [' + ex_date + '_mul' + ', [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601'  --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols2 = stuff((select distinct '], [' + ex_date + '_mul'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols3 = stuff((select distinct '], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @strSQL = 'select fromc, toc, ' + @cols1 +
		' from (
			select 	ex_date + '_mul' as ex_date1,        
				ex_date + '_div' as ex_date2,        
				fromc,
				toc,
				rate_mul,
				rate_div
			from hist_exch_rates
                            where ex_date between '20070101' and '20110601' --adjust date accordingly
		) o
		pivot (max(rate_mul) for ex_date1 in (' + @cols2 + ')) as p1
		pivot (max(rate_div) for ex_date2 in (' + @cols3 + ')) as p2'
		
exec(@strSQL)

Open in new window

Avatar of raymurphy
raymurphy

ASKER

Thanks for the prompt reply and your suggestion, ralmada ...

Just tried it out but got syntax error :
 
     Msg 102, Level 15, State 1, Line 30
      Incorrect syntax near '_mul'.

This relates to the line select ex_date + '_mul' as ex_date1,  in the set @strSQL code segment - anu ideas what's wron there ?
oops, the quotes should be escaped there

check now
declare @cols1 varchar(max)
declare @cols2 varchar(max)
declare @cols3 varchar(max)
declare @strSQL varchar(max)


set @cols1 = stuff((select distinct '], [' + ex_date + '_mul' + ', [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601'  --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols2 = stuff((select distinct '], [' + ex_date + '_mul'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols3 = stuff((select distinct '], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @strSQL = 'select fromc, toc, ' + @cols1 +
		' from (
			select 	ex_date + ''_mul'' as ex_date1,        
				ex_date + ''_div'' as ex_date2,        
				fromc,
				toc,
				rate_mul,
				rate_div
			from hist_exch_rates
                            where ex_date between ''20070101'' and ''20110601'' --adjust date accordingly
		) o
		pivot (max(rate_mul) for ex_date1 in (' + @cols2 + ')) as p1
		pivot (max(rate_div) for ex_date2 in (' + @cols3 + ')) as p2'
		
exec(@strSQL)

Open in new window

That's cured the quoting problem, ralmada, but have now got the following error :

Msg 325, Level 15, State 1, Line 11
Incorrect syntax near 'pivot'. You may need to set the compatibility level of the current database to a higher value to enable this feature. See help for the stored procedure sp_dbcmptlevel.

The database I'm using is SQL Server 2005, so not sure what this error is ?
You need to adjust your compatibility level as the message says. (compatibility level 90)

 EXEC sp_dbcmptlevel yourdatabase, 90;
GO
Sorry, never came across having to adjust the compatibility level, so thanks for that ...

Query is now running, but not yet producing results, as I'm now getting a whole pile of Invalid column name errors ...

Msg 207, Level 16, State 1, Line 1
Invalid column name '20071201_mul, [20071201_div'.
Msg 207, Level 16, State 1, Line 1
Invalid column name '20100301_mul, [20100301_div'.
ralmada, thanks for your assistance so far ...
I wonder if giving a specific data example might clarify what I need ...
Taking my sample data again, and looking a very small example set of data ...

For a fromc of AED and toc of CHF, I might have the following values in my table

ex_date         fromc  toc   rate_mul  rate_div
20070101      AED    CHF   0.33        3.01
20070201      AED    CHF   0.55        3.01

Given that sample data, the format I need to have the output in is so that I see :

fromc toc [rate_mul_20070101] [rate_div_20070101]  [rate_mul for 20070201]  [rate_div for 20070201]
AED   CHF 0.33                           3.01                            0.55                                 3.01  

Is that a better explanation of what I need ?
can you do me a favour, can you change the last line

exec(@strSQL)

to

print @strSQL

so I can debug the query?
In the mean time try this
declare @cols1 varchar(max)
declare @cols2 varchar(max)
declare @cols3 varchar(max)
declare @strSQL varchar(max)


set @cols1 = stuff((select distinct '], [' + ex_date + '_mul], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601'  --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols2 = stuff((select distinct '], [' + ex_date + '_mul'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols3 = stuff((select distinct '], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @strSQL = 'select fromc, toc, ' + @cols1 +
		' from (
			select 	ex_date + ''_mul'' as ex_date1,        
				ex_date + ''_div'' as ex_date2,        
				fromc,
				toc,
				rate_mul,
				rate_div
			from hist_exch_rates
                            where ex_date between ''20070101'' and ''20110601'' --adjust date accordingly
		) o
		pivot (max(rate_mul) for ex_date1 in (' + @cols2 + ')) as p1
		pivot (max(rate_div) for ex_date2 in (' + @cols3 + ')) as p2'
		
exec(@strSQL)

Open in new window

Printing @strSQL on original query shows :
select fromc, toc,  [20071201_mul, [20071201_div], [20100301_mul, [20100301_div], [20090901_mul, [20090901_div], [20080901_mul, [20080901_div], [20101101_mul, [20101101_div], [20080801_mul, [20080801_div], [20110501_mul, [20110501_div], [20080201_mul, [20080201_div], [20090301_mul, [20090301_div], [20070201_mul, [20070201_div], [20070501_mul, [20070501_div], [20110201_mul, [20110201_div], [20070301_mul, [20070301_div], [20090601_mul, [20090601_div], [20081101_mul, [20081101_div], [20070401_mul, [20070401_div], [20080501_mul, [20080501_div], [20091101_mul, [20091101_div], [20080401_mul, [20080401_div], [20080601_mul, [20080601_div], [20080701_mul, [20080701_div], [20090201_mul, [20090201_div], [20090501_mul, [20090501_div], [20110301_mul, [20110301_div], [20100601_mul, [20100601_div], [20100501_mul, [20100501_div], [20081001_mul, [20081001_div], [20070801_mul, [20070801_div], [20091201_mul, [20091201_div], [20070701_mul, [20070701_div], [20100201_mul, [20100201_div], [20080101_mul, [20080101_div], [20071101_mul, [20071101_div], [20080301_mul, [20080301_div], [20091001_mul, [20091001_div], [20101001_mul, [20101001_div], [20100901_mul, [20100901_div], [20100701_mul, [20100701_div], [20110101_mul, [20110101_div], [20100801_mul, [20100801_div], [20100401_mul, [20100401_div], [20110401_mul, [20110401_div], [20070901_mul, [20070901_div], [20100101_mul, [20100101_div], [20070101_mul, [20070101_div], [20101201_mul, [20101201_div], [20071001_mul, [20071001_div], [20090101_mul, [20090101_div], [20081201_mul, [20081201_div], [20090801_mul, [20090801_div], [20070601_mul, [20070601_div], [20090701_mul, [20090701_div], [20110601_mul, [20110601_div], [20090401_mul, [20090401_div] from (

                        select  ex_date + '_mul' as ex_date1,        

                                ex_date + '_div' as ex_date2,        

                                fromc,

                                toc,

                                rate_mul,

                                rate_div

                        from hist_exch_rates

                            where ex_date between '20070101' and '20110601' --adjust date accordingly

                ) o

                pivot (max(rate_mul) for ex_date1 in ( [20100401_mul], [20080801_mul], [20090301_mul], [20090201_mul], [20080101_mul], [20100201_mul], [20071101_mul], [20080201_mul], [20091001_mul], [20070301_mul], [20070601_mul], [20080301_mul], [20070201_mul], [20110501_mul], [20090601_mul], [20090101_mul], [20080501_mul], [20091201_mul], [20070701_mul], [20101201_mul], [20100501_mul], [20110401_mul], [20091101_mul], [20080901_mul], [20080701_mul], [20081001_mul], [20081201_mul], [20071201_mul], [20071001_mul], [20100901_mul], [20080401_mul], [20090401_mul], [20070501_mul], [20100701_mul], [20090901_mul], [20070801_mul], [20081101_mul], [20090701_mul], [20100801_mul], [20100301_mul], [20101001_mul], [20070101_mul], [20090801_mul], [20110201_mul], [20090501_mul], [20110101_mul], [20070901_mul], [20110601_mul], [20080601_mul], [20110301_mul], [20100601_mul], [20101101_mul], [20070401_mul], [20100101_mul])) as p1

                pivot (max(rate_div) for ex_date2 in ( [20091101_div], [20100201_div], [20110601_div], [20071101_div], [20081201_div], [20090201_div], [20080801_div], [20070701_div], [20080601_div], [20100901_div], [20101001_div], [20080101_div], [20090301_div], [20090801_div], [20080401_div], [20070601_div], [20090401_div], [20071001_div], [20101101_div], [20100701_div], [20070401_div], [20100401_div], [20080501_div], [20100101_div], [20100601_div], [20091001_div], [20100501_div], [20070901_div], [20080301_div], [20080901_div], [20110101_div], [20090901_div], [20081001_div], [20110201_div], [20090101_div], [20070101_div], [20070301_div], [20090601_div], [20110301_div], [20081101_div], [20080201_div], [20110501_div], [20110401_div], [20070201_div], [20090701_div], [20070501_div], [20091201_div], [20101201_div], [20100801_div], [20090501_div], [20080701_div], [20071201_div], [20100301_div], [20070801_div])) as p2
Trying the 'in the meantime can you try' solution :

-- Returned rows looking like the format I want, but with NULLS in columns :

fromc   toc     20100901_mul      20100901_div     .......
AED      AUD      NULL                NULL                 .......    
could it be because we are missing to filter by CHF?

Check now
declare @cols1 varchar(max)
declare @cols2 varchar(max)
declare @cols3 varchar(max)
declare @strSQL varchar(max)


set @cols1 = stuff((select distinct '], [' + ex_date + '_mul], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601'  --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols2 = stuff((select distinct '], [' + ex_date + '_mul'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @cols3 = stuff((select distinct '], [' + ex_date + '_div'
		from hist_exch_rates
		where ex_date between '20070101' and '20110601' --adjust date accordingly
		for xml path('')
		), 1, 2, '') + ']'


set @strSQL = 'select fromc, toc, ' + @cols1 +
		' from (
			select 	ex_date + ''_mul'' as ex_date1,        
				ex_date + ''_div'' as ex_date2,        
				fromc,
				toc,
				rate_mul,
				rate_div
			from hist_exch_rates
                            where ex_date between ''20070101'' and ''20110601'' --adjust date accordingly
				and toc = ''CHF''
		) o
		pivot (max(rate_mul) for ex_date1 in (' + @cols2 + ')) as p1
		pivot (max(rate_div) for ex_date2 in (' + @cols3 + ')) as p2'
		
exec(@strSQL)

Open in new window

Tried latest code with CHF filter but still getting NULLS as before ...
ASKER CERTIFIED SOLUTION
Avatar of ralmada
ralmada
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That's excellent, ralmada, and it's now producing the output almost exactly as I need so thanks for that.

The only remaining niggle is that I run it on my live data, I get the column headings produced as
for example :

fromc   toc  20070101_mul    20070101_div    20100901_mul  20100901_div

Although not a showstopper, it would be perfect if the columns could come out in ascending sequence - so is that possible ?    
That can be done by adjusting the @cols1 variable



set @cols1 = stuff((select distinct ', max([' + ex_date + '_mul]) as [' + ex_date + '_mul], max([' + ex_date + '_div]) as [' + ex_date + '_div]'
            from #hist_exch_rates
            where ex_date between '20070101' and '20110601'  --adjust date accordingly
            for xml path('')  order by ex_date -- desc or asc
            ), 1, 2, '')

 
Unfortunately, adding the order by ex_date to the setting of the @cols1 variable throws up :

Incorrect syntax near the keyword 'order'.

yeah, They are inverted, order by comes first and for xml path after. I'm not used to type on smartphones. lol

set @cols1 = stuff((select distinct ', max([' + ex_date + '_mul]) as [' + ex_date + '_mul], max([' + ex_date + '_div]) as [' + ex_date + '_div]'
            from #hist_exch_rates
            where ex_date between '20070101' and '20110601'  --adjust date accordingly
            order by ex_date -- desc or asc
            for xml path('')  
            ), 1, 2, '')
Thanks for your patience and sticking with this, ralmada:.......

Putting the order by ex_date before the for xml_path gives the error :

          ORDER BY items must appear in the select list if SELECT DISTINCT is specified.

But don't worry about this as having the columns come out in ascending sequence is a nice-to-have rather than a got-to-have. Not even sure if it would be possible as the code would would be trying to reference ex_date in the ORDER BY clause, but of course the columns now have names such as [20071001_mul], [20081001_mul] etc .....

I'm more than happy with your solution and output result, and so am gladly awarding the points. If possible, what I would finally appreciate is a step-by-step breakdown of the final code solution as it involves a few techniques (stuff ... for xml path ...) which I've not used before ....

Thanks again.
then change

order by ex_date -- desc or asc

with

order by 1

So here's a brief explanation of what the code does:

1) We are using dynamic SQL to create a PIVOT query on the fly which is stored in the @strSQL variable. This will then be executed by using

exec(@StrSQL)

The basic structure of the PIVOT we are creating is

select fromc, toc, ....                                                 -- .... gets populated with @cols1
from ( select query ) o
pivot (max(rate_mul) for ex_date1 in (....) as p1     -- .... gets populated with @cols2
pivot (max(rate_div) for ex_date2 in (...) as p2       -- .... gets populated with @cols3
group by fromc, toc

2) We have three variables @cols1, @cols2 and @cols3 they are used to populate  the missing columns above.

The first one @cols1 creates columns with this format:

max([20071201_mul]) as [20071201_mul], max([20071201_div]) as [20071201_div],

Please note that you can change the column name (in bold) with the name of your preference.

@cols2 and @cols3 are used in the pivot section and simply create values like [20071201_mul] and [20071201_div]

To do this concatenation we are using an XML trick with the FOR XML PATH. That simply list all rows and concatenates them in a comma separated value. try the below query for an example of what I'm talking about:

select distinct '], [' + ex_date + '_mul'
from #hist_exch_rates
for xml path('')

This produces something like this:

], [20071201_mul], [20080101_mul] ... and so on

As you see we have an extra " ], " there, to get rid of it we use the stuff function. the main structure is:

stuff (yourstring, 1, 2, '')  

That will select the susbstring from 1 to 2 and replace it with '', that basically will remove the " ], " part

3) And lastly, as you can see we are using pivot twice, that's because you need to create a set of two columns, one for mul and one for div, but in order to avoid the null values we have encountered before we use a final group by so we have everything on the same row.

Hope this helps.

Excellent - that gives me a goof reference point for any future variations on this technique. Thanks again for all of your help ....