Retrieving filtered records for pivot

I'm trying to create a stored procedures that will allow me to input two parameters consisting of two different ids.  All the records in each dataversion dataset has have records with the same metric names but with different values in those field.  All the records are in a single table called metrics.  I am trying to retrieve the difference between metric values for each dataversion Id.  I can do this simply enough via CTE as shown below.  However I also need to pivot this data.  The provlem is that I  need to use Dynamic sql and Dynamic SQL can't use CTE.  Others hiave proposed using a vew and then using that view in the pivot process.  however, you can't pass parameters to a view.  Of course, I could place the dataversion criteria in the dynamic sql but that would mean first retrieving all the records via the ctes and then apply the criteria in the pivoting stage.  We're talking serveral hundred thousand records here.  
-- Retrieves data
Create PROCEDURE up_PrimaryBenefitsTraditionalMetricsReport 
@DVID1 as NVARCHAR(MAX), 
@DVID2 as NVARCHAR(MAX) 
         
AS 
Declare @strSQL varchar(max)           
DECLARE @cols varchar(2000)     
 
BEGIN 
        -- SET NOCOUNT ON added to prevent extra result sets from 
        -- interfering with SELECT statements. 
        SET NOCOUNT ON; 
 
        WITH C1 as 
        ( 
                SELECT DV.ForecastYear, MetricName, MetricValue, M1.DataVersionRowID, M1.DataVersionID FROM Metrics M1 
                INNER JOIN DataVersion DV 
                ON M1.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID1
         ), 
          
         C2 AS 
         ( 
                SELECT DV.ForecastYear, MetricName , MetricValue, M2.DataVersionRowID, M2.DataVersionID FROM Metrics M2 
                INNER JOIN DataVersion DV 
                ON M2.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID2
        ), 
         
        C3 AS 
        ( 
                SELECT  C1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA FROM C1 
                INNER JOIN C2  
                ON C1.MetricName = C2.MetricName  
        ) 
 SELECT * FROM C3

Open in new window

Juan VelasquezAsked:
Who is Participating?
 
Brendt HessSenior DBACommented:
Part 1 - Yes, that is exactly the reason.  The @@SPID value will be unique to each connection, so the data from multiple users will be kept separate within the working table.  I've seen enough of these types of issues crop up that lead to weird, untrackable issues with reports, that I look for them automatically now.  :)

Part 2 - Common table expressions were not in fact created for this purpose.  There intent (quoting here):

"A Common Table Expression(CTE) is an ANSI SQL-99  temporary result set that can be declared once and referenced multiple times in a query."

So, if you simply are referencing a table or tableset once in a query, a CTE is actually overkill.  If, on the other hand, you were building a query where (for example) you were summing up multiple different columns into differing totals, and needed to take separate actions based on differing combinations of values, a CTE would be sensible, e.g.:

WITH CashFlow AS (
   SELECT invID,
       Sum(GrossIncome) as GrossIncome,
       Sum(NetIncome) as NetIncome,
       Min(UnitCost) as MinCost,
       Max(UnitCost) as MaxCost,
       Min(Added_Date) as DateAdded,
       Count(Distinct OrderID) as Orders,
       Sum(qtyShipped) as QtySold,
       Sum(QtyShipped * UnitCost) as NetCosts,
       Sum(QtyShipped * UnitPrice) as NetSales
   FROM <a bunch of different tables>
   WHERE <your conditions>
   )
SELECT inv.invID,
    inv.ProductName,
    CASE WHEN DateAdded > @StartDate
        THEN 1
        ELSE 0
    END As NewThisYear,
    CASE
        WHEN NetCosts >= NetSales - (NetSales * .025)  -- 2.5% assumed overhead on sales
            THEN 'Loss Leader',
        WHEN NetSales < (SELECT SalesCutoff FROM SalesThresholds WHERE LowCost < MinCost AND CutoffCost >= MinCost)
            THEN 'Low Seller'
        WHEN .....
FROM CashFlow
INNER JOIN Inventory inv
    ON CashFlow.invID = Inventory.invID
....

This is a fairly simplistic example, as this *could* be done with a simple derived table.  If you needed to do a series of queries using UNION that all needed to reference the data in the CTE, that would be a better example, as you could use the reference to the CTE multiple times without recalculating the data.  Another use would be to easily reference such things as ranking values in multiple different contexts.

In your case, all you are doing is trying to use a CTE to encapsulate a simple query as if it were an object.  It's really not the reason they were developed, and it is (IMO) a bad habit to get into.  In a true enterprise situation, you will frequently need to spend time optimizing queries for speed.  Using constructs that slow things down simply guarantees in that context that the SQL will need to be redesigned later.  Personally, I would discourage it.
0
 
Brendt HessSenior DBACommented:
Actually, if you check the internal query flow, you will find that Views DO use filters if you apply them correctly.  I would rewrite this for the view as:

        WITH C1 as
        (
                SELECT DV.ForecastYear, MetricName, MetricValue, M1.DataVersionID
                FROM Metrics M1
                INNER JOIN DataVersion DV
                ON M1.DataVersionRowID = DV.DataVersionRowID
                WHERE DV.DataVersionID = @DVID1
         ),
         
         C2 AS
         (
                SELECT DV.ForecastYear, MetricValue, M2.DataVersionID
                FROM Metrics M2
                INNER JOIN DataVersion DV
                ON M2.DataVersionRowID = DV.DataVersionRowID
--                WHERE DV.DataVersionID = @DVID2
        ),
         
        C3 AS
        (
                SELECT  C1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA, C1.DataVersionID as DVID1, C2.DataVersionID as DVID2
                FROM C1
                INNER JOIN C2  
                ON C1.MetricName = C2.MetricName  
        )
 SELECT * FROM C3

----------------------------------

When you reference the view, be sure to include the two DVIDs as criteria, e.g.:

SELECT *
FROM MyView
WHERE DVID1 = @DVID1
    AND DVID2 = @DVID2

Check the execution plan, and you will see that filtering is going on in the view query.  Whether it is filtering enough is a different question, but this could be an answer.

Another method would be to forego CTEs entirely.  I don't think you gain anything by using them here when compared to this code:

SELECT C1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA
FROM Metrics C1
INNER JOIN DataVersion DV1
    ON C1.DataVersionRowID = DV1.DataVersionRowID
INNER JOIN Metrics C2
    ON C1.MetricName = c2.MetricName
INNER JOIN DataVersion DV2
    ON C2.DataVersionRowID = DV2.DataVersionRowID
WHERE DV1.DataVersionID = @dvid1
    AND DV2.DataVersionID = @dvid2

Embedding this block as a derived table should allow your dynamic SQL to function fine.
0
 
Juan VelasquezAuthor Commented:
I came up with the following code and it seems to work
ALTER PROCEDURE up_PrimaryBenefitsTraditionalMetricsReport 
@DVID1 as NVARCHAR(MAX), 
@DVID2 as NVARCHAR(MAX) 
         
AS 
   
 
BEGIN 
        -- SET NOCOUNT ON added to prevent extra result sets from 
        -- interfering with SELECT statements. 
        SET NOCOUNT ON; 
        
TRUNCATE table tblReportMetrics;
 
        WITH C1 as 
        ( 
                SELECT DV.ForecastYear, MetricName, MetricValue, M1.DataVersionRowID, M1.DataVersionID FROM Metrics M1 
                INNER JOIN DataVersion DV 
                ON M1.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID1
         ), 
          
         C2 AS 
         ( 
                SELECT DV.ForecastYear, MetricName , MetricValue, M2.DataVersionRowID, M2.DataVersionID FROM Metrics M2 
                INNER JOIN DataVersion DV 
                ON M2.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID2
        ), 
         
        C3 AS 
        ( 
                SELECT  C1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA FROM C1 
                INNER JOIN C2  
                ON C1.MetricName = C2.MetricName  
        ) 




-- SELECT ForecastYear, MetricName, DELTA  INTO REPORTMETRICS  FROM C3
INSERT INTO tblReportMetrics(ForecastYear, MetricName, DELTA)
SELECT ForecastYear, MetricName, DELTA   FROM C3
 
GO;
   
Declare @strSQL varchar(max)           
DECLARE @cols varchar(2000)      
         
        SELECT  @cols = STUFF(( SELECT DISTINCT             
                                '],[' + cast(ForecastYear as varchar)            
                        From tblReportMetrics   
                        ORDER BY 1            
                        FOR XML PATH('')            
                      ), 1, 2, '') + ']'            
       
         
set @strSQL = 'SELECT MetricName, ' + @cols +    
                ' from (   
                        SELECT  MetricName, ForecastYear, Delta                                      
                        FROM dbo.tblReportMetrics  
                        ) o   
                pivot(sum(Delta) for ForecastYear in (' + @cols + '))p'   
     select @strSQL 
   exec(@strSQL)    
         
END 

Open in new window

0
Cloud Class® Course: CompTIA Cloud+

The CompTIA Cloud+ Basic training course will teach you about cloud concepts and models, data storage, networking, and network infrastructure.

 
Brendt HessSenior DBACommented:
This has one limitation - the query can not be run simultaneously on two stations without data conflicts in the working table - and one possible limitation - it may be slower than the simple JOIN syntax - that is, the CTEs may be less efficient than the standard SQL.

One workaround for the first problem would be to include an additional field in the work table for the current SPID, and pass that into the dynamic SQL string.  Below is a sample of that fix:

ALTER PROCEDURE up_PrimaryBenefitsTraditionalMetricsReport 
@DVID1 as NVARCHAR(MAX), 
@DVID2 as NVARCHAR(MAX) 
         
AS 
 
BEGIN 
        -- SET NOCOUNT ON added to prevent extra result sets from 
        -- interfering with SELECT statements. 
        SET NOCOUNT ON; 

DECLARE @spid int
SET @spid = @@SPID
        
DELETE FROM tblReportMetrics
WHERE spid = @spid;
 
        WITH C1 as 
        ( 
                SELECT DV.ForecastYear, MetricName, MetricValue, M1.DataVersionRowID, M1.DataVersionID FROM Metrics M1 
                INNER JOIN DataVersion DV 
                ON M1.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID1
         ), 
          
         C2 AS 
         ( 
                SELECT DV.ForecastYear, MetricName , MetricValue, M2.DataVersionRowID, M2.DataVersionID FROM Metrics M2 
                INNER JOIN DataVersion DV 
                ON M2.DataVersionRowID = DV.DataVersionRowID 
                WHERE DV.DataVersionID = @DVID2
        ), 
         
        C3 AS 
        ( 
                SELECT  C1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA FROM C1 
                INNER JOIN C2  
                ON C1.MetricName = C2.MetricName  
        ) 




-- SELECT ForecastYear, MetricName, DELTA  INTO REPORTMETRICS  FROM C3
INSERT INTO tblReportMetrics(spid, ForecastYear, MetricName, DELTA)
SELECT @spid, ForecastYear, MetricName, DELTA   FROM C3
 
GO;
   
Declare @strSQL varchar(max)           
DECLARE @cols varchar(2000)      
         
        SELECT  @cols = STUFF(( SELECT DISTINCT             
                                '],[' + cast(ForecastYear as varchar)            
                        From tblReportMetrics   
                        ORDER BY 1            
                        FOR XML PATH('')            
                      ), 1, 2, '') + ']'            
       
         
set @strSQL = 'SELECT MetricName, ' + @cols +    
                ' from (   
                        SELECT  MetricName, ForecastYear, Delta                                      
                        FROM dbo.tblReportMetrics 
                        WHERE spid = ' + Cast(@spid as varchar(10)) + '
                        ) o   
                pivot(sum(Delta) for ForecastYear in (' + @cols + '))p'   
     select @strSQL 
   exec(@strSQL)    
         
END 

Open in new window

0
 
Juan VelasquezAuthor Commented:
Hello bhess1,

Thank you for bringing up those two points.  I just want to be sure I understand the issues you raised rather than just implementing the suggestions you make without understanding them.  
@@SPID returns the session ID of the current user process. By storing that value in the working table each user will only be retrieving or working with data corresponding to the dataversion Id's he or she selected.  You know I hadn't even thought of that.  I've been developing in a dev environment an at present I only have one user working in a QA environment.  As a result this issue didn't come up.  I'm glad you brought it up.  As for the point that "CTEs may be less efficient than the standard SQL.".  I'll try both solutions and see which one is faster.  I'll let you know what I find out today.
0
 
Juan VelasquezAuthor Commented:
From what I understand the benefits of using table expressions are related to the logical aspects of the code and not to performance.  Basically I can use a more modular approach using CTEs thus making it easier to understand.  Am I correct.  I just want to be sure that I actually understand the nature of CTE correctly
0
 
Juan VelasquezAuthor Commented:
I'd like to run test and compare both approaches - CTE and non-CTE.  I'v created another version of the code by embedding the block as a derived table as you suggested.  It seems to compile properly but I keep getting the following error when I try to execute it.  I've isolated the error to the first block, where I Select @col  = STUFF...................  The only difference between this portion of the code and the corresponding portion of the CTE version is that I've replaced tblReportMetrics with the derived table.  I need a second pair of eyes to see what I am doing wrong.  I'm pretty sure it's something simple

Msg 8155, Level 16, State 2, Line 17
No column name was specified for column 1 of 'x'.
Msg 207, Level 16, State 1, Line 8
Invalid column name 'ForecastYear'.

DECLARE @dvid1 as nvarchar(MAX)= '2011-EEPort-NA-BAU-B-1 -NEMS-None-Mar 30 2010  9:09AM'
DECLARE @dvid2 as nvarchar(MAX) = '2011-EEPort-NA-BAU-T-1 -NEMS-None-Mar 30 2010  8:53AM'


Declare @strSQL varchar(max)           
DECLARE @cols varchar(2000)     

		SELECT  @cols = STUFF((SELECT DISTINCT '],[' + cast(x.ForecastYear as varchar )            
		From (
				SELECT 
				dv1.ForecastYear 
				FROM Metrics C1
				INNER JOIN DataVersion DV1 ON C1.DataVersionRowID = DV1.DataVersionRowID
				INNER JOIN Metrics C2 ON C1.MetricName = c2.MetricName
				INNER JOIN DataVersion DV2 ON C2.DataVersionRowID = DV2.DataVersionRowID
				WHERE DV1.DataVersionID = @dvid1 AND DV2.DataVersionID = @dvid2 								
				ORDER BY ForecastYear  FOR XML PATH('')	 	 )as x), 1, 2, '') + ']'

        select @cols  
       
         
set @strSQL = 'SELECT MetricName, ' + @cols +    
                ' from (   
                        SELECT  MetricName, ForecastYear, Delta                                      
                        FROM SELECT DV1.Forecastyear, C1.MetricName,  C1.MetricValue,  C1.MetricValue-C2.METRICVALUE DELTA
						FROM Metrics C1
						INNER JOIN DataVersion DV1
						ON C1.DataVersionRowID = DV1.DataVersionRowID
						INNER JOIN Metrics C2
						ON C1.MetricName = c2.MetricName
						INNER JOIN DataVersion DV2
						ON C2.DataVersionRowID = DV2.DataVersionRowID
						WHERE DV1.DataVersionID = ' + @dvid1 + '
						AND DV2.DataVersionID = ' + @dvid2 + ')o 
                        
                pivot(sum(Delta) for ForecastYear in (' + @cols + '))p'   
     --select @strSQL 
   exec(@strSQL)  

Open in new window

0
 
Juan VelasquezAuthor Commented:
Correction ORDER BY ForecastYear  should be Order By 1.  I still get the same error
0
 
Juan VelasquezAuthor Commented:
I'm still working on it and still haven't found the problem
0
 
Brendt HessSenior DBACommented:
Hmmm.... what is the need for the  FOR XML PATH('') in the x query?  I believe that it is what is causing the issue with the no fieldname.... what do you see when you run just that part of the query solo?
0
 
Juan VelasquezAuthor Commented:
I'm am usind it to create a xml string that contains all the forecast years.  This "single" value is necessary since I am using a subquery.
0
 
Juan VelasquezAuthor Commented:
I found your explanation of CTE very useful.  I wasn't able to get the code working using subqueries so I was forced to use CTEs'  I'll revisit that code at a later date.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.