Link to home
Start Free TrialLog in
Avatar of Dave KIlby
Dave KIlbyFlag for Canada

asked on

MS SQL Query - Compare First Day Of Week

I have a table that has  3 fields - Date, RecordBucket, RecordCount- every day there are new entries.  I want to run a query that compares data in the RecordCount field week over week.

So it looks for the data for every Monday and compares it to the previous Monday and then has a calculated field showing what percentage of increase/decrease.

Is this possible ?
Avatar of ste5an
ste5an
Flag of Germany image

hmm, your data is day-by-day, but you only care for Mondays?

Using LAG():

WITH Data
AS ( SELECT T.* ,
            LAG(T.RecordCount, 7, NULL) OVER ( ORDER BY T.RecordDate ASC ) AS PreviousRecordCount
     FROM   yourTable T )
SELECT * ,
       ( T.RecordCount - D.PreviousRecordCount ) / D.PreviousRecordCount AS RecordCountChange
FROM   Data D;

-- or

WITH Data
AS ( SELECT T.* ,
            LAG(T.RecordCount, 7, NULL) OVER ( PARTITION BY T.RecordBucket
                                               ORDER BY T.RecordDate ASC ) AS PreviousRecordCount
     FROM   yourTable T )
SELECT * ,
       ( T.RecordCount - D.PreviousRecordCount ) / D.PreviousRecordCount AS RecordCountChange
FROM   Data D;

Open in new window


For Mondays only, add an appropriate WHERE, e.g.

SELECT *
FROM   yourTable T
WHERE  DATEDIFF(DAY, 0, T.RecordDate) % 7 = 0;

Open in new window

Avatar of Dave KIlby

ASKER

I am getting the below error

The multi-part identifier "T.RecordCount" could not be bound.

If I run the first query i get results but when i try to run full query i get the above error.

SELECT T.* ,
            LAG(T.RecordCount, 7, NULL) OVER ( ORDER BY T.RecordDate ASC ) AS PreviousRecordCount
     FROM   yourTable T WHERE DATEDIFF(DAY, 0, T.RecordDate) % 7 = 0 

Open in new window

ASKER CERTIFIED SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
To get a %, as stated, you'll need to multiply the count before doing the calculation, otherwise you'll always just get 0 as the difference (unless there's a 100% or more increase).

SELECT * ,
      CAST( ( D.RecordCount - D.PreviousRecordCount )  * 100.0 / D.PreviousRecordCount AS decimal(6, 2)) AS [Record_Count_%_Change]
FROM   Data D;

If you're going to run this often for historical data, I'd suggest adding a column to contain the previous weekly count in the current row's data.  Then you can do a simple calc without constantly having to read and sequence the entire data stream.
Thank you for your help and sorry for not posting concise examples.

The query runs now - i am using the 2nd example you sent, and I added the where clause for Monday, but the PreviousRecordCount and RecordCountchange fields are both null values - i have attached the output from the query.  Not sure if I have put the where clause in the right place

Query I am  using

WITH Data
AS ( SELECT T.* ,
            LAG(T.RecordCount, 7, NULL) OVER ( PARTITION BY T.RecordBucket
                                               ORDER BY T.RecordDate ASC ) AS PreviousRecordCount
     FROM   yourTable T WHERE DATEDIFF(DAY, 0, T.RecordDate ) % 7 = 0 )
SELECT * ,
       ( D.RecordCount - D.PreviousRecordCount ) / D.PreviousRecordCount AS RecordCountChange
FROM   Data D;

Open in new window

SQLOutput.csv
E.g.

DECLARE @Sample TABLE (
    RecordBucket CHAR(1) ,
    RecordCount INT ,
    RecordDate DATE
);


INSERT INTO @Sample ( RecordBucket ,
                      RecordCount ,
                      RecordDate )
VALUES ( 'A', 1, '2019-12-02' ) ,
       ( 'A', 1, '2019-12-03' ) ,
       ( 'A', 1, '2019-12-04' ) ,
       ( 'A', 1, '2019-12-05' ) ,
       ( 'A', 1, '2019-12-06' ) ,
       ( 'A', 1, '2019-12-07' ) ,
       ( 'A', 1, '2019-12-08' ) ,
       ( 'A', 12, '2019-12-09' ) ,
       ( 'B', 12, '2019-12-02' ) ,
       ( 'B', 1, '2019-12-03' ) ,
       ( 'B', 1, '2019-12-04' ) ,
       ( 'B', 1, '2019-12-05' ) ,
       ( 'B', 1, '2019-12-06' ) ,
       ( 'B', 1, '2019-12-07' ) ,
       ( 'B', 1, '2019-12-08' ) ,
       ( 'B', 1, '2019-12-09' );

WITH Data
AS ( SELECT T.* ,
            LAG(T.RecordCount, 7, NULL) OVER ( PARTITION BY T.RecordBucket
                                               ORDER BY T.RecordDate ASC ) AS PreviousRecordCount
     FROM   @Sample T )
SELECT * ,
       ( D.RecordCount - D.PreviousRecordCount ) * 1.0 / D.PreviousRecordCount AS RecordCountChange
FROM   Data D
WHERE  DATEDIFF(DAY, 0, D.RecordDate) % 7 = 0;

Open in new window

You said every day. Thus my solution uses this assumption. One row per day and partition column.
This code will compare the first value for each week and compare it, not directly checking days.  For example, if the first value for a week is Monday (at 2AM), it will use that.  If the first value for a given week doesn't appear until Tuesday (at 3PM), it will use that.

I adjusted the sample data to remove selected Mondays to show the results.

Note, though, that even if both weeks' data don't start until Tuesday, Monday will still be the "WeekStartingDate", since the logic is intended to be Monday to Monday.  Naturally we can adjust to also show the actual date(time) used if you want that in the output as well.


IF OBJECT_ID('tempdb.dbo.#data') IS NOT NULL DROP TABLE #data;
CREATE TABLE #data (
    RecordBucket char(1) NOT NULL,
    RecordCount int NULL,
    Date datetime NOT NULL
);
CREATE CLUSTERED INDEX data__CL ON #data ( RecordBucket, Date ) WITH ( FILLFACTOR = 98 );
INSERT INTO #data
VALUES ( 'A',  1, '2019-12-02' ) ,
       ( 'A',  2, '2019-12-03' ) ,
       ( 'A',  4, '2019-12-04' ) ,
       ( 'A',  8, '2019-12-05' ) ,
       ( 'A', 16, '2019-12-06' ) ,
       ( 'A', 32, '2019-12-07' ) ,
       ( 'A', 64, '2019-12-08' ) ,
       ( 'A',128, '2019-12-10' ) ,
       ( 'B',  8, '2019-12-03' ) ,
       ( 'B', 16, '2019-12-04' ) ,
       ( 'B', 32, '2019-12-05' ) ,
       ( 'B', 64, '2019-12-06' ) ,
       ( 'B',128, '2019-12-07' ) ,
       ( 'B',256, '2019-12-08' ) ,
       ( 'B',512, '2019-12-09' );
GO


;WITH cte_get_first_value_for_each_week AS (
    SELECT *
    FROM (
        SELECT *, DATEDIFF(DAY, 0, Date) / 7 AS week#,
            ROW_NUMBER() OVER(PARTITION BY RecordBucket, DATEDIFF(DAY, 0, Date) / 7
            ORDER BY Date) AS row_num
        FROM #data
        --WHERE Date >= '<starting_date>'
    ) AS derived
    WHERE row_num = 1
)
SELECT curr.RecordBucket, DATEADD(DAY, 7 * curr.week#, 0) AS WeekStartingDate,
    curr.RecordCount AS RecordCountCurrent, prev.RecordCount AS RecordCountPrevious,
    (curr.RecordCount - prev.RecordCount) * 100.0 / prev.RecordCount AS PercentChange
FROM cte_get_first_value_for_each_week curr
LEFT OUTER JOIN cte_get_first_value_for_each_week prev ON
    prev.RecordBucket = curr.RecordBucket AND
    prev.week# = curr.week# - 1 AND
    prev.row_num = 1 AND curr.row_num = 1