asked on

Query That Resets ROW_NUMBER Value

I am using SQL Server 2012. I basically have a set of data that has an account number, date, account type and balance.
I am trying to write a query that will identify when the balance has not changed within a specified number of days (threshold) within a given 'review period'. For example, looking back 30 days, is there a period of where the balance hasn't changed within a 5 day threshold.

One thing I've noticed is if I use ROW_NUMBER() with the following syntax:

SELECT	account_id,
	account_type,
	balance_date,
	account_balance,
	ROW_NUMBER() OVER ( PARTITION BY account_id, account_type, account_balance ORDER BY account_id, account_type, balance_date DESC ) AS ranking
FROM	account
ODER BY account_id, security_id, last_pricing_date, mark_to_market_price;

Open in new window

If the balance for this account is (commas separate by date in ascending order):
45,000, 45,000, 45,000, 44,000, 45000, 45000,... (see account_id 3700)
ROW_NUMBER function returns values of 1, 2, 3, 1, 4, 5...
I want it to return 1, 2, 3, 1, 1, 2...
Looking at each individual account by type, I want to look at the dates sequentially, and EVERY time the balance changes, reset the ranking to 1.

Here is some sample data:

CREATE TABLE acct_temp
(account_id  BIGINT,
account_type CHAR(3),
balance_date DATE,
account_balance MONEY)

INSERT INTO acct_temp
VALUES
(3700,'SAV','2015-09-24',45000),
(3700,'SAV','2015-09-25',45000),
(3700,'SAV','2015-09-28',45000),
(3700,'SAV','2015-09-29',45000),
(3700,'SAV','2015-09-30',45000),
(3700,'SAV','2015-10-01',45000),
(3700,'SAV','2015-10-02',45000),
(3700,'SAV','2015-10-05',45000),
(3700,'SAV','2015-10-06',45000),
(3700,'SAV','2015-10-07',43000),
(3700,'SAV','2015-10-08',45000),
(3700,'SAV','2015-10-09',45000),
(3700,'SAV','2015-10-12',45000),
(740,'SAV','2015-10-02',45000),
(740,'SAV','2015-10-05',41000),
(740,'SAV','2015-10-06',45000),
(740,'SAV','2015-10-07',45000),
(740,'SAV','2015-10-08',45000),
(740,'SAV','2015-10-09',45000),
(740,'SAV','2015-10-12',45000),
(858,'SAV','2015-09-24',45000),
(858,'SAV','2015-09-25',45000),
(858,'SAV','2015-09-28',45100),
(858,'SAV','2015-09-29',45000),
(858,'SAV','2015-09-30',45300),
(858,'SAV','2015-10-01',45000),
(858,'SAV','2015-10-02',45000),
(858,'SAV','2015-10-05',45700),
(858,'SAV','2015-10-06',45000),
(858,'SAV','2015-10-07',45000),
(858,'SAV','2015-10-08',46100),
(858,'SAV','2015-10-09',45000),
(858,'SAV','2015-10-12',45000)

Open in new window

Jim Horn

>I want it to return 1, 2, 3, 1, 1, 2...
This fits the description of DENSE_RANK() instead of ROW_NUMBER, where there are no gaps in rankings, meaning if there's a two-way tie for first place, both are 1, and then 3rd place is 2.

PortletPaul

what does this actually mean?

..., looking back 30 days, is there a period of where the balance hasn't changed within a 5 day threshold.

interpretation:
a.) go back 25 to 35 days
b.) are there variations in the balance during that period

I would need to see the "expected result" from your sample data

D B

ASKER

For clarification, ranking is done on the entire set throughout the review period (e.g. 30 days) with no concern of the threshold period (5 days). The threshold value comes into play in a subsequent query.

Jim,
I already had tried dense_rank. It returns the same result as row_number. It is not that I am trying to close gaps, I want to start numbering again at 1.

Paul,
Today is 10/20/2015. The review period is 30 days, which means I am looking back to 9/20/2015. The threshold is 5 days. Think of it as a sliding window, so from 9/30 to today, where there and time periods where, for 5 consecutive days where the balance remained the same.

You stated you would need to see the expected results. I think I stated that in my original post
"If the balance for this account is (commas separate by date in ascending order):
45,000, 45,000, 45,000, 44,000, 45000, 45000,... (see account_id 3700)
ROW_NUMBER function returns values of 1, 2, 3, 1, 4, 5...
I want it to return 1, 2, 3, 1, 1, 2..."

If the balance changes, ANY subsequent balance that is not the same value returns a value of 1, even if it is the same as a previously posted balance:

Balance      Rank
45,000      1
45,000      2
45,000      3
45,000      4
45,000      5
46,000      1
46,000      2
45,000      1
45,000      2
44,000      1
45,000      1

PortletPaul

I have not been able to run this at all. However I'm hoping it will assist.

declare @today date = '20151020'
declare @period int = 30
declare @start date = dateadd(day,-@period,@today)

--select @start, @period, @today

select
       a.*
     , ca.ranking
from acct_temp as a
cross apply (
        select
              1 + count(distinct account_balance ) 
        from acct_temp
        where account_id = a.account_id
        and balance_date >= dateadd(day,-5,a.balance_date)
        and balance_date < a.balance_date
        and account_balance <> a.account_balance
        ) ca (ranking)
where balance_date >= @start
and balance_date < dateadd(day,1,@today)
order by 
      account_id
    , balance_date

Open in new window

D B

ASKER

Paul,
Results are not what I would expect. Here is one account. I would expect to see values of:
1,1,1,2,3,4 instead of 1,2,2,2,2,2 as shown from your query.

account_id	account_type	balance_date	account_balance	ranking
740	SAV	2015-10-02	45000.00	1
740	SAV	2015-10-05	41000.00	2
740	SAV	2015-10-06	45000.00	2
740	SAV	2015-10-07	45000.00	2
740	SAV	2015-10-08	45000.00	2
740	SAV	2015-10-09	45000.00	2

Open in new window

Brian Crowe

This is probably not the most efficient way to do this but it seems to satisfy your requirements.

IF OBJECT_ID('tempdb..#acct_temp') IS NOT NULL
	DROP TABLE #acct_temp;
GO

CREATE TABLE #acct_temp
(
	account_id		BIGINT,
	account_type	CHAR(3),
	balance_date	DATE,
	account_balance	MONEY
);

INSERT INTO #acct_temp
VALUES
	(3700,'SAV','2015-09-24',45000),
	(3700,'SAV','2015-09-25',45000),
	(3700,'SAV','2015-09-28',45000),
	(3700,'SAV','2015-09-29',45000),
	(3700,'SAV','2015-09-30',45000),
	(3700,'SAV','2015-10-01',45000),
	(3700,'SAV','2015-10-02',45000),
	(3700,'SAV','2015-10-05',45000),
	(3700,'SAV','2015-10-06',45000),
	(3700,'SAV','2015-10-07',43000),
	(3700,'SAV','2015-10-08',45000),
	(3700,'SAV','2015-10-09',45000),
	(3700,'SAV','2015-10-12',45000),
	(740,'SAV','2015-10-02',45000),
	(740,'SAV','2015-10-05',41000),
	(740,'SAV','2015-10-06',45000),
	(740,'SAV','2015-10-07',45000),
	(740,'SAV','2015-10-08',45000),
	(740,'SAV','2015-10-09',45000),
	(740,'SAV','2015-10-12',45000),
	(858,'SAV','2015-09-24',45000),
	(858,'SAV','2015-09-25',45000),
	(858,'SAV','2015-09-28',45100),
	(858,'SAV','2015-09-29',45000),
	(858,'SAV','2015-09-30',45300),
	(858,'SAV','2015-10-01',45000),
	(858,'SAV','2015-10-02',45000),
	(858,'SAV','2015-10-05',45700),
	(858,'SAV','2015-10-06',45000),
	(858,'SAV','2015-10-07',45000),
	(858,'SAV','2015-10-08',46100),
	(858,'SAV','2015-10-09',45000),
	(858,'SAV','2015-10-12',45000);

WITH cteAccount AS
(
	SELECT account_id, account_type, balance_date, account_balance,
		ROW_NUMBER() OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS RowNumber,
		LAG(account_balance, 1, NULL) OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS previous_balance,
		LEAD(account_balance, 1, NULL) OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS next_balance
	FROM #acct_temp
),
cteBalanceGroup AS
(
	SELECT account_id, account_type, balance_date, account_balance,
		RowNumber,
		CASE
			WHEN previous_balance IS NULL THEN RowNumber
			WHEN account_balance <> previous_balance THEN RowNumber
		END AS balance_ranking
	FROM cteAccount
),
cteBalanceRanking AS
(
	SELECT R1.account_id, R1.account_type, R1.balance_date, R1.account_balance,
		MAX(ISNULL(R1.Balance_ranking, R2.balance_ranking)) AS ranking
	FROM cteBalanceGroup AS R1
	LEFT OUTER JOIN cteBalanceGroup AS R2
		ON R1.account_id = R2.account_id
		AND R1.account_type = R2.account_type
		AND R1.RowNumber > R2.RowNumber
		AND R2.balance_ranking IS NOT NULL
	GROUP BY R1.account_id, R1.account_type, R1.balance_date, R1.account_balance
)
SELECT account_id, account_type, balance_date, account_balance,
	DENSE_RANK() OVER(PARTITION BY account_id, account_type ORDER BY ranking) AS ranking
FROM cteBalanceRanking

Open in new window

D B

ASKER

Sorry about the delay. I am at PASS Summit this week and will get to this next week.

Brian Crowe

I'm jealous...trying to arrange it for next year.