Query That Resets ROW_NUMBER Value

I am using SQL Server 2012. I basically have a set of data that has an account number, date, account type and balance.
I am trying to write a query that will identify when the balance has not changed within a specified number of days (threshold) within a given 'review period'. For example, looking back 30 days, is there a period of where the balance hasn't changed within a 5 day threshold.

One thing I've noticed is if I use ROW_NUMBER() with the following syntax:
SELECT	account_id,
	account_type,
	balance_date,
	account_balance,
	ROW_NUMBER() OVER ( PARTITION BY account_id, account_type, account_balance ORDER BY account_id, account_type, balance_date DESC ) AS ranking
FROM	account
ODER BY account_id, security_id, last_pricing_date, mark_to_market_price;

Open in new window

If the balance for this account is (commas separate by date in ascending order):
45,000, 45,000, 45,000, 44,000, 45000, 45000,... (see account_id 3700)
ROW_NUMBER function returns values of 1, 2, 3, 1, 4, 5...
I want it to return 1, 2, 3, 1, 1, 2...
Looking at each individual account by type, I want to look at the dates sequentially, and EVERY time the balance changes, reset the ranking to 1.

Here is some sample data:
CREATE TABLE acct_temp
(account_id  BIGINT,
account_type CHAR(3),
balance_date DATE,
account_balance MONEY)

INSERT INTO acct_temp
VALUES
(3700,'SAV','2015-09-24',45000),
(3700,'SAV','2015-09-25',45000),
(3700,'SAV','2015-09-28',45000),
(3700,'SAV','2015-09-29',45000),
(3700,'SAV','2015-09-30',45000),
(3700,'SAV','2015-10-01',45000),
(3700,'SAV','2015-10-02',45000),
(3700,'SAV','2015-10-05',45000),
(3700,'SAV','2015-10-06',45000),
(3700,'SAV','2015-10-07',43000),
(3700,'SAV','2015-10-08',45000),
(3700,'SAV','2015-10-09',45000),
(3700,'SAV','2015-10-12',45000),
(740,'SAV','2015-10-02',45000),
(740,'SAV','2015-10-05',41000),
(740,'SAV','2015-10-06',45000),
(740,'SAV','2015-10-07',45000),
(740,'SAV','2015-10-08',45000),
(740,'SAV','2015-10-09',45000),
(740,'SAV','2015-10-12',45000),
(858,'SAV','2015-09-24',45000),
(858,'SAV','2015-09-25',45000),
(858,'SAV','2015-09-28',45100),
(858,'SAV','2015-09-29',45000),
(858,'SAV','2015-09-30',45300),
(858,'SAV','2015-10-01',45000),
(858,'SAV','2015-10-02',45000),
(858,'SAV','2015-10-05',45700),
(858,'SAV','2015-10-06',45000),
(858,'SAV','2015-10-07',45000),
(858,'SAV','2015-10-08',46100),
(858,'SAV','2015-10-09',45000),
(858,'SAV','2015-10-12',45000)

Open in new window

LVL 15
dbbishopAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Jim HornMicrosoft SQL Server Developer, Architect, and AuthorCommented:
>I want it to return 1, 2, 3, 1, 1, 2...
This fits the description of DENSE_RANK() instead of ROW_NUMBER, where there are no gaps in rankings, meaning if there's a two-way tie for first place, both are 1, and then 3rd place is 2.
0
PortletPaulfreelancerCommented:
what does this actually mean?

..., looking back 30 days, is there a period of where the balance hasn't changed within a 5 day threshold.

interpretation:
a.) go back 25 to 35 days
b.) are there variations in the balance during that period

I would need to see the "expected result" from your sample data
0
dbbishopAuthor Commented:
For clarification, ranking is done on the entire set throughout the review period (e.g. 30 days) with no concern of the threshold period (5 days). The threshold value comes into play in a subsequent query.

Jim,
I already had tried dense_rank. It returns the same result as row_number. It is not that I am trying to close gaps, I want to start numbering again at 1.

Paul,
Today is 10/20/2015. The review period is 30 days, which means I am looking back to 9/20/2015. The threshold is 5 days. Think of it as a sliding window, so from 9/30 to today, where there and time periods where, for 5 consecutive days where the balance remained the same.

You stated you would need to see the expected results. I think I stated that in my original post
"If the balance for this account is (commas separate by date in ascending order):
45,000, 45,000, 45,000, 44,000, 45000, 45000,... (see account_id 3700)
ROW_NUMBER function returns values of 1, 2, 3, 1, 4, 5...
I want it to return 1, 2, 3, 1, 1, 2..."

If the balance changes, ANY subsequent balance that is not the same value returns a value of 1, even if it is the same as a previously posted balance:

Balance      Rank
45,000      1
45,000      2
45,000      3
45,000      4
45,000      5
46,000      1
46,000      2
45,000      1
45,000      2
44,000      1
45,000      1
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

PortletPaulfreelancerCommented:
I have not been able to run this at all. However I'm hoping it will assist.
declare @today date = '20151020'
declare @period int = 30
declare @start date = dateadd(day,-@period,@today)

--select @start, @period, @today

select
       a.*
     , ca.ranking
from acct_temp as a
cross apply (
        select
              1 + count(distinct account_balance ) 
        from acct_temp
        where account_id = a.account_id
        and balance_date >= dateadd(day,-5,a.balance_date)
        and balance_date < a.balance_date
        and account_balance <> a.account_balance
        ) ca (ranking)
where balance_date >= @start
and balance_date < dateadd(day,1,@today)
order by 
      account_id
    , balance_date
 

Open in new window

0
dbbishopAuthor Commented:
Paul,
Results are not what I would expect. Here is one account. I would expect to see values of:
1,1,1,2,3,4 instead of 1,2,2,2,2,2 as shown from your query.
account_id	account_type	balance_date	account_balance	ranking
740	SAV	2015-10-02	45000.00	1
740	SAV	2015-10-05	41000.00	2
740	SAV	2015-10-06	45000.00	2
740	SAV	2015-10-07	45000.00	2
740	SAV	2015-10-08	45000.00	2
740	SAV	2015-10-09	45000.00	2

Open in new window

0
Brian CroweDatabase AdministratorCommented:
This is probably not the most efficient way to do this but it seems to satisfy your requirements.

IF OBJECT_ID('tempdb..#acct_temp') IS NOT NULL
	DROP TABLE #acct_temp;
GO

CREATE TABLE #acct_temp
(
	account_id		BIGINT,
	account_type	CHAR(3),
	balance_date	DATE,
	account_balance	MONEY
);

INSERT INTO #acct_temp
VALUES
	(3700,'SAV','2015-09-24',45000),
	(3700,'SAV','2015-09-25',45000),
	(3700,'SAV','2015-09-28',45000),
	(3700,'SAV','2015-09-29',45000),
	(3700,'SAV','2015-09-30',45000),
	(3700,'SAV','2015-10-01',45000),
	(3700,'SAV','2015-10-02',45000),
	(3700,'SAV','2015-10-05',45000),
	(3700,'SAV','2015-10-06',45000),
	(3700,'SAV','2015-10-07',43000),
	(3700,'SAV','2015-10-08',45000),
	(3700,'SAV','2015-10-09',45000),
	(3700,'SAV','2015-10-12',45000),
	(740,'SAV','2015-10-02',45000),
	(740,'SAV','2015-10-05',41000),
	(740,'SAV','2015-10-06',45000),
	(740,'SAV','2015-10-07',45000),
	(740,'SAV','2015-10-08',45000),
	(740,'SAV','2015-10-09',45000),
	(740,'SAV','2015-10-12',45000),
	(858,'SAV','2015-09-24',45000),
	(858,'SAV','2015-09-25',45000),
	(858,'SAV','2015-09-28',45100),
	(858,'SAV','2015-09-29',45000),
	(858,'SAV','2015-09-30',45300),
	(858,'SAV','2015-10-01',45000),
	(858,'SAV','2015-10-02',45000),
	(858,'SAV','2015-10-05',45700),
	(858,'SAV','2015-10-06',45000),
	(858,'SAV','2015-10-07',45000),
	(858,'SAV','2015-10-08',46100),
	(858,'SAV','2015-10-09',45000),
	(858,'SAV','2015-10-12',45000);

WITH cteAccount AS
(
	SELECT account_id, account_type, balance_date, account_balance,
		ROW_NUMBER() OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS RowNumber,
		LAG(account_balance, 1, NULL) OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS previous_balance,
		LEAD(account_balance, 1, NULL) OVER(PARTITION BY account_id, account_type ORDER BY balance_date) AS next_balance
	FROM #acct_temp
),
cteBalanceGroup AS
(
	SELECT account_id, account_type, balance_date, account_balance,
		RowNumber,
		CASE
			WHEN previous_balance IS NULL THEN RowNumber
			WHEN account_balance <> previous_balance THEN RowNumber
		END AS balance_ranking
	FROM cteAccount
),
cteBalanceRanking AS
(
	SELECT R1.account_id, R1.account_type, R1.balance_date, R1.account_balance,
		MAX(ISNULL(R1.Balance_ranking, R2.balance_ranking)) AS ranking
	FROM cteBalanceGroup AS R1
	LEFT OUTER JOIN cteBalanceGroup AS R2
		ON R1.account_id = R2.account_id
		AND R1.account_type = R2.account_type
		AND R1.RowNumber > R2.RowNumber
		AND R2.balance_ranking IS NOT NULL
	GROUP BY R1.account_id, R1.account_type, R1.balance_date, R1.account_balance
)
SELECT account_id, account_type, balance_date, account_balance,
	DENSE_RANK() OVER(PARTITION BY account_id, account_type ORDER BY ranking) AS ranking
FROM cteBalanceRanking

Open in new window

0
dbbishopAuthor Commented:
Sorry about the delay. I am at PASS Summit this week and will get to this next week.
0
Brian CroweDatabase AdministratorCommented:
I'm jealous...trying to arrange it for next year.
0
Vitor MontalvãoMSSQL Senior EngineerCommented:
dbbishop, do you still need help with this question?
0
dbbishopAuthor Commented:
Sorry for the delay folks. I was kind of overwhelmed after being out of the office for a week. I was very fortunate in that I was able to talk to one of the Microsoft guys at PASS who indicated he LOVED challenges. Although he did not post the solution here, he came up with one that does exactly what I was looking for.

Brian, yours was 'close' but I would have needed to do some additional coding to count the number of distinct values of ranking with grouping to make it work. The one thing I don't like about your solution is the LEFT JOIN on a bunch of unindexed columns, as there can be a LOT of data involved (i.e. millions of rows).

Here is the solution from the Microsoft support center that was set up at PASS:
;WITH WorkingTable (account_id, account_type, balance_date, account_balance, IsSameAsPrevBal) AS (
select *
,CASE WHEN account_balance * 2 = SUM(Account_balance) OVER (PARTITION BY account_id ORDER BY balance_date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) THEN 0 ELSE 1 END AS IsSameAsPrevBal
from acct_temp
)
SELECT account_id, balance_date, account_balance, IsSameAsPrevBal,
      1 + SUM(IsSameAsPrevBal)
      OVER (PARTITION BY account_id  ORDER BY balance_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS StraightRank
INTO #AlmostThere
FROM WorkingTable
order by account_id, balance_date
 
SELECT account_id, balance_date, account_balance,
DENSE_RANK() OVER(PARTITION BY account_id, StraightRank ORDER BY balance_date)
FROM #AlmostThere
order by account_id, balance_date

Open in new window

I will likely accept it as the answer, but will leave this open for a couple of days for comments, suggestions or better solutions.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Vitor MontalvãoMSSQL Senior EngineerCommented:
dbbishop, I think you can close it now, right?
0
dbbishopAuthor Commented:
Most appropriate answer.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.