Link to home
Create AccountLog in
Avatar of Lia Nungaray
Lia NungarayFlag for United States of America

asked on

How to use row_number() over (partition by)

Hello Experts, I need help understanding the row_number() over (partition by) clause. I have read several explanations but I don't understand it when it is used in my code. I have seen it in other tables to get the highest paid employee, most items sold, but in this code it doesn't make sense: (using Oracle DB)


ROW_NUMBER() OVER(PARTITION BY PROV_LAST_NAME, PROV_FIRST_NAME,PROV_MIDDLE_INITIAL ORDER BY PROV_FIRST_NAME)


Any help will be appreciated, thanks.

Avatar of Kent Olsen
Kent Olsen
Flag of United States of America image

Hi Lia,


Consider these data rows:


SMITH, BOB, A

SMITH, BOB, B

JONES, FRED, C

SMITH, BOB, B


The row_number() example that you've shown will add a column to the results that applies a number to each row based on the number of times it has occurred.


1,SMITH, BOB, A

1,SMITH, BOB, B

1,JONES, FRED, C

2,SMITH, BOB, B


There were two rows with "SMITH, BOB, B" as the name.  The first occurrence is 1, second is 2, etc.  That process is applied to all values in the queries result.


Kent


ASKER CERTIFIED SOLUTION
Avatar of slightwv (䄆 Netminder)
slightwv (䄆 Netminder)

Link to home
membership
Create an account to see this answer
Signing up is free. No credit card required.
Create Account

I think you may be asking "Why has someone used this?", which of course is really difficult to answer.

The snippet is (I think) unlike the other examples given ( highest paid employee, most items sold etc.)
I suspect that whoever introduced that snippet into a query may have been looking for "possible duplicates" in a table of names etc. and in such use cases often the row numbers greater than 1 are the rows of most interest e.g.:

1,SMITH, BOB, A

1,SMITH, BOB, B

2,SMITH, BOB, B   <-- could this be a duplicate? are there other columns that assist here (e.g. address? age?)

1,JONES, FRED, C


Of course this is pure conjecture on my part, but hopefully it supplies a possible rationale.

Hi,

First thing first. What is it that you are trying to achive with your query ?

The row_number() analytic clause adds a increasing number starting from 1 to N to the rows within each partition group of columns that you provide and in the order of the order by column provided.
In your query above and if I use the data provided by Kent and add some more data for clarification you can see that this query gives you the number of duplicate rows in each partition group that you set up. See this dbfiddle example for more clarification https://dbfiddle.uk/dPHK71Eu

Also, take a look at these links below to understand how the analytical window function can be used.

https://oracle-base.com/articles/misc/analytic-functions
https://oracle-base.com/articles/misc/row_number-analytic-function#row_number
https://modern-sql.com/caniuse/over_partition_by

Best regards,
    Tomas Helgi