asked on

Oracle's dense_rank() in PostgreSQL

I'm trying to convert a bit of SQL that I got in the Oracle forum to PostgreSQL. My primary DBMS is Oracle but I'd like to see this run in PostgreSQL if possible.

Here is my scenario:

--Given:
--create table xref (
-- combined varchar(20),
-- memberid varchar(20)
--);
--
--insert into xref(combined, memberid) values ('1', '111');
--insert into xref(combined, memberid) values ('1', '222');
--insert into xref(combined, memberid) values ('1', '333');
--insert into xref(combined, memberid) values ('1', '444');
--insert into xref(combined, memberid) values ('2', '555');
--insert into xref(combined, memberid) values ('2', '666');
--insert into xref(combined, memberid) values ('2', '777');
--insert into xref(combined, memberid) values ('2', '888');
--insert into xref(combined, memberid) values ('2', '999');
--insert into xref(combined, memberid) values ('3', '987');

SELECT newid, oldid
FROM (
SELECT memberid oldid,
last_value(memberid) over (
PARTITION BY combined ORDER BY memberid
ROWS BETWEEN CURRENT ROW AND unbounded
following
) newid
FROM xref
);

--NEWID OLDID
--------------- -------------
--444 111
--444 222
--444 333
--444 444
--999 555
--999 666
--999 777
--999 888
--999 999
--987 987
--10 rows selected
--
--

Any ideas?

Thanks!

ASKER CERTIFIED SOLUTION

earth man2

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

cmulcahy

ASKER

Of course I'm an optimist. If I were to face reality all the time, I am not sure I could get out of bed each day.

Is there a way to produce the output I'm after without dense_rank()? I want to take the input shown, of all of the similar COMBINED fields, take the alphabetically last one as the NEWID and use all of the values like it as OLDID?

I'm thinking subselects. Now that I think about it a bit more, I think I may have the answer but will need to do some testing. I still welcome answers here!

Thanks!

cmulcahy

ASKER

I would still like to get dense_rank()-like functionality out of PostgreSQL, but..

select sNew.newMemberID, xref.memberid
from xref, (select combined, max(memberid) as newMemberID
from xref
group by combined) sNew
where xref.combined = sNew.combined;

This produces the output achieved with dense_rank() in Oracle, albeit a bit less efficiently, which is a bit of a problem with my 14 million row data set but at least it produces that output.

If there were an add-on module or plpgsql script that would produce the output I get from dense_rank(), I would be a very happy guy.

Thanks!

rjkimble

What exactly does dense_rank() do? FWIW, you can build your own aggregate functions using PL/pgSQL and CREATE AGGREGATE. Is that what you're looking for?

earth man2

You can improve performance for this type of problem using set returning functions. The programming goes on behind the stored procedure interface instead of within the SQL statement.

cmulcahy

ASKER

rank() and dense_rank() in Oracle assign a ranking to each row in a result set:

Given the data (arbitrarily chosen):
Keyid Value
1 Blue
9 Green
2 Apple

select dense_rank(), value, keyid from the_data order by value;

1 Apple 2
2 Blue 1
3 Green 9

It just numbers, sequentially, the rows in the result set. If I had two Blue values in the result set, it would have returned 2 Blue twice. It will duplicate the rank for duplicated values.

rank() does much the same thing, except when it encounters duplicates, it will assign the second value the same rank as the first, but skip the rank that it would have had. For example, rank 2 would have been duplicated but 3 would have been skipped.

See:
http://www.oracle-base.com/articles/9i/RankDenseRankFirstLastAnalyticFunctions.php

I'm a decent C programmer, though I'm much better with C++. Anyone care to at least get me started writing a function? Of course, a plpgsql function is always welcome!

Thanks!