Oracle's dense_rank() in PostgreSQL

Posted on 2004-11-03
Medium Priority
Last Modified: 2008-01-09
I'm trying to convert a bit of SQL that I got in the Oracle forum to PostgreSQL.  My primary DBMS is Oracle but I'd like to see this run in PostgreSQL if possible.

Here is my scenario:

--create table xref (
--  combined  varchar(20),
--  memberid     varchar(20)
--insert into xref(combined, memberid) values ('1', '111');
--insert into xref(combined, memberid) values ('1', '222');
--insert into xref(combined, memberid) values ('1', '333');
--insert into xref(combined, memberid) values ('1', '444');
--insert into xref(combined, memberid) values ('2', '555');
--insert into xref(combined, memberid) values ('2', '666');
--insert into xref(combined, memberid) values ('2', '777');
--insert into xref(combined, memberid) values ('2', '888');
--insert into xref(combined, memberid) values ('2', '999');
--insert into xref(combined, memberid) values ('3', '987');

SELECT newid, oldid
  FROM (
         SELECT memberid oldid,
                last_value(memberid) over (
                                         PARTITION BY combined ORDER BY memberid
                                         ROWS BETWEEN CURRENT ROW AND unbounded
                                        ) newid
           FROM xref

--NEWID         OLDID
--------------- -------------
--444           111
--444           222
--444           333
--444           444
--999           555
--999           666
--999           777
--999           888
--999           999
--987           987
--10 rows selected

Any ideas?

Question by:cmulcahy
  • 3
  • 2
LVL 22

Accepted Solution

earth man2 earned 1000 total points
ID: 12485489
You're a born optimist !  This level of analytic functionality is not currently built into Postgresql.  That's what you pay your money for !

Author Comment

ID: 12492810
Of course I'm an optimist.  If I were to face reality all the time, I am not sure I could get out of bed each day.

Is there a way to produce the output I'm after without dense_rank()?  I want to take the input shown, of all of the similar COMBINED fields, take the alphabetically last one as the NEWID and use all of the values like it as OLDID?

I'm thinking subselects.  Now that I think about it a bit more, I think I may have the answer but will need to do some testing.  I still welcome answers here!


Author Comment

ID: 12493726
I would still like to get dense_rank()-like functionality out of PostgreSQL, but..

select sNew.newMemberID, xref.memberid
  from xref, (select combined, max(memberid) as newMemberID
                   from xref
                  group by combined) sNew
 where xref.combined = sNew.combined;

This produces the output achieved with dense_rank() in Oracle, albeit a bit less efficiently, which is a bit of a problem with my 14 million row data set but at least it produces that output.

If there were an add-on module or plpgsql script that would produce the output I get from dense_rank(), I would be a very happy guy.

Microsoft Certification Exam 74-409

Veeam® is happy to provide the Microsoft community with a study guide prepared by MVP and MCT, Orin Thomas. This guide will take you through each of the exam objectives, helping you to prepare for and pass the examination.


Expert Comment

ID: 12494937
What exactly does dense_rank() do? FWIW, you can build your own aggregate functions using PL/pgSQL and CREATE AGGREGATE. Is that what you're looking for?
LVL 22

Expert Comment

by:earth man2
ID: 12495930
You can improve performance for this type of problem using set returning functions.   The programming goes on behind the stored procedure interface instead of within the SQL statement.

Author Comment

ID: 12496853
rank() and dense_rank() in Oracle assign a ranking to each row in a result set:

Given the data (arbitrarily chosen):
  Keyid     Value
  1            Blue
  9            Green
  2            Apple

select dense_rank(), value, keyid from the_data order by value;

  1     Apple     2
  2     Blue       1
  3     Green    9

It just numbers, sequentially, the rows in the result set.  If I had two Blue values in the result set, it would have returned 2  Blue twice.  It will duplicate the rank for duplicated values.

rank() does much the same thing, except when it encounters duplicates, it will assign the second value the same rank as the first, but skip the rank that it would have had.  For example, rank 2 would have been duplicated but 3 would have been skipped.


I'm a decent C programmer, though I'm much better with C++.  Anyone care to at least get me started writing a function?  Of course, a plpgsql function is always welcome!


Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many developers have database experience, but are new to PostgreSQL. It has some truly inspiring capabilities. I have several years' experience with Microsoft's SQL Server. When I began working with MySQL, I wanted a quick-reference to MySQL (htt…
By, Vadim Tkachenko. In this article we’ll look at ClickHouse on its one year anniversary.
Steps to create a PostgreSQL RDS instance in the Amazon cloud. We will cover some of the default settings and show how to connect to the instance once it is up and running.
As many of you are aware about Scanpst.exe utility which is owned by Microsoft itself to repair inaccessible or damaged PST files, but the question is do you really think Scanpst.exe is capable to repair all sorts of PST related corruption issues?
Suggested Courses
Course of the Month13 days, 18 hours left to enroll

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question