We help IT Professionals succeed at work.

How to order MySQL rows with duplicate columns

Jonathan Greenberg
Jonathan Greenberg used Ask the Experts™
on
I have a table with lots of columns, but I'll just refer to 3 of them for the sake of this question.

"pid" is the table's primary key. "oid" is a copy of "pid" when a row of data is first inserted. If a row is edited by my client, I have my web app set up to first duplicate the row so that the original is never overwritten. In this case, "pid" increments, but "oid" remains the same.

Here's an example of my table, returned with the following SELECT statement:

SELECT * FROM mytable ORDER BY pid

pid   oid   name
1      1      Somename
2      2      Othername
3      2      Othername
4      4      Newername

I need to write a SELECT statement that will return only the higher "oid" row when there is more than one with the same value. I've tried using "GROUP BY oid", which brings me frustratingly close to what I need. "GROUP BY oid" returns only one row whenever there's a duplicate "oid", but it returns the row with the lowest value "oid", not the highest value. This is what "GROUP BY oid" returns:

SELECT * FROM mytable GROUP BY oid ORDER BY pid

pid   oid   name
1      1      Somename
2      2      Othername
4      4      Newername

This is what I need to return:

pid   oid   name
1      1      Somename
3      2      Othername
4      4      Newername

What SELECT statement can I use to to return only the row with the highest value for "oid" when duplicates are found? Thanks!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
leakim971Multitechnician
Top Expert 2014

Commented:
SELECT m.pid,m.oid,name FROM mytable m , (SELECT MAX(pid) pid,oid FROM mytable GROUP BY oid) m1 WHERE m.pid = m1.pid ORDER BY pid,oid,name

Open in new window

Author

Commented:
Thanks, leakim971.

The only problem with this is that in my real table, there are about 100 columns. Can this statement be written with an asterisk, or do I need to actually name each column in the table?
Multitechnician
Top Expert 2014
Commented:
yes of course
SELECT m.* FROM mytable m , (SELECT MAX(pid) pid,oid FROM mytable GROUP BY oid) m1 WHERE m.pid = m1.pid ORDER BY pid,oid,name

Open in new window

Author

Commented:
Thank you, that does the trick. Just one question, and then I'll let you get back to your own life. :)

I assume the 'm' is an alias for the outer query, and the 'm1' is an alias for the sub-query. Is that correct? If you would explain and/or point me to a page that describes this a bit, I'd really appreciate it.

Thanks for your help!
leakim971Multitechnician
Top Expert 2014

Commented:
you're a bit correct, the terms subquery is not the right one
google for derived table, for example :
http://www.mysqlperformanceblog.com/2006/08/31/derived-tables-and-views-performance/
Distinguished Expert 2017

Commented:
m is an alias to mytable
ml is the alias for the table that resulted from the select Max(pid) pid,oid from my table group by oid) in which max(pid) is aliased to pid

select column names,max(pid) as pid from mytable group by oid order by max(pid)

Author

Commented:
Thanks Arnold, that's very helpful, and you're kind to chime in.

Leakim, thanks for the great solution.