Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

SELECT DISTINCT for two columns be get all the data.

Posted on 2007-03-24
9
Medium Priority
?
6,133 Views
Last Modified: 2012-06-22
I have 5 columns, A,B,C,D, and E.

THere are possible duplicates only in column A and B. I want to create a new table that has only unique values based on A & B, but has all the columns of data. (i.e., On a given row, x, the values of A and B are not the same as any other row's  A and B column).

I tried, SELECT DISTINCT A,B FROM list; but it is not an option as it only gives me columns A and B.

How do I get all the columns where columns A & B are DISTINCT?
0
Comment
Question by:DrDamnit
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 10

Assisted Solution

by:ksaul
ksaul earned 1000 total points
ID: 18786534
You would have to use an aggregate function to decide which value you want for the other columns, for example:

SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786552
Ok, let me re-phrase....I want to remove all rows where columns A and B are the same as any other row so that it will leave me with one row.
0
 
LVL 10

Expert Comment

by:ksaul
ID: 18786602
That is the only way I know how to do it.  If there are two rows where A and B are the same C, D, and E may not be the same - so you have to decide which values you want.  If C, D and E happen to be the same it won't matter.  The above query will return one row for every instance of A and B and the maximum value of C, D and E for each.  You can use the aggregate query to selecte the rows into a new table and then delete all rows from the original and then insert them from new table.

SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
INTO list_temp
FROM list
GROUP BY A, B

DELETE FROM list

INSERT INTO list
SELECT *
FROM list_temp

Backup your data first, of course.
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 9

Expert Comment

by:dbeneit
ID: 18786926
Ksaul,
using yours selects:

delete list from list L left join
(SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B) as remaind on L.a=remaind.a and L.b=remaind.b and L.c=remaind.c and L.d=remaind.d and L.e=remaind.e where remaind.a is null

delete those rows that they aren't relationated with the rows that we found
0
 
LVL 9

Expert Comment

by:dbeneit
ID: 18786929
you can test the sql :
----------------vvvvvv---------------
/*delete list*/ select *  from list L left join
(SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B) as remaind on L.a=remaind.a and L.b=remaind.b and L.c=remaind.c and L.d=remaind.d and L.e=remaind.e where remaind.a is null

0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786937
Ok, riddle me this.... Will these select statements give me the max value out of the entire column C, or just the max value of the set of "duplicates" as discarded by the distinct statement?
0
 
LVL 9

Accepted Solution

by:
dbeneit earned 1000 total points
ID: 18786948
the max value for each pair  A,B but ksaul said you before.

ksaul:>>If C, D and E happen to be the same it won't matter.  The above query will return one row for every instance of A and B and the maximum value of C, D and E for each.
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786978
What then does

SELECT * FROM table GROUP BY A,B do then?
0
 
LVL 58

Expert Comment

by:harfang
ID: 18787206
Hello DrDamnit,

As ksaul explained, you need to decide which values you want from each group where A and B are the same. You can use Max(), First(), etc. for single values or Sum(), Avg(), etc. to consolidate numerical values within each group, on the columns C, D, and E.

However, if you want C, D, and E to come from the same record, you need to select the first record of each group using any sort order, but you will need an ID column for that.

SELECT * FROM table
WHERE ID = (
    Select Top 1 T.ID
    From table T
    Where T.A = table.A And T.B = table.B
    Order by ID
)

This will show you the record with the lowest ID within each group. To actually get rid of the duplicates, you can change that query into a make-table query.

About your last question, you cannot use the * in a GROUP BY query. If you think about it, this would have no meaning.

Cheers!
(°v°)
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
In this video, Percona Solution Engineer Dimitri Vanoverbeke discusses why you want to use at least three nodes in a database cluster. To discuss how Percona Consulting can help with your design and architecture needs for your database and infras…
In this video, Percona Solution Engineer Rick Golba discuss how (and why) you implement high availability in a database environment. To discuss how Percona Consulting can help with your design and architecture needs for your database and infrastr…

578 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question