Solved

SELECT DISTINCT for two columns be get all the data.

Posted on 2007-03-24
9
6,111 Views
Last Modified: 2012-06-22
I have 5 columns, A,B,C,D, and E.

THere are possible duplicates only in column A and B. I want to create a new table that has only unique values based on A & B, but has all the columns of data. (i.e., On a given row, x, the values of A and B are not the same as any other row's  A and B column).

I tried, SELECT DISTINCT A,B FROM list; but it is not an option as it only gives me columns A and B.

How do I get all the columns where columns A & B are DISTINCT?
0
Comment
Question by:DrDamnit
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 10

Assisted Solution

by:ksaul
ksaul earned 250 total points
ID: 18786534
You would have to use an aggregate function to decide which value you want for the other columns, for example:

SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786552
Ok, let me re-phrase....I want to remove all rows where columns A and B are the same as any other row so that it will leave me with one row.
0
 
LVL 10

Expert Comment

by:ksaul
ID: 18786602
That is the only way I know how to do it.  If there are two rows where A and B are the same C, D, and E may not be the same - so you have to decide which values you want.  If C, D and E happen to be the same it won't matter.  The above query will return one row for every instance of A and B and the maximum value of C, D and E for each.  You can use the aggregate query to selecte the rows into a new table and then delete all rows from the original and then insert them from new table.

SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
INTO list_temp
FROM list
GROUP BY A, B

DELETE FROM list

INSERT INTO list
SELECT *
FROM list_temp

Backup your data first, of course.
0
 
LVL 9

Expert Comment

by:dbeneit
ID: 18786926
Ksaul,
using yours selects:

delete list from list L left join
(SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B) as remaind on L.a=remaind.a and L.b=remaind.b and L.c=remaind.c and L.d=remaind.d and L.e=remaind.e where remaind.a is null

delete those rows that they aren't relationated with the rows that we found
0
Free Trending Threat Insights Every Day

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

 
LVL 9

Expert Comment

by:dbeneit
ID: 18786929
you can test the sql :
----------------vvvvvv---------------
/*delete list*/ select *  from list L left join
(SELECT A, B, MAX(C) AS C, MAX(D) AS D, MAX(E) AS E
FROM list
GROUP BY A, B) as remaind on L.a=remaind.a and L.b=remaind.b and L.c=remaind.c and L.d=remaind.d and L.e=remaind.e where remaind.a is null

0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786937
Ok, riddle me this.... Will these select statements give me the max value out of the entire column C, or just the max value of the set of "duplicates" as discarded by the distinct statement?
0
 
LVL 9

Accepted Solution

by:
dbeneit earned 250 total points
ID: 18786948
the max value for each pair  A,B but ksaul said you before.

ksaul:>>If C, D and E happen to be the same it won't matter.  The above query will return one row for every instance of A and B and the maximum value of C, D and E for each.
0
 
LVL 32

Author Comment

by:DrDamnit
ID: 18786978
What then does

SELECT * FROM table GROUP BY A,B do then?
0
 
LVL 58

Expert Comment

by:harfang
ID: 18787206
Hello DrDamnit,

As ksaul explained, you need to decide which values you want from each group where A and B are the same. You can use Max(), First(), etc. for single values or Sum(), Avg(), etc. to consolidate numerical values within each group, on the columns C, D, and E.

However, if you want C, D, and E to come from the same record, you need to select the first record of each group using any sort order, but you will need an ID column for that.

SELECT * FROM table
WHERE ID = (
    Select Top 1 T.ID
    From table T
    Where T.A = table.A And T.B = table.B
    Order by ID
)

This will show you the record with the lowest ID within each group. To actually get rid of the duplicates, you can change that query into a make-table query.

About your last question, you cannot use the * in a GROUP BY query. If you think about it, this would have no meaning.

Cheers!
(°v°)
0

Featured Post

Find Ransomware Secrets With All-Source Analysis

Ransomware has become a major concern for organizations; its prevalence has grown due to past successes achieved by threat actors. While each ransomware variant is different, we’ve seen some common tactics and trends used among the authors of the malware.

Join & Write a Comment

As a database administrator, you may need to audit your table(s) to determine whether the data types are optimal for your real-world data needs.  This Article is intended to be a resource for such a task. Preface The other day, I was involved …
I have been using r1soft Continuous Data Protection (http://www.r1soft.com/linux-cdp/) for many years now with the mySQL Addon and wanted to share a trick I have used several times. For those of us that don't have the luxury of using all transact…
Illustrator's Shape Builder tool will let you combine shapes visually and interactively. This video shows the Mac version, but the tool works the same way in Windows. To follow along with this video, you can draw your own shapes or download the file…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

758 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now