?
Solved

remove duplicates in SQL for only specific columns and leaving one remaining row without blank address

Posted on 2007-03-22
7
Medium Priority
?
223 Views
Last Modified: 2010-03-20
I recently got a question answered on here that helped out a lot about removing duplicates from my SQL Server table. I now need further assistance with this same question. Please see my previous quesition and the solution here:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_22423230.html

I'd now like to specify which dupicates to remove. So if I had the below table...

id (not primary key but unique) |  name   |  email              | password        | address
1                                                    tom     tom@aol.com     tomrules            
2                                                   sue      sue@aol.com    sparkle               25 Oak st
3                                                  harry    tom@aol.com     tomrules          354 Elm st
4                                                  sally     sally@aol.com    frisky               98 Walnut St
5                                                 susan   sue@aol.com     sparkle            

I'd like to remove all duplicates above where address is blank so that the resulting table would be....

id (not primary key but unique) |  name   |  email              | password        | address
2                                                   sue      sue@aol.com    sparkle               25 Oak st
3                                                  harry    tom@aol.com     tomrules          354 Elm st
4                                                  sally     sally@aol.com    frisky               98 Walnut St

Please give me a solution that fits the format of the previously accepted answer as this wored best for me..

delete from members where id not in (select min(id) id from members group by (email+password))

Thank you very much

0
Comment
Question by:trevoray
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 5

Expert Comment

by:Steve Dubyo
ID: 18770099
Does this do what you want..

delete from members where id not in (select min(id) id from members group by (email+password) where address is not null)
0
 

Author Comment

by:trevoray
ID: 18772428
ok, here's a problem. I should've mentioned this in question. Most importantly, I need to get rid of duplicates. Sometimes there will be duplicates where address field is filled out for both. And sometimes there will be duplicates where only one address field is filled out. I need the one with address field filled out to stay. But if both duplicates have an address field, or if both have no address field, I still need one of the duplicates remove. The above code looks like it will only remove a duplicate if one of the address fields is blank. Can you help provide code that will remove duplicates (email/password) in this fashion?

Thanks
0
 

Author Comment

by:trevoray
ID: 18774281
what about this, does anyone think this would work?...

delete from members where id not in (select top 1 id from members group by (email+password) ORDER BY address)

that would put the NULL address up top and it would only select first row. I think this might work.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:trevoray
ID: 18774351
No, that doesn't work because I can't order by a column that I'm not selecting in returned results. And I'm only allowed to select ID when I do "NOT IN"

Can anyone help?

tks
0
 
LVL 5

Accepted Solution

by:
Steve Dubyo earned 2000 total points
ID: 18775582
Ye that is a bit more in depth!  I got curious tho and gave it a go, recreated the table and got this working...

It looks allot but what it is doing is deleting from the members table where the id doesnt equal one of three rules:
1, Is a unique combination of email & pwd
2, Is the min(id) where email+pwd not unique but has an address
3, Is the min(id) where email+pwd not unique, hasn't got an address but isn't in rule 2

delete
from members
where id not in
      (
      select i
      from
            (
            select (email+password) e, min(id) i, count(id) c
            from members
            group by (email+password)
            ) s
      where c = 1
      union
      select  min(i) i
      from
            (
            select (email+password) e, id i
            from members a
            join
                  (
                  select count(m.id) c, (m.email+m.password) e
                  from members m
                  join
                        (
                        select count(id) c , (email+password) e
                        from members
                        group by (email+password)
                        ) t
                  on (m.email+m.password) = t.e
                  where c > 1  
                  group by (m.email+m.password)
                  ) d
            on (a.email+a.password) = e
            where address is null
            ) n
      where e not in
            (
                  select (email+password) e
                  from members a
                  join
                        (
                        select count(m.id) c, (m.email+m.password) e
                        from members m
                        join
                              (
                              select count(id) c , (email+password) e
                              from members
                              group by (email+password)
                              ) t
                        on (m.email+m.password) = t.e
                        where c > 1  
                        group by (m.email+m.password)
                        ) d
                  on (a.email+a.password) = e
                  where address is  not null
                  group by (email+password)
            )
      group by e
      union
      select  min(id) i
      from members a
      join
            (
            select count(m.id) c, (m.email+m.password) e
            from members m
            join
                  (
                  select count(id) c , (email+password) e
                  from members
                  group by (email+password)
                  ) t
            on (m.email+m.password) = t.e
            where c > 1  
            group by (m.email+m.password)
            ) d
      on (a.email+a.password) = e
      where address is  not null
      group by (email+password)
      )

I'm using sql server btw incase the syntax differs, what r u using?
0
 

Author Comment

by:trevoray
ID: 18775594
i'm using SQL Server. Thanks!
0
 
LVL 5

Expert Comment

by:Steve Dubyo
ID: 18775606
No problem !
0

Featured Post

VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

If you find yourself in this situation “I have used SELECT DISTINCT but I’m getting duplicates” then I'm sorry to say you are using the wrong SQL technique as it only does one thing which is: produces whole rows that are unique. If the results you a…
I'm trying, I really am. But I've seen so many wrong approaches involving date(time) boundaries I despair about my inability to explain it. I've seen quite a few recently that define a non-leap year as 364 days, or 366 days and the list goes on. …
In this brief tutorial Pawel from AdRem Software explains how you can quickly find out which services are running on your network, or what are the IP addresses of servers responsible for each service. Software used is freeware NetCrunch Tools (https…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
Suggested Courses

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question