[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

remove duplicates in SQL for only specific columns and leaving one remaining row without blank address

Posted on 2007-03-22
7
Medium Priority
?
225 Views
Last Modified: 2010-03-20
I recently got a question answered on here that helped out a lot about removing duplicates from my SQL Server table. I now need further assistance with this same question. Please see my previous quesition and the solution here:

http://www.experts-exchange.com/Programming/Languages/SQL_Syntax/Q_22423230.html

I'd now like to specify which dupicates to remove. So if I had the below table...

id (not primary key but unique) |  name   |  email              | password        | address
1                                                    tom     tom@aol.com     tomrules            
2                                                   sue      sue@aol.com    sparkle               25 Oak st
3                                                  harry    tom@aol.com     tomrules          354 Elm st
4                                                  sally     sally@aol.com    frisky               98 Walnut St
5                                                 susan   sue@aol.com     sparkle            

I'd like to remove all duplicates above where address is blank so that the resulting table would be....

id (not primary key but unique) |  name   |  email              | password        | address
2                                                   sue      sue@aol.com    sparkle               25 Oak st
3                                                  harry    tom@aol.com     tomrules          354 Elm st
4                                                  sally     sally@aol.com    frisky               98 Walnut St

Please give me a solution that fits the format of the previously accepted answer as this wored best for me..

delete from members where id not in (select min(id) id from members group by (email+password))

Thank you very much

0
Comment
Question by:trevoray
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
7 Comments
 
LVL 5

Expert Comment

by:Steve Dubyo
ID: 18770099
Does this do what you want..

delete from members where id not in (select min(id) id from members group by (email+password) where address is not null)
0
 

Author Comment

by:trevoray
ID: 18772428
ok, here's a problem. I should've mentioned this in question. Most importantly, I need to get rid of duplicates. Sometimes there will be duplicates where address field is filled out for both. And sometimes there will be duplicates where only one address field is filled out. I need the one with address field filled out to stay. But if both duplicates have an address field, or if both have no address field, I still need one of the duplicates remove. The above code looks like it will only remove a duplicate if one of the address fields is blank. Can you help provide code that will remove duplicates (email/password) in this fashion?

Thanks
0
 

Author Comment

by:trevoray
ID: 18774281
what about this, does anyone think this would work?...

delete from members where id not in (select top 1 id from members group by (email+password) ORDER BY address)

that would put the NULL address up top and it would only select first row. I think this might work.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:trevoray
ID: 18774351
No, that doesn't work because I can't order by a column that I'm not selecting in returned results. And I'm only allowed to select ID when I do "NOT IN"

Can anyone help?

tks
0
 
LVL 5

Accepted Solution

by:
Steve Dubyo earned 2000 total points
ID: 18775582
Ye that is a bit more in depth!  I got curious tho and gave it a go, recreated the table and got this working...

It looks allot but what it is doing is deleting from the members table where the id doesnt equal one of three rules:
1, Is a unique combination of email & pwd
2, Is the min(id) where email+pwd not unique but has an address
3, Is the min(id) where email+pwd not unique, hasn't got an address but isn't in rule 2

delete
from members
where id not in
      (
      select i
      from
            (
            select (email+password) e, min(id) i, count(id) c
            from members
            group by (email+password)
            ) s
      where c = 1
      union
      select  min(i) i
      from
            (
            select (email+password) e, id i
            from members a
            join
                  (
                  select count(m.id) c, (m.email+m.password) e
                  from members m
                  join
                        (
                        select count(id) c , (email+password) e
                        from members
                        group by (email+password)
                        ) t
                  on (m.email+m.password) = t.e
                  where c > 1  
                  group by (m.email+m.password)
                  ) d
            on (a.email+a.password) = e
            where address is null
            ) n
      where e not in
            (
                  select (email+password) e
                  from members a
                  join
                        (
                        select count(m.id) c, (m.email+m.password) e
                        from members m
                        join
                              (
                              select count(id) c , (email+password) e
                              from members
                              group by (email+password)
                              ) t
                        on (m.email+m.password) = t.e
                        where c > 1  
                        group by (m.email+m.password)
                        ) d
                  on (a.email+a.password) = e
                  where address is  not null
                  group by (email+password)
            )
      group by e
      union
      select  min(id) i
      from members a
      join
            (
            select count(m.id) c, (m.email+m.password) e
            from members m
            join
                  (
                  select count(id) c , (email+password) e
                  from members
                  group by (email+password)
                  ) t
            on (m.email+m.password) = t.e
            where c > 1  
            group by (m.email+m.password)
            ) d
      on (a.email+a.password) = e
      where address is  not null
      group by (email+password)
      )

I'm using sql server btw incase the syntax differs, what r u using?
0
 

Author Comment

by:trevoray
ID: 18775594
i'm using SQL Server. Thanks!
0
 
LVL 5

Expert Comment

by:Steve Dubyo
ID: 18775606
No problem !
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

PL/SQL can be a very powerful tool for working directly with database tables. Being able to loop will allow you to perform more complex operations, but can be a little tricky to write correctly. This article will provide examples of basic loops alon…
It is possible to export the data of a SQL Table in SSMS and generate INSERT statements. It's neatly tucked away in the generate scripts option of a database.
Video by: ITPro.TV
In this episode Don builds upon the troubleshooting techniques by demonstrating how to properly monitor a vSphere deployment to detect problems before they occur. He begins the show using tools found within the vSphere suite as ends the show demonst…
We’ve all felt that sense of false security before—locking down external access to a database or component and feeling like we’ve done all we need to do to secure company data. But that feeling is fleeting. Attacks these days can happen in many w…

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question