Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Finding , removing MSSQL Database with Duplicate Entries

Posted on 2011-10-16
11
Medium Priority
?
280 Views
Last Modified: 2012-08-13
I have a MSSQL DB with fields

autoid
firstname
lastname
emailaddress

We have been doing Data Entry with forms and found that some people have filled numerous forms and they have gone to various departments.

Problem is, emailaddress is the only unique field and autoid.

How do I query this DB to find out the entries with the same firstname and last name and delete all but 1 entry?
0
Comment
Question by:souldj
  • 6
  • 4
11 Comments
 
LVL 23

Expert Comment

by:Rajkumar Gs
ID: 36976057
What I understood is you want to remove dupicates based on first name & last name, without considering email address. If correct, you may try this query.

here is a temporary table and sample data, which is safe than trying this query directly to your actual table. Execute one by one and plz check
create table #table
(
autoid		int identity(1,1),
firstname	varchar(20),
lastname	varchar(20),
emailaddress	varchar(60)
)


insert into #table 
select 'Johnson', 'A', 'johnson.a@gmail.com' union all
select 'Johnson', 'A', 'johnson.a@email.com' union all
select 'Johnson', 'A', 'johnson.a@gmail.com' union all
select 'Albert', 'A', 'alert.a@gmail.com'

select * from #table

-- Main query (delete duplicates based on first name & last name - not checking email address)
with cte as(
select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from #table
) 
DELETE FROM CTE WHERE Serial > 1

-- now check data after delete duplicate based on first name  & lastname
select * from #table

drop table #table

Open in new window


Let me know your result
Raj
0
 
LVL 1

Author Comment

by:souldj
ID: 36976107
Did you make a spelling mistake with "CTE" ?
0
 
LVL 23

Expert Comment

by:Rajkumar Gs
ID: 36976113
'CTE' is the name that I have given. You can give any name instead of that.

Important thing #table should be your actual table name, if you plan to test with you real table.

What about this query ?

Raj
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 36976356
>>Did you make a spelling mistake with "CTE" ? <<
You are just missing a ; prior to that line. So either this:
select * from #table;
-- Main query (delete duplicates based on first name & last name - not checking email address)
with cte as(

Or this:
select * from #table

-- Main query (delete duplicates based on first name & last name - not checking email address)
;with cte as(
0
 
LVL 1

Author Comment

by:souldj
ID: 36977829
I tried the SQL with this

select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from tbl_clients order by Serial desc, lastname


-- This displayed the data correctly with the items that were replicated and their serial number
-- But when I now use the Delete Command

Delete from (select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from tbl_clients order by Serial desc, lastname
) where serial > 1

/* I got an error  

Msg 102, Level 15, State 1, Line 5
Incorrect syntax near '('.
Msg 156, Level 15, State 1, Line 5
Incorrect syntax near the keyword 'order'.*/
0
 
LVL 23

Expert Comment

by:Rajkumar Gs
ID: 36977839
You cannot use alias in the same query for soring. Try this

select * from
(
select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from tbl_clients 
) a
order by Serial desc, lastname

Open in new window


Hope you got an idea with the demo table and data ?
Raj
0
 
LVL 23

Accepted Solution

by:
Rajkumar Gs earned 1000 total points
ID: 36977845
Oops! Ignore my previous comment

You cannot delete like that.
Delete from (select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from tbl_clients order by Serial desc, lastname
) where serial > 1

Open in new window



Use the query that I suggested - Replace #table with your actual table name.
with cte as(
select firstname,lastname, ROW_NUMBER() over(partition by firstname, lastname order by autoid) Serial from #table
) 
DELETE FROM CTE WHERE Serial > 1

Open in new window



Raj
0
 
LVL 23

Expert Comment

by:Rajkumar Gs
ID: 36977849
Fact is - deleting data from a sub-query is not possible. While you can achieve using CTE (Common Table Expression)

Sorry for one of my wrong previous comment - You can use alias for sorting in the same query, but cannot use it for grouping in same query.

My comment - http:#36976057 - that query could help you to achieve this.

Raj
0
 
LVL 1

Author Comment

by:souldj
ID: 36977863
Raj, you are a star!

Thanks Mission Accomplished!
0
 
LVL 1

Author Closing Comment

by:souldj
ID: 36977864
On Point and timely!
0
 
LVL 23

Expert Comment

by:Rajkumar Gs
ID: 36977936
Nice to see you got the solution :)
Raj
0

Featured Post

[Webinar On Demand] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In SQL Server, when rows are selected from a table, does it retrieve data in the order in which it is inserted?  Many believe this is the case. Let us try to examine for ourselves with an example. To get started, use the following script, wh…
In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Screencast - Getting to Know the Pipeline
Kernel Data Recovery is a renowned Data Recovery solution provider which offers wide range of softwares for both enterprise and home users with its cost-effective solutions. Let's have a quick overview of the journey and data recovery tools range he…

577 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question