Solved

DELETE Duplicates in SQL Server 2008

Posted on 2013-01-02
9
354 Views
Last Modified: 2013-01-02
I have a database that has one table that has 30 fields.  The first field is an integer that is a Primary Key and can not contain duplicates.  The second column can contain duplicates only because it can contain ‘NONE’.  Same with the third field it can contain duplicates and again only it can contain ‘NONE’.  My database has about 130,000 records in the Main_Table.  
The problem is that there are duplicates in the second and third fields that need be removed. ‘NONE’ is not to be considered a duplicate.  
Duplicates in DatabaseI know that I will have to do at least two statements or queries. The first one, deleting the duplicates in the second field, and the second query, to delete the duplicates in the third field.

Any Ideas?
0
Comment
Question by:us1975mc
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 22

Expert Comment

by:plusone3055
Comment Utility
this thread will explain :)

http://support.microsoft.com/kb/139444
0
 

Author Comment

by:us1975mc
Comment Utility
This does not take into account that I have duplicates of 'NONE' that I don't want deleted.
0
 
LVL 22

Expert Comment

by:plusone3055
Comment Utility
use thread above and add on to your queries
AND  <> 'None'
0
 

Author Comment

by:us1975mc
Comment Utility
I have my first SQL written and finding that the second column has 2821 duplicates, far more than I wish to delete by hand.  And the third column has 9883 duplicates, again far more than I wish to do by hand.

SELECT     MIN(ID) AS PK, TXT_InvTag, COUNT(TXT_InvTag) AS Expr2
FROM         ITRACK_Main
GROUP BY TXT_InvTag
HAVING      (COUNT(TXT_InvTag) > 1) AND (TXT_InvTag <> N'NONE')

Open in new window


The only way that I seem to get the PK is to do a MIN beings that there are no duplicates in the ID (PK) field.

The next step states “Select the duplicate key values into a holding table.” does this mean that I need to make a table in the database that I am going to write these records to? And if so do I need all 30 fields or just the PK and duplicate record and count?

Am I on the right track?
0
Free Gift Card with Acronis Backup Purchase!

Backup any data in any location: local and remote systems, physical and virtual servers, private and public clouds, Macs and PCs, tablets and mobile devices, & more! For limited time only, buy any Acronis backup products and get a FREE Amazon/Best Buy gift card worth up to $200!

 
LVL 42

Expert Comment

by:EugeneZ
Comment Utility
<The first field is an integer that is a Primary Key and can not contain duplicates. >
in you post what is the dups what you need to delete?  based on smaller ID is priority? what about the rest columns?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
Something like this perhaps:
DELETE  i
FROM    ITRACK_Main i
        INNER JOIN (SELECT  TXT_InvTag,
                            MIN(ID) ID
                    FROM    ITRACK_Main
                    WHERE   TXT_InvTag <> N'NONE'
                   ) d ON i.ID = d.ID

Open in new window

0
 
LVL 42

Expert Comment

by:EugeneZ
Comment Utility
looks like it is not about record deletion - rather update ("delete" ) data in the 2 columns...

us1975mc: please clarify your question
0
 
LVL 42

Accepted Solution

by:
EugeneZ earned 500 total points
Comment Utility
try this code

---create test table 
  create table ITRACK_Main (id int, TXT_FANum varchar(50), TXT_InvTag varchar(50)) 
  -- insert data                 
                   insert into ITRACK_Main
                   select 12192, '139634','NONE'
                   union all select 44998, '139634','NONE'
                   union all select 445058, '139634','NONE'
                   union all select 445059, 'NONE','NONE'
                   union all select 445059, 'NONE','123'
                   --
  --check data    
    select * from ITRACK_Main

/*result
id	TXT_FANum	TXT_InvTag
12192	139634	NONE
44998	139634	NONE
445058	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
    go
--delete dups data

WITH cte_Result(TXT_FANum,TXT_InvTag, DupCnt)
AS
(
SELECT TXT_FANum,TXT_InvTag,
ROW_NUMBER() OVER(PARTITION BY TXT_FANum,TXT_InvTag ORDER BY TXT_FANum) AS DupCnt
FROM ITRACK_Main
)
delete
 FROM cte_Result
WHERE DupCnt > 1
 GO
 
--check post delete result
 select * from ITRACK_Main


/* result 

id	TXT_FANum	TXT_InvTag
12192	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
 

Open in new window

0
 

Author Closing Comment

by:us1975mc
Comment Utility
That did it.  It waa just getting the things in the right order.

Thanks!

US1975MC
0

Featured Post

Complete Microsoft Windows PC® & Mac Backup

Backup and recovery solutions to protect all your PCs & Mac– on-premises or in remote locations. Acronis backs up entire PC or Mac with patented reliable disk imaging technology and you will be able to restore workstations to a new, dissimilar hardware in minutes.

Join & Write a Comment

Introduction SQL Server Integration Services can read XML files, that’s known by every BI developer.  (If you didn’t, don’t worry, I’m aiming this article at newcomers as well.) But how far can you go?  When does the XML Source component become …
In this article I will describe the Copy Database Wizard method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
Using examples as well as descriptions, and references to Books Online, show the documentation available for datatypes, explain the available data types and show how data can be passed into and out of variables.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now