DELETE Duplicates in SQL Server 2008

I have a database that has one table that has 30 fields.  The first field is an integer that is a Primary Key and can not contain duplicates.  The second column can contain duplicates only because it can contain ‘NONE’.  Same with the third field it can contain duplicates and again only it can contain ‘NONE’.  My database has about 130,000 records in the Main_Table.  
The problem is that there are duplicates in the second and third fields that need be removed. ‘NONE’ is not to be considered a duplicate.  
Duplicates in DatabaseI know that I will have to do at least two statements or queries. The first one, deleting the duplicates in the second field, and the second query, to delete the duplicates in the third field.

Any Ideas?
us1975mcAsked:
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

x
 
Eugene ZConnect With a Mentor Commented:
try this code

---create test table 
  create table ITRACK_Main (id int, TXT_FANum varchar(50), TXT_InvTag varchar(50)) 
  -- insert data                 
                   insert into ITRACK_Main
                   select 12192, '139634','NONE'
                   union all select 44998, '139634','NONE'
                   union all select 445058, '139634','NONE'
                   union all select 445059, 'NONE','NONE'
                   union all select 445059, 'NONE','123'
                   --
  --check data    
    select * from ITRACK_Main

/*result
id	TXT_FANum	TXT_InvTag
12192	139634	NONE
44998	139634	NONE
445058	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
    go
--delete dups data

WITH cte_Result(TXT_FANum,TXT_InvTag, DupCnt)
AS
(
SELECT TXT_FANum,TXT_InvTag,
ROW_NUMBER() OVER(PARTITION BY TXT_FANum,TXT_InvTag ORDER BY TXT_FANum) AS DupCnt
FROM ITRACK_Main
)
delete
 FROM cte_Result
WHERE DupCnt > 1
 GO
 
--check post delete result
 select * from ITRACK_Main


/* result 

id	TXT_FANum	TXT_InvTag
12192	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
 

Open in new window

0
 
plusone3055Commented:
this thread will explain :)

http://support.microsoft.com/kb/139444
0
 
us1975mcAuthor Commented:
This does not take into account that I have duplicates of 'NONE' that I don't want deleted.
0
Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

 
plusone3055Commented:
use thread above and add on to your queries
AND  <> 'None'
0
 
us1975mcAuthor Commented:
I have my first SQL written and finding that the second column has 2821 duplicates, far more than I wish to delete by hand.  And the third column has 9883 duplicates, again far more than I wish to do by hand.

SELECT     MIN(ID) AS PK, TXT_InvTag, COUNT(TXT_InvTag) AS Expr2
FROM         ITRACK_Main
GROUP BY TXT_InvTag
HAVING      (COUNT(TXT_InvTag) > 1) AND (TXT_InvTag <> N'NONE')

Open in new window


The only way that I seem to get the PK is to do a MIN beings that there are no duplicates in the ID (PK) field.

The next step states “Select the duplicate key values into a holding table.” does this mean that I need to make a table in the database that I am going to write these records to? And if so do I need all 30 fields or just the PK and duplicate record and count?

Am I on the right track?
0
 
Eugene ZCommented:
<The first field is an integer that is a Primary Key and can not contain duplicates. >
in you post what is the dups what you need to delete?  based on smaller ID is priority? what about the rest columns?
0
 
Anthony PerkinsCommented:
Something like this perhaps:
DELETE  i
FROM    ITRACK_Main i
        INNER JOIN (SELECT  TXT_InvTag,
                            MIN(ID) ID
                    FROM    ITRACK_Main
                    WHERE   TXT_InvTag <> N'NONE'
                   ) d ON i.ID = d.ID

Open in new window

0
 
Eugene ZCommented:
looks like it is not about record deletion - rather update ("delete" ) data in the 2 columns...

us1975mc: please clarify your question
0
 
us1975mcAuthor Commented:
That did it.  It waa just getting the things in the right order.

Thanks!

US1975MC
0
All Courses

From novice to tech pro — start learning today.