Link to home
Start Free TrialLog in
Avatar of us1975mc
us1975mcFlag for United States of America

asked on

DELETE Duplicates in SQL Server 2008

I have a database that has one table that has 30 fields.  The first field is an integer that is a Primary Key and can not contain duplicates.  The second column can contain duplicates only because it can contain ‘NONE’.  Same with the third field it can contain duplicates and again only it can contain ‘NONE’.  My database has about 130,000 records in the Main_Table.  
The problem is that there are duplicates in the second and third fields that need be removed. ‘NONE’ is not to be considered a duplicate.  
User generated imageI know that I will have to do at least two statements or queries. The first one, deleting the duplicates in the second field, and the second query, to delete the duplicates in the third field.

Any Ideas?
Avatar of plusone3055
plusone3055
Flag of United States of America image

this thread will explain :)

http://support.microsoft.com/kb/139444
Avatar of us1975mc

ASKER

This does not take into account that I have duplicates of 'NONE' that I don't want deleted.
use thread above and add on to your queries
AND  <> 'None'
I have my first SQL written and finding that the second column has 2821 duplicates, far more than I wish to delete by hand.  And the third column has 9883 duplicates, again far more than I wish to do by hand.

SELECT     MIN(ID) AS PK, TXT_InvTag, COUNT(TXT_InvTag) AS Expr2
FROM         ITRACK_Main
GROUP BY TXT_InvTag
HAVING      (COUNT(TXT_InvTag) > 1) AND (TXT_InvTag <> N'NONE')

Open in new window


The only way that I seem to get the PK is to do a MIN beings that there are no duplicates in the ID (PK) field.

The next step states “Select the duplicate key values into a holding table.” does this mean that I need to make a table in the database that I am going to write these records to? And if so do I need all 30 fields or just the PK and duplicate record and count?

Am I on the right track?
Avatar of EugeneZ
<The first field is an integer that is a Primary Key and can not contain duplicates. >
in you post what is the dups what you need to delete?  based on smaller ID is priority? what about the rest columns?
Something like this perhaps:
DELETE  i
FROM    ITRACK_Main i
        INNER JOIN (SELECT  TXT_InvTag,
                            MIN(ID) ID
                    FROM    ITRACK_Main
                    WHERE   TXT_InvTag <> N'NONE'
                   ) d ON i.ID = d.ID

Open in new window

looks like it is not about record deletion - rather update ("delete" ) data in the 2 columns...

us1975mc: please clarify your question
ASKER CERTIFIED SOLUTION
Avatar of EugeneZ
EugeneZ
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
That did it.  It waa just getting the things in the right order.

Thanks!

US1975MC