Solved

DELETE Duplicates in SQL Server 2008

Posted on 2013-01-02
9
362 Views
Last Modified: 2013-01-02
I have a database that has one table that has 30 fields.  The first field is an integer that is a Primary Key and can not contain duplicates.  The second column can contain duplicates only because it can contain ‘NONE’.  Same with the third field it can contain duplicates and again only it can contain ‘NONE’.  My database has about 130,000 records in the Main_Table.  
The problem is that there are duplicates in the second and third fields that need be removed. ‘NONE’ is not to be considered a duplicate.  
Duplicates in DatabaseI know that I will have to do at least two statements or queries. The first one, deleting the duplicates in the second field, and the second query, to delete the duplicates in the third field.

Any Ideas?
0
Comment
Question by:us1975mc
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 22

Expert Comment

by:plusone3055
ID: 38737686
this thread will explain :)

http://support.microsoft.com/kb/139444
0
 

Author Comment

by:us1975mc
ID: 38737715
This does not take into account that I have duplicates of 'NONE' that I don't want deleted.
0
 
LVL 22

Expert Comment

by:plusone3055
ID: 38737729
use thread above and add on to your queries
AND  <> 'None'
0
Percona Live Europe 2017 | Sep 25 - 27, 2017

The Percona Live Open Source Database Conference Europe 2017 is the premier event for the diverse and active European open source database community, as well as businesses that develop and use open source database software.

 

Author Comment

by:us1975mc
ID: 38737807
I have my first SQL written and finding that the second column has 2821 duplicates, far more than I wish to delete by hand.  And the third column has 9883 duplicates, again far more than I wish to do by hand.

SELECT     MIN(ID) AS PK, TXT_InvTag, COUNT(TXT_InvTag) AS Expr2
FROM         ITRACK_Main
GROUP BY TXT_InvTag
HAVING      (COUNT(TXT_InvTag) > 1) AND (TXT_InvTag <> N'NONE')

Open in new window


The only way that I seem to get the PK is to do a MIN beings that there are no duplicates in the ID (PK) field.

The next step states “Select the duplicate key values into a holding table.” does this mean that I need to make a table in the database that I am going to write these records to? And if so do I need all 30 fields or just the PK and duplicate record and count?

Am I on the right track?
0
 
LVL 43

Expert Comment

by:Eugene Z
ID: 38738603
<The first field is an integer that is a Primary Key and can not contain duplicates. >
in you post what is the dups what you need to delete?  based on smaller ID is priority? what about the rest columns?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38738742
Something like this perhaps:
DELETE  i
FROM    ITRACK_Main i
        INNER JOIN (SELECT  TXT_InvTag,
                            MIN(ID) ID
                    FROM    ITRACK_Main
                    WHERE   TXT_InvTag <> N'NONE'
                   ) d ON i.ID = d.ID

Open in new window

0
 
LVL 43

Expert Comment

by:Eugene Z
ID: 38738774
looks like it is not about record deletion - rather update ("delete" ) data in the 2 columns...

us1975mc: please clarify your question
0
 
LVL 43

Accepted Solution

by:
Eugene Z earned 500 total points
ID: 38738804
try this code

---create test table 
  create table ITRACK_Main (id int, TXT_FANum varchar(50), TXT_InvTag varchar(50)) 
  -- insert data                 
                   insert into ITRACK_Main
                   select 12192, '139634','NONE'
                   union all select 44998, '139634','NONE'
                   union all select 445058, '139634','NONE'
                   union all select 445059, 'NONE','NONE'
                   union all select 445059, 'NONE','123'
                   --
  --check data    
    select * from ITRACK_Main

/*result
id	TXT_FANum	TXT_InvTag
12192	139634	NONE
44998	139634	NONE
445058	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
    go
--delete dups data

WITH cte_Result(TXT_FANum,TXT_InvTag, DupCnt)
AS
(
SELECT TXT_FANum,TXT_InvTag,
ROW_NUMBER() OVER(PARTITION BY TXT_FANum,TXT_InvTag ORDER BY TXT_FANum) AS DupCnt
FROM ITRACK_Main
)
delete
 FROM cte_Result
WHERE DupCnt > 1
 GO
 
--check post delete result
 select * from ITRACK_Main


/* result 

id	TXT_FANum	TXT_InvTag
12192	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
 

Open in new window

0
 

Author Closing Comment

by:us1975mc
ID: 38738937
That did it.  It waa just getting the things in the right order.

Thanks!

US1975MC
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Slowly Changing Dimension Transformation component in data task flow is very useful for us to manage and control how data changes in SSIS.
In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

626 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question