Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

DELETE Duplicates in SQL Server 2008

Posted on 2013-01-02
9
Medium Priority
?
366 Views
Last Modified: 2013-01-02
I have a database that has one table that has 30 fields.  The first field is an integer that is a Primary Key and can not contain duplicates.  The second column can contain duplicates only because it can contain ‘NONE’.  Same with the third field it can contain duplicates and again only it can contain ‘NONE’.  My database has about 130,000 records in the Main_Table.  
The problem is that there are duplicates in the second and third fields that need be removed. ‘NONE’ is not to be considered a duplicate.  
Duplicates in DatabaseI know that I will have to do at least two statements or queries. The first one, deleting the duplicates in the second field, and the second query, to delete the duplicates in the third field.

Any Ideas?
0
Comment
Question by:us1975mc
  • 3
  • 3
  • 2
  • +1
9 Comments
 
LVL 22

Expert Comment

by:plusone3055
ID: 38737686
this thread will explain :)

http://support.microsoft.com/kb/139444
0
 

Author Comment

by:us1975mc
ID: 38737715
This does not take into account that I have duplicates of 'NONE' that I don't want deleted.
0
 
LVL 22

Expert Comment

by:plusone3055
ID: 38737729
use thread above and add on to your queries
AND  <> 'None'
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:us1975mc
ID: 38737807
I have my first SQL written and finding that the second column has 2821 duplicates, far more than I wish to delete by hand.  And the third column has 9883 duplicates, again far more than I wish to do by hand.

SELECT     MIN(ID) AS PK, TXT_InvTag, COUNT(TXT_InvTag) AS Expr2
FROM         ITRACK_Main
GROUP BY TXT_InvTag
HAVING      (COUNT(TXT_InvTag) > 1) AND (TXT_InvTag <> N'NONE')

Open in new window


The only way that I seem to get the PK is to do a MIN beings that there are no duplicates in the ID (PK) field.

The next step states “Select the duplicate key values into a holding table.” does this mean that I need to make a table in the database that I am going to write these records to? And if so do I need all 30 fields or just the PK and duplicate record and count?

Am I on the right track?
0
 
LVL 43

Expert Comment

by:Eugene Z
ID: 38738603
<The first field is an integer that is a Primary Key and can not contain duplicates. >
in you post what is the dups what you need to delete?  based on smaller ID is priority? what about the rest columns?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 38738742
Something like this perhaps:
DELETE  i
FROM    ITRACK_Main i
        INNER JOIN (SELECT  TXT_InvTag,
                            MIN(ID) ID
                    FROM    ITRACK_Main
                    WHERE   TXT_InvTag <> N'NONE'
                   ) d ON i.ID = d.ID

Open in new window

0
 
LVL 43

Expert Comment

by:Eugene Z
ID: 38738774
looks like it is not about record deletion - rather update ("delete" ) data in the 2 columns...

us1975mc: please clarify your question
0
 
LVL 43

Accepted Solution

by:
Eugene Z earned 2000 total points
ID: 38738804
try this code

---create test table 
  create table ITRACK_Main (id int, TXT_FANum varchar(50), TXT_InvTag varchar(50)) 
  -- insert data                 
                   insert into ITRACK_Main
                   select 12192, '139634','NONE'
                   union all select 44998, '139634','NONE'
                   union all select 445058, '139634','NONE'
                   union all select 445059, 'NONE','NONE'
                   union all select 445059, 'NONE','123'
                   --
  --check data    
    select * from ITRACK_Main

/*result
id	TXT_FANum	TXT_InvTag
12192	139634	NONE
44998	139634	NONE
445058	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
    go
--delete dups data

WITH cte_Result(TXT_FANum,TXT_InvTag, DupCnt)
AS
(
SELECT TXT_FANum,TXT_InvTag,
ROW_NUMBER() OVER(PARTITION BY TXT_FANum,TXT_InvTag ORDER BY TXT_FANum) AS DupCnt
FROM ITRACK_Main
)
delete
 FROM cte_Result
WHERE DupCnt > 1
 GO
 
--check post delete result
 select * from ITRACK_Main


/* result 

id	TXT_FANum	TXT_InvTag
12192	139634	NONE
445059	NONE	NONE
445059	NONE	123

*/
 

Open in new window

0
 

Author Closing Comment

by:us1975mc
ID: 38738937
That did it.  It waa just getting the things in the right order.

Thanks!

US1975MC
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
An alternative to the "For XML" way of pivoting and concatenating result sets into strings, and an easy introduction to "common table expressions" (CTEs). Being someone who is always looking for alternatives to "work your data", I came across this …
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example, show how to extract information from SQL Server on Database, Connection and Server properties

783 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question