Solved

SQL Server, 2000 - How to delete duplicated rows if the primary key is multiple columns

Posted on 2007-11-27
10
261 Views
Last Modified: 2010-04-21
Experts,
I have got a table “Trade” that has a combined primary key “Deal_Number,Order_Number,ValueDate”
While setting the primary key, the table already had data, which was duplicated for some dates.
How may I execute a statement that deletes all duplicates?
Note: I need to delete extra duplicates only and not all data. i.e. if rows of 01/12/2000 are duplicated 3 times, I need to delete them 2 times and keep 1 instance of them.

Thanks in advance,
0
Comment
Question by:feesu
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 3
  • 3
10 Comments
 
LVL 23

Expert Comment

by:Ashish Patel
ID: 20363786
For this the best way to do is Create another table with same structure along with primary key on the column you want and then Select DISTINCT Rows from the first table and insert them into the new table.

This is the only way you can do this.
0
 
LVL 23

Expert Comment

by:Ashish Patel
ID: 20363797
Or directly do a Select DISTINCT from the table into new table name and then drop the source and rename the new table to old table name. for example

Select Distinct Deail_Number, Order_Number, ValueDate, Col4, Col5,... Into NewTableName from DuplicatedValueTableName.

Drop the DuplicatedValueTableName

and then rename the NewTableName to OldTableName
0
 

Author Comment

by:feesu
ID: 20364105
But what if i have 10 columns, and the primary key (the ones i need to distinctly select) are only 3 ???
How do i write my select statement?

-----------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT     IKey, Deal_Number, Deal_Time, Order_Number, Order_Time, Order_Type, CompanyID, Price, QTY, Investor_Number, [User], Status, 
                      Main_Forward_Market_Deal_Date, Main_Deal_Number, Investment_Company_Number, ValueDate
FROM         Broker_TRADE
-----------------------------------------------------------------------------------------------------------------------------------------------------------

Open in new window

0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 23

Expert Comment

by:Ashish Patel
ID: 20364171
SELECT   Min(IKey), Deal_Number, Min(Deal_Time), Order_Number, Order_Time, Order_Type, CompanyID, Price, QTY, Investor_Number, [User], Status,
                      Main_Forward_Market_Deal_Date, Main_Deal_Number, Investment_Company_Number, ValueDate
FROM         Broker_TRADE
group by Deal_Number, Order_Number, ValueDate

Please add the Min or Max function as i have used according to what you need. The query will go with group by your distinct columns. Dont put min or max to your group by cols.
0
 
LVL 25

Expert Comment

by:imitchie
ID: 20364512
how big is the table, how many rows?
0
 
LVL 25

Accepted Solution

by:
imitchie earned 500 total points
ID: 20364587
this T-SQL should do it. as with any advice, back up your database/table first!
set nocount on
 
declare @count int
declare @dealnumber int
declare @ordernumber varchar(100)
declare @valuedate datetime
select @count = 1
while @count > 0
begin
 set @count = 0
 select top 1
	@dealnumber = Deal_Number,
	@ordernumber = Order_Number,
	@valuedate = ValueDate,
	@count = count(*)
 from Broker_TRADE
 group by Deal_Number, Order_Number, ValueDate
 having count(*) > 1
 order by Deal_Number, Order_Number, ValueDate
 
 if @count > 0
 begin
  set rowcount @count
  delete from Broker_TRADE
  where Deal_Number = @dealnumber
    and Order_Number = @ordernumber
    and ValueDate =@valuedate
  set rowcount 0
 end
end
 
set nocount off

Open in new window

0
 

Author Comment

by:feesu
ID: 20364605
Hi imitchie,
The table has around 100,000 rows.
It has the columns: IKey, Deal_Number, Deal_Time, Order_Number, Order_Time, Order_Type, CompanyID, Price, QTY, Investor_Number, [User], Status, Main_Forward_Market_Deal_Date, Main_Deal_Number, Investment_Company_Number, ValueDate

Do I execute your code now?
0
 
LVL 25

Expert Comment

by:imitchie
ID: 20364684
yup, that'd do it
0
 

Author Comment

by:feesu
ID: 20364814
Hi imitchie!!
It worked perfectly!! Thank you very much. This was a relief :)
0
 

Author Closing Comment

by:feesu
ID: 31411367
PERFECT!!
0

Featured Post

Best Practices: Disaster Recovery Testing

Besides backup, any IT division should have a disaster recovery plan. You will find a few tips below relating to the development of such a plan and to what issues one should pay special attention in the course of backup planning.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
I have a large data set and a SSIS package. How can I load this file in multi threading?
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the UPDATE and DELETE statements to change or remove existing data from their tables. Make a table: Update a specific column given a specific row using the UPDATE statement: Remove a set of values using the DELETE s…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question