Solved

Delete/merge some records based on a count

Posted on 2008-06-16
3
233 Views
Last Modified: 2012-06-21
I am trying to delete/merge some records from my dataset as there are a nuber of duplicates but I'm struggling to figure out how to do this. I have the following code:

SELECT     fnam03, snam03, had103, hpc103, hpc203, dtob03, COUNT(*) AS No_of_records
FROM         dbo.ptp03
GROUP BY fnam03, snam03, had103, hpc103, hpc203, dtob03
HAVING      (COUNT(*) > 1) and snam03 is not null
order by fnam03, snam03

and this allows me to detect the duplicates in my dataset (about 849 from about 12000). However I also need to be able to perform some other calculations prior to the merging taking place:

If there are 2 records (as most of the duplicates are) then I need to check a field called RGNO03 to see if either record has this populated. If one of them does then I wish to keep this record and discard the other. If RGNO03 is NULL in both records then it should check RCNO03 and the record with the highest number in this field should be retained and the lower record discarded. However, if RGNO03 is popluated in both records then no de-duplication should take place.

BTW RGNO03 and RCNO03 are both in the PTP03 table (the same table as the rest of the query fields)

The same principals should be applied to records that have a count value of 3 or 4 (there aren't that many of them so I could do this in stages if needed).

Anyone got any suggestions on a possible way forward on this one?
0
Comment
Question by:Steven O'Neill
  • 2
3 Comments
 
LVL 51

Expert Comment

by:Mark Wills
ID: 21813916
OK, might be a bit convaluted...

but, Keep if RGN003 is populated. If not (in all instances) keep the highest RCN003.


so, probably easiest to do this in two parts... First being get rid of the null RGNo03 where there is a populated one, Second being keep the highest RCNo03 when all RGNo03 are null.

One question first...  is it an "oh" then zero three ?
Next question, is RCNo03 a unique identifier per record, and if not, is there one ?
0
 
LVL 2

Author Comment

by:Steven O'Neill
ID: 21820496
Hi Mark
Thanx for offering to help out here. To answer you're questions:

1. Yes it is rgno03 (with an oh) and rcno03 (again with an oh);
2. No rcno03 isn't unique but we do have a uniquie identifier called TempID. It's an integer.
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 500 total points
ID: 21823371
OK, first step is to create a backup prior to doing anything - found this is the first step toward success - like an insurance policy that you never want to cash in - and follows along the lines or "murphy's law"...

Then we will remove those NULL rgno03 where there is a populated rgno03, That will leave us with either populated rgn03's or, NULL rgno03's and if the latter, then we will end up keeping the tempid which represents the highest rcno03...

If that sounds OK, then look at the following... And as a first pass would suggest you change the "delete" to a " select *" to visualize what is happening...
select * into ptp03_backup_dupes from ptp03
 
delete from ptp03
where rgno03 is NULL 
and exists (select null from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 and p.rgno03 is not null)
 
delete from ptp03
where rgno03 is NULL 
and not exists (select null from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 and p.rgno03 is not null)
and rcno03 <>  (select top 1 rcno03 from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 order by rcno03 desc, tempid desc)

Open in new window

0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
In this article we will get to know that how can we recover deleted data if it happens accidently. We really can recover deleted rows if we know the time when data is deleted by using the transaction log.
This Micro Tutorial will give you a basic overview how to record your screen with Microsoft Expression Encoder. This program is still free and open for the public to download. This will be demonstrated using Microsoft Expression Encoder 4.
With Secure Portal Encryption, the recipient is sent a link to their email address directing them to the email laundry delivery page. From there, the recipient will be required to enter a user name and password to enter the page. Once the recipient …

785 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question