Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Delete/merge some records based on a count

Posted on 2008-06-16
3
Medium Priority
?
243 Views
Last Modified: 2012-06-21
I am trying to delete/merge some records from my dataset as there are a nuber of duplicates but I'm struggling to figure out how to do this. I have the following code:

SELECT     fnam03, snam03, had103, hpc103, hpc203, dtob03, COUNT(*) AS No_of_records
FROM         dbo.ptp03
GROUP BY fnam03, snam03, had103, hpc103, hpc203, dtob03
HAVING      (COUNT(*) > 1) and snam03 is not null
order by fnam03, snam03

and this allows me to detect the duplicates in my dataset (about 849 from about 12000). However I also need to be able to perform some other calculations prior to the merging taking place:

If there are 2 records (as most of the duplicates are) then I need to check a field called RGNO03 to see if either record has this populated. If one of them does then I wish to keep this record and discard the other. If RGNO03 is NULL in both records then it should check RCNO03 and the record with the highest number in this field should be retained and the lower record discarded. However, if RGNO03 is popluated in both records then no de-duplication should take place.

BTW RGNO03 and RCNO03 are both in the PTP03 table (the same table as the rest of the query fields)

The same principals should be applied to records that have a count value of 3 or 4 (there aren't that many of them so I could do this in stages if needed).

Anyone got any suggestions on a possible way forward on this one?
0
Comment
Question by:Steven O'Neill
  • 2
3 Comments
 
LVL 51

Expert Comment

by:Mark Wills
ID: 21813916
OK, might be a bit convaluted...

but, Keep if RGN003 is populated. If not (in all instances) keep the highest RCN003.


so, probably easiest to do this in two parts... First being get rid of the null RGNo03 where there is a populated one, Second being keep the highest RCNo03 when all RGNo03 are null.

One question first...  is it an "oh" then zero three ?
Next question, is RCNo03 a unique identifier per record, and if not, is there one ?
0
 
LVL 2

Author Comment

by:Steven O'Neill
ID: 21820496
Hi Mark
Thanx for offering to help out here. To answer you're questions:

1. Yes it is rgno03 (with an oh) and rcno03 (again with an oh);
2. No rcno03 isn't unique but we do have a uniquie identifier called TempID. It's an integer.
0
 
LVL 51

Accepted Solution

by:
Mark Wills earned 2000 total points
ID: 21823371
OK, first step is to create a backup prior to doing anything - found this is the first step toward success - like an insurance policy that you never want to cash in - and follows along the lines or "murphy's law"...

Then we will remove those NULL rgno03 where there is a populated rgno03, That will leave us with either populated rgn03's or, NULL rgno03's and if the latter, then we will end up keeping the tempid which represents the highest rcno03...

If that sounds OK, then look at the following... And as a first pass would suggest you change the "delete" to a " select *" to visualize what is happening...
select * into ptp03_backup_dupes from ptp03
 
delete from ptp03
where rgno03 is NULL 
and exists (select null from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 and p.rgno03 is not null)
 
delete from ptp03
where rgno03 is NULL 
and not exists (select null from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 and p.rgno03 is not null)
and rcno03 <>  (select top 1 rcno03 from ptp03 p where p.fnam03=ptp03.fnam03 and p.snam03=ptp03.snam03 and p.had103=ptp03.had103 and p.hpc203=ptp03.hpc203 and p.dtob03=ptp03.dtob03 order by rcno03 desc, tempid desc)

Open in new window

0

Featured Post

NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

When writing XML code a very difficult part is when we like to remove all the elements or attributes from the XML that have no data. I would like to share a set of recursive MSSQL stored procedures that I have made to remove those elements from …
In SQL Server, when rows are selected from a table, does it retrieve data in the order in which it is inserted?  Many believe this is the case. Let us try to examine for ourselves with an example. To get started, use the following script, wh…
Exchange organizations may use the Journaling Agent of the Transport Service to archive messages going through Exchange. However, if the Transport Service is integrated with some email content management application (such as an anti-spam), the admin…
Whether it be Exchange Server Crash Issues, Dirty Shutdown Errors or Failed to mount error, Stellar Phoenix Mailbox Exchange Recovery has always got your back. With the help of its easy to understand user interface and 3 simple steps recovery proced…

782 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question