Here's another approach to deduping that involves using a self-join relationship. It's more trouble to set up, but runs faster once it's configured, and can help prevent entry of future duplicates automatically without having to run a script. Naturally, you should always work on a backup copy of your database when doing something like this.
Define a field called RecordID which assigns a unique ID to every record in the database. Most well-designed databases will already have some sort of unique identifier defined, so you can use that field instead if it exists.
Define a self-join relationship (a relationship between the table and itself) called SelfJoinForDuplicates, based on the value which you want to check for duplicates. If you want to look for duplicate phone numbers for example, base the relationship on the phone number field, and so forth. Define the sort order of this relationship so that the record you want to keep appears first (for example, if you want to keep the oldest record and discard newer ones, sort by date)
Define a new calculated field in the table called "IsDuplicate" and assign this formula to it:
If( RecordID = SelfJoinForDuplicates::Rec
Now, place the IsDuplicate field on a layout, and do a Find for the value "Duplicate" and then delete the found set of records.
All duplicates will be found, but the original records won't be. A handy side-effect of this is that you can warn users of possible duplicate records simply by placing the IsDuplicate field on the data entry screen.
Main Topics
Browse All Topics





by: lesouefPosted on 2005-09-14 at 13:39:34ID: 14884656
I do a loop which compares every record with the previous one using the field which determines if it is a dup or not (we'll call it ID here).
create a global field called say "tmp" to act as a variable (unless you use fm8)
sort (them to have each dup grouped with its colleagues!)
go to 1st one
define tmp = ID
start loop
go to next record
if ID = tmp
delete record
go to previous record (delete puts you on next record, and we don't want this as this is done at the begining of the loop)
endif
define tmp = ID
end of loop