MS Access - Tryng to remove duplicated records using delete and subquery

I have a table called DataExtractTable and this has linked tables EmailTo, CC and BCC all linked by the DataExtractID of the main table.  I have managed to create 2400 duplicate records programmatically and would like to create a query that deletes them.

First off I wrote this query and it gets the required list of duplicated records:
SELECT Max(DataExtractTable.DataExtractTableID) AS DataExtractTableID
FROM DataExtractTable
GROUP BY DataExtractTable.Box, DataExtractTable.Owner, DataExtractTable.DateSent, DataExtractTable.Subject, DataExtractTable.EmailFrom, DataExtractTable.InternetMessageID, DataExtractTable.InReplyTo, DataExtractTable.DateCreated, DataExtractTable.CreatedBy, DataExtractTable.DateModified, DataExtractTable.ModifiedBy, DataExtractTable.RecordIsDeletedYN
HAVING (((Count(*))>1))

Open in new window


This produces the list of 2400 DataExtractIDs that I want to delete, so I thought I could then surround the Select Query with the delete command like this:

DELETE *
FROM DataExtractTable
WHERE DataExtractTableID IN (SELECT Max(DataExtractTable.DataExtractTableID) AS DataExtractTableID
FROM DataExtractTable
GROUP BY DataExtractTable.Box, DataExtractTable.Owner, DataExtractTable.DateSent, DataExtractTable.Subject, DataExtractTable.EmailFrom, DataExtractTable.InternetMessageID, DataExtractTable.InReplyTo, DataExtractTable.DateCreated, DataExtractTable.CreatedBy, DataExtractTable.DateModified, DataExtractTable.ModifiedBy, DataExtractTable.RecordIsDeletedYN
HAVING (((Count(*))>1)));

Open in new window


The trouble is this seems to send Access into non-responding Tizzy, which I assume means it is generating huge datasets and will eventually end up with an out of space error or something.  Is there a more efficient way of doing this that doesn't blow up Access!?

The linked tables have cascaded deletes so any linked records will be deleted as well.  The DataExtractTable table has currently 122,385 records in it.  The database is Access 2016 though the database is in Access 2002 - 2003 format.
SivAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

PatHartmanCommented:
The problem with using this query to delete is that it deletes every copy of the duplicate.  It doesn't leave an "original".  The easiest way to get rid of duplicates is to create a new, empty table.  Add the necessary unique index to prevent future duplication.  Write an append query that copies the old table and appends the rows to the new table.  You can sort the data to exert some control over which record gets saved.  At the end, Access will give you an error summary and tell you how many rows were not added due to duplicate keys.

An alternative is to turn the find duplicates query into a make table query.  In this case, you should include either the Min() or Max() value of the autonumber.  Then, the delete query can join to that table on the "unique" fields and delete if the autonumbers don't match.    Of course, if you don't have an autonumber or other unique primary key, you'll have to go with option 1.  To make a delete query work, you MUST have something unique to identify what you want to keep.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
SivAuthor Commented:
Pat,
The issue with duplicates is a one off and won't be repeated so it's a one shot deal. I thought because I was selecting the Max (i.e. the higher of the two duplicates in terms of DataExtractTableID) that it would only delete those.  If I run the Select query on its own it only produces the higher numbered DataExtractTableID of the two.
0
PatHartmanCommented:
It will only take a couple of minutes to actually delete the duplicates.  Copy the table structure only.  Add the correct primary key.  Then create a query that sorts the rows ascending by the "key" fields and descending by the "duplicate terms" and convert it to an append query by changing the query type and picking the new table.  When you run the query, the "first" duplicate will be appended to the table and the others will be discarded.
0
SivAuthor Commented:
Thanks Pat, I iwll give that a go!
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Access

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.