Avatar of valmatic
Flag for United States of America asked on

Find duplicate records in Access table

Hi.  I have a database that stores information about equipment, maintenance and costing for my business.  Users entered duplicate maintenance records for various equipment by mistake and now want to clean it up.  Maintenance table consists of assetID, MaintID, MaintDescr, ReqDate, DueDate, etc...   I need to find any records where both the assetID and MaintDescr fields are the same.  I tried the built in duplicate query function in access2010 and it returned a lot of false positives.  I tried manipulating the SQL command with my limited knowledge of SQL and had no luck.  Any ideas how I could find al of these duplicate records?  thanks
Microsoft AccessSQL

Avatar of undefined
Last Comment

8/22/2022 - Mon
Rey Obrero (Capricorn1)

what are the contents of the field "MaintDescr" ? are these entered using free text entry?

have you tried using the assetID and MaintID?

better if you can upload a copy of the db

The built-in query should work.  Here's the SQL text:

SELECT First(tblRecords.assetID) AS [assetID Field], First(tblRecords.MaintDescr) AS [MaintDescr Field], Count(tblRecords.assetID) AS NumberOfDups
FROM tblRecords
GROUP BY tblRecords.assetID, tblRecords.MaintDescr
HAVING (((Count(Table1.ID))>1) AND ((Count(Table1.Desc))>1));

The key is the word "AND" in between each criteria so that all have to be true for that record to show in the query.
Rey Obrero (Capricorn1)

on second thought, why is the field "MaintDescr" on that table..
it should be in the Maintenance table..

you should have three tables to accomplish this
tblAsset          tblAssetMaint       tblMaintenance
assetID  1-- M     assetID
                           maintID m------1 maintID

see this demo

Set the relationship between two tables
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck

My DB is set up that way.  AssetID is a join field

I figured out why I'm getting false hits.  It looks like my SQL Query is not looking at the entire MaintDescr text.  This is a free form text field.  IF SQL finds a duplicate in the first word then it calls the entire block a duplicate.

Example:  This is coming up as a duplicate record and it should not.
AssetID = 123     MaintDescr = Check oil level
AssetID = 123     MaintDescr = Check pressure valve

How would I change my SQL string to force a compare against the entire block of text vs. just the 1st word or first match?

In (SELECT [AssetID] FROM [Maintenance] As Tmp GROUP BY [AssetID],[MaintenanceDescription] HAVING Count(*)>1 AND [MaintenanceDescription] = [Maintenance].[MaintenanceDescription])
Rey Obrero (Capricorn1)

<My DB is set up that way.>

do you really have three tables for this relationship?

and don't use the field "[MaintenanceDescription]", try using the MaintID field.

I have a table called Asset that holds the equipment info
I have a table called maintenance that holds the maintenance info - relationship is 1-many
I also have various other tables that aren't worth mentioning in this post.

MaintID is a key field of the Maintenance table and duplicates are not allowed.
I need to find MaintDescr duplicates within a given AssetID. I'm only looking for duplicates within table.maintenance.

I think the issue is in the SQL in that it calls the record a duplicate if it finds a match within the text vs. a match of the entire block of text.   I'm hoping someone can help me tweak the SQL to compare the block of text - if it's possible.
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Rey Obrero (Capricorn1)

upload a copy of your db.


Cap, I'm glad you had me do that..  I broke my db down so it was a little more managable and plugged in a few test entries.  Stock duplicate query works like it should and I only get hits as expected.  I then pulled some extra fields into my query;
qry test added: only find records where bogusrecord = false and where MaintDate = isnull...  

So in my test you can see that the query does not pull in the records where bogusrecord = true or where MaintDate is not null but those recorsd are still counted into the duplicate SQL string.  I'm guessing I need to add these extra tests into the SQL so it takes all tests into account?

Rey Obrero (Capricorn1)

can  you explain the rules that must be followed ..
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy

If more than one record exists where both Asset ID and MaintDescr are duplicated.
and if BogusRecord = False
        and if isnull(MaintDate)
(BogusRecord is a Cancelled record)
(MaintDate where null is an open active record)
Rey Obrero (Capricorn1)

so, what is the problem with the records returned by the query?


MaintID 9 and 6 should not show in query.   Only duplicate records that should show are MaintID 7 and 1.  I only have 10 maintenance records and I'm getting 2 false positives (9 & 6).  Imagine this with 1000 maintenance records.  thanks
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Rey Obrero (Capricorn1)

ok.. now state the reason why 9 and 6 should not show?

don't think that i am being dense here, i need to know the logic you want to implement.
Rey Obrero (Capricorn1)

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question

Hi Cap,  Sorry so long to respond.  That query is exactly what I was looking for.  I understand the logic behind it but why name the tables "T"?

To answer your question above,  I only want hits if duplicate records are found against open, active records.  My query was testing against closed and cancelled records so was finding a lot more duplicates than actual.  I guess you figured that out though.   thanks
Rey Obrero (Capricorn1)

<but why name the tables "T"?> that is just an alias of the table that is created in memory, you can name it anything you want.
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.

thanks.  Worked great.