Link to home
Start Free TrialLog in
Avatar of valmatic
valmaticFlag for United States of America

asked on

Find duplicate records in Access table

Hi.  I have a database that stores information about equipment, maintenance and costing for my business.  Users entered duplicate maintenance records for various equipment by mistake and now want to clean it up.  Maintenance table consists of assetID, MaintID, MaintDescr, ReqDate, DueDate, etc...   I need to find any records where both the assetID and MaintDescr fields are the same.  I tried the built in duplicate query function in access2010 and it returned a lot of false positives.  I tried manipulating the SQL command with my limited knowledge of SQL and had no luck.  Any ideas how I could find al of these duplicate records?  thanks
Avatar of Rey Obrero (Capricorn1)
Rey Obrero (Capricorn1)
Flag of United States of America image

what are the contents of the field "MaintDescr" ? are these entered using free text entry?

have you tried using the assetID and MaintID?

better if you can upload a copy of the db
Avatar of trbaze

The built-in query should work.  Here's the SQL text:

SELECT First(tblRecords.assetID) AS [assetID Field], First(tblRecords.MaintDescr) AS [MaintDescr Field], Count(tblRecords.assetID) AS NumberOfDups
FROM tblRecords
GROUP BY tblRecords.assetID, tblRecords.MaintDescr
HAVING (((Count(Table1.ID))>1) AND ((Count(Table1.Desc))>1));

The key is the word "AND" in between each criteria so that all have to be true for that record to show in the query.
on second thought, why is the field "MaintDescr" on that table..
it should be in the Maintenance table..

you should have three tables to accomplish this
tblAsset          tblAssetMaint       tblMaintenance
assetID  1-- M     assetID
                           maintID m------1 maintID

see this demo

Set the relationship between two tables
Avatar of valmatic


My DB is set up that way.  AssetID is a join field

I figured out why I'm getting false hits.  It looks like my SQL Query is not looking at the entire MaintDescr text.  This is a free form text field.  IF SQL finds a duplicate in the first word then it calls the entire block a duplicate.

Example:  This is coming up as a duplicate record and it should not.
AssetID = 123     MaintDescr = Check oil level
AssetID = 123     MaintDescr = Check pressure valve

How would I change my SQL string to force a compare against the entire block of text vs. just the 1st word or first match?

In (SELECT [AssetID] FROM [Maintenance] As Tmp GROUP BY [AssetID],[MaintenanceDescription] HAVING Count(*)>1 AND [MaintenanceDescription] = [Maintenance].[MaintenanceDescription])
<My DB is set up that way.>

do you really have three tables for this relationship?

and don't use the field "[MaintenanceDescription]", try using the MaintID field.
I have a table called Asset that holds the equipment info
I have a table called maintenance that holds the maintenance info - relationship is 1-many
I also have various other tables that aren't worth mentioning in this post.

MaintID is a key field of the Maintenance table and duplicates are not allowed.
I need to find MaintDescr duplicates within a given AssetID. I'm only looking for duplicates within table.maintenance.

I think the issue is in the SQL in that it calls the record a duplicate if it finds a match within the text vs. a match of the entire block of text.   I'm hoping someone can help me tweak the SQL to compare the block of text - if it's possible.
upload a copy of your db.

Cap, I'm glad you had me do that..  I broke my db down so it was a little more managable and plugged in a few test entries.  Stock duplicate query works like it should and I only get hits as expected.  I then pulled some extra fields into my query;
qry test added: only find records where bogusrecord = false and where MaintDate = isnull...  

So in my test you can see that the query does not pull in the records where bogusrecord = true or where MaintDate is not null but those recorsd are still counted into the duplicate SQL string.  I'm guessing I need to add these extra tests into the SQL so it takes all tests into account?

can  you explain the rules that must be followed ..
If more than one record exists where both Asset ID and MaintDescr are duplicated.
and if BogusRecord = False
        and if isnull(MaintDate)
(BogusRecord is a Cancelled record)
(MaintDate where null is an open active record)
so, what is the problem with the records returned by the query?

MaintID 9 and 6 should not show in query.   Only duplicate records that should show are MaintID 7 and 1.  I only have 10 maintenance records and I'm getting 2 false positives (9 & 6).  Imagine this with 1000 maintenance records.  thanks
ok.. now state the reason why 9 and 6 should not show?

don't think that i am being dense here, i need to know the logic you want to implement.
Avatar of Rey Obrero (Capricorn1)
Rey Obrero (Capricorn1)
Flag of United States of America image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Hi Cap,  Sorry so long to respond.  That query is exactly what I was looking for.  I understand the logic behind it but why name the tables "T"?

To answer your question above,  I only want hits if duplicate records are found against open, active records.  My query was testing against closed and cancelled records so was finding a lot more duplicates than actual.  I guess you figured that out though.   thanks
<but why name the tables "T"?> that is just an alias of the table that is created in memory, you can name it anything you want.
thanks.  Worked great.