Find duplicate records in Access table

valmatic
valmatic used Ask the Experts™
on
Hi.  I have a database that stores information about equipment, maintenance and costing for my business.  Users entered duplicate maintenance records for various equipment by mistake and now want to clean it up.  Maintenance table consists of assetID, MaintID, MaintDescr, ReqDate, DueDate, etc...   I need to find any records where both the assetID and MaintDescr fields are the same.  I tried the built in duplicate query function in access2010 and it returned a lot of false positives.  I tried manipulating the SQL command with my limited knowledge of SQL and had no luck.  Any ideas how I could find al of these duplicate records?  thanks
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Top Expert 2016

Commented:
what are the contents of the field "MaintDescr" ? are these entered using free text entry?

have you tried using the assetID and MaintID?


better if you can upload a copy of the db

Commented:
The built-in query should work.  Here's the SQL text:

SELECT First(tblRecords.assetID) AS [assetID Field], First(tblRecords.MaintDescr) AS [MaintDescr Field], Count(tblRecords.assetID) AS NumberOfDups
FROM tblRecords
GROUP BY tblRecords.assetID, tblRecords.MaintDescr
HAVING (((Count(Table1.ID))>1) AND ((Count(Table1.Desc))>1));

The key is the word "AND" in between each criteria so that all have to be true for that record to show in the query.
Top Expert 2016

Commented:
on second thought, why is the field "MaintDescr" on that table..
it should be in the Maintenance table..

you should have three tables to accomplish this
tblAsset          tblAssetMaint       tblMaintenance
assetID  1-- M     assetID
                           maintID m------1 maintID
                                                     MaintDescr


see this demo

Set the relationship between two tables
http://office.microsoft.com/en-us/access-help/demo-set-the-relationship-between-two-tables-HA010254901.aspx?pid=CH100739911033
Success in ‘20 With a Profitable Pricing Strategy

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
My DB is set up that way.  AssetID is a join field

I figured out why I'm getting false hits.  It looks like my SQL Query is not looking at the entire MaintDescr text.  This is a free form text field.  IF SQL finds a duplicate in the first word then it calls the entire block a duplicate.

Example:  This is coming up as a duplicate record and it should not.
AssetID = 123     MaintDescr = Check oil level
AssetID = 123     MaintDescr = Check pressure valve

How would I change my SQL string to force a compare against the entire block of text vs. just the 1st word or first match?

SQL
In (SELECT [AssetID] FROM [Maintenance] As Tmp GROUP BY [AssetID],[MaintenanceDescription] HAVING Count(*)>1 AND [MaintenanceDescription] = [Maintenance].[MaintenanceDescription])
Top Expert 2016

Commented:
<My DB is set up that way.>

do you really have three tables for this relationship?

and don't use the field "[MaintenanceDescription]", try using the MaintID field.

Author

Commented:
I have a table called Asset that holds the equipment info
I have a table called maintenance that holds the maintenance info - relationship is 1-many
I also have various other tables that aren't worth mentioning in this post.

MaintID is a key field of the Maintenance table and duplicates are not allowed.
I need to find MaintDescr duplicates within a given AssetID. I'm only looking for duplicates within table.maintenance.

I think the issue is in the SQL in that it calls the record a duplicate if it finds a match within the text vs. a match of the entire block of text.   I'm hoping someone can help me tweak the SQL to compare the block of text - if it's possible.
thanks
Top Expert 2016

Commented:
upload a copy of your db.

Author

Commented:
PMLogCopy.accdb

Cap, I'm glad you had me do that..  I broke my db down so it was a little more managable and plugged in a few test entries.  Stock duplicate query works like it should and I only get hits as expected.  I then pulled some extra fields into my query;
qry test added: only find records where bogusrecord = false and where MaintDate = isnull...  

So in my test you can see that the query does not pull in the records where bogusrecord = true or where MaintDate is not null but those recorsd are still counted into the duplicate SQL string.  I'm guessing I need to add these extra tests into the SQL so it takes all tests into account?

thanks
Top Expert 2016

Commented:
can  you explain the rules that must be followed ..

Author

Commented:
If more than one record exists where both Asset ID and MaintDescr are duplicated.
and if BogusRecord = False
        and if isnull(MaintDate)
________
(BogusRecord is a Cancelled record)
(MaintDate where null is an open active record)
Top Expert 2016

Commented:
so, what is the problem with the records returned by the query?

explain..

Author

Commented:
MaintID 9 and 6 should not show in query.   Only duplicate records that should show are MaintID 7 and 1.  I only have 10 maintenance records and I'm getting 2 false positives (9 & 6).  Imagine this with 1000 maintenance records.  thanks
Top Expert 2016

Commented:
ok.. now state the reason why 9 and 6 should not show?

don't think that i am being dense here, i need to know the logic you want to implement.
Top Expert 2016
Commented:
test this query

SELECT T.[AssetID], T.[MaintenanceDescription], T.[MaintenanceID], T.[MaintenanceDate], T.[BogusRecord]
FROM
(
SELECT Maintenance.AssetID, Maintenance.MaintenanceID, Maintenance.MaintenanceDescription, Maintenance.MaintenanceDate, Maintenance.BogusRecord
FROM Maintenance
WHERE Maintenance.MaintenanceDate Is Null AND Maintenance.BogusRecord=False
) as T
WHERE T.[AssetID] In
      (SELECT [AssetID]
            FROM
            (
            SELECT Maintenance.AssetID, Maintenance.MaintenanceID, Maintenance.MaintenanceDescription, Maintenance.MaintenanceDate,
            Maintenance.BogusRecord            
            FROM Maintenance
            WHERE Maintenance.MaintenanceDate Is Null AND Maintenance.BogusRecord=False
            ) As Tmp GROUP BY [AssetID],[MaintenanceDescription] HAVING Count(*)>1  And [MaintenanceDescription] = [T].[MaintenanceDescription])
ORDER BY T.[AssetID], T.[MaintenanceDescription];

Author

Commented:
Hi Cap,  Sorry so long to respond.  That query is exactly what I was looking for.  I understand the logic behind it but why name the tables "T"?

To answer your question above,  I only want hits if duplicate records are found against open, active records.  My query was testing against closed and cancelled records so was finding a lot more duplicates than actual.  I guess you figured that out though.   thanks
Top Expert 2016

Commented:
<but why name the tables "T"?> that is just an alias of the table that is created in memory, you can name it anything you want.

Author

Commented:
thanks.  Worked great.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial