Access and duplicate records

Hi, let me start with the table and sample data.

Fld1     Fld2          Fld3
False    Group1        123
True     Group1        123
False    Group2        456
True     Group2        456
False    Group3        789

Open in new window

There is no PK. I can't change the structure of this table in any way, because it's linked via ODBC to a proprietary software. I am aware that this table's structure is lousy. The number of records, etc. changes dynamically because the proprietary software dumps records into it.

I need to eliminate the "false" records when a duplicate exists in Fld2 AND Fld3, but if a field is "false" and has no duplicate, then it must not be eliminated.

I have been going round and round with SQL subqueries and I can't get it done. Maybe some way using Design View in Access? Either way.
Who is Participating?
IrogSintaConnect With a Mentor Commented:
I don't know how to do it via queries but if you don't mind some coding, this seems to work:
    Dim rst As DAO.Recordset
    Dim strTemp As String, strSQL As String
    Dim hasDups As Boolean
    Set rst = CurrentDb.OpenRecordset("TableName")
    Do While Not rst.EOF
        If rst!Fld1 <> "xDeletex" Then
            strTemp = rst!Fld1
            hasDups = DCount("*", "TableName", "Fld1 = 'True' And Fld2 = '" & rst!Fld2 & "' AND Fld3 = " & rst!Fld3) > 0
            If (strTemp = "false" And Not hasDups) Or strTemp = "true" Then
                'Mark all matching records for deletion
                strSQL = "Update TableName Set Fld1 = 'xDeletex' Where Fld2 = '" & rst!Fld2 & "' AND Fld3 = " & rst!Fld3
                CurrentDb.Execute strSQL, dbFailOnError
                rst!Fld1 = strTemp  'unmark current record
            End If
        End If
    Set rst = Nothing
    'Delete duplicates
    CurrentDb.Execute "Delete * From TableName Where Fld1='xDeletex'"

Open in new window

milhouse537Author Commented:
To clarify, I realized that in the sample data Fld3 seems to be correlated with Fld2. But in "real life" Fld2 and Fld3 are not matched to each other. They could be different.
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<<I need to eliminate the "false" records when a duplicate exists in Fld2 AND Fld3, but if a field is "false" and has no duplicate, then it must not be eliminated.>>

 Let me make sure I got that right; if there are duplicates, you want to delete ALL the records with false with fields 2 & 3 being the same?

 or do you want to leave one false record per field 2&3 combination if one or more exists?

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

milhouse537Author Commented:
Essentially, if there is a "true" duplicate then I need to keep the "true" record and not the "false" record(s). If there is multiple "false" records, I need to get down to one "false".

The match has to be on 2&3 together.

Hope that made sense.
Is data limited to at most 1 true and 1 false?
milhouse537Author Commented:
It isn't, and that's been part of my problem.

I thought about creating a query of all the trues, then searching the falses for duplicates and somehow deleting all but one false ... ? Couldn't figure out how to do it.
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
<<I thought about creating a query of all the trues, then searching the falses for duplicates and somehow deleting all but one false ... ? Couldn't figure out how to do it. >>

 Right, so:

a.  delete all false records where 2 & 3 are the same, and there is at least one true record.

b. reduce any remaining false records where 2 & 3 are the same to one false record.


milhouse537Author Commented:
Yes, I believe that's it!
I think you need an identity column for this to work in SQL. In your copy of the table, add a column ID, auto-number, as primary key. Then:

SELECT A.* FROM SampleTable A
  Select Top 1 B.ID
  From SampleTable B
  Where B.Fld2=A.Fld2 And B.Fld3=A.Fld3
  Order By B.Fld1, B.ID
If that works, turn it into a make-table query. It should also be possible to create a delete query using the same sort of sub-query, but I found that using TOP subqueries tends to make the entire query non-updatable, so that the DELETE no longer works, even with the DISTINCTROW modifier. Make-table queries work even from non-updatable queries.

Please check the code oriented comments.
I am trying to use queries. But no success.

What are you using, as supploed in topics under question body, Access or MSSQL or both?
milhouse537Author Commented:
Code is no problem! It's easier for me than the Design View. I'll hopefully be in the office a bit later today and I'm going to try the suggested code. Hopefully a primary key isn't needed - I can't add one; the database structure is mandated by this blasted proprietary software.
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
OK.  Let's start with 'a'.  

DELETE Table1.*, Table1.Field1
FROM Table1
WHERE Table1.Field1=False AND Exists (Select * From table1 as b where table1.field2 = b.field2 and table1.field3 = b.field3 and b.field1 = True);

 Access will reformat that slightly if you switch to design view.  This uses a sub query to check if for the current 2 & 3 field combination, there are records existing with a True in Field 1

 As for B, there's no easy solution in SQL and the fact that there is no PK limits you to one:

1. Add records to a temp table using SELECT DISTINCT so you end up with only one record
2. Delete all the False records from the main table.
3. Insert records from the temp table to the main one.

Check this:
table a(f1 int, f2 int, adesc string)

run form, click button:
click event of button: create table tmpTable with seq field, restarts seq from 1 when f1, f2 and adesc change.

execute delete query.
milhouse537Author Commented:
The VBA coding worked great, thanks. I actually had to tweak it, since I neglected to mention that in "real life" Fld1 is actually Boolean, and Fld3 is actually a GUID (replication ID). Fld2 is the only plain string.

I ended up making StrTemp point to Fld2 and changing that to 'xDeletex' as needed. But basically it was the same.

This was great, I learned a way to search for duplicates in a recordset using DCount. Thanks again!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.