Access and duplicate records

Posted on 2012-08-11
Medium Priority
Last Modified: 2012-08-13
Hi, let me start with the table and sample data.

Fld1     Fld2          Fld3
False    Group1        123
True     Group1        123
False    Group2        456
True     Group2        456
False    Group3        789

Open in new window

There is no PK. I can't change the structure of this table in any way, because it's linked via ODBC to a proprietary software. I am aware that this table's structure is lousy. The number of records, etc. changes dynamically because the proprietary software dumps records into it.

I need to eliminate the "false" records when a duplicate exists in Fld2 AND Fld3, but if a field is "false" and has no duplicate, then it must not be eliminated.

I have been going round and round with SQL subqueries and I can't get it done. Maybe some way using Design View in Access? Either way.
Question by:milhouse537
  • 6
  • 3
  • 3
  • +2

Author Comment

ID: 38283835
To clarify, I realized that in the sample data Fld3 seems to be correlated with Fld2. But in "real life" Fld2 and Fld3 are not matched to each other. They could be different.
LVL 58
ID: 38283845
<<I need to eliminate the "false" records when a duplicate exists in Fld2 AND Fld3, but if a field is "false" and has no duplicate, then it must not be eliminated.>>

 Let me make sure I got that right; if there are duplicates, you want to delete ALL the records with false with fields 2 & 3 being the same?

 or do you want to leave one false record per field 2&3 combination if one or more exists?


Author Comment

ID: 38283869
Essentially, if there is a "true" duplicate then I need to keep the "true" record and not the "false" record(s). If there is multiple "false" records, I need to get down to one "false".

The match has to be on 2&3 together.

Hope that made sense.
Get quick recovery of individual SharePoint items

Free tool – Veeam Explorer for Microsoft SharePoint, enables fast, easy restores of SharePoint sites, documents, libraries and lists — all with no agents to manage and no additional licenses to buy.

LVL 31

Expert Comment

ID: 38284058
Is data limited to at most 1 true and 1 false?

Author Comment

ID: 38284080
It isn't, and that's been part of my problem.

I thought about creating a query of all the trues, then searching the falses for duplicates and somehow deleting all but one false ... ? Couldn't figure out how to do it.
LVL 58
ID: 38284096
<<I thought about creating a query of all the trues, then searching the falses for duplicates and somehow deleting all but one false ... ? Couldn't figure out how to do it. >>

 Right, so:

a.  delete all false records where 2 & 3 are the same, and there is at least one true record.

b. reduce any remaining false records where 2 & 3 are the same to one false record.



Author Comment

ID: 38284132
Yes, I believe that's it!
LVL 29

Accepted Solution

IrogSinta earned 2000 total points
ID: 38284360
I don't know how to do it via queries but if you don't mind some coding, this seems to work:
    Dim rst As DAO.Recordset
    Dim strTemp As String, strSQL As String
    Dim hasDups As Boolean
    Set rst = CurrentDb.OpenRecordset("TableName")
    Do While Not rst.EOF
        If rst!Fld1 <> "xDeletex" Then
            strTemp = rst!Fld1
            hasDups = DCount("*", "TableName", "Fld1 = 'True' And Fld2 = '" & rst!Fld2 & "' AND Fld3 = " & rst!Fld3) > 0
            If (strTemp = "false" And Not hasDups) Or strTemp = "true" Then
                'Mark all matching records for deletion
                strSQL = "Update TableName Set Fld1 = 'xDeletex' Where Fld2 = '" & rst!Fld2 & "' AND Fld3 = " & rst!Fld3
                CurrentDb.Execute strSQL, dbFailOnError
                rst!Fld1 = strTemp  'unmark current record
            End If
        End If
    Set rst = Nothing
    'Delete duplicates
    CurrentDb.Execute "Delete * From TableName Where Fld1='xDeletex'"

Open in new window

LVL 58

Expert Comment

ID: 38284512
I think you need an identity column for this to work in SQL. In your copy of the table, add a column ID, auto-number, as primary key. Then:

SELECT A.* FROM SampleTable A
  Select Top 1 B.ID
  From SampleTable B
  Where B.Fld2=A.Fld2 And B.Fld3=A.Fld3
  Order By B.Fld1, B.ID
If that works, turn it into a make-table query. It should also be possible to create a delete query using the same sort of sub-query, but I found that using TOP subqueries tends to make the entire query non-updatable, so that the DELETE no longer works, even with the DISTINCTROW modifier. Make-table queries work even from non-updatable queries.

LVL 31

Expert Comment

ID: 38284675
Please check the code oriented comments.
I am trying to use queries. But no success.

What are you using, as supploed in topics under question body, Access or MSSQL or both?

Author Comment

ID: 38285450
Code is no problem! It's easier for me than the Design View. I'll hopefully be in the office a bit later today and I'm going to try the suggested code. Hopefully a primary key isn't needed - I can't add one; the database structure is mandated by this blasted proprietary software.
LVL 58
ID: 38285567
OK.  Let's start with 'a'.  

DELETE Table1.*, Table1.Field1
FROM Table1
WHERE Table1.Field1=False AND Exists (Select * From table1 as b where table1.field2 = b.field2 and table1.field3 = b.field3 and b.field1 = True);

 Access will reformat that slightly if you switch to design view.  This uses a sub query to check if for the current 2 & 3 field combination, there are records existing with a True in Field 1

 As for B, there's no easy solution in SQL and the fact that there is no PK limits you to one:

1. Add records to a temp table using SELECT DISTINCT so you end up with only one record
2. Delete all the False records from the main table.
3. Insert records from the temp table to the main one.

LVL 31

Expert Comment

ID: 38286252
Check this:
table a(f1 int, f2 int, adesc string)

run form, click button:
click event of button: create table tmpTable with seq field, restarts seq from 1 when f1, f2 and adesc change.

execute delete query.

Author Closing Comment

ID: 38287476
The VBA coding worked great, thanks. I actually had to tweak it, since I neglected to mention that in "real life" Fld1 is actually Boolean, and Fld3 is actually a GUID (replication ID). Fld2 is the only plain string.

I ended up making StrTemp point to Fld2 and changing that to 'xDeletex' as needed. But basically it was the same.

This was great, I learned a way to search for duplicates in a recordset using DCount. Thanks again!

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
Recursive SQL is one of the most fascinating and powerful and yet dangerous feature offered in many modern databases today using a Common Table Expression (CTE) first introduced in the ANSI SQL 99 standard. The first implementations of CTE began ap…
In Microsoft Access, learn how to “cascade” or have the displayed data of one combo control depend upon what’s entered in another. Base the dependent combo on a query for its row source: Add a reference to the first combo on the form as criteria i…
Visualize your data even better in Access queries. Given a date and a value, this lesson shows how to compare that value with the previous value, calculate the difference, and display a circle if the value is the same, an up triangle if it increased…

850 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question