Link to home
Start Free TrialLog in
Avatar of PDSWSS
PDSWSS

asked on

How would I identify the duplicates in an excel spreadsheet as described below?

I have many excel spreadsheets that contain many rows where the same subject_ID is listed multiple times with the same date.
  I need to be able to identify the duplicated subject IDs (column B) with identical dates in column D1.
Please see attached example. I have highlighted the rows that I would need to identify.
Code that would result in the highlighting of the rows as described above would be optimal. Thanks
EE_sample.xlsx
Avatar of Carol Chisholm
Carol Chisholm
Flag of Switzerland image

Move columns B and E so they are next to each other (you can move back afterwards).
Select the two columns together
Use the Data - Remove Duplicates function
Sorry did not read! Please ignore.
Make a new column combining your two fields (=B2&E2)
Use Conditional formatting on that column.
Please find the attached workbook where I have created a helper column O (you may hide that column) and then used conditional formatting to highlight the duplicate rows.
EE_sample.xlsx
Try this code

Option Explicit

Sub HighlightDuplicates()
    On Error Resume Next
    Dim c As Range, rng As Range, myRange As Range
    Dim r As Long

    Set rng = Range("A1").CurrentRegion.Offset(1)
    'add helper column

    Range("F2").FormulaR1C1 = "=RC[-4]&RC[-1]"
    Range("F2").AutoFill Destination:=Range("F2" & ":F" & rng.Rows.Count)
    Set rng = Range("F2" & ":F" & rng.Rows.Count)

    For Each c In rng.Cells

        If Application.CountIf(rng, c) > 1 Then
            If myRange Is Nothing Then
                Set myRange = c
            Else
                Set myRange = Union(myRange, c)
            End If
        End If
        r = r + 1
    Next c

    myRange.Resize(, -1).Interior.Color = vbYellow
    rng.Clear
    MsgBox r & " duplicates found", vbInformation, "Duplicate search"
    On Error GoTo 0
End Sub

Open in new window

@Roy
Disadvantage of a code to highlight the duplicates over the conditional formatting is, the highlighted rows will not change automatically if the duplicates are removed and you will be required to run the code again to identify the new duplicates after the change has been made.

And with conditional formatting, if you remove/change the cell contents, the conditional formatting will identify the new duplicates automatically.

What do you say?
The OP asked for code to identify the duplicates so I that what I provided. I would generally recommend non VBA solutions  if available.

With the code it would be a simple step to delete the duplicates if required.

Option Explicit

Sub SelectDuplicates()
    On Error Resume Next
    Dim c As Range, rng As Range, myRange As Range
    Dim r As Long

    Set rng = Range("A1").CurrentRegion.Offset(1)
    'add helper column

    Range("F2").FormulaR1C1 = "=RC[-4]&RC[-1]"
    Range("F2").AutoFill Destination:=Range("F2" & ":F" & rng.Rows.Count)
    Set rng = Range("F2" & ":F" & rng.Rows.Count)

    For Each c In rng.Cells

        If Application.CountIf(rng, c) > 1 Then
            If myRange Is Nothing Then
                Set myRange = c
            Else
                Set myRange = Union(myRange, c)
            End If
        End If
        r = r + 1
    Next c

    myRange.Resize(, -1).Interior.Color = vbYellow
    rng.Clear
    Select Case MsgBox(r & " duplicates found. Would you like to remove them?", _
                       vbYesNo Or vbExclamation Or vbDefaultButton1, "Duplicate Search")

    Case vbYes
        myRange.EntireRow.Delete
    End Select
    On Error GoTo 0
End Sub

Open in new window

Well I still think that the conditional formatting is best way to identify the duplicates.
Avatar of Biopsych
Biopsych

Roy_Cox
Thanks for the code. Unfortunately when I run your code I get  the message 62 duplicates found, which
is not very helpful. Did it work as expected on your end?
sktneer,

Thanks for your solution.  I should have made it clearer in my post that I already knew how to identify duplicates using conditional formatting.  I thought running code would simplify the process as
I could avoid the step of manually concatenating two columns and then running conditional formatting.
However, the code provided by Roy_Cox did not produce the expected results so in this case the method provided by you and Carol Chisholm although not what I had actually asked for would be the best answers. Before awarding points I want to see if Roy_Cox will send me code that will produce the expected results. Thanks
ASKER CERTIFIED SOLUTION
Avatar of Subodh Tiwari (Neeraj)
Subodh Tiwari (Neeraj)
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of PDSWSS

ASKER

skinner

Thanks. Thats exactly what I am looking for. Good job.
How do you link the highlight duplicates button on the excel sheet to the code?
Avatar of PDSWSS

ASKER

sktneer

Sorry about the wrong name - spell check auto corrected to skinner.
Glad that it worked as per your expectations.
Never mind about the misspelled name. :)

Have that code on a Standard Module like Module1 and then on the sheet, go to Developer Tab --> Insert --> Insert a Form Control CommandButton (Not ActiveX CommandBotton) --> After you insert the button on the sheet, you will be prompted to assign a macro to the button and since the macro already exists on Module1, select the macro from the available macros and click on OK. Once macro is assigned to the button, right click the button and choose Edit Text and you can place the desired Text on the button.

Hope this helps.
Once you have the button on the sheet, you may assign any macro to this button again by right clicking the button and by choosing Assign Macro.
Avatar of PDSWSS

ASKER

sktneer

Thanks for sticking with this and providing an excellent solution.
You're welcome mate.
Avatar of PDSWSS

ASKER

sktneer  Chose best solution but but website does not seem to processing as usual. Thanks,
Avatar of PDSWSS

ASKER

sktneer          Did you get 500 points?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
No not yet.
Are you having any trouble accepting the answer. If so, you may try it later.
Avatar of PDSWSS

ASKER

Roy_Cox
Thanks for the update.
I had awarded the points to sktneer for his code that worked. However, lucky for you the website did not respond to my attempt to award the points. Funny that there would be a bug in the EE website. In 8 years of using this site, first  time that has happened. When I get a chance I will check your code and if it works I will award points to both of you and hopefully the site will respond.  Thanks
Avatar of PDSWSS

ASKER

Thanks sktneer and Roy_Cox for your code.

I prefer sktneer's code as it highlights the entire duplicate rows making it easier for me to work with also provided a button on the spreadsheet to run the code.
Roy_Cox's code highlights only the dates. Either one would address this issue but  just as I said the former
is a bit more user friendly.  In this case, I think 300 points to sktneer and 200 points to Roy_Cox seems fair.
Thank again
You're welcome.  Glad to help.
Glad to help