Find Duplicate Values in Excel using VBA

I am trying to find duplicate values in an excel sheet from multiple columns (via VBA / Macros)

It works for the most part, however, there are a few ambiguities when I am doing a comparison between numeric values. Numbers need to be compared literally rather than the actual value.

Attached is an excel sheet.

1.

Rows 2 and 4 are correctly identified as duplicates.

2.

Rows 6 and 9 are WRONGLY identified as duplicates.

3.

Rows 7 and 8 are CORRECTLY identified as NOT duplicates. When we have a some value in Column D (which may or may not contain any values)

4.

Rows 10 and 11 are WRONGLY identified as duplicates inspite of having unique values (atleast on display)
Any help is appreciated to resolve the inconsistency.
Duplicate.xlsm
nainilAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Martin LissOlder than dirtCommented:
Sorry but what are you comparing when you say something is or isn't a duplicate? And what do you mean when you say "compared literally rather than the actual value"?
0
Steven CarnahanNetwork ManagerCommented:
If I understand what you are looking at then it is a matter of the cell formation in Column H. Since it is formated as general then it considers 6 and 9 to be the same regardless of the .000 at the end of 9. The same is true of 10 and 11 with the .300 vs. .3000.

Is this what you are meaning?
0
nainilAuthor Commented:
Yes formatting is probably causing things to break.

.300 is different from .3000 (alteast visually). The same should hold true while comparing.
0
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Steven CarnahanNetwork ManagerCommented:
Yes it is visually different however numerically speaking it is the same.  I think the only way to get the distincition between the two is to alter the format in column H (I think that is what your formula/code is looking at) to text.
0
nainilAuthor Commented:
The problem is that people will paste the data into the columns from external sources. How can we enforce that the data in the column is set to TEXT?
0
Martin LissOlder than dirtCommented:
You can compare the Text values which will reflect what you see and not the underlying value

Range("H6").Text
0
Steven CarnahanNetwork ManagerCommented:
@MartinLiss

Thank you, for some reason I was drawing a blank on that one. I think because I was thinking about the compare of row 6 and row 9. In $H$9 it seems to have dropped the .00 so the comparison will still return true.
0
Martin LissOlder than dirtCommented:
I'm sorry to have to ask but do you mean that I helped or do you still have a problem?
0
nainilAuthor Commented:
I still have issues. My aim is to filter out the duplicate values and notify users. If I have all alphabets, it is working like a charm. However, numbers are creating a problem here due to the column formatting.
0
Martin LissOlder than dirtCommented:
If you use the Text value as I tried to show in my last post it will ignore the formatting.

So if A1 shows 1.2345 and A2 shows 1.23 only because of the number format (they are both 1.2345 behind the scenes)

If Range("A1").Text = Range("A2").Text Then

will be evaluated as False.
0
nainilAuthor Commented:
is it possible to work that out in my code please? I am a bit lost here...
0
Martin LissOlder than dirtCommented:
Sub ValidateDuplicateCodes()

Dim RowCount As Integer
Dim RowCounter As Integer
Dim DupRow As Variant
Dim DuplicateErrorDetails As Variant
Dim r As Range
Dim i As Long
Dim DuplicateErrorCount As Variant

DuplicateErrorCount = 0
DuplicateErrorDetails = ""

    ActiveSheet.Range("K4").ClearContents
    ActiveSheet.Range("K4").ClearComments
'MsgBox ("in ValidateDuplicateLabCodes")

RowCount = ActiveSheet.Range("B:B").End(xlDown).Row - 1
'MsgBox (RowCount)
   
'    For RowCounter = 2 To RowCount + 1
   
'        Range("A" & RowCounter).Interior.ColorIndex = WHITE
'        Range("I" & RowCounter).Value = ""
       
        '''' =COUNTIF($AA$2:$AA$65536,AA2)
        'Range("I" & RowCounter).Formula = "=COUNTIF($H$2:$H$65536,H" & RowCounter & ")"
       
        Set r = Range("H1").End(xlDown).Offset(0, 0)
       
        For RowCounter = 2 To r.Row
            Range("A" & RowCounter).Interior.ColorIndex = WHITE
            Range("I" & RowCounter).Value = ""
            For i = RowCounter + 1 To r.Row
                If Range("H" & RowCounter).Text = Range("H" & i).Text Then
                    DupRow = DupRow + 1
                End If
            Next
            If DupRow >= 1 Then
                DuplicateErrorCount = DuplicateErrorCount + 1
                Range("A" & RowCounter).Interior.ColorIndex = 31
                DuplicateErrorDetails = DuplicateErrorDetails & "Duplicate values found in row " & RowCounter & vbLf
                DupRow = 0
            End If
        Next
       ' DupRow = Range("I" & RowCounter).Value
'        MsgBox ("Value at AB" & RowCounter & " : " & DupRow)
'        If DupRow > 1 Then
'            DuplicateErrorCount = DuplicateErrorCount + 1
'            Range("A" & RowCounter).Interior.ColorIndex = 31
'            DuplicateErrorDetails = DuplicateErrorDetails & "Duplicate values found in row " & RowCounter & vbLf
'        End If
       
'        DupRow = 0
'        Range("I" & RowCounter).Value = ""
   
'    Next RowCounter
   
    If (DuplicateErrorDetails = "") Then
            DuplicateErrorDetails = "No Error"
    End If

    ActiveSheet.Range("K4").HorizontalAlignment = xlCenter
    ActiveSheet.Range("K4").Value = DuplicateErrorCount
    ActiveSheet.Range("K4").AddComment(DuplicateErrorDetails).Shape.TextFrame.AutoSize = True

DuplicateErrorCount = 0
DuplicateErrorDetails = ""


End Sub
0
nainilAuthor Commented:
Thanks, let me try this out.
0
nainilAuthor Commented:
It works great. Except for one small issue. If you can check the attached file, the entry in row 6 & 9 is termed as Duplicate when it should not.

Can you please help?
Duplicate.xlsm
0
Steven CarnahanNetwork ManagerCommented:
That is because cell $b$9 is formated as number (2 decimal) instead of text like the rest of column B so the .00 is automatically truncated when combined with $d$9 resulting in a match in $h$9 with $h$6.
0
Martin LissOlder than dirtCommented:
@nainikl: Is the formatting of $b$9 that pony10us pointed out always going to be different than the rest of the column?
0
nainilAuthor Commented:
There is a possibility that the data can be formatted differently. our vendors will paste data from multiple sources (which we have no control over). So, i am really looking for a better handling of inconsistencies.

If we can restrict pasting of data in a certain format, even better.
0
Martin LissOlder than dirtCommented:
Add this to the code

 How about adding


        Columns("B:B").Select
        Selection.NumberFormat = "@"

just above the following line?

        Set r = Range("H1").End(xlDown).Offset(0, 0)
0
nainilAuthor Commented:
I just realized, the column B will NOT ALWAYS be a number. It can be text or alpha - numeric as well. How can we control that?
0
Martin LissOlder than dirtCommented:
See my last post.
0
nainilAuthor Commented:
Great, that helps. Once last help.

Currently, we are highlighting / marking just the first occurrence of the duplicate values. Is it possible to include each and every duplicate value with the color?

There will be alteast 2000 rows which will need to be parsed. So, in that case, it will be helpful to see ANY AND ALL rows which are duplicated highlighted.
0
nainilAuthor Commented:
Any help please?
0
Martin LissOlder than dirtCommented:
Sub ValidateDuplicateCodes()

Dim RowCount As Integer
Dim RowCounter As Integer
Dim DupRow As Variant
Dim DuplicateErrorDetails As Variant
Dim r As Range
Dim i As Long
Dim DuplicateErrorCount As Variant

DuplicateErrorCount = 0
DuplicateErrorDetails = ""

ActiveSheet.Range("K4").ClearContents
ActiveSheet.Range("K4").ClearComments

RowCount = ActiveSheet.Range("B:B").End(xlDown).Row - 1
Columns("B:B").Select
Selection.NumberFormat = "@"
Set r = Range("H1").End(xlDown).Offset(0, 0)
For RowCounter = 2 To r.Row
    Range("A" & RowCounter).Interior.ColorIndex = WHITE
Next
For RowCounter = 2 To r.Row
    Range("I" & RowCounter).Value = ""
    For i = RowCounter + 1 To r.Row
        If Range("H" & RowCounter).Text = Range("H" & i).Text Then
            DupRow = DupRow + 1
            Range("A" & i).Interior.ColorIndex = 31
        End If
    Next
    If DupRow >= 1 Then
        DuplicateErrorCount = DuplicateErrorCount + 1
'        Range("A" & RowCounter).Interior.ColorIndex = 31
        DuplicateErrorDetails = DuplicateErrorDetails & "Duplicate values found in row " & RowCounter & vbLf
        DupRow = 0
    End If
Next
   
If (DuplicateErrorDetails = "") Then
        DuplicateErrorDetails = "No Error"
End If

ActiveSheet.Range("K4").HorizontalAlignment = xlCenter
ActiveSheet.Range("K4").Value = DuplicateErrorCount
ActiveSheet.Range("K4").AddComment(DuplicateErrorDetails).Shape.TextFrame.AutoSize = True

DuplicateErrorCount = 0
DuplicateErrorDetails = ""


End Sub
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Excel

From novice to tech pro — start learning today.