VBA to get rid of duplicates - Excel 2003

greg_c
greg_c used Ask the Experts™
on
Hi

I was wondering if anyone would anyone know of any VBA code that would get rid of duplicates.  For example, in Column D, I want to keep the first instance of the ID, and get delete the duplicate(s).

I have attached the a file with data.

I am using Excel 2003.  



Regards

Greg
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2012
Top Expert 2012

Commented:
There is no attachment.

However, you can do this without VBA.

Put this in column E (assuming header on row one, start with E2):

[E2]=IF(COUNTIF($D$2:$D2,$D2)>1,1,0)

and copy down.

You can then filter the range on column E for 1, and delete those rows.

Dave
Most Valuable Expert 2012
Top Expert 2012
Commented:
If you want to remove duplicates in a column, including blank cells, we can enhance the formula above with:
=IF(COUNTIF($D$2:$D2,IF($D2="","<>",$D2))>1,1,0)

For a single-column VBA remove duplicate routine (including duplicate blank cells), starting in row 2 (this can be modified) and checking all subsequent rows, you can use:

Approach 1 - iterate through the range, find duplicates using dictionary (as a dictionary can only hold unique values) and delete the row if the value already exists.

Option Explicit

Sub RemoveDuplicates()
Dim wkb As Workbook
Dim wks As Worksheet
Dim lastRow As Long
Dim myDict As Object
Dim i As Long

    Set wkb = ThisWorkbook
    Set wks = ActiveSheet
    
    Set myDict = CreateObject("Scripting.Dictionary")
    myDict.CompareMode = vbTextCompare 'not case sensitive. Change to vbBinaryCompare for case sensitivity
    
    lastRow = wks.Range("D" & Rows.Count).End(xlUp).Row

    For i = lastRow To 2 Step -1 'change lastRow to 1 Step -1 to include row 1 in the evaluation
        If myDict.Exists(wks.Range("D" & i).Value) Then
            wks.Range("D" & i).EntireRow.Delete
        Else
            myDict.Add wks.Range("D" & i).Value, Nothing
        End If
    Next i

    myDict.RemoveAll
    Set myDict = Nothing
End Sub

Open in new window


See attached removeDupes r1.xls
-------------------------------------------------------------------
Approach 2 - use the formula that was teed up in a separate column, then use autofilter to filter the rows that need to be deleted, then delete them.  This could be very fast, in comparison with the prior approach, as it doesn't require looping through the dataset.

Caveats - uses the data filter, so eliminates any that currently exist.  As this approach depends on identifying VISIBLE rows via the data filter that is created, it currently doesn't work with collapsed outlines via grouping or Excel 2007+ filtered tables (though code to expand groups and show all data on all table in the sheet could be added.)  Simple is better, but I can give you the code to handle these last two situations if you like this solution and might have outline/grouping or tables which are currently hiding rows that need to be un-hidden during the remove duplicates process.

Option Explicit

Sub RemoveDuplicatesUsingFormula()
Dim wkb As Workbook
Dim wks As Worksheet
Dim lastRow As Long
Dim rngChk As Range
Dim rng As Range
Dim strCol As String
Dim strTmp As String

    strCol = "D" 'column to test for duplicates
    strTmp = "Z" 'temporary column for formula
    
    Set wkb = ThisWorkbook
    Set wks = ActiveSheet
    
    lastRow = wks.Range(strCol & Rows.Count).End(xlUp).Row
    Set rng = wks.Range(strCol & "2:" & strCol & lastRow)
    
    wks.Columns(strTmp).Insert 'insert temporary column - comment this out if sheet is protected, or you can specify a temporary column having no data
    
    Set rngChk = wks.Range(strTmp & "2:" & strTmp & lastRow)
    
    '=IF(COUNTIF($D$2:$D2,IF($D2="","<>",$D2))>1,1,0) 'include blank cells in the duplicate elimination process
    rngChk.Formula = "=IF(COUNTIF($" & strCol & "$2:$" & strCol & "2,IF($" & strCol & "2="""",""<>"",$" & strCol & "2))>1,1,0)"
    
    If wks.AutoFilterMode Then
        wks.AutoFilterMode = False
    End If
    
    rngChk.AutoFilter Field:=1, Criteria1:="1"
    
    rngChk.SpecialCells(xlCellTypeVisible).EntireRow.Delete
    
    rngChk.EntireColumn.Delete 'comment this out if not using the insert column, approach - see above comment
End Sub

Open in new window


See attached removeDupes r2.xls

Cheers,

Dave
removeDupes-r1.xls
removeDupes-r2.xls

Author

Commented:
Perfect, thank you very much.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial