Using Excel VBA to delete duplicate rows of data

Posted on 2005-04-28
Last Modified: 2008-01-09
I am trying to create a module that will allow me to delete duplicate rows of data.  The code that I have is below, but it seems that the for-next statement never finishes.  Any assistance anyone can give would be greatly appreciated.

Option Explicit
Sub DeleteDuplicates()
'   Variable declaration
    Dim lngRow As Long, lngColumn As Long
    Dim lngEnd As Long
'   Turn off screen updating
    Application.ScreenUpdating = False
'   Set error handling
    On Error GoTo canceled:

'   Set begin and end row to same value
    lngRow = 2
    lngEnd = lngRow

'   Determine the range of used cells
    Do While Cells((lngEnd + 1), 1) <> ""
        lngEnd = lngEnd + 1
'   Ask the user what column should be tested for duplicate values
    lngColumn = InputBox("Which column should be" & vbNewLine & _
        "checked for duplicates?", "Not so fast...")
'   Begin the process of removing duplicates
    For lngRow = 2 To lngEnd
        If Cells(lngRow, lngColumn) = Cells(lngRow - 1, lngColumn) Then
            lngRow = lngRow - 1
        End If
    Next lngRow

'   Turn on screen updating
    Application.ScreenUpdating = True

Exit Sub
Application.ScreenUpdating = True
MsgBox "Operation canceled by user.", vbInformation Or vbOKOnly, "Canceled"

End Sub




Question by:heath9916
    LVL 35

    Accepted Solution

    Hi Heath,

    I changed the method of checking for the duplicates, but you'll find this runs much faster and more efficient.  Please let me know if you have any questions!

    Option Explicit
    Sub DeleteDuplicates()
    'Variable declaration
     Dim StartRow As Long, CheckRG As Range, ColToCheck As Range
     Dim UniqCells() As String, UniqCount As Long, i As Long, DelRG As Range
    'Set row to start searching for duplicates
     StartRow = 2

    'Determine the range of used cells
     Set CheckRG = Intersect(Rows(StartRow & ":65536"), ActiveSheet.UsedRange)
     If CheckRG Is Nothing Then Exit Sub 'Nothing to check
    'Ask the user what column should be tested for duplicate values
      Set ColToCheck = Application.InputBox(prompt:="Which column should be" & vbNewLine _
       & "checked for duplicates?", Title:="Please select single column", Type:=8)
     Loop Until ColToCheck.Columns.Count = 1
    'Begin the process of removing duplicates by loading unique values into string array
    ' and setting the duplicate rows into a range to delete later on
     UniqCount = 0
     ReDim UniqCells(0)
     For i = 1 To CheckRG.Rows.Count
      If Not InSArray(UniqCells, Intersect(CheckRG.Rows(i).EntireRow, ColToCheck).Text) Then
       ReDim Preserve UniqCells(UniqCount)
       UniqCells(UniqCount) = Intersect(CheckRG.Rows(i).EntireRow, ColToCheck).Text
       UniqCount = UniqCount + 1
       If DelRG Is Nothing Then
        Set DelRG = CheckRG.Rows(i).EntireRow
        Set DelRG = Union(DelRG, CheckRG.Rows(i).EntireRow)
       End If
      End If
     Next i
    'Delete duplicate rows
     If Not DelRG Is Nothing Then 'If there are duplicate cells...
      Application.ScreenUpdating = False 'Turn off screen updating
      DelRG.Delete 'Delete duplicate range
      Application.ScreenUpdating = True 'Turn on screen updating
     End If

    End Sub

    Function InSArray(ByRef vArray() As String, ByVal vValue As String) As Boolean
     Dim i As Long
     For i = LBound(vArray) To UBound(vArray)
      If vArray(i) = vValue Then
       InSArray = True
       Exit Function
      End If
     Next i
     InSArray = False
    End Function

    You could also look at "The Duplicate Master", a free excel add-in created by EE's own brettdj:

    LVL 10

    Expert Comment

    You could make use of the AutoFilter and a formula:

    OutCol =  InputBox("Which column should be" & vbNewLine & "used for the formula?", "Not so fast...","B")
    ChkCol = InputBox("Which column should be" & vbNewLine & "checked for duplicates?", "Not so fast...","A")

    Do While Cells((lngEnd + 1), 1) <> ""
       lngEnd = lngEnd + 1

    Range(OutCol & "2").Formula = "=IF(" & ChkCol & "2=" & ChkCol & "1,""Y"",""N"")"
    Range(OutCol & "2").Select
    Selection.AutoFill Destination:=Range(OutCol & "2:" & OutCol & lngEnd ), Type:=xlFillDefault
    Selection.AutoFilter Field:=Range(OutCol & "2").Column, Criteria1:="Y"
    Rows("2:" & lngEnd).Select
    Selection.Delete Shift:=xlUp
    LVL 9

    Expert Comment

    LastRow is a function that will return the last row number of the last row on the worksheet that contains any data (including spaces).

    Using your code, with a slight change...
    lngend = lastrow(activesheet)

    '   Begin the process of removing duplicates
        For lngRow = lngend to 2 step -1
            If Cells(lngRow-1, lngColumn) = Cells(lngRow, lngColumn) Then
            End If
        Next lngRow

    Function LastRow(ws As Worksheet) As Single

        'uses worksheet object
        'returns last used row
        On Error Resume Next
        With ws
          LastRow = .Cells.Find(What:="*", _
            SearchDirection:=xlPrevious, _
        End With
    End Function

    Featured Post

    Free Trending Threat Insights Every Day

    Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

    Join & Write a Comment

    Introduction I needed to skip over some file processing within a For...Next loop in some old production code and wished that VB (classic) had a statement that would drop down to the end of the current iteration, bypassing the statements that were c…
    When trying to find the cause of a problem in VBA or VB6 it's often valuable to know what procedures were executed prior to the error. You can use the Call Stack for that but it is often inadequate because it may show procedures you aren't intereste…
    As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
    This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

    745 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    14 Experts available now in Live!

    Get 1:1 Help Now