need to remove duplicate words in a cell

have a column of cells like this

world, industry, generation, industry
leader, leader, Covered, supplies
transformation, margin

world, industry, generation
leader, Covered,
transformation, margin
Who is Participating?
Bill PrewCommented:
Here are some VBA procs you can use.  Attaching a sheet as an example.

If you want to remove duplicate words and place results in a new cell you can use the DeDupeText() function to do that.  See sample sheet, column B.

If you want to remove duplicate words in each cell in the whole A column, you can run DeDupe().

Hopefully this gives you the idea...

Option Explicit

Sub DeDupe()
    DeDupeRange Range("A1:A" & ActiveSheet.Cells(Rows.Count, 1).End(xlUp).Row)
End Sub

Sub DeDupeRange(rngTarget As Range)
    Dim rngCell As Range
    For Each rngCell In rngTarget
        rngCell.Value = DeDupeText(rngCell.Value)
End Sub

Function DeDupeText(strText As String) As String
    Dim arrWord() As String
    Dim strWord As Variant

    DeDupeText = ""

    With CreateObject("Scripting.Dictionary")
        .CompareMode = vbTextCompare

        arrWord = Split(strText, ",")
        For Each strWord In arrWord
            If Not .Exists(Trim(strWord)) Then
                .Add Trim(strWord), 1
                If DeDupeText = "" Then
                    DeDupeText = strWord
                    DeDupeText = DeDupeText & "," & strWord
                End If
            End If
    End With

End Function

Open in new window


Pawan KumarDatabase ExpertCommented:
SELECT that column -> go to DATA tab -> CLICK remove duplicates--> You are done.
finnstoneAuthor Commented:
that is not correct
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

finnstoneAuthor Commented:
dupes only should be removed from within the same cell
Pawan KumarDatabase ExpertCommented:
Is it a one time activity ?
Martin LissOlder than dirtCommented:
Ignore this if Bill's code works for you.

Otherwise add and run this macro. Note lines 8 and 9.

Sub RemoveCellDupes()
Dim lngLastRow As Long
Dim lngRow As Long
Dim colWords As Collection
Dim strWords() As String
Dim lngIndex As Long
Dim lngNextRow As Long
Const DATA_COL = "A" ' Change as needed
Const FIRST_ROW = 1 ' Change as needed

lngLastRow = Range(DATA_COL & "1048576").End(xlUp).Row
lngNextRow = FIRST_ROW

For lngRow = FIRST_ROW To lngLastRow
    Set colWords = New Collection
    strWords = Split(Cells(lngRow, DATA_COL), ",")
    For lngIndex = 0 To UBound(strWords) 
        On Error Resume Next
        colWords.Add Trim(strWords(lngIndex)), Trim(strWords(lngIndex))
    Cells(lngNextRow, DATA_COL).ClearContents
    For lngIndex = 1 To colWords.Count
        If lngIndex < colWords.Count Then
            Cells(lngNextRow, DATA_COL) = Cells(lngNextRow, DATA_COL) & colWords(lngIndex) & ", "
            Cells(lngNextRow, DATA_COL) = Cells(lngNextRow, DATA_COL) & colWords(lngIndex)
        End If
    lngNextRow = lngNextRow + 1
End Sub

Open in new window

Subodh Tiwari (Neeraj)Excel & VBA ExpertCommented:
Assuming each word in the cell is separated by a comma, you may give this a try...
Function getUniqueWords(ByVal vStr As String) As String
Dim str() As String
Dim dict
Dim i As Long

str = Split(Replace(Replace(vStr, Chr(10), ""), " ", ""), ",")
Set dict = CreateObject("Scripting.Dictionary")
For i = 0 To UBound(str)
    dict.Item(str(i)) = ""
Next i
If dict.Count > 0 Then getUniqueWords = Join(dict.keys, ", ")
End Function

Open in new window

Assuming your string is in A2 then use this user defined function on the sheet like below...

Open in new window

finnstoneAuthor Commented:
awesome thanks Bill
finnstoneAuthor Commented:
Bill, got a new one. This time, I just need to remove dupes within the commas
finnstoneAuthor Commented:
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.