remove duplicates from text file

Sabrin
Sabrin used Ask the Experts™
on
hello,
I have a text file with name in it each line by line and I would like to check
for duplicates and delete them! how can I do this?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
GrahamSkanRetired
Top Expert 2012
Commented:
Save each unique line in an array and write the array back to the file.

Sub DelDups()
    Dim strFileName As String
    Dim strLines() As String
    Dim strText As String
    Dim f As Integer
    Dim l As Integer
    Dim i As Integer
    Dim bFound As Boolean
   
    strFileName = "c:\myfolder\file1.txt"
   
    f = FreeFile
    Open strFileName For Input As #f
    Do Until EOF(f)
        Line Input #f, strText
        bFound = False
        For i = 0 To l - 1
            If strLines(i) = strText Then
                bFound = True
                Exit For
            End If
        Next i
        If Not bFound Then
            ReDim Preserve strLines(l)
            strLines(l) = strText
            l = l + 1
        End If
    Loop
    Close #f
   
    f = FreeFile
    Open strFileName For Output As #f
        For i = 0 To l - 1
            Print #f, strLines(i)
        Next i
    Close #f
End Sub

Author

Commented:
this makes the computer freeze if the list has 50,000 lines!!
Commented:

Sub DelDups()
dim lFile as long
dim sBuf as string
dim asNames() as string
dim l as long
lFile=FreeFile
Open "c:\myfolder\file1.txt" For Binary Access Read As #lFile
sBuf=space(lof(lFile))
get #lFile,,sBuf
asNames=split(sBuf,vbcrlf)

Now that you have the names in an array you have 2 options:
1)Sort the array and remove duplicated
2)In double for (for (){for {}}) loop the array and for each record, check if it exist in the array.
 
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

High School Computer Science, Computer Applications, Digital Design, and Mathematics Teacher
Top Expert 2009
Commented:
See if this is any faster:

Private Sub Command1_Click()
    On Error Resume Next ' NECESSARY for duplicate key error
    Dim names As New Collection
    Dim fileName As String
    Dim curName As Variant
    fileName = "c:\somefile.txt"
    Open fileName For Input As #1
    While Not EOF(1)
        Line Input #1, curName
        names.Add curName, curName
    Wend
    Close #1
    Open fileName For Output As #2
    For Each curName In names
        Print #2, curName
    Next
    Close #2
End Sub

Commented:
mmm Idle_Mind nice!
I should have thought it my self.
Mike TomlinsonHigh School Computer Science, Computer Applications, Digital Design, and Mathematics Teacher
Top Expert 2009

Commented:
That's secret option #3...   =)

I believe you could also build an in memory table out of the data and query it using an SQL statement to get the unique values.  ADO or something...not my forte though....  =\

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial