Solved

read file into array and remove duplicates

Posted on 2007-03-30
12
257 Views
Last Modified: 2008-02-01
Hello.  I have a file that has about 3000 names in it that i need to read into an array and remove the duplicate names.  they are files that need to be deleted from  a couple servers so i need to write this tiny app that reads the list of files, removes the duplicates and then deletes them.  here is the code i'm using to read the files into an array and seems to work ok:

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
For lCtr = 0 To UBound(sArray)
Debug.Print sArray(lCtr)
Next


End Sub

Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)
'PURPOSE:    Puts all lines of file into a string array
'PARAMETERS: FileName = FullPath of File
'            TheArray = StringArray to which contents
'                       Of File will be added.
'Example
'  Dim sArray() as String
'  FileToArray "C:\MyTextFile.txt", sArray
'  For lCtr = 0 to Ubound(sArray)
'  Debug.Print sArray(lCtr)
'  Next

'NOTES:
'  --  Requires a reference to Microsoft Scripting Runtime
'      Library
'  --  You can write this method in a number of different ways
'      For instance, you can take advantage of VB 6's ability to
'      return an array.
' --   You can also read all the contents of the file and use the
'      Split function with vbCrlf as the delimiter, but I
'      wanted to illustrate use of the ReadLine
'      and AtEndOfStream methods.
'**********************************************************

  Dim oFSO As New FileSystemObject
  Dim oFSTR As Scripting.TextStream
  Dim ret As Long
  Dim lCtr As Long

  If Dir(FileName) = "" Then Exit Sub

'Check if string array was passed
'If you want to permit other type of arrays (e.g.,
'variant) remove or modify this line
'If VarType(TheArray) <> vbArray + vbString Then Exit Sub
 
  On Error GoTo ErrorHandler
     Set oFSTR = oFSO.OpenTextFile(FileName)
     
     Do While Not oFSTR.AtEndOfStream
            ReDim Preserve TheArray(lCtr) As String
            TheArray(lCtr) = oFSTR.ReadLine
            lCtr = lCtr + 1
            DoEvents 'optional but with large file
                     'program will appear to hang
                     'without it
    Loop
     oFSTR.Close
     
ErrorHandler:
     Set oFSTR = Nothing
End Sub
______________________________________________________________________________________

Now, problem is that array has duplicates in it that need to be removed.  then i could simply loop through the
new array, add a .WAV extension to it and check if file exists and if so delete the file.  can someone help with removing duplicates from this array?

thanks!
0
Comment
Question by:linuxrox
  • 6
  • 5
12 Comments
 
LVL 27

Expert Comment

by:VBRocks
ID: 18823660
Perhaps a simple approach would be to create a collection and add each name to the collection as "key" and as the "item".  Because a collection will not allow duplicates, an error will automatically occur, which you can catch, and then process accordingly.

'Create a module level collection variable:
Private cNames as New Collection

'Create a sub to add each name to the collection, and then catch any error that occurs
Private Sub AddName(ByVal sName as String)
'This is part of the key:  catch the error when the duplicate item cannot be added to the collection
OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

'If no error occurs, the name will be added, then you will exit the sub
Exit Sub

ErrorHandler:
     'If an error occured, the name was already in the collection, so delete it, or handle it according to your preference.

     'Clear the error
     err.Clear

     '***  Perform your special processing here with the duplicate name when the error occurs
    '     delete the record, or whatever...

End Sub


'Then, just call the sub from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     AddName sArray(lCtr)

Next
0
 
LVL 27

Expert Comment

by:VBRocks
ID: 18823749
An additional idea, if you want to handle the processing of the duplicate item in your code instead of in the sub, you can use a boolean function instead of a sub as follows:

Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If

Next
0
 

Author Comment

by:linuxrox
ID: 18823822
are you saying in my project go to "add / module" and insert the code there instead of the main form?
0
Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

 
LVL 27

Expert Comment

by:VBRocks
ID: 18823862
Oh no, just add the code to your main form.
0
 

Author Comment

by:linuxrox
ID: 18823878
gotchya...
ok, have that working but what do i do with the collection?  
For lCtr = 0 To UBound(sArray)
AddName sArray(lCtr)
'would like to add each item in the collection to a listbox or loop through
'the collection to perform functions on each item
Next
0
 
LVL 85

Assisted Solution

by:Rory Archibald
Rory Archibald earned 100 total points
ID: 18823898
You may want to check out this Scripting Guys article - it does exactly what you want using a Dictionary object:
http://www.microsoft.com/technet/scriptcenter/resources/qanda/apr05/hey0413.mspx

HTH
Rory
0
 
LVL 27

Accepted Solution

by:
VBRocks earned 400 total points
ID: 18823909
Here is an example of what I mean.  Hopefully I didn't miss anything.  Let me know if you have any questions:

'Create a module level collection variable:
Private cNames as New Collection

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If
Next

End Sub


Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)

     'The code for your sub belongs here...

End Sub

0
 
LVL 27

Assisted Solution

by:VBRocks
VBRocks earned 400 total points
ID: 18823931
Well, each of the items in the collection are unique items:  in other words, there are no duplicates in them.  So, you can just loop through each item in the collection and handle it anyway you like, such as adding them to a listbox, or whatever

Dim i as integer
For i = 1 to cNames.Count
     'perform your action with:  cNames.Item(i)

Next i
0
 

Author Comment

by:linuxrox
ID: 18823979
thanks guys!!! PERFECT!!
0
 

Author Comment

by:linuxrox
ID: 18824075
one thing:
in my file there are 2892 lines; each line has a duplicate..such as:
C200YC
C200YC
C200YD
C200YD
C200YE
C200YE
C200YF
C200YF
C200YG
C200YG
C200YH
C200YH
C200YI
C200YI
C200YJ
C200YJ
C200YK
C200YK

VBRocks: when i run the code i would expect the cnames.count to be 1446 because 2892 divided by two equals 1446, however i actually get a count of 1429  !?!?
any thoughts or ideas?
0
 

Author Comment

by:linuxrox
ID: 18824163
i could email you the file if you'd like zipped.  it's very small zipped.
my email is xf at westky.com
0
 

Author Comment

by:linuxrox
ID: 18824190
actually i guess 1429 is correct because i opened the file in editpad pro and removed duplicates and there were 1429 lines.  i just would have expected 1446 lines, perhaps my math and logic is wrong there..
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
MsgBox 2 59
Prevent checkbox click event occur while editing it in vb6 8 57
Hide vba in gp 7 106
Can we place a tooltip on the actual vb6 form 5 48
The debugging module of the VB 6 IDE can be accessed by way of the Debug menu item. That menu item can normally be found in the IDE's main menu line as shown in this picture.   There is also a companion Debug Toolbar that looks like the followin…
You can of course define an array to hold data that is of a particular type like an array of Strings to hold customer names or an array of Doubles to hold customer sales, but what do you do if you want to coordinate that data? This article describes…
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
This lesson covers basic error handling code in Microsoft Excel using VBA. This is the first lesson in a 3-part series that uses code to loop through an Excel spreadsheet in VBA and then fix errors, taking advantage of error handling code. This l…

828 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question