Solved

read file into array and remove duplicates

Posted on 2007-03-30
12
254 Views
Last Modified: 2008-02-01
Hello.  I have a file that has about 3000 names in it that i need to read into an array and remove the duplicate names.  they are files that need to be deleted from  a couple servers so i need to write this tiny app that reads the list of files, removes the duplicates and then deletes them.  here is the code i'm using to read the files into an array and seems to work ok:

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
For lCtr = 0 To UBound(sArray)
Debug.Print sArray(lCtr)
Next


End Sub

Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)
'PURPOSE:    Puts all lines of file into a string array
'PARAMETERS: FileName = FullPath of File
'            TheArray = StringArray to which contents
'                       Of File will be added.
'Example
'  Dim sArray() as String
'  FileToArray "C:\MyTextFile.txt", sArray
'  For lCtr = 0 to Ubound(sArray)
'  Debug.Print sArray(lCtr)
'  Next

'NOTES:
'  --  Requires a reference to Microsoft Scripting Runtime
'      Library
'  --  You can write this method in a number of different ways
'      For instance, you can take advantage of VB 6's ability to
'      return an array.
' --   You can also read all the contents of the file and use the
'      Split function with vbCrlf as the delimiter, but I
'      wanted to illustrate use of the ReadLine
'      and AtEndOfStream methods.
'**********************************************************

  Dim oFSO As New FileSystemObject
  Dim oFSTR As Scripting.TextStream
  Dim ret As Long
  Dim lCtr As Long

  If Dir(FileName) = "" Then Exit Sub

'Check if string array was passed
'If you want to permit other type of arrays (e.g.,
'variant) remove or modify this line
'If VarType(TheArray) <> vbArray + vbString Then Exit Sub
 
  On Error GoTo ErrorHandler
     Set oFSTR = oFSO.OpenTextFile(FileName)
     
     Do While Not oFSTR.AtEndOfStream
            ReDim Preserve TheArray(lCtr) As String
            TheArray(lCtr) = oFSTR.ReadLine
            lCtr = lCtr + 1
            DoEvents 'optional but with large file
                     'program will appear to hang
                     'without it
    Loop
     oFSTR.Close
     
ErrorHandler:
     Set oFSTR = Nothing
End Sub
______________________________________________________________________________________

Now, problem is that array has duplicates in it that need to be removed.  then i could simply loop through the
new array, add a .WAV extension to it and check if file exists and if so delete the file.  can someone help with removing duplicates from this array?

thanks!
0
Comment
Question by:linuxrox
  • 6
  • 5
12 Comments
 
LVL 27

Expert Comment

by:VBRocks
Comment Utility
Perhaps a simple approach would be to create a collection and add each name to the collection as "key" and as the "item".  Because a collection will not allow duplicates, an error will automatically occur, which you can catch, and then process accordingly.

'Create a module level collection variable:
Private cNames as New Collection

'Create a sub to add each name to the collection, and then catch any error that occurs
Private Sub AddName(ByVal sName as String)
'This is part of the key:  catch the error when the duplicate item cannot be added to the collection
OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

'If no error occurs, the name will be added, then you will exit the sub
Exit Sub

ErrorHandler:
     'If an error occured, the name was already in the collection, so delete it, or handle it according to your preference.

     'Clear the error
     err.Clear

     '***  Perform your special processing here with the duplicate name when the error occurs
    '     delete the record, or whatever...

End Sub


'Then, just call the sub from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     AddName sArray(lCtr)

Next
0
 
LVL 27

Expert Comment

by:VBRocks
Comment Utility
An additional idea, if you want to handle the processing of the duplicate item in your code instead of in the sub, you can use a boolean function instead of a sub as follows:

Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If

Next
0
 

Author Comment

by:linuxrox
Comment Utility
are you saying in my project go to "add / module" and insert the code there instead of the main form?
0
 
LVL 27

Expert Comment

by:VBRocks
Comment Utility
Oh no, just add the code to your main form.
0
 

Author Comment

by:linuxrox
Comment Utility
gotchya...
ok, have that working but what do i do with the collection?  
For lCtr = 0 To UBound(sArray)
AddName sArray(lCtr)
'would like to add each item in the collection to a listbox or loop through
'the collection to perform functions on each item
Next
0
 
LVL 85

Assisted Solution

by:Rory Archibald
Rory Archibald earned 100 total points
Comment Utility
You may want to check out this Scripting Guys article - it does exactly what you want using a Dictionary object:
http://www.microsoft.com/technet/scriptcenter/resources/qanda/apr05/hey0413.mspx

HTH
Rory
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 27

Accepted Solution

by:
VBRocks earned 400 total points
Comment Utility
Here is an example of what I mean.  Hopefully I didn't miss anything.  Let me know if you have any questions:

'Create a module level collection variable:
Private cNames as New Collection

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If
Next

End Sub


Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)

     'The code for your sub belongs here...

End Sub

0
 
LVL 27

Assisted Solution

by:VBRocks
VBRocks earned 400 total points
Comment Utility
Well, each of the items in the collection are unique items:  in other words, there are no duplicates in them.  So, you can just loop through each item in the collection and handle it anyway you like, such as adding them to a listbox, or whatever

Dim i as integer
For i = 1 to cNames.Count
     'perform your action with:  cNames.Item(i)

Next i
0
 

Author Comment

by:linuxrox
Comment Utility
thanks guys!!! PERFECT!!
0
 

Author Comment

by:linuxrox
Comment Utility
one thing:
in my file there are 2892 lines; each line has a duplicate..such as:
C200YC
C200YC
C200YD
C200YD
C200YE
C200YE
C200YF
C200YF
C200YG
C200YG
C200YH
C200YH
C200YI
C200YI
C200YJ
C200YJ
C200YK
C200YK

VBRocks: when i run the code i would expect the cnames.count to be 1446 because 2892 divided by two equals 1446, however i actually get a count of 1429  !?!?
any thoughts or ideas?
0
 

Author Comment

by:linuxrox
Comment Utility
i could email you the file if you'd like zipped.  it's very small zipped.
my email is xf at westky.com
0
 

Author Comment

by:linuxrox
Comment Utility
actually i guess 1429 is correct because i opened the file in editpad pro and removed duplicates and there were 1429 lines.  i just would have expected 1446 lines, perhaps my math and logic is wrong there..
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

I’ve seen a number of people looking for examples of how to access web services from VB6.  I’ve been using a test harness I built in VB6 (using many resources I found online) that I use for small projects to work out how to communicate with web serv…
When designing a form there are several BorderStyles to choose from, all of which can be classified as either 'Fixed' or 'Sizable' and I'd guess that 'Fixed Single' or one of the other fixed types is the most popular choice. I assume it's the most p…
Get people started with the process of using Access VBA to control Excel using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Excel. Using automation, an Access application can laun…
Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…

771 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now