Solved

read file into array and remove duplicates

Posted on 2007-03-30
12
255 Views
Last Modified: 2008-02-01
Hello.  I have a file that has about 3000 names in it that i need to read into an array and remove the duplicate names.  they are files that need to be deleted from  a couple servers so i need to write this tiny app that reads the list of files, removes the duplicates and then deletes them.  here is the code i'm using to read the files into an array and seems to work ok:

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
For lCtr = 0 To UBound(sArray)
Debug.Print sArray(lCtr)
Next


End Sub

Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)
'PURPOSE:    Puts all lines of file into a string array
'PARAMETERS: FileName = FullPath of File
'            TheArray = StringArray to which contents
'                       Of File will be added.
'Example
'  Dim sArray() as String
'  FileToArray "C:\MyTextFile.txt", sArray
'  For lCtr = 0 to Ubound(sArray)
'  Debug.Print sArray(lCtr)
'  Next

'NOTES:
'  --  Requires a reference to Microsoft Scripting Runtime
'      Library
'  --  You can write this method in a number of different ways
'      For instance, you can take advantage of VB 6's ability to
'      return an array.
' --   You can also read all the contents of the file and use the
'      Split function with vbCrlf as the delimiter, but I
'      wanted to illustrate use of the ReadLine
'      and AtEndOfStream methods.
'**********************************************************

  Dim oFSO As New FileSystemObject
  Dim oFSTR As Scripting.TextStream
  Dim ret As Long
  Dim lCtr As Long

  If Dir(FileName) = "" Then Exit Sub

'Check if string array was passed
'If you want to permit other type of arrays (e.g.,
'variant) remove or modify this line
'If VarType(TheArray) <> vbArray + vbString Then Exit Sub
 
  On Error GoTo ErrorHandler
     Set oFSTR = oFSO.OpenTextFile(FileName)
     
     Do While Not oFSTR.AtEndOfStream
            ReDim Preserve TheArray(lCtr) As String
            TheArray(lCtr) = oFSTR.ReadLine
            lCtr = lCtr + 1
            DoEvents 'optional but with large file
                     'program will appear to hang
                     'without it
    Loop
     oFSTR.Close
     
ErrorHandler:
     Set oFSTR = Nothing
End Sub
______________________________________________________________________________________

Now, problem is that array has duplicates in it that need to be removed.  then i could simply loop through the
new array, add a .WAV extension to it and check if file exists and if so delete the file.  can someone help with removing duplicates from this array?

thanks!
0
Comment
Question by:linuxrox
  • 6
  • 5
12 Comments
 
LVL 27

Expert Comment

by:VBRocks
ID: 18823660
Perhaps a simple approach would be to create a collection and add each name to the collection as "key" and as the "item".  Because a collection will not allow duplicates, an error will automatically occur, which you can catch, and then process accordingly.

'Create a module level collection variable:
Private cNames as New Collection

'Create a sub to add each name to the collection, and then catch any error that occurs
Private Sub AddName(ByVal sName as String)
'This is part of the key:  catch the error when the duplicate item cannot be added to the collection
OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

'If no error occurs, the name will be added, then you will exit the sub
Exit Sub

ErrorHandler:
     'If an error occured, the name was already in the collection, so delete it, or handle it according to your preference.

     'Clear the error
     err.Clear

     '***  Perform your special processing here with the duplicate name when the error occurs
    '     delete the record, or whatever...

End Sub


'Then, just call the sub from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     AddName sArray(lCtr)

Next
0
 
LVL 27

Expert Comment

by:VBRocks
ID: 18823749
An additional idea, if you want to handle the processing of the duplicate item in your code instead of in the sub, you can use a boolean function instead of a sub as follows:

Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If

Next
0
 

Author Comment

by:linuxrox
ID: 18823822
are you saying in my project go to "add / module" and insert the code there instead of the main form?
0
 
LVL 27

Expert Comment

by:VBRocks
ID: 18823862
Oh no, just add the code to your main form.
0
 

Author Comment

by:linuxrox
ID: 18823878
gotchya...
ok, have that working but what do i do with the collection?  
For lCtr = 0 To UBound(sArray)
AddName sArray(lCtr)
'would like to add each item in the collection to a listbox or loop through
'the collection to perform functions on each item
Next
0
 
LVL 85

Assisted Solution

by:Rory Archibald
Rory Archibald earned 100 total points
ID: 18823898
You may want to check out this Scripting Guys article - it does exactly what you want using a Dictionary object:
http://www.microsoft.com/technet/scriptcenter/resources/qanda/apr05/hey0413.mspx

HTH
Rory
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 27

Accepted Solution

by:
VBRocks earned 400 total points
ID: 18823909
Here is an example of what I mean.  Hopefully I didn't miss anything.  Let me know if you have any questions:

'Create a module level collection variable:
Private cNames as New Collection

Private Sub Form_Load()
Dim sArray() As String
FileToArray "C:\expiredcopyfiles.txt", sArray
'Then, just call the Function from your code as you loop through each item of the array:
For lCtr = 0 To UBound(sArray)
     If AddName(sArray(lCtr)) = False Then
          '***  Perform your special processing here with the duplicate name when the error occurs
          '     delete the record, or whatever...
     End If
Next

End Sub


Private Function AddName(ByVal sName as String) As Boolean

OnError GoTo ErrorHandler

     'Try to add the name to the collection
     cNames.Add sName, sName

     'Return 'True' because the name was added successfully
     AddName = True

Exit Function

ErrorHandler:
     'Clear the error
     err.Clear

     'Return 'False' because an error occured trying to add the name to the collection
     AddName = False

End Function


Public Sub FileToArray(ByVal FileName As String, _
    ByRef TheArray As Variant)

     'The code for your sub belongs here...

End Sub

0
 
LVL 27

Assisted Solution

by:VBRocks
VBRocks earned 400 total points
ID: 18823931
Well, each of the items in the collection are unique items:  in other words, there are no duplicates in them.  So, you can just loop through each item in the collection and handle it anyway you like, such as adding them to a listbox, or whatever

Dim i as integer
For i = 1 to cNames.Count
     'perform your action with:  cNames.Item(i)

Next i
0
 

Author Comment

by:linuxrox
ID: 18823979
thanks guys!!! PERFECT!!
0
 

Author Comment

by:linuxrox
ID: 18824075
one thing:
in my file there are 2892 lines; each line has a duplicate..such as:
C200YC
C200YC
C200YD
C200YD
C200YE
C200YE
C200YF
C200YF
C200YG
C200YG
C200YH
C200YH
C200YI
C200YI
C200YJ
C200YJ
C200YK
C200YK

VBRocks: when i run the code i would expect the cnames.count to be 1446 because 2892 divided by two equals 1446, however i actually get a count of 1429  !?!?
any thoughts or ideas?
0
 

Author Comment

by:linuxrox
ID: 18824163
i could email you the file if you'd like zipped.  it's very small zipped.
my email is xf at westky.com
0
 

Author Comment

by:linuxrox
ID: 18824190
actually i guess 1429 is correct because i opened the file in editpad pro and removed duplicates and there were 1429 lines.  i just would have expected 1446 lines, perhaps my math and logic is wrong there..
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Enums (shorthand for ‘enumerations’) are not often used by programmers but they can be quite valuable when they are.  What are they? An Enum is just a type of variable like a string or an Integer, but in this case one that you create that contains…
This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

920 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now