[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 293
  • Last Modified:

Get two strings in a filename and search it in a two dimensional array.

Hi guys,

I have a 2 Dim array from an excel file with 3 columns.  The first column is ID, next is lastname, the 3rd is firstname.

I already put all the data in a 2 dim array where Array(NumberOfRowsInExcel, 1-3). 1 for the ID, 2 for the lastname and 3 for the firstname col in excel.

Now I have several folders and directories with document files that has this filenaming format.
Firstname Lastname-########_.doc

The # represents any random number.
Other samples.
Bryan Adams-31467854_.doc
Jeff McBride-78945677_.doc

Now what i want to do is crawl all docs in a folder then parse the firstname and lastname from the filename then compare that to the 2 dim array. If there is a perfect match, rename the file appending the Array(x,1) value or the ID column in excel to that doc file.

Example: There is a match in the array.  Append it to the very start of the filename then add underscore something like this.

000001_Bryan Adams-31467854_.doc
000002_Jeff McBride-78945677_.doc

If there's no match on the array, move it to a folder named "unmatched".


Above example lets say Bryan Adams has been parsed from the filename.
Pseudocode:
Let x = counter.
Loop
   If Array(x, 2) = Adams and Array(x,3) = Bryan then
     Name "path\filename.doc" as "path\" & array(x,1)  & "_" & "filename.doc"
   else
     filecopy to unmatched folder
     kill file
...
End Loop

Basically that's what i want to do. Can somebody help me out patch things right?  I haven't done any programming in VB6 for a year. Thanks.

By the way, this is my code to to extract the data from excel and put it in an array just in case.
Set xlApp = New Excel.Application
Set xlWbk = xlApp.Workbooks.Open("C:\EMR\drv_patient_list.xls")
Set xlWks = xlWbk.Sheets(1)
arv = xlWks.Range("A1:H" & xlWks.UsedRange.Rows.Count).Value

lngRowCount = LastRowInSheet(xlWks)
ReDim arrData(1 To lngRowCount, 1 To 3)
For i = 2 To lngRowCount
    For j = 1 To 3
        arrData(i, j) = xlWks.Cells(i, j).Value
    Next j
Next i

xlWbk.Close False
xlApp.Quit

Open in new window



TIA
0
m3mdicl
Asked:
m3mdicl
  • 10
  • 7
1 Solution
 
shaydieCommented:
Something like this maybe?


Set objFolders = objFSO.getfolder(BasePath)
Set objFiles = objFolders.Files

 For Each FILE In objFiles
 strFileName = FILE.Name
 
 strFirstName = Left(strFileName, (InStr(1, strFileName, " ") - 1))
 strLastName = Mid(strFileName, (InStr(1, strFileName, " ") + 1), (InStr(1, strFileName, "-")) - (InStr(1, strFileName, " ") + 1))
 
 bIsMatch = False
 
 For ArrayItem = 0 To UBound(vArray)
 
 If strFirstName = vArray(ArrayItem, 1) And strLastName = vArray(ArrayItem, 2) Then
 
    objFSO.MoveFile FILE.Path, BasePath & vArray(ArrayItem, 0) & "_" & strFileName
    bIsMatch = True
    Exit For
 
 End If
 
 Next ArrayItem
 
 If bIsMatch = False Then objFSO.MoveFile FILE.Path, UnmatchedPath & strFileName
 
 Next
0
 
m3mdiclAuthor Commented:
Thanks for the reply.  

I was coding it using native syntax like Dir, Name, Filecopy and Kill because i though it's much faster than FSO.  But that should do it for the meantime.

I will test it and let you know.  Thanks mate.
0
 
aikimarkCommented:
1. You can save yourself some processing by skipping the first row in your range specification on line 4.

arv = xlWks.Range("A2:H" & xlWks.UsedRange.Rows.Count).Value

Open in new window


2. Once this statement has executed, all your data is in the arv variable.  I don't see a need to transfer it to arrData in the item-by-item assignment.

3. From a performance perspective, you would be much better off in your loop referring to
arv(i, j)
rather than
xlWks.Cells(i, j).Value

4. If you only need the first three columns of data from the worksheet, why are you getting the first eight columns?

5. You should declare all your variables.  Add an Option Explicit statement to the top of your General Declarations section.

============
Depending on the number of names you might have to match and the number of files that are matching candidates, I might suggest using a dictionary object in place of an array.  Also, I would use the combined (space delimited) firstname and lastname data to match against the parsed file name.

Can we assume that the numbers associated with the names are unique?

In the following example, the Dir() function is not supplied a path.  You might add a path to the initial call or use a ChDir statement to set the current directory.

Option Explicit

Public Sub FileRename()
    Set xlApp = New Excel.Application
    Set xlWbk = xlApp.Workbooks.Open("C:\EMR\drv_patient_list.xls")
    Set xlWKS = xlWbk.Sheets(1)
    
    Dim lngRowCount As Long
    Dim lngLoop As Long
    Dim dicNames As Object
    Dim strFilename As String
    Dim strNamePart() As String
    Set dicNames = CreateObject("Scripting.Dictionary")
    
    'get data from workbook
    arv = xlWKS.Range("A2:C" & xlWKS.UsedRange.Rows.Count).Value
    xlWbk.Close False
    xlApp.Quit
    
    'populate dictionary with name and number data
    lngRowCount = UBound(arv)
    For lngLoop = 1 To lngRowCount
        dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), arv(lngLoop, 1)
    Next lngLoop
    
    'iterate files and rename if match in dictionary
    strFilename = Dir("*.doc")
    Do Until Len(strFilename) = 0
        If LCase(strFilename) Like "#*_*-#_.doc" Then   'skip already renamed files
        Else
            strNamePart = Split(strFilename, "-")
            If dicNames.Exists(strNamePart(0)) Then
                Name strFilename As dicNames(strNamePart(0)) & "_" & strFilename
            End If
        End If
        strFilename = Dir
    Loop

    Set dicNames = Nothing

End Sub

Open in new window

0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
m3mdiclAuthor Commented:
Nice catch aikimark.   I will test it and let you know.  Thanks for the detailed help.
0
 
m3mdiclAuthor Commented:
@aikimark

I got a runtime error 457.  The key is already associated with an element of this collection.

This is the offending line

dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), arv(lngLoop, 1)

I also declared the variables like this.  Is this correct?

Option Explicit
Dim xlApp As Excel.Application
Dim xlWbk As Excel.Workbook
Dim xlWks As Excel.Worksheet
Dim arv As Variant

Open in new window

0
 
aikimarkCommented:
that means that the names in your worksheet are not unique.  What do you want to do with the duplicate names?
0
 
m3mdiclAuthor Commented:
Yes, firstnames and lastnames are not unique.  Only the ID is unique. so let's go back to array?
0
 
m3mdiclAuthor Commented:
I got 2000+ records on the excel sheet.  So the names will have duplicates.  That's why i want to compare both the firstname and lastnames.  It should match the parsed first and last names from the .doc filenames.

Is there any workaround for this if the dictionary wont allow dups?  I can only think of 2 dimensional arrays.  Any thoughts?
0
 
aikimarkCommented:
>>So the names will have duplicates.  That's why i want to compare both the firstname and lastnames.

Let me explain.  Your list has duplicate whole names (firstname and lastname combined).  It doesn't matter whether you iterate an array or search a dictionary/collection.  I can prevent the error you encountered with the dictionary object very easily.  However, you need to think about this scenario.

Your Excel list has 3 rows like this
ID	Last	First
000134	Roberts	Bob
001236	Roberts	Bob
000012	Roberts	Bob

Open in new window


When you encounter a file like
Bob Roberts-78987654_.doc
what number should be used for the new file name? (000134, 001236, 000012)

So, this is a processing and data decision for you.
0
 
m3mdiclAuthor Commented:
I see. Okay if dups have been found to exists, don't rename the file instead move it to another folder named "duplicates" some thing like that.

So to recap.

If found a perfect match = Rename file
If not found = move the file to a folder named "unmatched"
if dups found = move the file to a folder named "duplicate"

Thanks for the help.
0
 
aikimarkCommented:
The following code assumes that the files are in the current directory and that the unmatched and duplicate sub-directories exist.

Option Explicit

Public Sub FileRename()
    Set xlApp = New Excel.Application
    Set xlWbk = xlApp.Workbooks.Open("C:\EMR\drv_patient_list.xls")
    Set xlWKS = xlWbk.Sheets(1)
    
    Dim lngRowCount As Long
    Dim lngLoop As Long
    Dim dicNames As Object
    Dim strFilename As String
    Dim strNamePart() As String
    Set dicNames = CreateObject("Scripting.Dictionary")
    
    'get data from workbook
    arv = xlWKS.Range("A2:C" & xlWKS.UsedRange.Rows.Count).Value
    xlWbk.Close False
    xlApp.Quit
    
    'populate dictionary with name and number data
    lngRowCount = UBound(arv)
    For lngLoop = 1 To lngRowCount
        If dicNames.Exists(arv(lngLoop, 3) & " " & arv(lngLoop, 2)) Then
            dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), -1
        Else
            dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), arv(lngLoop, 1)
        End If
    Next lngLoop
    
    'iterate files and rename if match in dictionary
    'move file to unmatched sub-directory if no name match
    'move file to duplicate sub-directory if Excel list for this whole name was a duplicate
    'NOTE: assume that .\unmatched and .\duplicate sub-directories exist
    strFilename = Dir("*.doc")
    Do Until Len(strFilename) = 0
        If LCase(strFilename) Like "#*_*-#_.doc" Then   'skip already renamed files
        Else
            strNamePart = Split(strFilename, "-")
            If dicNames.Exists(strNamePart(0)) Then
                If Sgn(dicNames(strNamePart(0))) = -1 Then
                    'duplicate
                    FileCopy strFilename, ".\duplicate\" & strFilename
                    Kill strFilename
                Else
                    Name strFilename As dicNames(strNamePart(0)) & "_" & strFilename
                End If
            Else
                'unmatched
                FileCopy strFilename, ".\unmatched\" & strFilename
                Kill strFilename
            End If
        End If
        strFilename = Dir
    Loop

    Set dicNames = Nothing

End Sub

Open in new window

0
 
m3mdiclAuthor Commented:
Hi I still get an error on line 24.

dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), -1

same error.  The key is already associated with an element of this collection.
0
 
aikimarkCommented:
oops.  I was too quick with the copy/paste and didn't go back and change the line before posting here.

Line 24 should be

            dicNames(arv(lngLoop, 3) & " " & arv(lngLoop, 2))=-1

Open in new window


0
 
m3mdiclAuthor Commented:
Hi one quick question.  What data type should arv variable be?  It's not working I declared arv as variant type.
0
 
m3mdiclAuthor Commented:
Hi I modified the codes to work. I added this API to create folders automatically
Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal lpPath As String) As Long

I also put a DirListBox control to make it easier.
Dir1.Path = is the current path.  

The problem is even if I have a legit document matching it's not renaming it instead it moved to unmatch folder.
Here is the complete code.

Option Explicit
Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal lpPath As String) As Long
Dim xlApp As Excel.Application
Dim xlWbk As Excel.Workbook
Dim xlWks As Excel.Worksheet
Dim arv() As Variant

Public Sub FileRename()
    Set xlApp = New Excel.Application
    Set xlWbk = xlApp.Workbooks.Open("C:\EMR\drv_patient_list.xls")
    Set xlWks = xlWbk.Sheets(1)
    
    Dim lngRowCount As Long
    Dim lngLoop As Long
    Dim dicNames As Object
    Dim strFilename As String
    Dim strNamePart() As String
    Set dicNames = CreateObject("Scripting.Dictionary")
    
    'get data from workbook
    arv = xlWks.Range("A2:C" & xlWks.UsedRange.Rows.Count).Value
    xlWbk.Close False
    xlApp.Quit
    
    'populate dictionary with name and number data
    lngRowCount = UBound(arv)
    For lngLoop = 1 To lngRowCount
        If dicNames.Exists(arv(lngLoop, 3) & " " & arv(lngLoop, 2)) Then
            dicNames(arv(lngLoop, 3) & " " & arv(lngLoop, 2)) = -1
        Else
            dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), arv(lngLoop, 1)
        End If
    Next lngLoop
    
    'iterate files and rename if match in dictionary
    'move file to unmatched sub-directory if no name match
    'move file to duplicate sub-directory if Excel list for this whole name was a duplicate
    'NOTE: assume that .\unmatched and .\duplicate sub-directories exist

    strFilename = Dir(Dir1.Path & "\" & "*.doc")
    Do Until Len(strFilename) = 0
        If LCase(strFilename) Like "#*_*-#_.doc" Then   'skip already renamed files
        Else
            strNamePart = Split(strFilename, "-")
            If dicNames.Exists(strNamePart(0)) Then
                If Sgn(dicNames(strNamePart(0))) = -1 Then
                    'duplicate
                    MakeSureDirectoryPathExists Dir1.Path & "\duplicate\"
                    FileCopy strFilename, "\duplicate\" & strFilename
                    Kill strFilename
                Else
                    Name strFilename As dicNames(strNamePart(0)) & "_" & strFilename
                End If
            Else
                'unmatched
                MakeSureDirectoryPathExists Dir1.Path & "\unmatched\"
                FileCopy Dir1.Path & "\" & strFilename, Dir1.Path & "\unmatched\" & strFilename
                Kill Dir1.Path & "\" & strFilename
            End If
        End If
        strFilename = Dir
    Loop

    Set dicNames = Nothing

End Sub
Private Sub cmdConvert_Click()
    Call FileRename
End Sub

Private Sub Form_Load()
    Dir1.Path = "c:"
End Sub

Open in new window

0
 
m3mdiclAuthor Commented:
Hi.  I figured it out.  Seems that the names added has more space in between the first name and last name.

Debug.print (arv(lngLoop, 3) & " " & arv(lngLoop, 2))
Bryan                  Adams

I just added trim and it worked fine.

 'populate dictionary with name and number data
    lngRowCount = UBound(arv)
    For lngLoop = 1 To lngRowCount
        If dicNames.Exists(Trim(arv(lngLoop, 3)) & " " & Trim(arv(lngLoop, 2))) Then
            dicNames(Trim(arv(lngLoop, 3)) & " " & Trim(arv(lngLoop, 2))) = -1
        Else
            dicNames.Add Trim(arv(lngLoop, 3)) & " " & Trim(arv(lngLoop, 2)), Trim(arv(lngLoop, 1))
        End If
    Next lngLoop

Open in new window

0
 
aikimarkCommented:
You are pretty close.  I moved the two API calls up to the start of the routine.

I think the reason that the results for matched files is wrong is that your description of the file names differs from your description of the Excel data.  Please post a directory list (use the /b switch) and the worksheet.  Please understand that any extra characters in the file name or the Excel data will cause this to misbehave in the way you described (nothing would match).

Option Explicit

Private Declare Function MakeSureDirectoryPathExists Lib "imagehlp.dll" (ByVal lpPath As String) As Long

Dim xlApp As Excel.Application
Dim xlWbk As Excel.Workbook
Dim xlWks As Excel.Worksheet
Dim arv() As Variant

Public Sub FileRename()
    Set xlApp = New Excel.Application
    Set xlWbk = xlApp.Workbooks.Open("C:\EMR\drv_patient_list.xls")
    Set xlWks = xlWbk.Sheets(1)
    
    Dim lngRowCount As Long
    Dim lngLoop As Long
    Dim dicNames As Object
    Dim strFilename As String
    Dim strNamePart() As String

     MakeSureDirectoryPathExists Dir1.Path & "\duplicate\"
     MakeSureDirectoryPathExists Dir1.Path & "\unmatched\"

    Set dicNames = CreateObject("Scripting.Dictionary")
    
    'get data from workbook
    arv = xlWks.Range("A2:C" & xlWks.UsedRange.Rows.Count).Value
    xlWbk.Close False
    xlApp.Quit
    
    'populate dictionary with name and number data
    lngRowCount = UBound(arv)
    For lngLoop = 1 To lngRowCount
        If dicNames.Exists(arv(lngLoop, 3) & " " & arv(lngLoop, 2)) Then
            dicNames(arv(lngLoop, 3) & " " & arv(lngLoop, 2)) = -1
        Else
            dicNames.Add arv(lngLoop, 3) & " " & arv(lngLoop, 2), arv(lngLoop, 1)
        End If
    Next lngLoop
    
    'iterate files and rename if match in dictionary
    'move file to unmatched sub-directory if no name match
    'move file to duplicate sub-directory if Excel list for this whole name was a duplicate
    'NOTE: assume that .\unmatched and .\duplicate sub-directories exist

    strFilename = Dir(Dir1.Path & "\" & "*.doc")
    Do Until Len(strFilename) = 0
        If LCase(strFilename) Like "#*_*-#_.doc" Then   'skip already renamed files
        Else
            strNamePart = Split(strFilename, "-")
            If dicNames.Exists(strNamePart(0)) Then
                If Sgn(dicNames(strNamePart(0))) = -1 Then
                    'duplicate
                    FileCopy strFilename, "\duplicate\" & strFilename
                    Kill strFilename
                Else
                    Name strFilename As dicNames(strNamePart(0)) & "_" & strFilename
                End If
            Else
                'unmatched
                FileCopy Dir1.Path & "\" & strFilename, Dir1.Path & "\unmatched\" & strFilename
                Kill Dir1.Path & "\" & strFilename
            End If
        End If
        strFilename = Dir
    Loop

    Set dicNames = Nothing

End Sub
Private Sub cmdConvert_Click()
    Call FileRename
End Sub

Private Sub Form_Load()
    Dir1.Path = "c:"
End Sub

Open in new window

0
 
aikimarkCommented:
Ah.  That was it, then.  I'm glad you found it.  I posted my last comment prior to seeing your concluding comment.
0

Featured Post

Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 10
  • 7
Tackle projects and never again get stuck behind a technical roadblock.
Join Now