Function-to-return-given-records-from-multiple-files-needed#2

Hi Experts,
This is in reference of my previous question.
https://www.experts-exchange.com/questions/29131073/Function-to-return-given-records-from-multiple-files-needed.html?anchor=a42767252¬ificationFollowed=220950287#a42767252
I have now the following code
Public Function ListRecords(sFolder As String, sFile As String, sCriteria As String, sNewFile As String)
    Const ForReading = 1
    Dim objRegEx As Object, objFSO As Object, objFile As Object
    Set objRegEx = CreateObject("VBScript.RegExp")
    Set objFSO = CreateObject("Scripting.FileSystemObject")

    Set newFile = objFSO.CreateTextFile(sNewFile, True)
    
    objRegEx.Pattern = sCriteria
    sDir = sFolder & "\"
    StrFile = Dir(sDir & "*" & sFile & "*")
    Do While Len(StrFile) > 0
        Set objFile = objFSO.OpenTextFile(sDir & StrFile, ForReading)
            
        Do Until objFile.AtEndOfStream
            strSearchString = objFile.ReadLine
            Set colMatches = objRegEx.Execute(strSearchString)
            If colMatches.Count > 0 Then
                For Each strMatch In colMatches
                    '' Write strSearchString
                    newFile.Write strSearchString
                Next
            End If
        Loop
        'Debug.Print objFile.DateCreated
        objFile.Close

        
        StrFile = Dir

    Loop
    
    objFile.Close
    newFile.Close
End Function

Open in new window


Would like to add the below
1- function should have another param for field name, one for field name and one for criteria.
2- should auto allocate the field position in order to find the matching criteria.
3- if possible to add date/time of file created of each file into the output file.

Thanks
LVL 6
bfuchsAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Fabrice LambertConsultingCommented:
Hi,

- Your funtion break the SRP and should be split into several function with one and only one responsibility.
- Don't use the Dir() function to browse the file system, as it store its parameters statically, wich can lead to nasty surprises.
The FSO library is much more reliable.
2- should auto allocate the field position in order to find the matching criteria.
What do you mean here ?
Regexs do not need any "allocation" to perform their job.
3- if possible to add date/time of file created of each file into the output file.
Another reason against the Dir() function, the FSO Library will provide that information.
Rikin ShahMicrosoft Dynamics CRM ConsultantCommented:
Hi bfuchs,

I am little occupied with work. But here is the pseudo code, if you can try yourself-

1. Use InStr to find the position "Patient ID" from the header
2. Cut the string till that position and store somewhere. You'll get the string before the "Patient ID"
3. Now search how many Commas comes from start position to End of String.
Dim commas = len(yourString) - len(replace(yourString, ",", ""))

Open in new window

4. Use for loop to check the position of the comma before the Patient ID in actual data
5. Search for your data and write to another file if match found.

Hope this helps.
Fabrice LambertConsultingCommented:
After looking at your previous post, can you describe your input file ?

It looks like a text file with commas separated fields.
If this is the case, there are better solutions than computing commas positions and other akward things.
Maximize Customer Retention with Superior Service

The IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy for valuable how-to assets including sample agreements, checklists, flowcharts, and more to help build customer satisfaction and retention.

bfuchsAuthor Commented:
Hi Experts,

- Your funtion break the SRP and should be split into several function with one and only one responsibility.
I would not like to create several functions for this as its for some temp usage, would not spend too much time on being proper, and not on efficiency either, rather spend on the accuracy...

If this is the case, there are better solutions than computing commas positions and other akward things.
Yes this is the case, see attached one example of file (however field names and position may vary).

@Rikin,
Let see if Fabrice comes up with a complete function and more accurate as he claims, will go with that, otherwise will follow your instructions above.

Thanks,
Ben
ENTOUT_613_PATMedProfileChanges_2018.csv
bfuchsAuthor Commented:
Hi Experts,
@Rikin,
I will try follow your guidance, just missing one point, how to get the date/time of file created to get written along with that record?
Thanks,
Ben
Rikin ShahMicrosoft Dynamics CRM ConsultantCommented:
Hi Ben,

Here you go!

Set objFSO = CreateObject("Scripting.FileSystemObject") 
Set objFile = objFSO.GetFile("<FileName>") 
 
MsgBox "Date created: " & objFile.DateCreated 

Open in new window

bfuchsAuthor Commented:
Hi,
Any idea what am I doing wrong?
Public Function ListRecords(sFolder As String, sFile As String, sCriteria As String, sNewFile As String)
    Const ForReading = 1
    Dim objRegEx As Object, objFSO As Object, objFile As Object
    Set objRegEx = CreateObject("VBScript.RegExp")
    Set objFSO = CreateObject("Scripting.FileSystemObject")

    Set newFile = objFSO.CreateTextFile(sNewFile, True)
    
    objRegEx.Pattern = sCriteria
    sDir = sFolder & "\"
    StrFile = Dir(sDir & "*" & sFile & "*")
    Do While Len(StrFile) > 0
        Set objFile = objFSO.OpenTextFile(sDir & StrFile, ForReading)
            
        Do Until objFile.AtEndOfStream
            StrSearchString = objFile.ReadLine
            Set colMatches = objRegEx.Execute(StrSearchString)
                    Set objFile = objFSO.GetFile(sDir & StrFile)
                    StrSearchString = StrSearchString & ", " & objFile.DateCreated
            If colMatches.Count > 0 Then
                For Each strMatch In colMatches
                    '' Write strSearchString
                    newFile.Write StrSearchString
                Next
            End If
        Loop
        'Debug.Print objFile.DateCreated
        objFile.Close

        
        StrFile = Dir

    Loop
    
    objFile.Close
    newFile.Close
End Function

Open in new window

See attached error.
on line below
        Do Until objFile.AtEndOfStream

Open in new window

Thanks,
Ben
Untitled.png
Rikin ShahMicrosoft Dynamics CRM ConsultantCommented:
Hi Ben,

I am not able to check about the error without proper environment here.. Let me know if you were able to go further with it?
Fabrice LambertConsultingCommented:
See attached error.
on line below
"Option Explicit" Were Are You ?

Your issue come from the following lines wich do not return the same object type:
Set objFile = objFSO.OpenTextFile(sDir & StrFile, ForReading)
Set objFile = objFSO.GetFile(sDir & StrFile)

Open in new window

The former return a Scripting.TextStream object, and the later return a Scripting.File object.
aikimarkCommented:
Replace this:
            Set colMatches = objRegEx.Execute(strSearchString)
            If colMatches.Count > 0 Then

Open in new window

with this:
            If objRegEx.Test(strSearchString) Then

Open in new window

There is no need to use matches, since you don't do anything with the matches, if found.

It looks like you're coding your own grep routine.
bfuchsAuthor Commented:
Hi Experts,

Okay I applied each of your suggestions and stayed with the following
Public Function ListRecords(sFolder As String, sFile As String, sCriteria As String, sNewFile As String)
    Dim newFile As Object, sDir As String, StrFile As String, StrSearchString As String, colMatches As Object
    Dim strMatch As Object, objFile2 As Object
    Const ForReading = 1
    Dim objRegEx As Object, objFSO As Object, objFile As Object
    Set objRegEx = CreateObject("VBScript.RegExp")
    Set objFSO = CreateObject("Scripting.FileSystemObject")

    Set newFile = objFSO.CreateTextFile(sNewFile, True)
    
    objRegEx.Pattern = sCriteria
    sDir = sFolder & "\"
    StrFile = Dir(sDir & "*" & sFile & "*")
    Do While Len(StrFile) > 0
        Set objFile = objFSO.OpenTextFile(sDir & StrFile, ForReading)
            
        Do Until objFile.AtEndOfStream
            StrSearchString = objFile.ReadLine
            'Set colMatches = objRegEx.Execute(StrSearchString)
                    Set objFile2 = objFSO.GetFile(sDir & StrFile)
                    StrSearchString = StrSearchString & ", " & objFile2.DateCreated
            'If colMatches.Count > 0 Then
            Debug.Print StrSearchString
            If objRegEx.Test(StrSearchString) Then
               ' For Each strMatch In colMatches
                    '' Write strSearchString
                    newFile.Write StrSearchString
               ' Next
            End If
        Loop
        'Debug.Print objFile.DateCreated
        objFile.Close

        
        StrFile = Dir

    Loop
    
    objFile.Close
    newFile.Close
End Function

Open in new window

The only problem is that it does not look for specific field, so if i search for "123456" it will be true regardless to where on which field this was found.
Any easy solution for this?

Thanks,
Ben
Fabrice LambertConsultingCommented:
Do you know that you can query your file instead of parsing it ?
You write a Schema.ini file describing the file's structure, and an SQL query. All that remain is retrieving the data.
This way, there is no need to worry about field's position, neither field's order:
Option Explicit

Public Sub ListRecords(ByVal folderPath As String, ByVal fileName As String, ByVal fieldName As String, ByVal criteria As String, ByVal newFilePath As String)
    Const adUseServer As Byte = 2
    Const adOpenStatic As Byte = 3
    Const adLockReadOnly As Byte = 1
    
    Dim dateCreated As Date
    dateCreated = GetDateCreated(folderPath, fileName)
    
    WriteSchema folderPath, fileName
    
        '// SQL query to retrieve data
    Dim sql As String
    sql = vbNullString
    sql = sql & "SELECT *" & vbCrLf
    sql = sql & "FROM [" & fileName & "]" & vbCrLf
    sql = sql & "WHERE [" & fieldName & "] = '" & criteria & "';"
    
    Dim cn As Object        '// ADODB.Connection
    cn.ConnectionString = GetConnectionString(folderPath)
    cn.Open
    
    Dim rs As Object        '// ADODB.Recordset
    Set rs = CreateObject("ADODB.Recordset")
    
    rs.CursorLocation = adUseServer
    rs.CursorType = adOpenStatic
    rs.LockType = adLockReadOnly
    rs.Open sql, cn
    
        '// write data retrieved
    WriteFileData rs, dateCreated, newFilePath
    
        '// cleanup
    rs.Close
    cn.Close
    DeleteSchema folderPath
End Sub

Private Function GetDateCreated(ByVal folderPath As String, ByVal fileName As String) As Date
    Dim fso As Object       '// Scripting.FileSystemObject
    Set fso = CreateObject("Scripting.FileSystemObject")
    
    Dim file As Object      '// Scripting.file
    Set file = fso.GetFile(folderPath & "\" & fileName)
    
    GetDateCreated = file.dateCreated
End Function

    '// Write a schema.ini file describing the file'sstructure
    '// and enforcing all data as Text to prevent automatic convertions
Private Sub WriteSchema(ByVal folderPath As String, ByVal fileName As String)
    Const ForAppending = 8
    Dim fso As Object       '// Scripting.FileSystemObject
    Set fso = CreateObject("Scripting.FileSystemObject")
    
    Dim file As Object      '// Scripting.file
    Set file = fso.GetFile(folderPath & "\Schema.ini")
    
    Dim oStream As Object   '// Scripting.TextStream
    Set oStream = file.OpenAsTextStream(ForAppending)
    
    oStream.WriteLine "[" & file.Name & "]"
    oStream.WriteLine "ColNameHeader=True"
    oStream.WriteLine "Format=Delimited(,)"
    oStream.WriteLine "col1=""Agency ID"" char"
    oStream.WriteLine "col2=""Patient ID"" char"
    oStream.WriteLine "col3=""Med Profile ID"" char"
    oStream.WriteLine "col4=""Status"" char"
    oStream.WriteLine "col5=""Medication ID"" char"
    oStream.WriteLine "col6=""Medication"" char"
    oStream.WriteLine "col7=""Ord Physician ID"" char"
    oStream.WriteLine "col8=""Dose ID"" char"
    oStream.WriteLine "col9=""Dose Other"" char"
    oStream.WriteLine "col10=""Form ID"" char"
    oStream.WriteLine "col11=""Form Other"" char"
    oStream.WriteLine "col12=""Route ID"" char"
    oStream.WriteLine "col13=""Route Other"" char"
    oStream.WriteLine "col14=""Amount"" char"
    oStream.WriteLine "col15=""Class"" char"
    oStream.WriteLine "col16=""Frequency ID"" char"
    oStream.WriteLine "col17=""Start Date"" char"
    oStream.WriteLine "col18=""Order Date"" char"
    oStream.WriteLine "col19=""Taught Date"" char"
    oStream.WriteLine "col20=""Discontinue Date"" char"
    oStream.WriteLine "col21=""Discontinue Order Date"" char"
    oStream.WriteLine "col22=""Preferred Pharmacy ID"" char"
    oStream.WriteLine "col23=""Comment"" char"
    oStream.WriteLine "col24=""User"" char"
    oStream.WriteLine "col25=""Certification Period ID"" char"
    oStream.WriteLine "col26=""Certification Start Date"" char"
    oStream.WriteLine "col27=""Certification End Date"" char"
    oStream.WriteLine "col28=""Other Frequency Text"" char"
    oStream.WriteLine "col29=""Payer ID"" char"
    oStream.WriteLine "col30=""Doc ID"" char"
    oStream.WriteLine "col31=""Modified Date"" char"
    oStream.WriteLine "col32=""Is Deleted"" char"
    oStream.Close
End Sub

Private Sub DeleteSchema(ByVal folderPath As String)
    Kill folderPath & "\Schema.ini"
End Sub

Private Function GetConnectionString(ByVal folderPath As String) As String
    Const Driver As String = "text"
    Const Provider As String = "Microsoft.ACE.OLEDB.12.0"
    Const UseHeaders As String = "HDR=YES"
    
    Dim ExtendedProperties As String
    ExtendedProperties = ExtendedProperties & Driver & ";"
    ExtendedProperties = ExtendedProperties & UseHeaders & ";"
    ExtendedProperties = ExtendedProperties & "FMT=Delimited;"
    
    GetConnectionString = vbNullString
    GetConnectionString = GetConnectionString & "Provider=" & Provider & ";"
    GetConnectionString = GetConnectionString & "Data Source=" & folderPath & "\;"
    GetConnectionString = GetConnectionString & "Extended Properties=""" & ExtendedProperties & """;"
End Function

'// Private Sub writeFileData(ByRef rs As ADODB.Recordset, ByVal dateCreated As Date, ByVal path As String)
Private Sub WriteFileData(ByRef rs As Object, ByVal dateCreated As Date, ByVal path As String)
    Const ForAppending = 8
    Dim fso As Object       '// Scripting.FileSystemObject
    Set fso = CreateObject("scripting.filesystemobject")
    
    If (fso.FileExists(path)) Then
        fso.DeleteFile path
    End If
    
    Dim oStream As Object       '// Scripting.TextStream
    Set oStream = fso.OpenTextFile(path, ForAppending, True)
    
    While Not rs.EOF
        Dim field As Object     '// ADOR.field
        For Each field In rs.Fields
            oStream.Write field.Value & ","
        Next
        oStream.Write Format(dateCreated, "dd/mm/yyyy")
        oStream.Write vbNewLine
    Wend
    oStream.Close
End Sub

Open in new window

Sample usage:
ListRecords "c:\temp", "myFile.csv", "Patiend ID", "123", "c:\temp\NewFile.csv"

Open in new window

Fabrice LambertConsultingCommented:
And if you want the function to support any combination of quotes or double quotes as criteria, and SQL injection proof, use a ADODB.Command and an ADODB.Parameter object:
Public Sub ListRecords(ByVal folderPath As String, ByVal fileName As String, ByVal fieldName As String, ByVal criteria As String, ByVal newFilePath As String)
    Const adUseServer As Byte = 2
    Const adOpenStatic As Byte = 3
    Const adLockReadOnly As Byte = 1
    Const adVarChar As Byte = 200
    Const adParamInput As Byte = 1
    Const adCmdText As Byte = 1
    
    Dim dateCreated As Date
    dateCreated = GetDateCreated(folderPath, fileName)
    
    WriteSchema folderPath, fileName
    
    Dim sql As String
    sql = vbNullString
    sql = sql & "SELECT *" & vbCrLf
    sql = sql & "FROM [" & fileName & "]" & vbCrLf
    sql = sql & "WHERE [" & fieldName & "] = ?;"
    
    Dim cn As Object        '// ADODB.Connection
    cn.ConnectionString = GetConnectionString(folderPath)
    cn.Open
    
    Dim cm As Object        '// ADODB.Command
    Set cm = CreateObject("ADODB.Command")
    Set cm.ActiveConnection = cn
    cm.CommandText = sql
    cm.CommandType = adCmdText
    
    Dim pm As Object        '// ADODB.Parameter
    Set pm = cm.CreateParameter(fieldName, adVarChar, adParamInput, 255, criteria)
    
    cm.Parameters.Append pm
    
    Dim rs As Object        '// ADODB.Recordset
    Set rs = cm.Execute
    
    WriteFileData rs, dateCreated, newFilePath
    
    rs.Close
    cn.Close
    DeleteSchema folderPath
End Sub

Open in new window

bfuchsAuthor Commented:
@Fabrice,

First thanks for putting together this function.

While testing I realized this is expecting a specific file name, while I need something that will loop thru all files of that given folder, see attached.

In addition as described above, this would be used for many kind of files, ans therefore I'm not able to write a schema, or rather prefer not to have to write the schema.ini every time I use this function for different data file format.

Thanks,
Ben
Untitled.png
Fabrice LambertConsultingCommented:
The Schema.ini file is optional, but I recommend it to have a better control on data types, since the ADO driver attempt to guess the data types otherwise (and it can guess wrong).
Just do not call the 2 fonctions responsible for managing the Schema.ini file (WriteSchema / DeleteSchema) and you'll be done.

As for looping trough files in a directory, an additional function will do the job.
And With SRP compliant function, it is very easy to add the functionality without messing what has already been done:
Public Sub ListFiles(ByVal FolderPath as String, ByVal newFilePath As String)
    Dim fso As Object    '// Scripting.FileSystemObject
    Set fso = CreateObject("Scripting.FileSystemObject")

    Dim folder As Object    '// Scripting.Folder
    Set folder = fso.GetFolder(path)

    Dim file As Object    '// Scripting.File
    For Each file in folder.Files
        ListRecords folder.Path, file.Name, "Patiend ID", "123", newFilePath
    Next
End Sub

Open in new window

Slight update needed in the previous code tho, since writing data will be called multiple times, you don't want the targeted file to be re-created everytime, so e need to remove the lines taking care of output file. This will need to be done at higher level (prior looping trough the folder):
Private Sub WriteFileData(ByRef rs As Object, ByVal dateCreated As Date, ByVal path As String)
    Const ForAppending = 8
    Dim fso As Object       '// Scripting.FileSystemObject
    Set fso = CreateObject("scripting.filesystemobject")
    
    Dim oStream As Object       '// Scripting.TextStream
    Set oStream = fso.OpenTextFile(path, ForAppending)
    
    While Not rs.EOF
        Dim field As Object     '// ADOR.field
        For Each field In rs.Fields
            oStream.Write field.Value & ","
        Next
        oStream.Write Format(dateCreated, "dd/mm/yyyy")
        oStream.Write vbNewLine
    Wend
    oStream.Close
End Sub

Open in new window

(I do not provide again the code taking care of output file, as it is written in my previous answer).

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
aikimarkCommented:
You can create/update the schema.ini file programmatically.
bfuchsAuthor Commented:
@Fabrice,
I'm not in today, will do the testing on Sunday & let yoy know.
Have a nice weekend!
Thanks,
Ben
bfuchsAuthor Commented:
Thanks to all participants!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Programming

From novice to tech pro — start learning today.