Avatar of Manoj Kumbhar
Manoj Kumbhar
 asked on

Fastest approach to search text in multiple files

I need to search a string or text in around 2000 files. Input string is coming from an excel document.

Basically I need to search if UserName exists in 2000 excel files placed at shared location. there are around 3000 users for which I need perform this search. I have developed below code by searching on google but it is taking 3 mins for each user which is too long. Please suggest better way.
Public Shared Function Search(ByVal SearchText As String, ByVal FolderPath As String) As List(Of String)

        Dim searchresult As List(Of String) = New List(Of String)
        Dim list As ObjectModel.ReadOnlyCollection(Of String)

        list = My.Computer.FileSystem.FindInFiles(FolderPath, SearchText, True, FileIO.SearchOption.SearchTopLevelOnly)

        searchresult = list.Select(Function(a) a).ToList()
        'For Each name In list
        '    searchresult.Add(name)
        'Next

        Return searchresult
End Function

Open in new window

Visual Basic.NETGoogleMicrosoft Excel

Avatar of undefined
Last Comment
Tracy

8/22/2022 - Mon
aikimark

Since opening/closing a file is relatively expensive (to string searching), you would be better off doing all your name searches for each file.
Ark

Try
Public Shared Function Search(ByVal SearchUsers As List(Of String),
                              ByVal FolderPath As String,
                              Optional ByVal IgnoreCase As Boolean = True,
                              Optional ByVal pattern As String = "*.*") As Dictionary(Of String, List(Of String))
    Dim result As New Dictionary(Of String, List(Of String))
    For Each user In SearchUsers
        If IgnoreCase Then user = user.ToLower
        If Not result.ContainsKey(user) Then
            result.Add(user, New List(Of String))
        End If
    Next
    Dim files = IO.Directory.GetFiles(FolderPath, pattern)
    For Each file In files
        Dim content = IO.File.ReadAllText(file)
        If IgnoreCase Then content = content.ToLower
        For Each user In result.Keys
             If content.Contains(user) Then result(user).Add(file)
        Next
    Next
    Return result
End Function

Open in new window

To use:
Dim users = "User1,User2,User3,User4".Split(","c).ToList
Dim results = Search(users, yourPath, True, "*.xls")
For each user In results.Keys
    Debug.Print(String.Format("{0} found at: {1}", user, String.Join("; ", results(user).ToArray)))
Next

Open in new window

Manoj Kumbhar

ASKER
Thanks for providing solution.

I tried above approach but found few issues -
1. If file is open by other user code is throwing an exception
2. It doesn't have option to search exact text - e.g. if I search for Smith search result is showing both Smith and Smith1 but we want only Smith.

Please suggest
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
aikimark

1. What does your word list look like?  (please post representative sample data in a workbook or post your actual workbook)
2. This question is tagged with VB.Net.  Is the code running outside of Excel or is this actually VBA?
ASKER CERTIFIED SOLUTION
Ark

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Tracy

This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.