How can I search a list of phrases for specific terms?


I have written a small database program using access and VB to store all the movies I have along with their location, format, director, cast etc. I also have a text file full of names of movies and I want to be able to search through this list for a specific set of words or a word. Basically, I have set my program to automatically enter the file name of a movie when I select it, but the name often contains lots of other words and characters that aren't relevant, so i want the program to compare each word in the file name with the names of the movies in my text file and suggest possibilities.

For example, If the file name is Pulp_Fiction xvd hjkhj stero.avi, it should suggest Pulp Fiction

I hope I have explained well enough. Please ask if I haven't been clear.

Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Carl TawnSystems and Integration DeveloperCommented:
Not quite sure what you're trying to do. But you could try something like:

Option Explicit

Private Sub Command1_Click()

    Dim fso As New Scripting.FileSystemObject
    Dim fol As Folder
    Dim fil As File
    Set fol = fso.GetFolder("C:\My Documents\My Pictures")
    For Each fil In fol.Files
        If CheckFile(fil.Name) Then
            lstResults.AddItem fil.Name
        End If
    Next fil

End Sub

Private Function CheckFile(ByVal fileName As String) As Boolean

    CheckFile = True   '// Default
    Dim arrWords() As String
    Dim i As Integer
    If Len(txtSearch.Text) > 0 Then
        arrWords = Split(txtSearch.Text, " ")
        If UBound(arrWords) = 0 Then
            ReDim arrWords(1)
            arrWords(0) = txtSearch.Text
        End If
        For i = 0 To UBound(arrWords) - 1
            If arrWords(i) <> "" Then
                If InStr(fileName, arrWords(i)) < 1 Then
                    CheckFile = False
                    Exit For
                End If
            End If
        Next i
    End If
End Function
eadam-ukAuthor Commented:
sorry, I guess I didn't explain that well.

What i mean is say I have a file name held in a string "pulp fiction xvid ac3.avi", I want the program to suggest "pulp fiction" to the user. I have a list of movie titles in a text file that I read in, but I need to compare them.

At the moment my code looks like this:

Public Function guessName(source As String) As String
Dim cnt As Long
Dim tmp As String
Dim tmpArray() As String

cnt = 0

Dim oFSO As New FileSystemObject
Dim oFSTR As Scripting.TextStream

ReDim tmpArray(2000)

     Set oFSTR = oFSO.OpenTextFile(App.Path & "\MovieList2.txt")

     Do While Not oFSTR.AtEndOfStream

            tmpArray(cnt) = oFSTR.ReadLine

            cnt = cnt + 1

   Dim i As Integer
   Dim tmpStr As String
   For i = LBound(tmpArray) To UBound(tmpArray)
    tmpStr = compareMovies(source, tmpArray(i))
    If (Not tmpStr = "") Then
        guessName = tmpStr
    End If
   Next i

End Function

Private Function compareMovies(source As String, target As String) As String

If (source = "" Or target = "") Then
    Exit Function
End If

Dim temp As String
Dim pos As Long

If (source = target) Then
    compareMovies = source
ElseIf (InStr(source, target)) Then
    compareMovies = target
End If

End Function

This works if the exact name that is in the text file is in the file name, but often it isn't. So I need a way of guessing what the name could be based on the filename.

For example.

If the file name is X-men.avi but my list of films contains "The X men" it won't find it.

Does that make more sense?

Thanks again,
Leigh PurvisDatabase DeveloperCommented:
You could perhaps have a comparison that looks for each word in the source name in the file name.
And counts the number of words that it gets a hit from.
Return those to the main function - along with the hit count, and present the user with that entire list (sorted by hit count).  OK - so you'd get a load from "The" - hence the need for sorting ;-)

Something like

Function compareMovies(source As String, target As String, intHits As Integer) As String

Dim temp As String
Dim intPos As Long
Dim intInstr As Integer

    If (source = "" Or target = "") Then
        Exit Function
    End If
    intPos = 1
    intInstr = InStr(intPos, source & " ", " ")
    Do Until intInstr = 0
        If InStr(target, Mid(source, intPos, intInstr - intPos)) > 0 Then
            intHits = intHits + 1
        End If
        intPos = intInstr + 1
        intInstr = InStr(intPos, source & " ", " ")
    If intHits > 0 Then
        compareMovies = target
    End If
End Function
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Here is how I would approach this
(Programmers love to go overboard)

strFN = "Pulp_Fiction xvd hjkhj stero.avi"
strSeparaters = " @$%^&()!+=~`,.;"
' These could be anything that separates a word
' as these are the most valid characters that can be in
' a file name and not be a letter or number.
For Y = 1 to Len(strSeparaters)
   For X = 1 to Len(strFN)
      If Mid(strFN,X,1)=Mid(strSeparaters ,Y,1) then
         If Not intStart = 0 then   'I use this to flag that I should be looking
                                            'for the intEnd not the intStart.
            intStart = X+1   'Don't want this character in the word so +1.
            intEnd = X-1   'Don't want this character in the word so -1.
            ' Now I have marked the Start and End of a word
            For Each BadWord in List
                If Mid(strFN,intStart,intEnd) <> BadWord Then _ 'Note the underscore
                   strDisplayTitle = strDisplayTitle & Mid(strFN,intStart,intEnd) & " "
            intStart = 0
            intEnd = 0
         End If
      End If
Next X, Y

This should do the trick. Understand that I haven't tested this code and is from Head to Paper. If I made a mistake, I am sure you can figure it out. I think what you were looking for mainly was an algorithm.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Ok, I noticed I should have added the underscore to

strSeparaters = " @$%^&()!+=~`,.;_"

and failed to clarify that...

For Each Badword in List

was to presume that you know how to create a LIST and a variable BADWORDS set to that list.

There are some variations to this you could apply, but still you get the algorithm.

Ok,here's another approach. This example will load the file into a recordset in memory which can then be filtered for your search terms. For this example I created a test movie list and added a command button and textbox to a form.If more than one movie is found that matches the criteria it will return them as a list seperated by carriage return line feeds.

' add reference to Microsoft ActiveX Data Objects Library

' add to declarations area
Dim rsMovies As ADODB.Recordset

Private Sub Form_Load()
 'load the movies into recordset
Call LoadMovieList("C:\Movies2.txt")
End Sub

Private Sub Command1_Click()
'pass search term typed in textbox
MsgBox guessName(Text1.Text)
End Sub

Private Sub LoadMovieList(MoveListFile As String)
Dim ff As Integer, Ln As String

' create recordset
Set rsMovies = New ADODB.Recordset
With rsMovies.Fields
    .Append "MovieTitle", adVarChar, 60 ' add text field 60 chars
End With

' read file into recordset
ff = FreeFile
Open MoveListFile For Input As #ff
Do Until EOF(ff)
  Line Input #ff, Ln
    rsMovies.Fields("MovieTitle") = Trim$(Ln)
Close #ff

End Sub

Public Function guessName(source As String) As String
Dim arrTitle() As String, x As Long, strSearch As String

source = Replace(source, "-", " ")

' load words into an array
If InStr(source, " ") Then
    arrTitle = Split(source, " ")
    ' build filter string
    For x = 0 To UBound(arrTitle)
        If x = 0 Then
            strSearch = "MovieTitle Like '%" & arrTitle(x) & "%'"
            strSearch = strSearch & " AND " & "MovieTitle Like '%" & arrTitle(x) & "%'"
        End If
    strSearch = "MovieTitle Like '%" & source & "%'"
End If

' filter recordset for items that contain search terms
rsMovies.Filter = strSearch

' build return string
guessName = ""
With rsMovies
While Not .EOF
    If guessName = "" Then
        guessName = .Fields("MovieTitle").Value
        guessName = guessName & vbCrLf & .Fields("MovieTitle").Value
    End If
End With
' release filter
rsMovies.Filter = adFilterNone
End Function

Private Sub Form_Unload(Cancel As Integer)
' cleanup recordset
If Not rsMovies Is Nothing Then
    If rsMovies.State <> adStateClosed Then
    End If
    Set rsMovies = Nothing
End If
End Sub
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Visual Basic Classic

From novice to tech pro — start learning today.