How can I search a list of phrases for specific terms?

Posted on 2006-04-08
Last Modified: 2010-05-01

I have written a small database program using access and VB to store all the movies I have along with their location, format, director, cast etc. I also have a text file full of names of movies and I want to be able to search through this list for a specific set of words or a word. Basically, I have set my program to automatically enter the file name of a movie when I select it, but the name often contains lots of other words and characters that aren't relevant, so i want the program to compare each word in the file name with the names of the movies in my text file and suggest possibilities.

For example, If the file name is Pulp_Fiction xvd hjkhj stero.avi, it should suggest Pulp Fiction

I hope I have explained well enough. Please ask if I haven't been clear.

Question by:eadam-uk
    LVL 52

    Expert Comment

    by:Carl Tawn
    Not quite sure what you're trying to do. But you could try something like:

    Option Explicit

    Private Sub Command1_Click()

        Dim fso As New Scripting.FileSystemObject
        Dim fol As Folder
        Dim fil As File
        Set fol = fso.GetFolder("C:\My Documents\My Pictures")
        For Each fil In fol.Files
            If CheckFile(fil.Name) Then
                lstResults.AddItem fil.Name
            End If
        Next fil

    End Sub

    Private Function CheckFile(ByVal fileName As String) As Boolean

        CheckFile = True   '// Default
        Dim arrWords() As String
        Dim i As Integer
        If Len(txtSearch.Text) > 0 Then
            arrWords = Split(txtSearch.Text, " ")
            If UBound(arrWords) = 0 Then
                ReDim arrWords(1)
                arrWords(0) = txtSearch.Text
            End If
            For i = 0 To UBound(arrWords) - 1
                If arrWords(i) <> "" Then
                    If InStr(fileName, arrWords(i)) < 1 Then
                        CheckFile = False
                        Exit For
                    End If
                End If
            Next i
        End If
    End Function

    Author Comment

    sorry, I guess I didn't explain that well.

    What i mean is say I have a file name held in a string "pulp fiction xvid ac3.avi", I want the program to suggest "pulp fiction" to the user. I have a list of movie titles in a text file that I read in, but I need to compare them.

    At the moment my code looks like this:

    Public Function guessName(source As String) As String
    Dim cnt As Long
    Dim tmp As String
    Dim tmpArray() As String

    cnt = 0

    Dim oFSO As New FileSystemObject
    Dim oFSTR As Scripting.TextStream

    ReDim tmpArray(2000)

         Set oFSTR = oFSO.OpenTextFile(App.Path & "\MovieList2.txt")

         Do While Not oFSTR.AtEndOfStream

                tmpArray(cnt) = oFSTR.ReadLine

                cnt = cnt + 1

       Dim i As Integer
       Dim tmpStr As String
       For i = LBound(tmpArray) To UBound(tmpArray)
        tmpStr = compareMovies(source, tmpArray(i))
        If (Not tmpStr = "") Then
            guessName = tmpStr
        End If
       Next i

    End Function

    Private Function compareMovies(source As String, target As String) As String

    If (source = "" Or target = "") Then
        Exit Function
    End If

    Dim temp As String
    Dim pos As Long

    If (source = target) Then
        compareMovies = source
    ElseIf (InStr(source, target)) Then
        compareMovies = target
    End If

    End Function

    This works if the exact name that is in the text file is in the file name, but often it isn't. So I need a way of guessing what the name could be based on the filename.

    For example.

    If the file name is X-men.avi but my list of films contains "The X men" it won't find it.

    Does that make more sense?

    Thanks again,
    LVL 44

    Expert Comment

    by:Leigh Purvis
    You could perhaps have a comparison that looks for each word in the source name in the file name.
    And counts the number of words that it gets a hit from.
    Return those to the main function - along with the hit count, and present the user with that entire list (sorted by hit count).  OK - so you'd get a load from "The" - hence the need for sorting ;-)

    Something like

    Function compareMovies(source As String, target As String, intHits As Integer) As String

    Dim temp As String
    Dim intPos As Long
    Dim intInstr As Integer

        If (source = "" Or target = "") Then
            Exit Function
        End If
        intPos = 1
        intInstr = InStr(intPos, source & " ", " ")
        Do Until intInstr = 0
            If InStr(target, Mid(source, intPos, intInstr - intPos)) > 0 Then
                intHits = intHits + 1
            End If
            intPos = intInstr + 1
            intInstr = InStr(intPos, source & " ", " ")
        If intHits > 0 Then
            compareMovies = target
        End If
    End Function

    Accepted Solution

    Here is how I would approach this
    (Programmers love to go overboard)

    strFN = "Pulp_Fiction xvd hjkhj stero.avi"
    strSeparaters = " @$%^&()!+=~`,.;"
    ' These could be anything that separates a word
    ' as these are the most valid characters that can be in
    ' a file name and not be a letter or number.
    For Y = 1 to Len(strSeparaters)
       For X = 1 to Len(strFN)
          If Mid(strFN,X,1)=Mid(strSeparaters ,Y,1) then
             If Not intStart = 0 then   'I use this to flag that I should be looking
                                                'for the intEnd not the intStart.
                intStart = X+1   'Don't want this character in the word so +1.
                intEnd = X-1   'Don't want this character in the word so -1.
                ' Now I have marked the Start and End of a word
                For Each BadWord in List
                    If Mid(strFN,intStart,intEnd) <> BadWord Then _ 'Note the underscore
                       strDisplayTitle = strDisplayTitle & Mid(strFN,intStart,intEnd) & " "
                intStart = 0
                intEnd = 0
             End If
          End If
    Next X, Y

    This should do the trick. Understand that I haven't tested this code and is from Head to Paper. If I made a mistake, I am sure you can figure it out. I think what you were looking for mainly was an algorithm.



    Expert Comment

    Ok, I noticed I should have added the underscore to

    strSeparaters = " @$%^&()!+=~`,.;_"

    and failed to clarify that...

    For Each Badword in List

    was to presume that you know how to create a LIST and a variable BADWORDS set to that list.

    There are some variations to this you could apply, but still you get the algorithm.

    LVL 28

    Assisted Solution

    Ok,here's another approach. This example will load the file into a recordset in memory which can then be filtered for your search terms. For this example I created a test movie list and added a command button and textbox to a form.If more than one movie is found that matches the criteria it will return them as a list seperated by carriage return line feeds.

    ' add reference to Microsoft ActiveX Data Objects Library

    ' add to declarations area
    Dim rsMovies As ADODB.Recordset

    Private Sub Form_Load()
     'load the movies into recordset
    Call LoadMovieList("C:\Movies2.txt")
    End Sub

    Private Sub Command1_Click()
    'pass search term typed in textbox
    MsgBox guessName(Text1.Text)
    End Sub

    Private Sub LoadMovieList(MoveListFile As String)
    Dim ff As Integer, Ln As String

    ' create recordset
    Set rsMovies = New ADODB.Recordset
    With rsMovies.Fields
        .Append "MovieTitle", adVarChar, 60 ' add text field 60 chars
    End With

    ' read file into recordset
    ff = FreeFile
    Open MoveListFile For Input As #ff
    Do Until EOF(ff)
      Line Input #ff, Ln
        rsMovies.Fields("MovieTitle") = Trim$(Ln)
    Close #ff

    End Sub

    Public Function guessName(source As String) As String
    Dim arrTitle() As String, x As Long, strSearch As String

    source = Replace(source, "-", " ")

    ' load words into an array
    If InStr(source, " ") Then
        arrTitle = Split(source, " ")
        ' build filter string
        For x = 0 To UBound(arrTitle)
            If x = 0 Then
                strSearch = "MovieTitle Like '%" & arrTitle(x) & "%'"
                strSearch = strSearch & " AND " & "MovieTitle Like '%" & arrTitle(x) & "%'"
            End If
        strSearch = "MovieTitle Like '%" & source & "%'"
    End If

    ' filter recordset for items that contain search terms
    rsMovies.Filter = strSearch

    ' build return string
    guessName = ""
    With rsMovies
    While Not .EOF
        If guessName = "" Then
            guessName = .Fields("MovieTitle").Value
            guessName = guessName & vbCrLf & .Fields("MovieTitle").Value
        End If
    End With
    ' release filter
    rsMovies.Filter = adFilterNone
    End Function

    Private Sub Form_Unload(Cancel As Integer)
    ' cleanup recordset
    If Not rsMovies Is Nothing Then
        If rsMovies.State <> adStateClosed Then
        End If
        Set rsMovies = Nothing
    End If
    End Sub

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    IT, Stop Being Called Into Every Meeting

    Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

    Introduction In a recent article ( for the Excel community, I showed an improved version of the Excel Concatenate() function.  While writing that article I realized that no o…
    This article describes some techniques which will make your VBA or Visual Basic Classic code easier to understand and maintain, whether by you, your replacement, or another Experts-Exchange expert.
    As developers, we are not limited to the functions provided by the VBA language. In addition, we can call the functions that are part of the Windows operating system. These functions are part of the Windows API (Application Programming Interface). U…
    Show developers how to use a criteria form to limit the data that appears on an Access report. It is a common requirement that users can specify the criteria for a report at runtime. The easiest way to accomplish this is using a criteria form that a…

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    22 Experts available now in Live!

    Get 1:1 Help Now