LINQ Query to select records by searching with multiple words

My user enters a query screen and types single words with spaces in a text box.  For example 500 NASA  Select GetFiles button.

I use this code to prepare for my query:

If Me.txtTitle.Text.Contains(" ") Then

Dim kWord As String() = Me.txtTitle.Text.Split(" ")

Dim start As Integer = 1

For Each x In kWord

Dim count As Integer = kWord.Count

If start = count Then
  TitleKeys &= Chr(34) & x & Chr(34) & ")"
Else
   TitleKeys &= Chr(34) & x & Chr(34) & ") And r.Title.Contains("
    start += 1
End If
Next
       
Else
            'do nothing
End If

Result is a string that shows this:  ""500") And r.Title.Contains("NASA")"

I then do this in case the user just types in a single word:

 Dim titl As String = ""

If Me.txtTitle.Text <> "" Then
     Me.Button1.PerformClick()  'Where the above code is run

     If TitleKeys = "" Then
            titl = CStr(Me.txtTitle.Text)
     Else
             titl = TitleKeys
      End If
End If



I try to pass it to my LINQ Query.  Here is that code:

Dim recs = From r In d.tblRecordInfos _
                           Where r.Title.Contains(titl) _
                           And r.Rec_ID.Contains(rID) _
                           And r.Author.Contains(auth) _
                           And r.Author_org.Contains(Org) _
                           And r.Description.Contains(desc) _
                           And r.KeyWords.Contains(key) _
                           And r.E_File.Contains(ef) _
                           And r.File_Type = CStr(fty) _
                           And r.Category_ID.Contains(cID) _
                           And r.Sub_category_ID.Contains(scID) _
                           Order By r.Title Ascending _
                           Select r

           
Here is what I think the line should say  Where r.Title.Contains("500" And r.Title.Contains("NASA") _  but it's not.  I think it has to do with the "" at the start and end of the string.

This, of course, works perfect and is the end result that I want.  

Dim recs = From r In d.tblRecordInfos _
                           Where r.Title.Contains("500") And r.Title.Contains("NASA") _
                           And r.Rec_ID.Contains(rID) _
                           And r.Author.Contains(auth) _
                           And r.Author_org.Contains(Org) _
                           And r.Description.Contains(desc) _
                           And r.KeyWords.Contains(key) _
                           And r.E_File.Contains(ef) _
                           And r.File_Type = CStr(fty) _
                           And r.Category_ID.Contains(cID) _
                           And r.Sub_category_ID.Contains(scID) _
                           Order By r.Title Ascending _
                           Select r

Any help would be much appreciated by me and the staff!!
Thanks!!
Karen WilsonProgrammerAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

it_saigeDeveloperCommented:
I think this should assist you:
Module Module1
	ReadOnly sentences As List(Of String) = New List(Of String) From { _
	 "The brown fox jumped quickly.", _
	 "The yellow fox jumped quickly.", _
	 "The red fox jumped slowly.", _
	 "The blue fox jumped slowly." _
	}

	Sub Main()
		Dim terms = "fox jumped".Split(" ")
		Dim query = (From sentence In sentences
		 Let words = sentence.Split(New Char() {" ", ",", ".", ";", ":"}, StringSplitOptions.RemoveEmptyEntries)
		 Where words.Distinct().Intersect(terms).Count() = terms.Count()
		 Select sentence)

		Console.WriteLine("Should return all:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.WriteLine()

		terms = "fox quickly".Split(" ")
		Console.WriteLine("Should return the first two:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.WriteLine()

		terms = "jumped slowly".Split(" ")
		Console.WriteLine("Should return the last two:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.ReadLine()
	End Sub
End Module

Open in new window

Which produces the following output -Capture.JPG
-saige-
0
Karen WilsonProgrammerAuthor Commented:
If I am ready this correctly, you remove the spaces of all the sentences?  I will be searching over 400,000 document titles, key words, descriptions, etc.  

I think there is an easier way.  I just need to know how to manipulate the Where statement in my query.  It's been a long day, so maybe I'm not getting your answer.
0
Karen WilsonProgrammerAuthor Commented:
I'm trying your code and I keep getting this error.  

Additional information: Local sequence cannot be used in LINQ to SQL implementations of query operators except the Contains operator.
0
Cloud Class® Course: Amazon Web Services - Basic

Are you thinking about creating an Amazon Web Services account for your business? Not sure where to start? In this course you’ll get an overview of the history of AWS and take a tour of their user interface.

it_saigeDeveloperCommented:
The premise behind this algorithm is that you segment the sentences into their own individual list of words (which is represented as Set A) and you segment your search items into their own individual list of words (which is represented as Set B).

Once you have your two sets, you find the set intersection.

You can even decide if case matters:
Module Module1
	ReadOnly sentences As List(Of String) = New List(Of String) From { _
	 "The brown fox jumped quickly.", _
	 "The yellow fox jumped quickly.", _
	 "The red fox jumped slowly.", _
	 "The blue fox jumped slowly." _
	}

	Sub Main()
		Dim terms = "fox jumped".Split(" ")
		Dim query = (From sentence In sentences
		 Let words = sentence.Split(New Char() {" ", ",", ".", ";", ":"}, StringSplitOptions.RemoveEmptyEntries)
		 Where words.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count()
		 Select sentence)

		Console.WriteLine("Should return all:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.WriteLine()

		terms = "fox quickly".Split(" ")
		Console.WriteLine("Should return the first two:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.WriteLine()

		terms = "thE sLoWlY".Split(" ")
		Console.WriteLine("Should return the last two:")
		For Each sentence In query
			Console.WriteLine(sentence)
		Next

		Console.ReadLine()
	End Sub
End Module

Open in new window

Which produces the same output as above.

It really doesn't get much easier than that.

-saige-
0
it_saigeDeveloperCommented:
Which represents your SQL data, the sentences or the terms?

-saige-
0
Karen WilsonProgrammerAuthor Commented:
The sentences

If Me.txtTitle.Text.Contains(" ") Then

            Dim terms As String() = Me.txtTitle.Text.Split(" ")

            Dim query = (From sentence In d.tblRecordInfos _
                        Let words = sentence.Title.Split(New Char() {" ", ",", ".", ";", ":"}, StringSplitOptions.RemoveEmptyEntries)
                        Where words.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count()
                        Select sentence)


            For Each sentence In query
                Console.WriteLine(sentence)
            Next
        End If
0
it_saigeDeveloperCommented:
As you made reference to it, I am assuming that you are using a Linq to SQL generated context.  If that is the case, then I believe the simple resolution is to cast your table to an enumerable.  Something like the following (note that I changed the split to use a regular expression, it essentially splits on anything that is not an alpha numeric character):
Imports System.Text.RegularExpressions
Imports System.Runtime.CompilerServices
Module Module1
	Sub Main()
		Dim context As New EE_Q28709245_DataClassesDataContext()
		Dim terms = "young knowledge".Split(" ")
		Dim query = (From book In context.EBooks.AsEnumerable()
		 Let title = Regex.Split(book.Title, "(i?)[^a-z0-9]+")
		 Let content = Regex.Split(book.Content, "(i?)[^a-z0-9]+")
		 Where (title.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count()) OrElse
		 (content.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count())
		 Select book)

		For Each story In query
			Console.WriteLine(story.Title)
			Console.WriteLine()
			Console.WriteLine(story.Content.Wrap(60))
		Next

		Console.ReadLine()
	End Sub
End Module

Module Extensions
	<Extension()> _
	Public Function Wrap(ByVal source As String, ByVal length As Integer) As String
		If String.IsNullOrEmpty(source) Then Return String.Empty
		If source.Length = 0 Then Return String.Empty

		Dim words As String() = source.Split(New String() {" "}, StringSplitOptions.RemoveEmptyEntries)
		Dim lines As List(Of String) = New List(Of String)
		Dim group As String = String.Empty
		For Each word In words
			If (group.Length > length) OrElse ((group.Length + word.Length) > length) Then
				lines.Add(group)
				group = String.Empty
				Continue For
			End If

			group = If(group.Length > 0, String.Format("{0} {1}", group, word), String.Format("{0}{1}", group, word))
		Next
		If group.Length > 0 Then lines.Add(group)

		Return String.Join(vbLf, lines.ToArray())
	End Function
End Module

Open in new window

From the following SQL dataset -Capture.JPGProduces the following output -Capture.JPG-saige-
0
Karen WilsonProgrammerAuthor Commented:
Thanks, it's getting closer!  It didn't stop as soon as it hit query.  It is also taking approximately 3 minutes to run it.  

The next error is:
Conversion from string "500 NASA" to type 'Boolean' is not valid.  Does that have to do with your alpha numeric?  I tried using your original code on that line but it threw an error as well.


Here is how I have the code:

Dim terms As String() = Me.txtTitle.Text.Split(" ")

Dim query = (From id In d.tblRecordInfos.AsEnumerable()
   Let title = Regex.Split(id.Title, "(i?)[^a-z0-9]+")
   Where (title.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count())
    Select id)
0
it_saigeDeveloperCommented:
No the conversion error is something else.  The regular expression basically states (I just noticed an error *whoopsy-daisy*) - the regular expression broken down should mean this:
(?i) = an inline modifier that stipulates that from this point on regular expression is case insensitive for [a-zA-Z].
[^a-z0-9]+ = Match a singe character not present in the list a-z (case insensitive) and 0-9.  Do this as many times  as possible giving back as needed.Capture.JPGWe could also write the regular expression like this - "[^a-zA-Z0-9]+", which would mean the same thing as above, without the inline modifier, we just *explicitly* include all cases that we want to exclude.Capture.JPG
So, the change to your code would be:
Dim query = (From id In d.tblRecordInfos.AsEnumerable()
   Let title = Regex.Split(id.Title, "(?i)[^a-z0-9]+")
   Where (title.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count())
    Select id)

Open in new window


Are you stepping through your code?  Can you take a snippet of the error and line that it occurs on?

-saige-
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Karen WilsonProgrammerAuthor Commented:
Woot to the hoot, it pulled the three records and it populated in my datagrid!!  MUCHO GRACIAS!!!  You have saved my day.  :O)    

For others that may look, here is my final code.  

Try

            If Me.txtTitle.Text.Contains(" ") Then

                Dim terms As String() = Me.txtTitle.Text.Split(" ")

                Dim query = (From id In d.tblRecordInfos.AsEnumerable()
                    Let title = Regex.Split(id.Title, "(?i)[^a-z0-9]+")
                    Where (title.Distinct().Intersect(terms, StringComparer.OrdinalIgnoreCase).Count() = terms.Count())
                    Select id).ToList


                If query.Count > 0 Then
                    Me.TblRecordInfoDataGridView.DataSource = query
                Else
                    MsgBox("No records were retrieved.")
                End If


            End If

        Catch ex As Exception
            MessageBox.Show(ex.Message & vbCrLf & "Reference Material" & vbCrLf & ex.StackTrace, My.Application.Info.Title, MessageBoxButtons.OK, MessageBoxIcon.Error)
        End Try
0
Karen WilsonProgrammerAuthor Commented:
I forgot this was added as well:
Imports System.Text.RegularExpressions
Imports System.Runtime.CompilerServices
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
.NET Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.