Link to home
Start Free TrialLog in
Avatar of dageyra
dageyraFlag for United States of America

asked on

SQL stored procedure querying XML parameter

We have a working search feature in our application that takes the user search terms and creates an XML parameter to the SQL stored procedure, allowing the result set to be filtered based on the search parameters.  This was done after some research using CROSS APPLY.  We now want to expand on this because it does an 'OR' search for all the terms, and we want to offer the ability to do a 'AND' search.

Here is an example of the query in the SP.

CREATE procedure [dbo].[getresources]
	@ModuleId int,
	@xmlTerms XML,
	@allTerms int

as
select distinct ModuleId,
       ItemId,
       Description,
       Title
from resources with (nolock)
CROSS APPLY @xmlTerms.nodes('//SearchTerms/SearchTerm') xmlTerms(i)
    WHERE (Title LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
			OR
			Description LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%')
	AND ModuleId = @ModuleId

Open in new window


A sample XML would like this:
<SearchTerms>
  <SearchTerm>bob</SearchTerm>
  <SearchTerm>jane doe</SearchTerm>
</SearchTerms>

Open in new window


This works fine but we're finding too many results are returned for several keywords because the CROSS APPLY is an 'OR' search, so we are trying to add an 'AND' option.  It will be controlled by the allTerms parameter, which would be 0 for the standard 'OR' and 1 for the 'AND'.  

The problem is how to get the query to check the value of the column against the XML parameter value(s).  I've tried using a IN (SELECT ...) but kept running into syntax problems due to the conversion of XML to a dataset.  Most of the examples online deal with taking XML input and creating records in a database, not searching records based on the XML parameter.

Any thoughts on how this might be possible?
Avatar of PortletPaul
PortletPaul
Flag of Australia image

I think the answer to "can it be done?" = yes

Not quite sure if this is what you are after, bit could you try this?
CREATE PROCEDURE [dbo].[getresources]
      @ModuleId INT
    , @xmlTerms XML
    , @allTerms INT
AS
SELECT DISTINCT
      ModuleId
    , ItemId
    , Description
    , Title
FROM resources WITH (NOLOCK)
CROSS APPLY @xmlTerms.nodes('//SearchTerms/SearchTerm') xmlTerms(i)
WHERE ModuleId = @ModuleId
AND (
     (
        @allTerms = 0 AND (
                   Title LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
                OR Description LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
                          )
     )
    OR
     (
        @allTerms = 1 AND (
                    Title LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
                AND Description LIKE '%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
                          )
     )
    )

Open in new window

Avatar of dageyra

ASKER

Thanks for the reply.  I like how you incorporated the allTerms into the query, I was just going to do a if...else clause and have two separate queries.  

I don't mind trying it out because this cross apply/join stuff is complicated for me, but it seems like you misunderstood the intent  I will try to explain better in case it wasn't clear.  

In our current query, if "bob" & "jane doe" are passed in via XML, the query returns records that match either search term (bob OR jane doe).  I think that's done via the CROSS APPLY, but honestly I can't say I understand it well enough to say.  We want to allow the option to match both bob AND jane doe.  So the fact that we are searching both title & description is irrelevant, we could simplify and just search title, the idea is that when the user supplies multiple keywords, they all must match the title.

I still tried out the query and I get this error:


The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.

Thoughts?
OK - I was wondering what you would gain by the AND, so now I follow what you are after. Can you provide some sample data perhaps?

I'll re-consider the approach, but in the meantime you should know what that cross apply is actually doing.

<SearchTerms>
  <SearchTerm>first</SearchTerm>
  <SearchTerm>second</SearchTerm>
  <SearchTerm>third</SearchTerm>
</SearchTerms>

for a single record in the table 'bob', and with 3 search terms in the xml, that one record will become 3 records

in sql the intermediate resultset will appear as:
bob,first
bob,second
bob,third

In effect every row in the table is multiplied by the number of search terms in the xml.
Avatar of dageyra

ASKER

Thanks for the reply.  That actually makes a lot of sense, thanks for explaining that.  It also explains why I have to add distinct.  I more or less understand joins but wrapping my head around cross apply has been difficult.  I am only interested in the ability to pass XML parameters and the processing, of which I found a single example that used cross apply to filter data based on the XML.  By far the examples online are about taking XML and inserting records, which I could not figure out how to adapt to my needs.

So here's an example.  The table resources (I'm leaving out ModuleID because it's always the same, tied to a more global system):

ItemID         Title            Description
1                   First            This is the first resource
2                   Second       This is the second resource
3                   Third           This is the third resource
4                   Fourth        The fourth record

So let's say the user enters the search terms "this", "resource", and "first".  The system builds the XML as such:

<SearchTerms>
  <SearchTerm>this</SearchTerm>
  <SearchTerm>resource</SearchTerm>
  <SearchTerm>first</SearchTerm>
</SearchTerms>

The query we already have will return the first 3 records because the search term "this" is found in all 3 descriptions.  I thought that was because of the cross apply but I'm probably mistaken.  What I know is that it's doing is an 'OR' search, meaning that the title/description has to match any of the keywords.  I know this because it will never return the 4th record, which does not contain any of they keywords.

So we want to allow the option to do an 'AND' search, which would only return the first record because it's the only one that matches all 3 terms.  I can work out how to implement it for both title & description, but I struggle with the handling of the XML parameter in a way to require all of the terms to be found in a record that is returned.
Try something like this:
CREATE procedure [dbo].[getresources]
	@ModuleId int,
	@xmlTerms XML,
	@allTerms int

as

CREATE TABLE #SearchTerms(SearchTerm varchar(50))

SET NOCOUNT ON

INSERT	#SearchTerms(SearchTerm)
SELECT	'%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
FROM	@xmlTerms.nodes('/SearchTerms/SearchTerm') xmlTerms(i)

SELECT DISTINCT
        r.ModuleId,
        r.ItemId,
        r.Description,
        r.Title
FROM    resources r WITH (NOLOCK)
        INNER JOIN #SearchTerms s ON r.Title LIKE s.SearchTerm
                                     OR r.[Description] = s.SearchTerm
WHERE   ModuleId = @ModuleId

DROP TABLE #SearchTerms

Open in new window


But a word of caution, unless you are prepared to switch to using Full-Text Search that query is going to be a dog.

<edited to get rid of the unnecessary global XPath search>
Actually that is still not right.  Need to rethink.
Take two (still a dog though):
CREATE procedure [dbo].[getresources]
	@ModuleId int,
	@xmlTerms XML,
	@allTerms int

as

CREATE TABLE #SearchTerms(SearchTerm varchar(50))

SET NOCOUNT ON

INSERT	#SearchTerms(SearchTerm)
SELECT	'%' + xmlTerms.i.value('.', 'VARCHAR(50)') + '%'
FROM	@xmlTerms.nodes('/SearchTerms/SearchTerm') xmlTerms(i)

SELECT DISTINCT
        r.ModuleId,
        r.ItemId,
        r.Description,
        r.Title
FROM    resources r WITH (NOLOCK)
        INNER JOIN #SearchTerms s ON r.Title + '|' + r.[Description] LIKE s.SearchTerm
WHERE   ModuleId = @ModuleId

DROP TABLE #SearchTerms

Open in new window

Avatar of dageyra

ASKER

Yeh I was wondering how it differed.  If it helps at all, I don't mind two separate queries (one for the OR, one for the AND).  As I mentioned, I can do a IF...ELSE on the parameter that determines which type to execute.
Avatar of dageyra

ASKER

I tried out the second version, but it still returns records that match any search terms.
Avatar of dageyra

ASKER

Now that you have it in a table, I've been doing some research.  Do you know anything about the CONTAINS function:

I found an example here that I think could be modified to do what I need with full-text searching.

http://stackoverflow.com/questions/2237443/contains-with-multiple-terms-over-multiple-columns

Does this make sense as a solution?
ASKER CERTIFIED SOLUTION
Avatar of Anthony Perkins
Anthony Perkins
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Do you know anything about the CONTAINS function:
That is exactly what I meant by doing Full-Text Search.  But there is no leading wildcard search with FTS.
This is what your FTS Stored Procedure would look like:
CREATE procedure [dbo].[getresources]
	@ModuleId int,
	@xmlTerms XML,
	@allTerms int

AS

DECLARE @SearchTerms varchar(1000)

SET NOCOUNT ON

SELECT	@SearchTerms = ISNULL(@SearchTerms + ' AND ', '') + '"' + xmlTerms.i.value('.', 'VARCHAR(50)')  + '*"'
FROM	@xmlTerms.nodes('/SearchTerms/SearchTerm') xmlTerms(i)

SELECT	ModuleId,
        ItemId,
        Description,
        Title
FROM    resources WITH (NOLOCK)
WHERE   ModuleId = @ModuleId
	AND CONTAINS(*, @SearchTerms)
GROUP BY
	ModuleId,
        ItemId,
        Description,
        Title


DROP TABLE #SearchTerms

Open in new window


However:

1.

You have to setup FTS before you can use it.  Read up on it and see if it is for you.  It may not be available with the Express Edition.

2.

As stated previously there is no leading wildcard search.
Avatar of dageyra

ASKER

The non-FTS you provided would update the SP fine but when testing, it gives this error:

Could not find stored procedure 'SearchTerm'.

I'm thinking it's related to

INSERT      #SearchTerms(SearchTerm)

Any idea why it's trying to exec a SP?
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dageyra

ASKER

I apologize, I reported the error incorrectly (when I test the SP, if I have anything highlighted, it tries to execute only what's highlighted).  Here is the error:

Invalid object name '#SearchTerms'.

This goes away if I add back in the create table, which I think you replaced with the declare statement correct?
Avatar of dageyra

ASKER

Oh, I see that one is a var with @ and the other has #.  I added the create table back in and am getting a comparison error.  I will try to work it out and let you know if it works.
This goes away if I add back in the create table, which I think you replaced with the declare statement correct?
Right.  Just drop the line:
DROP TABLE #SearchTerms

In my last attempt.
Avatar of dageyra

ASKER

It worked!  Thanks so much.  So can you elaborate on the speed concerns?
think you are getting confused

the current solution place the xml into a temp table - you must create that table before you can insert.
ID: 39215158

a second thread of conversation is about full text search (FTS) which doesn't use the temp table - but it requires change to your dbms to implement it.

I suggest, for now, ignore FTS.
Avatar of dageyra

ASKER

And to be clear, you intend this to be used as two separate queries, correct?  By that I mean my original idea of using the all terms flag to execute the original query we have or this new one you've developed.
But again (in case you missed it) in order to use Full-Text Search you have to implement it first.
It worked!  Thanks so much.  So can you elaborate on the speed concerns?
If you are talking about using LIKE with leading and trailing wildcards, just add only 100K rows and check how long it takes and more importantly how much CPU and memory it uses.
And to be clear, you intend this to be used as two separate queries, correct?  
It depends which approach you want to use.  With FTS you could probably combine it into a single query.
Avatar of dageyra

ASKER

Great, I understand.  We have our own server and sql 2005 license, but we are a small non-profit and I don't think that kind of usage is going to come into play.  I appreciate the help and both options.  Once everything is working I can talk to the team about optimizing using FTS.
Avatar of dageyra

ASKER

Wanted to show appreciation for both your assistance.  Thanks a lot, guys.