oracle exception while using CONTAINS function

mikha
mikha used Ask the Experts™
on
I am using Contains function for text search in a table in a ORACLE database. see example below.

I'm passing text_param as a ORACLE parameter. I have a input text field in my web page, for users to input any text they want to search.
the query works, except when I enter characters like $ or * , in the text field and pass it to the parameter. I get the exception below.

how do i make sure I cover such use case. do I restrict user to input search characters but how do i know which characters to look for?

SELECT title, issue_date from details WHERE CONTAINS(text_data, text_param, 1) > 0

exception ------------------------------------------------------

Oracle.DataAccess.Client.OracleException : 'ORA-29902: error in executing ODCIIndexSTART() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 2
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
The "$" is a stemming shortcut and "*" is for weight.

Where to look is always in the documentation.  All the operators are there:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/ccref/oracle-text-CONTAINS-query-operators.html#GUID-6410B783-FC9A-4C99-B3AF-9E0349AA43D1

It isn't just special characters.  There are function operators you need to be concerned with.

Use your search for PT.  My guess it will error as well since it is a Partial Term function that expects paramaters.

I wrote my own function, no I cannot provide it, to clean the passed in string to ensure I had complete control of what was actually searched.

Another problem you have is how the Text index was created.  By default non-word characters cause token separation.  I had to search for email addresses in the text but by default "username@foo.bar" was indexed as three individual tokens "username", "foo" and "bar".  If I searched for the string "username@foo.bar", it was never found.

If you want them kept together when indexed, you need PRINTJOINS when you created the index.

Author

Commented:
@slightwv - thanks for your comment. I am going to read the oracle documentation.
but thinking about this more, how do we query , if we actually want to search for "*" or say "$" characters then?
Most Valuable Expert 2012
Distinguished Expert 2018

Commented:
Those characters aren't indexed unless you set up printjoins so you cannot search them per say.

You can escape the special characters and sort of find the string but it isn't because it searches those characters.  It ignores them.

Set up a quick test and see.  As you can see, the special characters aren't part of the Text index but the query doesn't syntax out when you escape them BUT it also returns both rows.

SQL> drop table tab1 purge;

Table dropped.

SQL> create table tab1(col1 varchar2(50));

Table created.

SQL> insert into tab1 values('Hello*World$Test');

1 row created.

SQL> insert into tab1 values('Hello#World^Test');

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> create index tab1_idx on tab1(col1) indextype is ctxsys.context;

Index created.

SQL>
SQL> select token_text from dr$tab1_idx$i;
HELLO
TEST
WORLD

SQL>
SQL> select * from tab1 where contains(col1,'Hello\*World\$Test') > 0;
Hello*World$Test
Hello#World^Test

Open in new window


You can also just remove the special characters from the search and find it:
select * from tab1 where contains(col1,'Hello World Test') > 0;
Hello*World$Test
Hello#World^Test

Open in new window


You still have the issue of strings that are also the same as a Text function.  I double quote those.

SQL> drop table tab1 purge;

Table dropped.

SQL> create table tab1(col1 varchar2(50));

Table created.

SQL> insert into tab1 values('PT Barnum');

1 row created.

SQL> commit;

Commit complete.

SQL>
SQL> create index tab1_idx on tab1(col1) indextype is ctxsys.context;

Index created.

SQL>
SQL> select token_text from dr$tab1_idx$i;
BARNUM
PT

SQL>
SQL> select * from tab1 where contains(col1,'PT Barnum') > 0;
select * from tab1 where contains(col1,'PT Barnum') > 0
*
ERROR at line 1:
ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 4


SQL> select * from tab1 where contains(col1,'"PT" Barnum') > 0;
PT Barnum

Open in new window


So far we are doing exact string search.  For the above example, if you search 'PT Barnum', what if the indexed text is 'PT Bob Barnum', do you want to find that as well?
Ensure you’re charging the right price for your IT

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Author

Commented:
Thanks again.

yes, I would like to be able to find, 'PT Bob Barnum' if i search for 'PT Barnum'.  

you mentioned that in sample code you posted,  it is doing exact string search.  is the reason , you are escaping all the special characters and also any text function is within double quotes?

I'm new to oracle.  so I'm not sure what PRINTJOINS  is. I will research this. but for my basic search,

I am planning to use regular expression, to escape any special characters .
and also parse the user input for any text function and escape them as well, by using double quotes or use {} curly braces around them.

does that sound like right direction to go?
Most Valuable Expert 2012
Distinguished Expert 2018
Commented:
>>it is doing exact string search.  is the reason

No.  Look at the example I posted.  It uses the string "PT Barnum" so that is what is searches for.

I used quotes to be able to use PT and not have Text think it is the Partial Term function,

Use simple test cases to try things.  I added a new row "PT Bob Barnum" and only searched for "PT Barnum"

Notice it doesn't find both rows.

drop table tab1 purge;
create table tab1(col1 varchar2(50));
insert into tab1 values('PT Barnum');
insert into tab1 values('PT Bob Barnum');
commit;
create index tab1_idx on tab1(col1) indextype is ctxsys.context;

select * from tab1 where contains(col1,'"PT" Barnum') > 0;

Open in new window


>>o I'm not sure what PRINTJOINS  is. I will research this. but for my basic search,

Here is the doc link:
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/ccref/oracle-text-indexing-elements.html#GUID-253EE2AA-F9C6-49E1-9EA4-4A05F803FF24

Get used to spending a LOT of time in the Text Reference.

>>does that sound like right direction to go?

I used a PL/SQL function to clean my search inputs but a regex should work as well.  Whatever is easier for you to maintain when you come across different problems.

>>to escape any special characters .

Escaping might not be what you want.  Looking in my example above you had * and $ but it also returned "Hello#World^Test"  If that isn't right then escaping won't work for you.

>>yes, I would like to be able to find, 'PT Bob Barnum' if i search for 'PT Barnum'.  


Then you probably need to convert your search string to use the NEAR function where the tokens appear near each other.

This searches for PT near BARNUM in a 10 word proximity and preserving order (PT MUST be to the left of Barnum):
SQL> select * from tab1 where contains(col1,'near(("PT",Barnum),10,true)') > 0;

COL1
--------------------------------------------------
PT Barnum
PT Bob Barnum

Open in new window


If that is what you need, you'll need to do more to the input string than just escape and quote.

You probably aslo want to take care of the boolean  operators like AND and OR.  If those are part of your search string, they can cause unforeseen issues since Text will take them literally if you don't account for them.

You also need to be concerned with Index fragmentation as new records are added and indexed.  You'll probably want to set up an optimize_index job to run once in a while.  How often depends on data changes and overall index performance.

Are you beginning to see that a simple Text search isn't all that simple?

Author

Commented:
thank you so much for your insight.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial