Oracle indexes and selects with special characters give ORA-29902

Hi,

I need help to build the correct index for a table.

Simplified scenario

- we have MySchema.MyTable
- MyTable has an index on the field PRODUCT_DESCR, which is a VARCHAR2(2000) field. The index is:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT)')
NOPARALLEL;

- the following select:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, '%shoes*%', 1) > 0

leads to this error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 6  

- the following select:

select * from MyTable
where contains( short_desc, '%sko,%', 1)>0

leads to this error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms
ORA-06512

Question
How should I modify IDX_PRODUCT_DESCR so that the selects above work correctly?
A stupid question maybe, but I am newbie in PL/SQL, and index seems difficult to me.

Thank you in advance.
kathysmithAsked:
Who is Participating?
 
slightwv (䄆 Netminder) Commented:
The DRG-51030: wildcard query expansion is caused my not setting the wildcard max terms.  I forget the exact parameter but it's in the docs.

I still also think the printjoins will not do what you want when it ends a word.
0
 
HainKurtSr. System AnalystCommented:
try

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, 'shoes', 1) > 0

0
 
HainKurtSr. System AnalystCommented:
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

 
slightwv (䄆 Netminder) Commented:
For the DRG-51030 there is a wildcard max terms parameter you can set.

I'm on mobile right now and can't get the link for the docs but the parameter is in the Text doc.
0
 
kathysmithAuthor Commented:
@HainKurt:
What I want to do is to get all products that contains "shoes,". That is the word "shoes" followed by a comma.

And I tried to achieve this with the following:

select * from MyTable
where contains( short_desc, '%sko,%', 1)>0

But you propose:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, 'shoes', 1) > 0

And yes, your query works, but it is not what I need. Or?
0
 
kathysmithAuthor Commented:
Actually I need the following type of queries:

1) get all rows that starts with "shoes,"
2) get all rows that contains "shoes,"
3) get all rows that ends with "shoes,"

Insted of that comma, I could have * _ or other characters that Oracle don't like.
I guess I have to change the index and, maybe the SELECT. But how exactly?
0
 
slightwv (䄆 Netminder) Commented:
Oracle Text does not 'index' a comma.  It indexes 'words'.  You can tell Text to have certain characters keep 'words' together.  These are PRINTJOINS.

The Text docs talks about them.
0
 
HainKurtSr. System AnalystCommented:
did you try to escape liek this?

'shoes\,'
0
 
slightwv (䄆 Netminder) Commented:
Escaping doesn't matter. Text didn't index the comma.

You can see what was 'indexed' looking at TOKEN_TEXT in the DR$index_name$I view.
0
 
kathysmithAuthor Commented:
@HainKurt:
Yes, Ive tried to escape it. To be more exactly, what I've tried are:
'shoes\,'
'shoes\\,'
'shoes\*'
After all these, I've got:
DRG-51030: wildcard query expansion resulted in too many terms

So, as @slightwv says, escaping doesn't help.
But I've heard about printjoins, and I'm trying to learn more about this.
0
 
kathysmithAuthor Commented:
Guys, remember that I have to escape not only comma, but also other wildcards.
0
 
kathysmithAuthor Commented:
@slightwv: your answers lead me to the a very good article about index creation in Oracle:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16594/ind.htm#i1006887
I'm trying to understand text indexes from this article, but it'll take at least 2-3 hours.
I think you're right when you say that I should look at TOKEN_TEXT in the DR$index_name$I view.
I guess I need PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST') or alike.
0
 
kathysmithAuthor Commented:
@slightwv

Here is how what I have in Toad for Oracle:

DROP INDEX MYSCHEMA.IDX_PRODUCT_DESCR;

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
NOPARALLEL;

_________

DROP INDEX MYSCHEMA.DR$IDX_PRODUCT_DESCR$X;

CREATE INDEX MYSCHEMA.DR$IDX_PRODUCT_DESCR$X ON MYSCHEMA.DR$IDX_PRODUCT_DESCR$I
(TOKEN_TEXT, TOKEN_TYPE, TOKEN_FIRST, TOKEN_LAST, TOKEN_COUNT)
NOLOGGING
TABLESPACE MYTABLESPACE
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
COMPRESS 2;
0
 
kathysmithAuthor Commented:
@slightwv
I'm making progresses here :-)
The relevant article is:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1007615
The key issues are: BASIC_LEXER, printjoins, Case-Sensitive Indexing and Querying, indexing themes.

0
 
slightwv (䄆 Netminder) Commented:
Glad you are in the docs.  Its the best resource for Text.

You shouldn't need a null stoplist unless you want to index/search the words in default stop list.

I also don't think printjoins will work if the word 'ends' with that character.

I'm not near a real computer so cannot confirm that for characters like a comma.

If you have to search for words ending in a comma (and other characters) you might need regexp_instr.  Some of the Oracle regexp commands can also do case insensitive.
0
 
kathysmithAuthor Commented:
No luck here... This is what I've done:

1. Created an lexer and printjoins:

begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '*,:;');
end;

One can read about this here:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1007615

2. Then, I droped my index:

DROP INDEX MYSCHEMA.IDX_PRODUCT_DESCR  FORCE

3. And recreated it:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT) LEXER mylexer')
NOPARALLEL;

4. Executed again this:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, '%shoes*%', 1) > 0

5. Result: I'm getting the same error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms
ORA-06512

What am I doing wrong? It seems so confusing to create an simple index...
0
 
kathysmithAuthor Commented:
@slightwv: thank you. Really appreciate it. I'll try more tomorrow, and let you know. This is a "battle" that I have to win in the end :-)
0
 
kathysmithAuthor Commented:
@slightwv: couldn't wait until tomorrow, and googled after DRG-51030 and wildcard_maxterms. Found this:
http://forums.oracle.com/forums/thread.jspa?threadID=313297
Very good explanation of what wildcard_maxterms means and sugestions on how to solve problems related to it.
Thanx again. You pointed me in the right direction.
0
 
slightwv (䄆 Netminder) Commented:
If you need to look for 'words' ending in a character, I'm still not convinced Text will work for you.
0
 
kathysmithAuthor Commented:
@slightwv: what kind of index will then be appropiate? Send me some key words so that I can search for them in the Oracle documentation. It is very much to read there, like a jungle, and I'm a beginner... Thanx :-)
0
 
slightwv (䄆 Netminder) Commented:
I understand the jungle comment!  I've been there!

The good news is I should finally be back on my development box tomorrow and can provide working examples.

Set up a simple table, insert the strings you need to find, create the index with the printjoins.

Look in the DR$...$I table and see if the 'words' you need are in there.

If not, you cannot search for them.

You might need the regexp_instr command I referenced above or a regular instr that forces case insensitive (this one is the last resorrt).
0
 
kathysmithAuthor Commented:
I have allready this table, and it allready contains about 55000 rows.
All the queries against this simple table are in a package with stored procedures. The stored procedures are executed from C#.
I'm using Toad for Oracle.

The table:

CREATE TABLE MYSCHEMA.MYTABLE
(
  ID             NUMBER,
  PRODUCT_DESCR  VARCHAR2(2000 BYTE)
)

The basic_lexer:

begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '*,:;');
end;

The index:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT) LEXER mylexer')
NOPARALLEL;

I can see the following in TOKEN_TEXT field of DR$IDX_PRODUCT_DESCR$I:
- all wildcards added as printjoins, that is *,:;
- all words appears in uppercase (for example: CLOTHES)
- there are different combination of words with characters from printjoins (for ex. ALMONDS, )

The following (1) works:
select * from DR$IDX_PRODUCT_DESCR$I
where token_text = 'ALMONDS,'

But the following (2) gives again DRG-51030:
SELECT * FROM MyTable
WHERE CONTAINS(short_desc, '%' || 'ALMONDS,' || '%', 1) > 0

I don't understand why (1) works and (2) fails. The sequence "ALMONDS," is in DR$IDX_PRODUCT_DESCR$I

I'll look up the documentation for regexp_instr.



0
 
slightwv (䄆 Netminder) Commented:
The DRG-51030 has nothing to do with the printjoins.  Again, it is the wildcard max terms parameter.  I do not see that in your setup.

Check the docs:

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cqoper.htm#CCREF0344

The total number of wildcard expansions from all words in a query containing unescaped wildcard characters cannot exceed the maximum number of expansions specified by the BASIC_WORDLIST attribute WILDCARD_MAXTERMS. For more information, see "BASIC_WORDLIST".


Which gets you to:

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1008935


Now that I'm back on my test database, you are correct, the printjoins are being indexed at the end of a word.  That is actually new to me!

I also owe HainKurt an apology.  You do need to escape 'special' characters.

Below is a complete test case.
begin
 ctx_ddl.drop_preference('TEST_LEX');
end;
/

begin
	ctx_ddl.create_preference('TEST_LEX', 'BASIC_LEXER');
	ctx_ddl.set_attribute('TEST_LEX', 'printjoins', '*,:;');
	ctx_ddl.set_attribute('TEST_LEX', 'index_text', 'YES'); 
end;
/

begin
 ctx_ddl.drop_preference('TEST_PREF');
end;
/

begin
	ctx_ddl.create_preference('TEST_PREF', 'BASIC_WORDLIST');
	ctx_ddl.set_attribute('TEST_PREF','SUBSTRING_INDEX','TRUE');
	ctx_ddl.set_attribute('TEST_PREF','WILDCARD_MAXTERMS','50000');
end;
/


drop table tab1 purge;
create table tab1 (col1 varchar2(20));

insert into tab1 values('shoes');
insert into tab1 values('shoes,');
insert into tab1 values('shoes*');
commit

create index tab1_idx on tab1(col1) indextype is ctxsys.context
	parameters('Wordlist TEST_PREF lexer TEST_LEX sync(on commit)');

select token_text from dr$tab1_idx$i;


select * from tab1 where contains(col1,'%shoe%\,') > 0;
select * from tab1 where contains(col1,'%shoe%\*') > 0;

Open in new window

0
 
kathysmithAuthor Commented:
@slightwv:

I can't thank you enough! First of all, I read the relevant part of the documentation. Then, I've run the code from you. It worked like a charm! Then, I implemented the same on my table. Works here too!
Well, now I have enough knowledge to add other wildcards/characters to the printjoins.
I've allso learned how to check the existing lexers and preferences in a schema:
SELECT * FROM ctx_preference_values

Many thanks :-)
0
 
slightwv (䄆 Netminder) Commented:
Very glad to help.  I remember how hard it was when I was starting off using Oracle Text.

The documentation isn't very user friendly.  I learned most of it from trial and error!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.