Solved

Oracle indexes and selects with special characters give ORA-29902

Posted on 2011-03-18
25
3,092 Views
Last Modified: 2013-12-07
Hi,

I need help to build the correct index for a table.

Simplified scenario

- we have MySchema.MyTable
- MyTable has an index on the field PRODUCT_DESCR, which is a VARCHAR2(2000) field. The index is:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT)')
NOPARALLEL;

- the following select:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, '%shoes*%', 1) > 0

leads to this error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-50901: text query parser syntax error on line 1, column 6  

- the following select:

select * from MyTable
where contains( short_desc, '%sko,%', 1)>0

leads to this error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms
ORA-06512

Question
How should I modify IDX_PRODUCT_DESCR so that the selects above work correctly?
A stupid question maybe, but I am newbie in PL/SQL, and index seems difficult to me.

Thank you in advance.
0
Comment
Question by:kathysmith
  • 13
  • 9
  • 3
25 Comments
 
LVL 51

Expert Comment

by:HainKurt
ID: 35167000
try

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, 'shoes', 1) > 0

0
 
LVL 51

Expert Comment

by:HainKurt
ID: 35167034
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35167450
For the DRG-51030 there is a wildcard max terms parameter you can set.

I'm on mobile right now and can't get the link for the docs but the parameter is in the Text doc.
0
 

Author Comment

by:kathysmith
ID: 35169130
@HainKurt:
What I want to do is to get all products that contains "shoes,". That is the word "shoes" followed by a comma.

And I tried to achieve this with the following:

select * from MyTable
where contains( short_desc, '%sko,%', 1)>0

But you propose:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, 'shoes', 1) > 0

And yes, your query works, but it is not what I need. Or?
0
 

Author Comment

by:kathysmith
ID: 35169151
Actually I need the following type of queries:

1) get all rows that starts with "shoes,"
2) get all rows that contains "shoes,"
3) get all rows that ends with "shoes,"

Insted of that comma, I could have * _ or other characters that Oracle don't like.
I guess I have to change the index and, maybe the SELECT. But how exactly?
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35169392
Oracle Text does not 'index' a comma.  It indexes 'words'.  You can tell Text to have certain characters keep 'words' together.  These are PRINTJOINS.

The Text docs talks about them.
0
 
LVL 51

Expert Comment

by:HainKurt
ID: 35169950
did you try to escape liek this?

'shoes\,'
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35169982
Escaping doesn't matter. Text didn't index the comma.

You can see what was 'indexed' looking at TOKEN_TEXT in the DR$index_name$I view.
0
 

Author Comment

by:kathysmith
ID: 35170937
@HainKurt:
Yes, Ive tried to escape it. To be more exactly, what I've tried are:
'shoes\,'
'shoes\\,'
'shoes\*'
After all these, I've got:
DRG-51030: wildcard query expansion resulted in too many terms

So, as @slightwv says, escaping doesn't help.
But I've heard about printjoins, and I'm trying to learn more about this.
0
 

Author Comment

by:kathysmith
ID: 35171103
Guys, remember that I have to escape not only comma, but also other wildcards.
0
 

Author Comment

by:kathysmith
ID: 35171199
@slightwv: your answers lead me to the a very good article about index creation in Oracle:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16594/ind.htm#i1006887
I'm trying to understand text indexes from this article, but it'll take at least 2-3 hours.
I think you're right when you say that I should look at TOKEN_TEXT in the DR$index_name$I view.
I guess I need PARAMETERS ('STOPLIST CTXSYS.EMPTY_STOPLIST') or alike.
0
 

Author Comment

by:kathysmith
ID: 35171253
@slightwv

Here is how what I have in Toad for Oracle:

DROP INDEX MYSCHEMA.IDX_PRODUCT_DESCR;

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
NOPARALLEL;

_________

DROP INDEX MYSCHEMA.DR$IDX_PRODUCT_DESCR$X;

CREATE INDEX MYSCHEMA.DR$IDX_PRODUCT_DESCR$X ON MYSCHEMA.DR$IDX_PRODUCT_DESCR$I
(TOKEN_TEXT, TOKEN_TYPE, TOKEN_FIRST, TOKEN_LAST, TOKEN_COUNT)
NOLOGGING
TABLESPACE MYTABLESPACE
PCTFREE    10
INITRANS   2
MAXTRANS   255
STORAGE    (
            INITIAL          64K
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOPARALLEL
COMPRESS 2;
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 

Author Comment

by:kathysmith
ID: 35171286
@slightwv
I'm making progresses here :-)
The relevant article is:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1007615
The key issues are: BASIC_LEXER, printjoins, Case-Sensitive Indexing and Querying, indexing themes.

0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35172024
Glad you are in the docs.  Its the best resource for Text.

You shouldn't need a null stoplist unless you want to index/search the words in default stop list.

I also don't think printjoins will work if the word 'ends' with that character.

I'm not near a real computer so cannot confirm that for characters like a comma.

If you have to search for words ending in a comma (and other characters) you might need regexp_instr.  Some of the Oracle regexp commands can also do case insensitive.
0
 

Author Comment

by:kathysmith
ID: 35175842
No luck here... This is what I've done:

1. Created an lexer and printjoins:

begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '*,:;');
end;

One can read about this here:
http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1007615

2. Then, I droped my index:

DROP INDEX MYSCHEMA.IDX_PRODUCT_DESCR  FORCE

3. And recreated it:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT) LEXER mylexer')
NOPARALLEL;

4. Executed again this:

SELECT * FROM MyTable WHERE
CONTAINS( PRODUCT_DESCR, '%shoes*%', 1) > 0

5. Result: I'm getting the same error:

ORA-29902: error in executing ODCIIndexStart() routine
ORA-20000: Oracle Text error:
DRG-51030: wildcard query expansion resulted in too many terms
ORA-06512

What am I doing wrong? It seems so confusing to create an simple index...
0
 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
ID: 35175861
The DRG-51030: wildcard query expansion is caused my not setting the wildcard max terms.  I forget the exact parameter but it's in the docs.

I still also think the printjoins will not do what you want when it ends a word.
0
 

Author Comment

by:kathysmith
ID: 35176691
@slightwv: thank you. Really appreciate it. I'll try more tomorrow, and let you know. This is a "battle" that I have to win in the end :-)
0
 

Author Comment

by:kathysmith
ID: 35176797
@slightwv: couldn't wait until tomorrow, and googled after DRG-51030 and wildcard_maxterms. Found this:
http://forums.oracle.com/forums/thread.jspa?threadID=313297
Very good explanation of what wildcard_maxterms means and sugestions on how to solve problems related to it.
Thanx again. You pointed me in the right direction.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35178021
If you need to look for 'words' ending in a character, I'm still not convinced Text will work for you.
0
 

Author Comment

by:kathysmith
ID: 35179037
@slightwv: what kind of index will then be appropiate? Send me some key words so that I can search for them in the Oracle documentation. It is very much to read there, like a jungle, and I'm a beginner... Thanx :-)
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35181311
I understand the jungle comment!  I've been there!

The good news is I should finally be back on my development box tomorrow and can provide working examples.

Set up a simple table, insert the strings you need to find, create the index with the printjoins.

Look in the DR$...$I table and see if the 'words' you need are in there.

If not, you cannot search for them.

You might need the regexp_instr command I referenced above or a regular instr that forces case insensitive (this one is the last resorrt).
0
 

Author Comment

by:kathysmith
ID: 35189699
I have allready this table, and it allready contains about 55000 rows.
All the queries against this simple table are in a package with stored procedures. The stored procedures are executed from C#.
I'm using Toad for Oracle.

The table:

CREATE TABLE MYSCHEMA.MYTABLE
(
  ID             NUMBER,
  PRODUCT_DESCR  VARCHAR2(2000 BYTE)
)

The basic_lexer:

begin
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '*,:;');
end;

The index:

CREATE INDEX MYSCHEMA.IDX_PRODUCT_DESCR ON MYSCHEMA.MYTABLE
(PRODUCT_DESCR)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('SYNC(ON COMMIT) LEXER mylexer')
NOPARALLEL;

I can see the following in TOKEN_TEXT field of DR$IDX_PRODUCT_DESCR$I:
- all wildcards added as printjoins, that is *,:;
- all words appears in uppercase (for example: CLOTHES)
- there are different combination of words with characters from printjoins (for ex. ALMONDS, )

The following (1) works:
select * from DR$IDX_PRODUCT_DESCR$I
where token_text = 'ALMONDS,'

But the following (2) gives again DRG-51030:
SELECT * FROM MyTable
WHERE CONTAINS(short_desc, '%' || 'ALMONDS,' || '%', 1) > 0

I don't understand why (1) works and (2) fails. The sequence "ALMONDS," is in DR$IDX_PRODUCT_DESCR$I

I'll look up the documentation for regexp_instr.



0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35190033
The DRG-51030 has nothing to do with the printjoins.  Again, it is the wildcard max terms parameter.  I do not see that in your setup.

Check the docs:

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cqoper.htm#CCREF0344

The total number of wildcard expansions from all words in a query containing unescaped wildcard characters cannot exceed the maximum number of expansions specified by the BASIC_WORDLIST attribute WILDCARD_MAXTERMS. For more information, see "BASIC_WORDLIST".


Which gets you to:

http://download.oracle.com/docs/cd/E11882_01/text.112/e16593/cdatadic.htm#i1008935


Now that I'm back on my test database, you are correct, the printjoins are being indexed at the end of a word.  That is actually new to me!

I also owe HainKurt an apology.  You do need to escape 'special' characters.

Below is a complete test case.
begin
 ctx_ddl.drop_preference('TEST_LEX');
end;
/

begin
	ctx_ddl.create_preference('TEST_LEX', 'BASIC_LEXER');
	ctx_ddl.set_attribute('TEST_LEX', 'printjoins', '*,:;');
	ctx_ddl.set_attribute('TEST_LEX', 'index_text', 'YES'); 
end;
/

begin
 ctx_ddl.drop_preference('TEST_PREF');
end;
/

begin
	ctx_ddl.create_preference('TEST_PREF', 'BASIC_WORDLIST');
	ctx_ddl.set_attribute('TEST_PREF','SUBSTRING_INDEX','TRUE');
	ctx_ddl.set_attribute('TEST_PREF','WILDCARD_MAXTERMS','50000');
end;
/


drop table tab1 purge;
create table tab1 (col1 varchar2(20));

insert into tab1 values('shoes');
insert into tab1 values('shoes,');
insert into tab1 values('shoes*');
commit

create index tab1_idx on tab1(col1) indextype is ctxsys.context
	parameters('Wordlist TEST_PREF lexer TEST_LEX sync(on commit)');

select token_text from dr$tab1_idx$i;


select * from tab1 where contains(col1,'%shoe%\,') > 0;
select * from tab1 where contains(col1,'%shoe%\*') > 0;

Open in new window

0
 

Author Comment

by:kathysmith
ID: 35198173
@slightwv:

I can't thank you enough! First of all, I read the relevant part of the documentation. Then, I've run the code from you. It worked like a charm! Then, I implemented the same on my table. Works here too!
Well, now I have enough knowledge to add other wildcards/characters to the printjoins.
I've allso learned how to check the existing lexers and preferences in a schema:
SELECT * FROM ctx_preference_values

Many thanks :-)
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
ID: 35198182
Very glad to help.  I remember how hard it was when I was starting off using Oracle Text.

The documentation isn't very user friendly.  I learned most of it from trial and error!
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Article by: Swadhin
From the Oracle SQL Reference (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm) we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
Note: this article covers simple compression. Oracle introduced in version 11g release 2 a new feature called Advanced Compression which is not covered here. General principle of Oracle compression Oracle compression is a way of reducing the d…
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video shows information on the Oracle Data Dictionary, starting with the Oracle documentation, explaining the different types of Data Dictionary views available by group and permissions as well as giving examples on how to retrieve data from th…

760 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now