using oracle indexes

hi ,

I have an  comment  and user_details table . The comment table contains the comments entered by the user and user_details contains the users first name and last name of the user who created the comment.

Comments
Comment_id  comment_desc comment_title   created_by
1                         sam                        sam_com1       102
2                         jam                         jam_com1        103
3                         sam_com3            sam_com2         102


user_details
user_id   first_name last_name
1                sam              daniel
2                jam               gopal


I have defined the function based index on comment_desc --> comments_desc_idx and comment_title --> comments_title_idx .  I also have an index on last_name user_details_ln_idx in  user_details table.

I have the below query to fetch the comments matching the test sam or first_name or last_name matching sam. The issue iam facing is , the comments table has almost 2345678 k rows. The user_details table also has rows upto 120049K rows

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from comments c left outer join user_details ut
on (ut.user_id=c.created_by)
where
(c.comment_desc like '%sam%'
or c.comment_title like '%sam%'
or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')

The query is taking an long time , as the indexes on the columns defined in comments table is not used. I cannot define anymore indexes on the user_details table as it is another user schema. We have an synonym for user_details table in out schema. Is there way to request oracle to use the indexes for the columns comment_Desc and comment_title.
sam_2012Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

sam_2012Author Commented:
Iam also fine with any other alternative approach to address my problem. I have heard of creating some oracle user defined functions for creating index , but not sure how i can implement the same .
0
sdstuberCommented:
you can't index into a mid-string substring.

Think about it like an index in a book.  Look up all words in the index that have a "d" in the middle.

You have to read every one start to finish, right?
Same with the database index.  With a mid-string, it has no reference so it has to scan everything.
0
sdstuberCommented:
you "might" be able to use a function based index if you're searching for strings in specific locations.

 substr(your_column,4,6)  would be a viable mid-string index.

but not wildcarding to any location.


you could try Oracle Text indexes, they aren't like normal indexes but they "might" be able to help you do mid-string searches with the CONTAINS operator.
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

PortletPaulfreelancerCommented:
>>"you could try Oracle Text indexes"

Oracle Text Application Developer's Guide

chapters 3, 4 and 7 in particular.

---
or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')


just a small observation, if you have searched first name and last name, then there is no point is searching the concatenation of first and last names unless you are looking for terms formed by joining the 2 names e.g.

Anders Amundessen neither have %sam%
andersamundessen does have %sam%

AND
note the effect of case sensitivity in that small sample

By default, all text tokens are converted to uppercase and then indexed. This results in case-insensitive queries.
3.2.7 Case-Sensitive Indexing and Querying
0
slightwv (䄆 Netminder) Commented:
Specifically with Oracle Text you want a MULTI_COLUMN_DATASTORE.

This lets you create one index across ALL of your columns.

http://docs.oracle.com/cd/E11882_01/text.112/e24436/cdatadic.htm#i1006391

Add a column like idx_col char(1) to use as the indexed column.  Then add all the additional columns you want contained in the index.

In your case you will need one index on both tables:
comment_desc
comment_title

and:
first_name
last_name


Then the query would be:
Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
 from comments c left outer join user_details ut
 on (ut.user_id=c.created_by)
 where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0
0
slightwv (䄆 Netminder) Commented:
I had some time.

Here is the complete test case.

Note: I added 'my' to the table names just in case you run it as-is and accidentally drop your tables.

--set up what you currently posted
drop table mycomments purge;
create table mycomments(comment_id number, comment_desc varchar2(20), comment_title varchar2(20), created_by number);

insert into mycomments values(1,'sam','sam_com1',102);
insert into mycomments values(2,'jam','jam_com1',103);
insert into mycomments values(3,'sam_com3','sam_com2',102);
commit;


drop table myuser_details purge;
create table myuser_details(user_id number, first_name varchar2(20), last_name varchar2(20));

insert into myuser_details values(1,'sam','daniel');
insert into myuser_details values(2,'jam','gopal');
commit;

--The new parts for Oracle Text

--add the column for the multi column index
alter table mycomments add idx_col char(1);
alter table myuser_details add idx_col char(1);

exec ctx_ddl.drop_preference('MY_COMMENT_PREF');

begin
	ctx_ddl.create_preference('MY_COMMENT_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_COMMENT_PREF','COLUMNS','comment_desc, comment_title');
end;
/

exec ctx_ddl.drop_preference('MY_DETAIL_PREF');

begin
	ctx_ddl.create_preference('MY_DETAIL_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_DETAIL_PREF','COLUMNS','first_name, last_name');
end;
/


create index mycomments_text_idx on mycomments(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_COMMENT_PREF sync(on commit)');

create index myuserdetails_text_idx on myuser_details(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_DETAIL_PREF sync(on commit)');

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from mycomments c left outer join myuser_details ut
on (ut.user_id=c.created_by)
where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0 
/

Open in new window


Something to be aware of:
Text indexes are not like regular indexes.  They need more care and feeding than normal BTree type indexes.

Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

If you have a LOT of data added over time, you will want to run a CTX_DDL.OPTIMIZE_INDEX with either a FAST or FULL depending on your specific requirements to maintain the performance of the CONTAINS queries.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Mark GeerlingsDatabase AdministratorCommented:
If you want good performance from Oracle queries:
1. Do not use "outer joins"
2. Do not use leading wildcard characters on your bind variables
3. Do not apply operators (like: upper, lower, to_char, to_date, nvl, etc.) to the database columns that you reference in "where" clauses (unless you have function-based indexes that exactly match this query syntax for those columns).
0
sam_2012Author Commented:
Hi Slightwv,

I need clarification on the two thing u have mentioned with regard to oracle text indexes

1. Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

I did not understand the term fragmentation term here. Can u explain it.

2. Should , I run automated job to update an oracle text index.

3. How efficient is it from an normal index.
0
slightwv (䄆 Netminder) Commented:
1:  Once a 'token' (word) has been indexed and the documents it is in has been added to the Text Index tables, new rows that contain that term generate new index entries.  The old ones are not updated.

For example:
given the table:
doc_id, string
and the row:
1,'Hello World'

It has two tokens so the generated index would look like this:
Hello exists in doc_id 1
World exists in doc_id 1

Now you add a new row:
2, 'Goodbye World'

The index now has:
Hello exists in doc_id 1
World exists in doc_id 1
Goodbye exists in doc_id 2
World exists in doc_id 2

As you can see, 'World' now has two entries.  Once you OPTIMIZE the index, it will combine the two World entries so it now looks like:
Hello exists in doc_id 1
World exists in doc_id 1 and 2
Goodbye exists in doc_id 2

Expand that by a few thousand rows and you can see how fragmented it can become.



2:  If by update you mean optimize then yes.  I have one that runs nightly during the off hours and uses the max time parameter so it doesn't run into business hours.  The optimize is smart enough to know where it left off and will pick back up during the next run.


3:  As far as efficiency, it all depends on your specific requirements.  If you want left sided wildcards and don't want a full table scan on a large table, it is pretty much your only choice.

Look at it this way:
I have ONE index on TWO columns as opposed to TWO normal indexes.


If all you have is smaller tables then it might not be worth the overhead.

You'll need to set up some test cases on your system and test different scenarios.


That said, Text has some additional benefits:
It has fuzzy and stemming search capability (The docs talk about what this is):
http://docs.oracle.com/cd/E11882_01/text.112/e24435/ind.htm#CIHCCCIF


You also have the ability to set up synonyms, broader and narrower terms:
http://docs.oracle.com/cd/E11882_01/text.112/e24435/cthes.htm#i1006324

So, you can get as 'fancy' as you want way above normal text matching.
0
sam_2012Author Commented:
Awesome explanation.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Oracle Database

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.