Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium


using oracle indexes

Posted on 2014-09-16
Medium Priority
Last Modified: 2014-09-22
hi ,

I have an  comment  and user_details table . The comment table contains the comments entered by the user and user_details contains the users first name and last name of the user who created the comment.

Comment_id  comment_desc comment_title   created_by
1                         sam                        sam_com1       102
2                         jam                         jam_com1        103
3                         sam_com3            sam_com2         102

user_id   first_name last_name
1                sam              daniel
2                jam               gopal

I have defined the function based index on comment_desc --> comments_desc_idx and comment_title --> comments_title_idx .  I also have an index on last_name user_details_ln_idx in  user_details table.

I have the below query to fetch the comments matching the test sam or first_name or last_name matching sam. The issue iam facing is , the comments table has almost 2345678 k rows. The user_details table also has rows upto 120049K rows

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from comments c left outer join user_details ut
on (ut.user_id=c.created_by)
(c.comment_desc like '%sam%'
or c.comment_title like '%sam%'
or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')

The query is taking an long time , as the indexes on the columns defined in comments table is not used. I cannot define anymore indexes on the user_details table as it is another user schema. We have an synonym for user_details table in out schema. Is there way to request oracle to use the indexes for the columns comment_Desc and comment_title.
Question by:sam_2012
  • 3
  • 3
  • 2
  • +2

Author Comment

ID: 40327068
Iam also fine with any other alternative approach to address my problem. I have heard of creating some oracle user defined functions for creating index , but not sure how i can implement the same .
LVL 74

Expert Comment

ID: 40327150
you can't index into a mid-string substring.

Think about it like an index in a book.  Look up all words in the index that have a "d" in the middle.

You have to read every one start to finish, right?
Same with the database index.  With a mid-string, it has no reference so it has to scan everything.
LVL 74

Expert Comment

ID: 40327154
you "might" be able to use a function based index if you're searching for strings in specific locations.

 substr(your_column,4,6)  would be a viable mid-string index.

but not wildcarding to any location.

you could try Oracle Text indexes, they aren't like normal indexes but they "might" be able to help you do mid-string searches with the CONTAINS operator.

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

LVL 49

Expert Comment

ID: 40327194
>>"you could try Oracle Text indexes"

Oracle Text Application Developer's Guide

chapters 3, 4 and 7 in particular.

or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')

just a small observation, if you have searched first name and last name, then there is no point is searching the concatenation of first and last names unless you are looking for terms formed by joining the 2 names e.g.

Anders Amundessen neither have %sam%
andersamundessen does have %sam%

note the effect of case sensitivity in that small sample

By default, all text tokens are converted to uppercase and then indexed. This results in case-insensitive queries.
3.2.7 Case-Sensitive Indexing and Querying
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40327820
Specifically with Oracle Text you want a MULTI_COLUMN_DATASTORE.

This lets you create one index across ALL of your columns.


Add a column like idx_col char(1) to use as the indexed column.  Then add all the additional columns you want contained in the index.

In your case you will need one index on both tables:


Then the query would be:
Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
 from comments c left outer join user_details ut
 on (ut.user_id=c.created_by)
 where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0
LVL 78

Accepted Solution

slightwv (䄆 Netminder) earned 2000 total points
ID: 40328756
I had some time.

Here is the complete test case.

Note: I added 'my' to the table names just in case you run it as-is and accidentally drop your tables.

--set up what you currently posted
drop table mycomments purge;
create table mycomments(comment_id number, comment_desc varchar2(20), comment_title varchar2(20), created_by number);

insert into mycomments values(1,'sam','sam_com1',102);
insert into mycomments values(2,'jam','jam_com1',103);
insert into mycomments values(3,'sam_com3','sam_com2',102);

drop table myuser_details purge;
create table myuser_details(user_id number, first_name varchar2(20), last_name varchar2(20));

insert into myuser_details values(1,'sam','daniel');
insert into myuser_details values(2,'jam','gopal');

--The new parts for Oracle Text

--add the column for the multi column index
alter table mycomments add idx_col char(1);
alter table myuser_details add idx_col char(1);

exec ctx_ddl.drop_preference('MY_COMMENT_PREF');

	ctx_ddl.create_preference('MY_COMMENT_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_COMMENT_PREF','COLUMNS','comment_desc, comment_title');

exec ctx_ddl.drop_preference('MY_DETAIL_PREF');

	ctx_ddl.create_preference('MY_DETAIL_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_DETAIL_PREF','COLUMNS','first_name, last_name');

create index mycomments_text_idx on mycomments(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_COMMENT_PREF sync(on commit)');

create index myuserdetails_text_idx on myuser_details(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_DETAIL_PREF sync(on commit)');

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from mycomments c left outer join myuser_details ut
on (ut.user_id=c.created_by)
where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0 

Open in new window

Something to be aware of:
Text indexes are not like regular indexes.  They need more care and feeding than normal BTree type indexes.

Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

If you have a LOT of data added over time, you will want to run a CTX_DDL.OPTIMIZE_INDEX with either a FAST or FULL depending on your specific requirements to maintain the performance of the CONTAINS queries.
LVL 35

Expert Comment

by:Mark Geerlings
ID: 40329145
If you want good performance from Oracle queries:
1. Do not use "outer joins"
2. Do not use leading wildcard characters on your bind variables
3. Do not apply operators (like: upper, lower, to_char, to_date, nvl, etc.) to the database columns that you reference in "where" clauses (unless you have function-based indexes that exactly match this query syntax for those columns).

Author Comment

ID: 40335942
Hi Slightwv,

I need clarification on the two thing u have mentioned with regard to oracle text indexes

1. Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

I did not understand the term fragmentation term here. Can u explain it.

2. Should , I run automated job to update an oracle text index.

3. How efficient is it from an normal index.
LVL 78

Expert Comment

by:slightwv (䄆 Netminder)
ID: 40336400
1:  Once a 'token' (word) has been indexed and the documents it is in has been added to the Text Index tables, new rows that contain that term generate new index entries.  The old ones are not updated.

For example:
given the table:
doc_id, string
and the row:
1,'Hello World'

It has two tokens so the generated index would look like this:
Hello exists in doc_id 1
World exists in doc_id 1

Now you add a new row:
2, 'Goodbye World'

The index now has:
Hello exists in doc_id 1
World exists in doc_id 1
Goodbye exists in doc_id 2
World exists in doc_id 2

As you can see, 'World' now has two entries.  Once you OPTIMIZE the index, it will combine the two World entries so it now looks like:
Hello exists in doc_id 1
World exists in doc_id 1 and 2
Goodbye exists in doc_id 2

Expand that by a few thousand rows and you can see how fragmented it can become.

2:  If by update you mean optimize then yes.  I have one that runs nightly during the off hours and uses the max time parameter so it doesn't run into business hours.  The optimize is smart enough to know where it left off and will pick back up during the next run.

3:  As far as efficiency, it all depends on your specific requirements.  If you want left sided wildcards and don't want a full table scan on a large table, it is pretty much your only choice.

Look at it this way:
I have ONE index on TWO columns as opposed to TWO normal indexes.

If all you have is smaller tables then it might not be worth the overhead.

You'll need to set up some test cases on your system and test different scenarios.

That said, Text has some additional benefits:
It has fuzzy and stemming search capability (The docs talk about what this is):

You also have the ability to set up synonyms, broader and narrower terms:

So, you can get as 'fancy' as you want way above normal text matching.

Author Closing Comment

ID: 40337392
Awesome explanation.

Featured Post

[Webinar On Demand] Database Backup and Recovery

Does your company store data on premises, off site, in the cloud, or a combination of these? If you answered “yes”, you need a data backup recovery plan that fits each and every platform. Watch now as as Percona teaches us how to build agile data backup recovery plan.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

How to Create User-Defined Aggregates in Oracle Before we begin creating these things, what are user-defined aggregates?  They are a feature introduced in Oracle 9i that allows a developer to create his or her own functions like "SUM", "AVG", and…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database
This video shows information on the Oracle Data Dictionary, starting with the Oracle documentation, explaining the different types of Data Dictionary views available by group and permissions as well as giving examples on how to retrieve data from th…
Suggested Courses

572 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question