Solved

using oracle indexes

Posted on 2014-09-16
10
285 Views
Last Modified: 2014-09-22
hi ,

I have an  comment  and user_details table . The comment table contains the comments entered by the user and user_details contains the users first name and last name of the user who created the comment.

Comments
Comment_id  comment_desc comment_title   created_by
1                         sam                        sam_com1       102
2                         jam                         jam_com1        103
3                         sam_com3            sam_com2         102


user_details
user_id   first_name last_name
1                sam              daniel
2                jam               gopal


I have defined the function based index on comment_desc --> comments_desc_idx and comment_title --> comments_title_idx .  I also have an index on last_name user_details_ln_idx in  user_details table.

I have the below query to fetch the comments matching the test sam or first_name or last_name matching sam. The issue iam facing is , the comments table has almost 2345678 k rows. The user_details table also has rows upto 120049K rows

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from comments c left outer join user_details ut
on (ut.user_id=c.created_by)
where
(c.comment_desc like '%sam%'
or c.comment_title like '%sam%'
or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')

The query is taking an long time , as the indexes on the columns defined in comments table is not used. I cannot define anymore indexes on the user_details table as it is another user schema. We have an synonym for user_details table in out schema. Is there way to request oracle to use the indexes for the columns comment_Desc and comment_title.
0
Comment
Question by:sam_2012
  • 3
  • 3
  • 2
  • +2
10 Comments
 

Author Comment

by:sam_2012
Comment Utility
Iam also fine with any other alternative approach to address my problem. I have heard of creating some oracle user defined functions for creating index , but not sure how i can implement the same .
0
 
LVL 73

Expert Comment

by:sdstuber
Comment Utility
you can't index into a mid-string substring.

Think about it like an index in a book.  Look up all words in the index that have a "d" in the middle.

You have to read every one start to finish, right?
Same with the database index.  With a mid-string, it has no reference so it has to scan everything.
0
 
LVL 73

Expert Comment

by:sdstuber
Comment Utility
you "might" be able to use a function based index if you're searching for strings in specific locations.

 substr(your_column,4,6)  would be a viable mid-string index.

but not wildcarding to any location.


you could try Oracle Text indexes, they aren't like normal indexes but they "might" be able to help you do mid-string searches with the CONTAINS operator.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
>>"you could try Oracle Text indexes"

Oracle Text Application Developer's Guide

chapters 3, 4 and 7 in particular.

---
or ut.first_name like '%sam%'
or ut.last_name like '%sam%'
or ut.first_name ||ut.last_name like '%sam%'
or ut.last_name||ut.first_name like '%sam%')


just a small observation, if you have searched first name and last name, then there is no point is searching the concatenation of first and last names unless you are looking for terms formed by joining the 2 names e.g.

Anders Amundessen neither have %sam%
andersamundessen does have %sam%

AND
note the effect of case sensitivity in that small sample

By default, all text tokens are converted to uppercase and then indexed. This results in case-insensitive queries.
3.2.7 Case-Sensitive Indexing and Querying
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
Comment Utility
Specifically with Oracle Text you want a MULTI_COLUMN_DATASTORE.

This lets you create one index across ALL of your columns.

http://docs.oracle.com/cd/E11882_01/text.112/e24436/cdatadic.htm#i1006391

Add a column like idx_col char(1) to use as the indexed column.  Then add all the additional columns you want contained in the index.

In your case you will need one index on both tables:
comment_desc
comment_title

and:
first_name
last_name


Then the query would be:
Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
 from comments c left outer join user_details ut
 on (ut.user_id=c.created_by)
 where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0
0
PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

 
LVL 76

Accepted Solution

by:
slightwv (䄆 Netminder) earned 500 total points
Comment Utility
I had some time.

Here is the complete test case.

Note: I added 'my' to the table names just in case you run it as-is and accidentally drop your tables.

--set up what you currently posted
drop table mycomments purge;
create table mycomments(comment_id number, comment_desc varchar2(20), comment_title varchar2(20), created_by number);

insert into mycomments values(1,'sam','sam_com1',102);
insert into mycomments values(2,'jam','jam_com1',103);
insert into mycomments values(3,'sam_com3','sam_com2',102);
commit;


drop table myuser_details purge;
create table myuser_details(user_id number, first_name varchar2(20), last_name varchar2(20));

insert into myuser_details values(1,'sam','daniel');
insert into myuser_details values(2,'jam','gopal');
commit;

--The new parts for Oracle Text

--add the column for the multi column index
alter table mycomments add idx_col char(1);
alter table myuser_details add idx_col char(1);

exec ctx_ddl.drop_preference('MY_COMMENT_PREF');

begin
	ctx_ddl.create_preference('MY_COMMENT_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_COMMENT_PREF','COLUMNS','comment_desc, comment_title');
end;
/

exec ctx_ddl.drop_preference('MY_DETAIL_PREF');

begin
	ctx_ddl.create_preference('MY_DETAIL_PREF', 'MULTI_COLUMN_DATASTORE');
	ctx_ddl.set_attribute('MY_DETAIL_PREF','COLUMNS','first_name, last_name');
end;
/


create index mycomments_text_idx on mycomments(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_COMMENT_PREF sync(on commit)');

create index myuserdetails_text_idx on myuser_details(idx_col) indextype is ctxsys.context
	parameters('Datastore MY_DETAIL_PREF sync(on commit)');

Select c.comment_id , c.comment_desc , c.comment_title , ut.first_name , ut.last_name
from mycomments c left outer join myuser_details ut
on (ut.user_id=c.created_by)
where contains(c.idx_col,'%sam%')>0 or contains(ut.idx_col,'%sam%')>0 
/

Open in new window


Something to be aware of:
Text indexes are not like regular indexes.  They need more care and feeding than normal BTree type indexes.

Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

If you have a LOT of data added over time, you will want to run a CTX_DDL.OPTIMIZE_INDEX with either a FAST or FULL depending on your specific requirements to maintain the performance of the CONTAINS queries.
0
 
LVL 34

Expert Comment

by:Mark Geerlings
Comment Utility
If you want good performance from Oracle queries:
1. Do not use "outer joins"
2. Do not use leading wildcard characters on your bind variables
3. Do not apply operators (like: upper, lower, to_char, to_date, nvl, etc.) to the database columns that you reference in "where" clauses (unless you have function-based indexes that exactly match this query syntax for those columns).
0
 

Author Comment

by:sam_2012
Comment Utility
Hi Slightwv,

I need clarification on the two thing u have mentioned with regard to oracle text indexes

1. Notice I added 'sync(on commit)' to the index clause.  This causes new rows to be indexed as they are committed.  This can cause a LOT of fragmentation within the Text index tables.

I did not understand the term fragmentation term here. Can u explain it.

2. Should , I run automated job to update an oracle text index.

3. How efficient is it from an normal index.
0
 
LVL 76

Expert Comment

by:slightwv (䄆 Netminder)
Comment Utility
1:  Once a 'token' (word) has been indexed and the documents it is in has been added to the Text Index tables, new rows that contain that term generate new index entries.  The old ones are not updated.

For example:
given the table:
doc_id, string
and the row:
1,'Hello World'

It has two tokens so the generated index would look like this:
Hello exists in doc_id 1
World exists in doc_id 1

Now you add a new row:
2, 'Goodbye World'

The index now has:
Hello exists in doc_id 1
World exists in doc_id 1
Goodbye exists in doc_id 2
World exists in doc_id 2

As you can see, 'World' now has two entries.  Once you OPTIMIZE the index, it will combine the two World entries so it now looks like:
Hello exists in doc_id 1
World exists in doc_id 1 and 2
Goodbye exists in doc_id 2

Expand that by a few thousand rows and you can see how fragmented it can become.



2:  If by update you mean optimize then yes.  I have one that runs nightly during the off hours and uses the max time parameter so it doesn't run into business hours.  The optimize is smart enough to know where it left off and will pick back up during the next run.


3:  As far as efficiency, it all depends on your specific requirements.  If you want left sided wildcards and don't want a full table scan on a large table, it is pretty much your only choice.

Look at it this way:
I have ONE index on TWO columns as opposed to TWO normal indexes.


If all you have is smaller tables then it might not be worth the overhead.

You'll need to set up some test cases on your system and test different scenarios.


That said, Text has some additional benefits:
It has fuzzy and stemming search capability (The docs talk about what this is):
http://docs.oracle.com/cd/E11882_01/text.112/e24435/ind.htm#CIHCCCIF


You also have the ability to set up synonyms, broader and narrower terms:
http://docs.oracle.com/cd/E11882_01/text.112/e24435/cthes.htm#i1006324

So, you can get as 'fancy' as you want way above normal text matching.
0
 

Author Closing Comment

by:sam_2012
Comment Utility
Awesome explanation.
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Article by: Swadhin
From the Oracle SQL Reference (http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm) we are told that a join is a query that combines rows from two or more tables, views, or materialized views. This article provides a glimps…
Working with Network Access Control Lists in Oracle 11g (part 2) Part 1: http://www.e-e.com/A_8429.html Previously, I introduced the basics of network ACL's including how to create, delete and modify entries to allow and deny access.  For many…
This video explains at a high level with the mandatory Oracle Memory processes are as well as touching on some of the more common optional ones.
Via a live example show how to connect to RMAN, make basic configuration settings changes and then take a backup of a demo database

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

9 Experts available now in Live!

Get 1:1 Help Now