?
Solved

compute statistics advice request

Posted on 2005-05-01
21
Medium Priority
?
1,527 Views
Last Modified: 2008-02-01
When running a java-based application that calls a base class to insert records and select others, the select statements sometimes seem to take an awful long time.  Well, I added an "ANALYZE TABLE table COMPUTE STATISTICS" statement that is executed before each call to this base class.  This seems to speed things up quite a bit.  The trouble is, I'm fairly new to Oracle, and I want to make sure I understand all of the ramifications of such an approach.  Is my approach "safe"?  Is there a better approach?  
0
Comment
Question by:rnicholus
  • 7
  • 4
  • 3
  • +4
21 Comments
 
LVL 23

Accepted Solution

by:
paquicuba earned 440 total points
ID: 13906450
<< "ANALYZE TABLE table COMPUTE STATISTICS" >> This statement is gathering statistics about a table, such as the number of rows, number of columns, min value, max value, cardinality ( Number of distinct rows ),etc; COMPUTE STATISTICS instructs Oracle to compute exact statistics about the analyzed object and store them in the data dictionary. When you analyze a table, both table and column statistics are collected.
You don't need to analyze a table every time you run a query, and much less if you don't know how to take advantage of the statistics stored in the data dictionary about an object. You approach is going to make the query run slower...

I strongly recommend you to spend some time understanding the optimizer and the Oracle supplied utilities for SQL tuning:
http://www.cs.umb.edu/cs634/ora9idocs/server.920/a96533/optimops.htm

 ********** Oracle Corporation strongly recommends that you use the DBMS_STATS package rather than ANALYZE to collect optimizer statistics. ********


Good luck!





Oracle Corporation strongly recommends that you use the DBMS_STATS package rather than ANALYZE to collect optimizer statistics
0
 

Author Comment

by:rnicholus
ID: 13906596
Thank you very much for this info and link.  I am printing it out as we speak for reading.

One more question...

You said that my approach is going to make my query run slower, but i did see a performance increase.  Why exactly did this happen?  I want to completely understand.
0
 
LVL 9

Assisted Solution

by:konektor
konektor earned 80 total points
ID: 13907188
computing statistics befor each statement can be disserviceable. compare how long did your command run before computing statistics and how long did your command run after it + how many time was spent to compute statistics.
it's not good to compute statistics on tables which are modified massively, because computed statistics became obsolete wery soon  and u have to compute it again wery often. it's better not to compute statistic on such tables (massively modified - a lot of inserts, deletes), but use "RULE" hint and build appropriate indexes.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:rnicholus
ID: 13907244
"rule" hint?
0
 
LVL 5

Expert Comment

by:helpneed
ID: 13907643
hi

its not needed all the time u analyze the table when u query the table it will be time consuming when there are lots of row and thats not the way to analyze it

u can go for a schedule analyze if u want for the table its better to use dbms_stats package for analyzing

wht u can do if u are feeling the table access is slow just trace that sql and find weather u are getting the proper execution plan

u can trace the process with dbms_system.set_sql_trace_in_session(sid,serial#,true);

u can get the sid ans serial# from v$session and using tkprof utility try to format the trace file;

this is the one way to trace the process..

hints are also welcome in sqls

regards





0
 
LVL 23

Expert Comment

by:paquicuba
ID: 13907922
<< but i did see a performance increase.  Why exactly did this happen?>>
I doubt that you gained performance by just analyzing your table, in your case it would be like placing a comment within your SQL.
Once you have a table analyzed, you can use those statistics stored in the data dictionary to elaborate an execution plan before running your query.
The execution plan will give you enogh details for you to decide what changes need to be made to the SQL, what path to follow, what type of join needs to be forced using hints, what index to be created, etc.
SQL Tuning and the Optimizer is a broad topic that cannot be covered in a forum. That's why a suggested you that link.

Cheers!
0
 
LVL 4

Assisted Solution

by:kripa_odba
kripa_odba earned 80 total points
ID: 13908162
Hi,

All answers mentioned here is one way other way correct only..So one thing you can remove from your process is ANALYZING base tables each and every time you run the sql.

Its not a nice practice to ANALYZE the table each and every time when you run the sql queries.....
 
You can  analyze when the structure of the data in the tables changes enough that the optimal query execution path is different from what it used to be.  The Cost Based Optimizer will use the statistics from the last time that the schema was analyzed.  So if the correct path is now different, then you need to analyze to use it.

If things start slowing down, then the first thing to try is an “Analyze.”  If there is a substantive improvement, then it was the right decision.  And when your tables in ur database encounter lot of updations and Inserts you can go for anlayze.

Tracing every time is not practical  always.it will consume lot of resources also.... So better try with analyze first...

If you are going to start tuning, then you must analyze first.  Otherwise, your work will all go away (or become irrelevant) when you analyze. You don’t need to so a full analyze each time.  An estimated analyze is enough.

Once you analyze the tables and indexes the data dictionary  will be updated.. Oracle optimizer will use this information for calculating the access cost for each sql statement which refer to the particular table.And will choose a access path which is low.

  You can use two technique while analyzing tables and indexes.

   ANALYZE command and DBMS_STATS.
 
   eg. Analyze Table tblname  Estimate Statistics For Table For All Indexes Sample 50  Percent; or analyze table tblname compute statistics;

        EXEC DBMS_STATS.gather_schema_stats('SCOTT',,,'FOR ALL INDEXED COLUMNS','ALL')

   
        Please do not collect statistics for SYS and SYSTEM schemas.


   Both has Advantages and disadvantages.

   from oracle 8i you can use DBMS_STATS package.

  DBMS_STATS has the advantages like
 
   -----  import/export/set statistics directly with dbms_stats

   ----- easier to automate with dbms_stats (it is procedural, analyze is just a command)

   ----- dbms_stats can analyze external tables, analyze cannot.

   ----- DBMS_STATS gathers statistics only for cost-based optimization; it does not gather other statistics. For example,
          the  table statistics gathered by
   ------ DBMS_STATS include the number of rows, number of blocks currently containing data, and average row length but  
           not the number of chained rows, average free space, or number of unused data blocks.

   ------ dbms_stats (in 9i) can gather system stats (new)

   ------- ANALYZE calculates global statistics for partitioned tables and indexes instead of gathering them directly. This can
             lead to inaccuracies for some statistics,such as the number of distinct values.  DBMS_Stats won't do that.

             Most importantly, in the future, ANALYZE will not collect statistics needed by the cost-based optimizer.



Hope this will help you
0
 
LVL 4

Expert Comment

by:kripa_odba
ID: 13908325
One more comment.....

  The hint RULE is kind of a recomendation to the oracle optimizer to always use Indexes....... This may not always help you... If your query is gonna hit less than 5% of the rows in the table then index based retrieval of the rows is good... but when ur query is going to hit more than 5% of the rows in ur table its always better to go for full table scan....

  so if you specify RULE hint in ur query and the rows its going to hit is more than say 5% then its a overhead .Since first it need to read the index after that actual rows in the table....

  so better dont specify hint in ur sql query ... Oracle cost based optimizer will choose the best execution plan for your query...

   but unless you analyze your tables the statistics for the cost based optimizer will not be generated in data dictionary tables.... if you want to knw more abt optimizer modes you can refer to oracle documentation...


thanks & Regards
0
 
LVL 35

Assisted Solution

by:Mark Geerlings
Mark Geerlings earned 240 total points
ID: 13911127
The "rule" hint was what Oracle used in Oracle6 and Oracle7 to optimize queries.  Starting with Oracle8, Oracle introduced the (much more complex and sophisticated) "cost-based" optimizer.  If you have Oracl8.1 or Oracle9, you should be using the cost-based optimzer (which needs table statistics).  In Oracle10, the "rule" based optimizer is officially desupported.

Should you be analyzing the table before every "select..." statement?  **DEFINITELY NOT**  You only need to analyze the tables when they have significant changes, for example if more than 10% of the rows are deleted or updated, or if 10% or more new rows are added.  In most Oracle databases, it is enough to run analyze just once a week or even once a month.
0
 
LVL 6

Expert Comment

by:jwittenm
ID: 13930157
I'd recommend you pull the query that seems to be benefitting from the analyze from your java.  Execute your analyze statement.  Run the query via sqlplus and check the explain plan.  Run your java as usual to do the inserts and queries.  When it begins to slow down, run the query again from sqlplus and re-check the explain plan.  What is the difference?  Once you understand what is changing, you can postulate a reason and a solution, then test your theory.
0
 

Author Comment

by:rnicholus
ID: 13933166
Actually, that won't be necessary at this juncture.  I realized that I am missing indexes for key columns that appear in the predicate section of most of my automated SQL statements.  I intend to create and remove some indexes.  i have also been reading up on the CBO.  Very interesting stuff.  I'm glad I was given this link.  

One question (and probably one final question tommorrow)...

Is there a reason why I should not create indexes for all of my columns in a table?  What sort of overhead is involved with maintaining an index?
0
 
LVL 4

Expert Comment

by:kripa_odba
ID: 13933564
If you are doing lot of DML operations on table which you have indexes on all the columns ,it will temendously slow down the DML operations......Because every dml operation has to update the index as well as actual rows in the table.....

If your table is read only  i.e you are using the table only for querying you can go for creating indexes on most of the columns.... But you need to understand the fact that indexes will cosume storage space in database........

0
 
LVL 5

Assisted Solution

by:helpneed
helpneed earned 80 total points
ID: 13933932
hi

i think there is no need to put index in all the columns its a over head when dion an update..

and oracle will take index only if ur selsction is only less than 5%of data...so if its more than that just think about it

soit is  is a trade-off between the speed of retrieving data from a table and the speed of updating the table. For example, if a table is primarily read-only, having more indexes can be useful; but if a table is heavily updated, having fewer indexes may be preferable.

find details in this link abt managing index
http://www-rohan.sdsu.edu/doc/oracle/server803/A54641_01/ch14.htm

regards

0
 
LVL 35

Assisted Solution

by:Mark Geerlings
Mark Geerlings earned 240 total points
ID: 13935843
I agree, you do not want an index on every column because:
1. of the space it will take
2. of the performance impact on inserts, updates or deletes
3. this can actually make query performance worse

You definitely need indexes on the key columns in your tables (those columns where every record has a unique value).  For non-key columns, with fewer distinct values (and multiple rows that have the save value) indexes may or may not help. They definitely will not help if the "where" clauses of the queries do not provide values for these indexed columns.  The performance value of indexes depends on how many different distinct values there are (with fewer values, indexes are less helpful) and on how many rows are being retrieved by the queries, and on how those rows are distributed in the table (widely scattered from each other, or adjacent to each other).  A general rule of thumb is if the query will retrieve more than 10% of the rows in the table, an index usually does not help, but this varies based on the conditions I just mentioned.
0
 

Author Comment

by:rnicholus
ID: 13935982
Thanks for this info.  

One final question...

In the "Introduction to the Optimizer" online document at http://www.cs.umb.edu/cs634/ora9idocs/server.920/a96533/optimops.htm, the following statement appears: "Online transaction processing applications, which consist of short-running SQL statements with high selectivity, are often characterized by the use of index access paths."  However, earlier on in that same paragraph, it says "...index paths should be used for statements that retrieve a small subset of table rows".  The high selectivity mentioned in the first statement (in my question) seems to suggest that a large number of rows will be selected (since high selectivity = selectivty rating close to 1.0 = large # of rows selected).  So, the two statements I have quoted seem to contradict each other.

Can someone shed some light on this for me?
0
 
LVL 6

Assisted Solution

by:jwittenm
jwittenm earned 80 total points
ID: 13936535
'High Selectivity' means very selective, i.e., retrieves a small subset of the data.  They are talking about the same concept.
0
 

Author Comment

by:rnicholus
ID: 13936637
Oh.  Ok.  So, when they speak of high selectivity, they are speaking of a selectivity rating closer to 0, rather than a selectivity rating closer to 1.  Correct?
0
 
LVL 35

Assisted Solution

by:Mark Geerlings
Mark Geerlings earned 240 total points
ID: 13937029
My understanding of "High Selectivity" is the same as jwittenm's, that is: a small subset of data.  For example, in a customer file that has addresses for 100 customers, and two customers are from each of the 50 states in the US, a query based on a particular state would be "highly selective".  If the 100 customers were grouped though into only two or three states, then I would consider the value in the state column to be not highly selective, and not a good candidate for a standard b-tree index.  In a data warehouse though, this would be a good column for a bit-mapped index.
0
 

Author Comment

by:rnicholus
ID: 13937081
I see.  I just want to confirm that the selectivty rating of your first example is close to 0, and the selectivity rating of your second example is close to 1...
0
 
LVL 35

Expert Comment

by:Mark Geerlings
ID: 13937182
I don't claim to know what a selectivity rating of 0 or 1 is closer to.  I think of selectivity more in terms of the percentage of rows selected, rather than in the absolute number of rows selected.  For example, in a million-row table, a query that retrieves only 100 rows, I would consider highly selective.  Whether that means it's closer to a selectivity rating of 0 or 1, I don't know.
0
 

Author Comment

by:rnicholus
ID: 13937229
Thank you all for your help.  I will split the points up as best I can.
0

Featured Post

Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article started out as an Experts-Exchange question, which then grew into a quick tip to go along with an IOUG presentation for the Collaborate confernce and then later grew again into a full blown article with expanded functionality and legacy…
Have you ever had to make fundamental changes to a table in Oracle, but haven't been able to get any downtime?  I'm talking things like: * Dropping columns * Shrinking allocated space * Removing chained blocks and restoring the PCTFREE * Re-or…
This video shows setup options and the basic steps and syntax for duplicating (cloning) a database from one instance to another. Examples are given for duplicating to the same machine and to different machines
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question