myerpz
asked on
Sample size in ANALYZE command
Is there a recommended rule of thumb for choosing what
percentage of a table to analyse ?
I have some large tables (20 million rows) which I want
to analyse on a regular basis.
I don't want to analyse too small a number of rows, as
the stat's probably won't be accurate, and could make
things worse if the wrong execution path is chosen as
a result of inaccurate stat's.
But I don't want to analyse the whole table as this will
take too long.
What percentage will give me a good balance ? 10 ? 20 ?
Does Oracle recommend a suitable figure ?
Any accompanying explanations or URL's would be most
appreciated.
Thanks.
percentage of a table to analyse ?
I have some large tables (20 million rows) which I want
to analyse on a regular basis.
I don't want to analyse too small a number of rows, as
the stat's probably won't be accurate, and could make
things worse if the wrong execution path is chosen as
a result of inaccurate stat's.
But I don't want to analyse the whole table as this will
take too long.
What percentage will give me a good balance ? 10 ? 20 ?
Does Oracle recommend a suitable figure ?
Any accompanying explanations or URL's would be most
appreciated.
Thanks.
Rather than a percentage, I've seen the number 1024 recommended as the number of rows to count when you want analyze to do a quicker job. I used that figure in the past when I was working on a system with some multi-million row tables, with good results as far as I could tell.
ASKER
I think the default on the ANALYZE command is 1604.
I just wonder if there's any official word from Oracle
on it, or whether they leave it up to the analyst.
I just wonder if there's any official word from Oracle
on it, or whether they leave it up to the analyst.
************************** ********** ********** ********
ANALYZE .... ESTIMATE STATISTICS SAMPLE integer ;
SAMPLE INTEGER:
Specify the amount of data from the analyzed object Oracle should sample to estimate statistics. If you omit this parameter, Oracle samples 1064 rows.
The default sample value is adequate for tables up to a few thousand rows. If your tables are larger, specify a higher value for SAMPLE. If you specify more than half of the data, Oracle reads all the data and computes the statistics.
ROWS causes Oracle to sample integer rows of the table or cluster or integer entries from the index. The integer must be at least 1.
PERCENT causes Oracle to sample integer percent of the rows from the table or cluster or integer percent of the index entries. The integer can range from 1 to 99.
************************** ********** ********** ********
I think 20% of the rows is a good figure estimate the statistics.
ANALYZE .... ESTIMATE STATISTICS SAMPLE integer ;
SAMPLE INTEGER:
Specify the amount of data from the analyzed object Oracle should sample to estimate statistics. If you omit this parameter, Oracle samples 1064 rows.
The default sample value is adequate for tables up to a few thousand rows. If your tables are larger, specify a higher value for SAMPLE. If you specify more than half of the data, Oracle reads all the data and computes the statistics.
ROWS causes Oracle to sample integer rows of the table or cluster or integer entries from the index. The integer must be at least 1.
PERCENT causes Oracle to sample integer percent of the rows from the table or cluster or integer percent of the index entries. The integer can range from 1 to 99.
**************************
I think 20% of the rows is a good figure estimate the statistics.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I have used this script for a number of years an it seems to do the trick. It is base on the size of the table not the number of rows.
REM ************************** ********** ********** *********
REM
REM File: C:\Dba\analytab.SQL
REM
REM ************************** ********** ********** ********
REM This script will produce and run the script
REM ANALYTB2.SQL to analyze
REM all tables and create the listing analytab.rpt
REM
REM ************************** ********** ********** ********
set echo off
set heading off
set pagesize 50000
set feedback off
spool c:\dba\analytb2.sql
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' COMPUTE STATISTICS ;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes < 5000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 30 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes between 5000001 AND 30000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 10 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes between 30000001 AND 50000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 5 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner IN ('SYS','SYSTEM')
AND bytes > 50000000
ORDER BY 1
;
spool off
set echo on
set feedback on
@C:\dba\analytb2
REM **************************
REM
REM File: C:\Dba\analytab.SQL
REM
REM **************************
REM This script will produce and run the script
REM ANALYTB2.SQL to analyze
REM all tables and create the listing analytab.rpt
REM
REM **************************
set echo off
set heading off
set pagesize 50000
set feedback off
spool c:\dba\analytb2.sql
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' COMPUTE STATISTICS ;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes < 5000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 30 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes between 5000001 AND 30000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 10 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner NOT IN ('SYS','SYSTEM','VOSER')
AND bytes between 30000001 AND 50000000
UNION
SELECT 'ANALYZE TABLE ' || RTRIM( owner ) || '.' ||
segment_name || ' ESTIMATE STATISTICS SAMPLE 5 PERCENT;'
FROM sys.dba_segments
WHERE segment_type = 'TABLE'
AND owner IN ('SYS','SYSTEM')
AND bytes > 50000000
ORDER BY 1
;
spool off
set echo on
set feedback on
@C:\dba\analytb2
I'm not experienced with analyze schema.
But dbrower, are you sure, that you always used THIS script for a number of years?
What if a table is > 50000000 Bytes?
Ooops, is there perhaps a "NOT" missing ;-)?
Good luck.
But dbrower, are you sure, that you always used THIS script for a number of years?
What if a table is > 50000000 Bytes?
Ooops, is there perhaps a "NOT" missing ;-)?
Good luck.
Hi
Why not test different samples percentage for both time and statistics accuracy ?
Why not test different samples percentage for both time and statistics accuracy ?