DBMS_STATS – Gathering Database Statistics

There are various thoughts on the correct way to gather statistics in a database. Do you collect them weekly/monthly/daily? Do you do it when the objects have had a certain level of changes made to them? Do you do it straight in PROD or test it in a QA environment first?

In this article I would like to first show you what Oracle recommend and then give you some advice from real world experience.

Oracle’s Recommendations for Gathering Stats

The points below have been summarised from multiple Metalink articles on this subject.

  • => Use 100% sampling size wherever possible
  • => Ensure all tables and indexes have stats
  • => Ensure columns with skewed data have histograms. Usually only add a histogram if it’s needed. Use SKEWONLY option. Default in 10g of AUTO can be fine.
  • => Creating histograms on columns when the sample size is less than 100% can cause issues because the sample may not include all of the different values which should be in the histogram. Also, if stats are out of date and you have histograms the value might be out of range and cause bad plans to be used.
  • => In 9i, when using dbms_stats, the default value for method_opt (to do with histogram stats collection) was FOR ALL COLUMNS SIZE 1. Effectively, this means no details about the data within the columns just the min and max values. In 10g & 11g this is now FOR ALL COLUMNS SIZE AUTO. This setting means that DBMS_STATS decides which columns to add histogram to where it believes that they may help to produce a better plan.
  • => Gather system statistics. This provides the optimizer with values for the CPU load and I/O capabilities of the server to more accurately generate explain plans.

DBMS_STATS – The Real World Solution

Key points to think about when dealing with Oracle table and index statistics are:

  • => Any change in stats should be treated like a release because the results in changes of plans can be as severe. It should be run on DEV, QA and finally PROD after being signed-off. Would you put a new SQL statement live without having tested it?
  • => Always use 100% sampling and don’t gather stats on PROD for the first time. Gather statistics on DEV using a full data set from PROD with DBMS_STATS. Then export these stats and import them as you need them on other environments. This way you are 100% sure that the stats on PROD are the same ones which were signed-of with.
  • => Use the CASCADE option to gather index stats at the same time as the tables, but also be aware that when creating a new index in 10g and above the stats are gathered automatically so you might not need to re-gather, saving you time.
  • => Don’t use histograms unless you absolutely have to. They cause more headaches and require so much maintenance that it’s far better to not have them if you can get away with it.

DBMS_STATS – The Detail

Another recommendation that I have regarding the use of DBMS_STATS is that you configure the default values which are used when executing one of the gather procedures without explicitly specifying the options. Here is a script that I used when I upgraded my database from 10g to 11g:

-- Show current values for default stats gather parameters

SPOOL dbms_stats_config.log

select dbms_stats.GET_PREFS('AUTOSTATS_TARGET') from dual;
select dbms_stats.GET_PREFS('CASCADE') from dual;
select dbms_stats.GET_PREFS('ESTIMATE_PERCENT') from dual;
select dbms_stats.GET_PREFS('METHOD_OPT') from dual;
select dbms_stats.GET_PREFS('NO_INVALIDATE') from dual;
select dbms_stats.GET_PREFS('GRANULARITY') from dual;
select dbms_stats.GET_PREFS('PUBLISH') from dual;
select dbms_stats.GET_PREFS('INCREMENTAL') from dual;
select dbms_stats.GET_PREFS('DEGREE') from dual;

-- Setting new parameters for default stats gather parameters



In the example above I am configuring the default options to be used when someone calls the DBMS_STATS package. This is helpful because it means that if anyone runs the dbms_stats package without thinking about what options to specify it will use the default values.

When to Gather Statistics?

You should always gather new statistics when you perform an upgrade from one Oracle release to another. For example, when upgrading your database from 10g to 11g you should definitely gather new stats. This is because there are changes in the ways that the optimizer works and, as such, different statistics are required to give the optimizer the full picture about the segments. Gather of new statistics in this instance should be done as part of your development and testing phase. Refresh a full database from your production environment and gather stats there. Then use the same stats throughout your testing and go-live with the stats which you have used for testing.

Another time to gather statistics is when you have created a new table, changed a large amount of the rows in the table, created a new index or altered a table in some way like adding a new column to it. In all of these scenarios, I would recommend the same approach to be taken as I have mentioned above; gather the stats on DEV where you have completed the work, test that everything works as expected and performance is good, and then roll these statistics through with the other changes so you are using a known set of stats.

To restore schema, table or index statistics I have written another article about the best way to do that.

You should definitely read more about gathering system statistics in your database, as these can make a massive difference on how your SQL queries run, especially when upgrading from 10g to 11g, I found. This, and gathering new table and index stats with the new DBMS_STATS package in 10g made things much better for us.

You should also check out Metalink ID 1226841.1 for more detailed information.

Like it, share it...

Category: Statistics

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *