10G Undo Advisor
A transaction uses an undo segment to record before images of data it intends to change. If the transaction fails before committing, Oracle uses the before images to rollback or undo the uncommitted data changes. Oracle also uses undo segments for statement-level read consistency. Read consistency ensures that all data returned by a query comes from the same point-in-time (query start time). Lastly, undo segments provide before images of data to help the instance roll back failed transactions during instance recovery.
In earlier releases of Oracle, DBAs were responsible for manually sizing, creating and tuning database objects that were responsible for storing before images of data. These objects, called rollback segments, are still available for use but certainly aren't as popular as their automatic undo segment counterparts. You didn't have to be an Oracle expert to know that rollback segments were "somewhat troublesome." Out of space conditions, contention, poor performance and the perennial favorite "snap shot too old" errors had been plaguing Oracle database administrators for over a decade. Oracle finally decided that the database could probably do a better job of managing before images of data than we could.
In Oracle9i and later releases, administrators have their choice of continuing to manage rollback segments on their own (manual undo management) or configuring the database to manage its own before image data (automatic undo management). Oracle refers to system managed before image segments as undo segments.
Administrators must create a tablespace to hold undo segments by using the UNDO keyword in the tablespace create statement:
CREATE UNDO TABLESPACE undots1
DATAFILE 'undotbs_1a.dbf'
SIZE 10M AUTOEXTEND ON;
The following initialization parameters are used to activate automatic undo
management:
- UNDO_MANAGEMENT - AUTO configures the database is to use automatic undo segments. MANUAL configures the database to use rollback segments.
- UNDO_TABLESPACE - Specifies the tablespaces that are to be used to hold the undo segments. The tablespace must be created using the UNDO keyword. If no tablespace is defined, Oracle will select the first available undo tablespace. If no undo tablespaces are present in the database, Oracle will use the system rollback segment during startup. This value can be set dynamically by using the ALTER SYSTEM statement.
- UNDO_RETENTION - specifies the amount of time that Oracle attempts to keep undo data available. This parameter becomes important when the Oracle9i flashback query option is used.
You cannot create database objects in undo tablespaces. It is reserved for system-managed undo data. The view DBA_UNDO_EXTENTS can be accessed to retrieve information relating to system managed undo data. For those of us familiar with V$ROLLSTAT, it is still available and the information reflects the behavior of the undo segments in the undo tablespace.
Tuning Automatic Undo
Segments
Most of us thought that automatic undo would be the solution to all of our before
image management problems. Certainly, before image administration has been simplified
by reducing the number of tuning knobs to one (UNDO_RETENTION). But sizing the
UNDO_TABLESPACE to hold the amount of before images generated during the time period specified
in the UNDO_RETENTION parameter can still be a challenge.
The goal we are trying to accomplish is to set the UNDO_RETENTION parameter and size the undo tablespace to ensure that we don't run out of space in the undo tablespace and our queries don't receive the dreaded ORA-01555 "Snapshot too old" error messages.
ORA-01555 errors occur when a transaction is unable to successfully read a before image of data to maintain read consistency. As stated previously, read consistency ensures that all data returned by a query comes from the same point-in-time (query start time). If a block of data changes after the query has started, Oracle will read the before, or undo, image of data to maintain a read consistent view of the data. To guarantee read consistency, Oracle returns the ORA-01555 error if it cannot retrieve the before image of the changed data block.
There are a couple of reasons why this failure occurs. One is delayed block cleanout. A good description of this problem can be found in Note 40689.1 on Oracle's Metalink website. Another reason this error happens is when there is insufficient space to hold all of the before images in the undo tablespace. When space becomes constrained, Oracle will remove before images of data from the undo tablespace that are no longer required to maintain transaction-level consistency. Remember, before images of uncommitted data must be kept to provide transaction-level consistency. If a transaction fails, Oracle uses the before images to make the database look like the failed transaction never changed anything. Before images of committed data blocks aren't as important to Oracle because they aren't required to maintain transaction-level consistency. The problem arises when queries and Oracle don't agree on what is, or isn't, important. And that disagreement occurs when a query is looking for a before image that has been flushed out of the undo area.
The trick is to balance the size of the undo tablespace with the amount of undo retention you need to satisfy your long-running queries. But this is easier said than done. Database applications, by their nature, are very dynamic. It is hard to predict how much undo will be generated for a particular time period. Sizing the undo tablespace and setting the UNDO_RETENTION time period is certainly simpler for small databases that don't have a lot of activity. But what if you have an application that is hundreds of GIGs in size with queries that run for hours and hours and hours .
10G Grid Control Undo Advisor
Let's take a look at the 10G Grid Control Undo Advisor to see if it can help. Like all
advisors, we access the Undo Advisor by clicking on the Undo Advisor link that
is displayed on 10G Grid Control's Advisor
Central Home Page. Clicking on this link notifies10G Grid Control to display the Undo
Management Home Page. The home page displays the following information:
- Current UNDO_RETENTION setting
- Links to administer the undo tablespace
- A recommendation area
- Undo image consumption and generation
- Links to activate the Undo Advisor
Notice that Oracle is making a recommendation to increase the size of the undo tablespace. Remember that 10G Grid Control will activate various advisors during normal operations. The Undo Advisor is one of them. After reviewing the information on the Undo Management Home Page, I clicked on the Undo Advisor link. 10G Grid Control displays the Undo Advisor Page. The Undo Advisor Page displays a graph that allows you to determine the size of the undo tablespace that will be required to successfully store the amount of before images generated for a specified undo retention time period.
If you roll the mouse over the graph, Oracle will display undo retention time/undo tablespace size settings. You can now perform numerous "what if" tests to determine what the best combination of undo retention/undo tablespace size values are for your environment. If you click on the graph and hit the OK navigation button, you will be executing the SQL that is generated to change the undo retention to the time period you specified with your mouse click. If you click on the Show SQL navigation button, 10G Grid Control will display the SQL statement that will be executed.
It seems that the Undo Advisor uses a worst-case scenario to determine the amount of undo generated for a particular time period. Use the advisor to determine a good starting point and set your UNDO_RETENTION parameter and undo tablespace size settings accordingly. Monitor the enviroment and fine tune it to meet your overall goals.
Tune the SQL First
If a statement suddenly returns the infamous ORA-015555 "Snapshot
too old" error message, don't immediately begin tuning the database's undo
management system to fix it. Remember many SQL statements run too long because
they aren't properly tuned and some shouldn't be run during daylight
hours due to the amount of work they generate. You'll need to tune the poorly
running queries and determine which SQL should be run during off-peak processing
times. Continue to tune the SQL and tweak the undo setting parameters until
you obtain a trouble-free environment.
I hope you enjoyed this
blog on the 10G Grid Control Automatic Undo Advisor. Thanks for reading!
