Oracle_trace - the Best Built-in Diagnostic Tool?

by Jonathan Lewis

Editor's Note: Shortly after this article was published, it came to light that the Oracle 9.2 Performance Guide and Reference has now identified Oracle Trace as a deprecated product.

There is a lot of diagnostic code built into the database engine some, such as sql_trace, is well documented and some, such as x$trace is undocumented. Every now and again, I like to spend a little time re-visiting areas of code like this to see how much they have evolved, and whether they have acquired official blessing and documentation. Recently, whilst doing some work with Oracle 9i, I discovered the amazing leap forward that oracle_trace has made in the last couple of releases. This article is a brief introduction to oracle_trace, and what it could do for you.

How Do I … ?

Find out which object is the source of all the buffer busy waits that I can see in v$waitstat?

We've all seen the tuning manuals: "if you see . . . you may need to increase freelists on the problem table" -- but no clue about how to find the problem table.

Option 1 - run a continuous stream of queries against v$session_wait and check the values of p1,p2, p3 when this event appears. Statistically you will eventually get a good indicator of which object(s) are causing the problem. A bit of a pain to do, and relies a little on luck.

Option 2 - switch on event 10046 at level 8, and catch a massive stream of wait states in trace files. A fairly brutal overhead, and again relies on a little bit of luck.

Option 3 - there is an event (10240) which is supposed to produce a trace file listing the addresses of blocks we wait for (hooray!), but I've not yet managed to get it to work. If you do know how to, please let me know, as this is clearly the optimum solution.

So would you like to get a list of just those blocks we wait for, who waited for them, why they waited, and how long they waited - at minimal cost ? This is just one of the things that oracle_trace can do for you.

What is oracle_trace?

oracle_trace is a component of the database engine that collects events, apparently at relatively low cost.

Events include such things as waits, connects, disconnects, calls to parse, execute, fetch, and several others.

You can collect the information from the entire instance or target specific users, events, or processes; and can switch the tracing on and off at will.

But one of the really nice features of oracle_trace is that it can be set to buffer the collection and dump it to disc in big chunks, rather than leaking it out a line at a time. Not only that, but you can request that the collection file should be a fixed size, and should be recycled.

Naturally, once you have generated a collection file, you need to analyse it. You can do this one of two ways - run a program that converts the collection file into a series of flat text reports, or run a program that reads the collection file and dumps it into a set of Oracle tables, from which you can then generate your own reports.

Uses for oracle_trace

So how does oracle_trace help us to answer the original question ?

Simple: one of the classes of events that can be traced is waits. We ensure that we have started our database in a ready to trace mode, and then tell Oracle, either through PL/SQL or the command line interface, to start tracing waits. But we restrict the choice of wait events to just wait 92 (this is buffer busy waits in my Oracle 9i system, but check event# and name from v$event_name for your system). We then sit back and wait for an hour or so at peak problem time. When we think our trace file is large enough, we stop tracing, format the trace file into a database, and run an SQL statement that might, for example, say something like,

tell me which objects are subject to buffer busy waits, the wait time, how
often it happens and who got hit the most.

If we wanted to suffer an extra overhead, we could even start a trace to capture the waits and the SQL, and get a report of the SQL that suffered the waits.

Putting it All Together

The first component of our task is to set some database parameters so that the database is ready to trace, but not tracing. Fig.1 shows the list.

Required
oracle_trace_enable          = true
oracle_trace_collection_name = **

Defaults
oracle_trace_collection_size =

     5242880


oracle_trace_collection_path =
     ?/otrace/admin/cdf


oracle_trace_facility_path   =
     ?/otrace/admin/fdf


oracle_trace_facility_name   =  
     oracled

Figure 1: Parameters relating to oracle_trace

The oracle_trace_collection_name must be set to an explicit blank " " otherwise it defaults to "oracle," and if there is a collection name available, when trace is enabled then Oracle does instance level tracing from the moment it starts up (ouch!).

The oracle_trace_collection_path is the directory where the files will go. The oracle_trace_facility_path is where the lists of events to be traced (facility definition files supplied by Oracle Corporation) will be located . The oracle_trace_facility_name identifies the list of events we are interested in. Finally we can limit the size (in bytes) of the collection file using oracle_trace_collection_size.

Once the database is started, we can start a trace collection.

In this article I will stick with using the command line interface, although there is a PL/SQL interface as an alternative (and a graphic interface if you purchase the module for Oracle Enterprise Manager). The command we use will be something like:

     otrccol start 1 otrace.cfg

The otrccol command is the primary interface to oracle_trace. There are other commands, but their functionality has generally been added to otrccol. Obviously we use start to start tracing (and stop to stop tracing). The "1" is an arbitrary job id, and otrace.cfg is a configuration file. See fig. 2 for an example of a configuration file.

Used for collection

col_name           = jpl
cdf_file           = jpl
dat_file           = jpl
fdf_file           = waits.fdf
max_cdf            = -10485760
buffer_size        = 1048576
regid              = 1 192216243 7 92 5 d901

Used for formatting
username           = otrace
password           = otrace
service            = d901
full_format        = 1

Figure 2: Sample oracle_trace configuration file

This file tells oracle to produce a collection file called jpl.dat, with a collection definition file called jpl.cdf, and a collection identifier of jpl. The facility definition is in the file waits.fdf (supplied by Oracle Corp. to identify wait events only). The trace file will be limited to 10 MB, but will be recycled so that there is always 10MB of recent data in it, and Oracle will use a buffer of 1MB to hold data before dumping it.

The regid is one of the really powerful features of oracle_trace. The 'default' value for this line reads '0 0' where I have the values '7 92', and in its default form the line states that oracle_trace is tracing across the entire Oracle instance identified by the d901 at the end of the line. In the form shown, I have chosen to trace only facility number 7 (wait events) facility item 92 (buffer busy waits).

You can have multiple regid lines in a file if you want to. For my first set of experiments, I used two regid lines in my configuration file, specifying '7 129' and '7 130' - sequential and scattered reads respectively, as these types of waits are easy to generate.

I will mention the formatting section later in the article.

After letting the system run for a while, we then execute:

    otrccol stop 1 otrace.cfg
    otrccol format otrace.cfg

The first command stops the trace, the second command reads the collection file and dumps it into a set of Oracle tables.

However, before you can format the collection, you need to create an account to hold the tables used by the formatter. For the name and password we use the values listed in fig.2 above. The significance of the full_format=1 line in the configuration file is that it forces the formatter to dump the entire file to table, full_format=0 would restrict the task to new data only. Note also the service name -- this identifies the database that holds the account -- you will need the TNS listener running to use the format command, even if you format to the local database.

Fig. 3 shows a short script to create the account and give it suitable privileges.

   create user otrace identified by otrace
         default tablespace users
         -- if you are not using 9i
         -- temporary tablespace temp
         quota 100m on users;

   grant create session to otrace;
   grant create table to otrace;
   grant create sequence to otrace;
   grant create synonym to otrace;

Figure 3: Creating a user for the trace tables.

When you run the format option, the program will automatically (at least in newer versions of Oracle) create the necessary target tables under the supplied account. Some of these tables will have sensible names like:

    EPC_COLLECTION

Others will be meaningless things such as:

    V_192216243_F_5_E_9_9_0

The problem of incomprehensibility can be addressed by running script otrcsyn.sql in directory $ORACLE_HOME/otrace/demo.

This script produces synonyms for the tables, giving them meaningful names such as

    WAIT
    CONNECTION

(There are variations in names across different versions of Oracle.)

I found a small problem with the automatic table generation. If you use one of the very selective lists of facilities (such as waits.fdf) only the set of tables required for the resulting data appears when you format the data. If you then decide to use a more comprehensive list of facilities (such as oracle.fdf) the formatter will get confused and
crash because some tables are present and some are missing. So it may help to run for just a few seconds with facility oracle.fdf, format the data, then truncate all the tables. This is a crude, but effective, method for setting up the account.

Some Results

So finally what have we done:

Created a config file
Started a collection
Done some work on the database
Stopped the collection
Formatted the collection

Now what?

Assume we have been using the configuration file from figure 2. We would find, when we logged into the otrace account, that we had some rows in the connection, disconnect, and wait tables. The rows in the wait table would tell us all about the buffer busy waits in the interval we traced.

For example, we could execute the SQL in fig. 4

  select

            p1 file_id,
            p2 block_id,
            p3 reason_code,
            count(*) ct,
            sum(time_waited)/100
            secs

  from

            wait
  group by
            p1, p2, p3
  order by
            sum(time_waited) desc

  ;

Figure 4: Sample query to identify worst BBWs

If you want greater precision, you can list all the waits, with timestamp, and a column called (optimistically perhaps) timestamp_nano.

If you want to identify users with the worst wait problem, change the query to sum by session_index (SID) and session_serial (serial#). The table (synonyms) connection can be used to turn (session_index, session_serial) into the username, machine name, logon time, and so on.

Of course, there's nothing (except performance costs) stopping you from joining this table to the dba_extents view to translated file and block ids into object types and names.

And if you want to locate the specific SQL that suffered the waits you can always, at a rather higher cost of course, switch to the sql_waits.fdf file, which populates a few more of the trace tables, then join them on session_index, session_serial, timestamp, and timestamp_nano.

Finally, if you are worried that the cost of loading the data into tables and running the reports will have a negative impact on the system, you can always move the cdf and dat files to another machine and process them in another database. I even managed, with just a little fixing, to generate collection data on a version 9i instance, and then process it on a version 8i instance, just to prove the point. This would, of course, restrict your ability to associate blocks with objects.

The Future

The possibilities are endless - for example, one of the facility definition files is called oraclec.fdf and this tracks buffer cache activity. After tracing this, you can actually report, pretty much to the micro second, the blocks that were loaded into the cache, in the order they were loaded, and which blocks they pushed out in order to be loaded. (I suspect the overhead is likely to be a bit too painful for general use.)

Another option that is likely to be generally useful is the connect.fdf file.This captures session connects and disconnects, in much the same way that the command audit session will do. However, the trace file collects half a dozen extra session statistics (such as redo entries) which are not collected in the aud$ table; and it isn't writing to the database as it does so.

And for that very special individual - you can even set oracle_trace to target just one user. You could even write SQL to read the resulting tables, and generate an exact clone of a normal sql_trace file, with the added benefit of being able to track from the moment they log in, migrate across multi-threaded servers, and finally log out.

Conclusion

This is just a brief introduction to oracle_trace that barely scratches the surface of how you can use it. There is still some work to do in determining the stability and side-effects of oracle_trace, not to mention the performance impact.

However, initial investigations suggest that it is much cheaper than other built-in diagnostics, and much more flexible and precise in what it can report.

At the very least, it allows you to get some accurate answers to some of the vexing questions that have plagued DBAs over the years.

Personally I wouldn't be particularly surprised if oracle_trace didn't end up replacing just about every other diagnostic strategy some time within the next couple of years.

Caveat

There are a number of differences, usually in nothing other than names, between the Oracle 8i and the Oracle 9i implementation of oracle_trace. This article was written purely from the perspective of Oracle 9i.

References

Oracle 9i Performance Tuning Guide and Reference, Chapter 12.

Jonathan Lewis is a freelance consultant with more than 18 years experience of Oracle. He specialises in physical database design and the strategic use of the Oracle database engine, is author of Practical Oracle 8i - Building Efficient Databases published by Addison-Wesley, and is one of the best-known speakers on the UK Oracle circuit. Further details of his published papers, presentations and seminars and tutorials can be found at http://www.jlcomp.demon.co.uk, which also hosts The Co-operative Oracle Users' FAQ for the Oracle-related Usenet newsgroups.

Jonathan Lewis
Last modified 2005-04-18 07:49 PM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?