Tuning Up for OLTP and Data Warehousing - Part 1

by Scott Hayes and Philip Gunning

Everything you need to know to improve performance from experts who’ve tackled every kind of performance problem.

At conferences and other speaking engagements, we often say we’ve never met a database whose performance we couldn’t improve. We’re willing to put that claim in print and share tips and tricks gleaned from our combined 22 years’ experience with DB2 Universal Database (UDB).

Before we begin, let’s set the stage by defining the difference between online transaction processing (OLTP) and data warehouse (DW) databases. (Note: This article focuses on DB2 UDB for Unix, Linux, and Windows platforms. For DB2 UDB for z/OS and OS/390, see “Arm Yourself Against Performance Erosion.”)

OLTP VS. DW Databases

OLTP databases run business applications (such as PeopleSoft, Siebel, and SAP) with multiple users who require very rapid response times and accurate data. Frequently, the database must serve thousands of concurrent users who all want to press the Enter key or click a Submit button and get an immediate (and accurate) response. While users wait, a series of processing events occur. Business programming logic and SQL statements get executed. Returning a response may involve CPU, sort, and I/O times (but not lock time, you hope). When all these events are complete and the business function is performed, this transaction or unit of work is either committed (made permanent) or rolled back (erased).

Some OLTP database applications include batch-processing components, and some may allow concurrent decision-support queries. OLTP databases often run hundreds, if not thousands, of transactions per minute.

And they’re not simply returning answers to queries. OLTP databases support applications that perform INSERT, UPDATE, and DELETE functions that require contention management and locking technology. And, they must perform many transactions for many users very rapidly and efficiently. Nobody likes to wait.

In contrast, DW databases help companies solve complex business problems. They don’t run transactions at high rates of speed. Users ask complex business questions relevant to the available data that are delivered to the database via complex SQL (or by query tools that transform the user’s business inquiry into complex SQL). Generally, users don’t expect instantaneous results, although answers delivered quickly are always appreciated. Response times in a DW environment are typically measured in minutes rather than seconds. However, response time requirements vary significantly based on business needs.

Unlike OLTP, DW databases are mostly read-only. They’re not generally available 2437 because data is usually loaded via batch processing during off hours. Parallelism (both CPU and I/O) plays a huge role in returning answers to these complex queries in a reasonable timeframe.

DB2 exploits the read-only nature of DW databases to make maximum use of parallelism. To understand parallelism, think of washing a car. A person can wash a car alone, but the job is finished much more quickly with the help of some friends, multiple buckets of soapy water, andmultiple hoses. In DB2, the car wash friends are called Agents and IO_Servers. The Coordinator Agent tells the Subagents how to divide up the work and get it done quickly. IO_servers are responsible for bringing boatloads of data into DB2’s memory very quickly so the Agents can process it. Parallelism is the essence of DW databases.

With OLTP, the jobs should be very small and efficient. Parallelism (by which I mean intrapartition parallelism on a single SMP server) is neither necessary nor desirable. Parallelism in OLTP would be like having a group of five people on hand to watch one person climb a ladder to install a light bulb. Please note that I’m not talking about parallelism in terms of a clustered environment. We’ll cover parallelism in more detail later in this article.

Despite their differences, OLTP and DW databases do share some common elements. Certain configuration parameters in DB2 must be properly set to ensure that DB2 has sufficient resources to perform the processing required. (Back to the car wash analogy, if water isn’t available, the car isn’t going to get washed no matter how many people are willing to wash it.) These settings differ somewhat depending on your workload. And SQL has an enormous impact on both kinds of databases. Here are our recommendations for both kinds of workloads.

Configuration Parameters

Between 20 and 25 percent of performance tuning improvements for both kinds of databases result from properly configuring the database manager, database, and tablespace container and parameter settings.

MAXFILOP

The default value for the database configuration parameter MAXFILOP is far too small for most databases. When this value is too small, DB2 spends a lot of extra CPU processing time closing and opening files to be a good citizen to other processes running in the operating system. This slowdown affects both OLTP response times and DW query results. Issue the command db2 get snapshot for database on DBNAME and look at the value for Number of files closed. If this value is greater than zero, keep incrementally increasing MAXFILOP until DB2 stops closing files.

MAXAGENTS

You must have enough DB2 Agents available to process the workload based on the number of users accessing the database. Too few Agents, and DB2 starts stealing Agents from one user’s connection to process the work of another. This stealing is like employee raiding. There’s a lot of overhead that isn’t relevant to getting the work at hand done. Issue the command db2 get snapshot for database manager and look at the value for Agents stolen from another application. If this value is greater than zero, incrementally increase the value of MAXAGENTS until the thefts stop.

CATALOGCACHE_SZ

Sufficient memory allocated to the CATALOGCACHE_SZ database configuration parameter benefits both OLTP and DW databases. When preparing execution strategies for SQL statements, DB2 checks this cache to learn about the definition of the database, tablespaces, tables, indexes, and views. If all the required information is available in the cache, DB2 can avoid disk I/Os and shorten plan preparation times.

Having a high package cache hit ratio (95 percent or better) is key for OLTP database applications. Keep increasing the CATALOGCACHE_SZ until you reach 95 percent. Issue the command db2 getsnapshot for database on DBNAME and compute the hit ratio using the following formula:

100 - ((Catalog cache inserts X 100) / Catalog cache lookups)

You should also increase the CATALOGCACHE_SZ if the value of Catalog cache overflows is greater than zero. And, if the value of catalog cache heap full is greater than zero, both DBHEAP and CATALOGCACHE_SZ should be proportionally increased.

LOCKTIMEOUT

The LOCKTIMEOUT default value is -1, which means that user connections can wait indefinitely to get the locks they need. Problems occur when users leave for the night without releasing locks held by their DB2 work. In a DW database, set this value to 60 seconds.

In an OLTP database, set LOCKTIMEOUT to 10 seconds. If your transaction can’t get the locks it needs to complete its work (and it may be holding locks of its own while it waits) then timeout quickly and get out of the way of other transactions before the entire house of cards falls down. There may be so many transactions waiting on locks they can’t obtain that the whole application locks up or performs very slowly.

TEMPSPACE TABLESPACE

Remember to define TEMPSPACE tablespaces so that they have at least three or four containers across different disks, and set the PREFETCHSIZE to a multiple of EXTENTSIZE, where the multiplier is equal to the number of containers. Doing so will enable parallel I/O for larger sorts, joins, and other database functions requiring substantial TEMPSPACE space.

Even though an OLTP application may be finely tuned to avoid large sorts, the occasional decisionsupport queries do find their way into OLTP databases.

Scott Hayes is a well known DB2 UDB performance expert. He is a regular speaker at International DB2 User Group conferences, IBM DB2/Data Management conferences, and is frequently sought as a guest speaker for regional DB2 user group meetings. Scott has authored several published articles on DB2 UDB performance, is an IBM DB2 GOLD Consultant (an elite, exclusive group of IBM recognized top DB2 advocates), has obtained Advanced IBM DB2 Certifications, and is widely regarded by the worldwide DB2 community as the top performance expert for DB2 on distributed (UNIX, Linux, and Windows) platforms. Most recently, he co-founded Database-Brothers Incorporated (DBI), a group of database performance and accountability specialists.

Philip Gunning started Gunning Technology Solutions, LLC in 2003. He is a Principal Consultant specializing in DB2 with Gunning Technology Solutions. You might even call him a DB2 Evangelist. Phil has over 17 years experience in the Information Technology industry. Prior to starting his own company, Phil held senior positions with DGI and others,and has consulted with Fortune 500 companies and with government. Phil has presented at IDUG North America, IDUG Europe, and DB2 User Groups all over North America. You can reach him at pgunning@breakthroughdb2.com

Contributors : Scott Hayes, Philip Gunning
Last modified 2006-08-21 04:46 PM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?