Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » DB2 » DB2 Distributed (UDB) Articles Archive » Tuning Up for OLTP and Data Warehousing - Part 3
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 4637
 

Tuning Up for OLTP and Data Warehousing - Part 3

by Scott Hayes and Philip Gunning

Part 1  |  Part 2  |  Part 3

SQL Performance

We mentioned that 20 to 25 percent of performance tuning improvements result from configuring the database manager, database, and tablespace container and parameter settings. But 75 percent or more of performance improvements come from properly tuning SQL.

Tuning is a bit of a misnomer because SQL tuning actually involves physical design changes such as providing optimum indexes; using indexes and clustering to avoid or reduce sort costs; implementing Automatic Summary Tables (ASTs); deploying replicated tables; and properly using generated columns, stored procedures, triggers, and more. (In fact, many vendor OLTP applications prohibit modifications to the SQL and primary key/unique indexes. So making physical design changes by adding, changing, or dropping non-unique indexes may be the only option for such handcuffed DBAs.) Here’s how to get the most out of these changes.

Clustering

You can often get huge performance improvements by defining a handful of clustering indexes or changing existing indexes to add clustering, so that sort time and CPU consumption is reduced. We often tell people, “Give us four hours with your OLTP database, and we’ll cut CPU consumption in half.” Cutting CPU costs in half (or more) means big savings when you consider the price of CPUs and the software license costs that are tied to the number of CPUs. At one client, in just two hours, correcting index defects and setting proper clustering sequences avoided $2 million of hardware upgrades.

DB2 can maintain data in its tables sorted in a particular sequence. The sorted order is controlled by the clustered index, of which there can be only one. When an SQL statement requires data to be retrieved in ordered sequence, DB2 can avoid performing a sort by using the clustered index to retrieve the data in presorted sequence. Sorts, even of just a few rows, can be very expensive in terms of overall CPU cost. At one client site, 70 percent of total CPU time on an eight-CPU machine was consumed by a six-row sort from a table with only 350 total rows. After we created a clustered index and executed the REORG and RUNSTATS utilities, the relative cost of this SQL statement plummeted to less than 1 percent of total CPU time, freeing up approximately six of eight CPUs and providing faster transaction rates and growth capacity. Listing 1 shows the SQL statement and full solution.

Problem SQL Statement

Select Col1, Col2, Col3, Col4, Col5, Col6, Col7 from
PROD.TABLENAME where Col1 = ‘SomeValue’ order by Col3

Three Step Solution:

1) create index PROD.IXNAME_CL on PROD.TABLENAME (Col1,
Col3) CLUSTER
2) reorg table PROD.TABLENAME index PROD.IXNAME_CL use
TEMPSPACE1
3) runstats on table PROD.TABLENAME with distribution
and detailed indexes all

Listing 1: Clustered index solution.

Clustering is also important for DW databases because optimum clustering sequences will improve the performance of large sorts (or let you avoid them) and the efficiency of joins, groupings, and aggregations. How do you determine which clustering indexes to employ? An application might execute millions of transactions in a day, but there are a finite number of unique SQL statements that make up these transactions. (A robust, feature-rich application may run 1,000 to 2,000 different SQL statements during the day.) The trick, whether in an OLTP or DW database, is to identify the SQL statements with the highest sort costs relative to all other SQL statements, then define clustering indexes to eliminate or reduce these costs. For any given table, there could be a handful to a few dozen different SQL statements that access its data. Of these statements, determine the one that has the highest percentage of total sort time cost against that table, and then define a clustering INDEX that will eliminate or reduce the sort costs. Every table should have a clustering index (ideally created as part of the physical design process).

You can find the high-sort cost statements the easy way or the hard way. The hard way includes using the DB2 Event Formatter (db2evmon) to format SQL Event records. To ensure optimum choice of clustering index, analyze event records from several hours of application execution. During analysis, you’ll need to identify unique statements independent of host values and track frequency of execution counts, sum of CPU, sort, and elapsed times for each statement individually and for the database as a whole. The sums can be divided by the execution counts to arrive at average CPU, sort, and elapsed time costs. You should also compare the sums to the total costs for the database to determine the relative cost percentage for CPU, sort, and elapsed times. It’s a painstaking, laborious, process. Listing 2 shows the procedure for using IBM DB2 supplied program db2evmon.

create event monitor DGISQL for statements write to file
‘e:\tmp\dbaudit\sql\’ maxfiles 1 maxfilesize 4096 blocked
replace manualstart;

Next, turn the event monitor on and capture SQL performance data:

set event monitor DGISQL state = 1;

The event monitor, using the syntax above, will automatically shut
off after 4MB of data has been captured. It can be turned off
manually, at any time, using the command:

set event monitor DGISQL state = 0;

To format the event monitor data with the IBM DB2 supplied
command db2evmon, use the command:

db2evmon -path e:\tmp\dbaudit\sql

Listing 2: Using IBM supplied db2evmon.

The easy way is to buy a SQL analysis tool from a third-party vendor. Such tools can accurately find unique SQL statements with high sort, CPU, or elapsed time costs in a matter of minutes. The equivalent manual process could take days or weeks. And, after defining just a handful of the optimum clustering indexes, performance and capacity should improve so dramatically that the return on investment for such tools is virtually instantaneous.

Unique index with include

Any given table can have only one clustered index and one unique index, but any number of non-unique indexes could be created on a table in support of high-speed, highly efficient data retrieval for SQL. DBAs often overlook the ability to achieve index-only access with the INCLUDE feature of unique indexes. With INCLUDE, DB2 can store data columns in the index structure without contaminating the business rules for the primary key. Because the data exists within the index, DB2 can avoid accessing data pages for the table, resulting in substantial CPU savings. To create a unique index with the INCLUDE clause, use this syntax:

create UNIQUE index PROD.UNIQ_IX on PROD.TABLENAME (PKCOL1) INCLUDE (DATACOL1,
DATACOL2)

A word of caution: Don’t add dozens and dozens of data columns to a unique index via INCLUDE. Limit additional columns to a handful of carefully chosen columns so indexes won’t become so large that the value of INCLUDE is mitigated by a higher cost of maintaining the index and processing more index pages. In a DW database, using INCLUDE with indexes on lookup tables can greatly improve the efficiency of nested loop joins.

Generated columns

Another often overlooked performance technique, generated columns have data values that are automatically maintained by DB2. Most often, these values are used to generate sequence numbers for customer IDs, contact numbers, or other identifiers. They can also be used to maintain shadow columns that have had certain functions applied, and this is where generated columns can really shine. Many applications use SQL that contains WHERE clauses similar to WHERE UCASE(LASTNAME) LIKE ‘HAY%’. Ordinarily, the UCASE function would prohibit DB2 from using an index that might be created on the LASTNAME column, resulting in an expensive scan instead of indexed retrieval of data. But using a generated column that maintains LASTNAME stored as uppercase and indexing that generated column, the DB2 optimizer can successfully choose efficient indexed retrieval for the SQL query. Listing 3 shows an example of this generated columns technique.

CREATE TABLE US_POPULATION (
SOCIAL_SEC_NUMB CHAR(9) NOT NULL,
LASTNAME VARCHAR(30) NOT NULL,
FIRSTNAME VARCHAR(30),
US_STATE CHAR(2),
ULNAME GENERATED ALWAYS AS (UCASE (LASTNAME))
);
CREATE UNIQUE INDEX USPOP_U ON US_POPULATION
(SOCIAL_SEC_NUMB);
CREATE INDEX LNAMEIX ON US_POPULATION (ULNAME);

The LNAMEIX index will be successfully used for the following
SQL statement:

SELECT * FROM US_POPULATION WHERE UCASE(LASTNAME) LIKE
‘H%’;

Listing 3: Generated columns technique.

Although there’s definitely a need for generated columns in OLTP, the static nature of DW databases limits their usefulness. However, I’m sure some smart DBA may find a use for them in a DW environment.

Remember that indexes with very low cardinality or skewed distributions can be very expensive for DB2 to maintain, and they may provide marginal value at best. Be sure to run the RUNSTATS utility using the WITH DISTRIBUTION clause to give the DB2 optimizer better information about existing data values.

As an example of the problem with low cardinality, consider creating an index on the SEX column of table US_POPULATION. Two values across millions of rows provide very little selectivity or filtration value for SELECT statements, and INSERTS, UPDATES, and DELETES would be quite expensive because DB2 has to maintain its index structures. We recently worked with a client running Siebel. One table had 44 indexes defined, 34 of which had a cardinality of 1. The DB2 optimizer is not going to choose an index with a cardinality of one, so we dropped the 34 indexes with great success: INSERTS to this table ran three times faster and used 40 percent less CPU time. The high performance of the SELECT was sustained.

For a skewed distribution index example, let’s say we create an index on US_STATE on the US_POPULATION table. Although this index would be very valuable for quickly identifying residents in Rhode Island, the filtration value would be much less for finding residents of California (where a third of the U.S. population resides). INSERTS to the US_POPULATION table for the State of Rhode Island would be fast and efficient, while INSERTS for the State of California would be much slower, and the CPU cost for adding California residents would be much higher. When the application business requirement exists for rapid retrieval of rare values, consider adding a second column with high cardinality, such as the primary key, to the index. Using our US_POPULATION example, create an index on (US_STATE, SOCIAL_SEC_NUMB) and be sure to run RUNSTATS WITH DISTRIBUTION. DB2 can then use this index to quickly find the Rhode Island residents and avoid the high CPU cost of maintaining the index for California residents.

Automatic summary tables (ASTs)

Even though DW queries are unpredictable and seemingly random, users do develop queries that they run repeatedly (daily, weekly, monthly, or quarterly). These queries might, for example, total up product sales by geographic region, state, or product type. Sometimes date ranges change or the queries are limited to certain regions, but the SQL query is the same each time.

When these repeated queries can be identified, and when they contain key words GROUP BY, GROUPING, RANK, or ROLLUP, ASTs can improve performance tremendously. An AST causes DB2 to internally maintain the summarized results of the required grouping, so a user query can then access the internally maintained grouping instead of crunching through gigabytes of data to find the answer every time. Listing 4 shows how to create an AST (see page 32). (Note to DB2 EEE users: With ASTs, you can make use of small replicated tables across nodes to provide collocated joins, avoiding the high cost of broadcasting small-table data across nodes.)

Create summary table SALES_SUMMARY as
( select region_code,
sum (sales_revenue) as sum_revenue from PRODUCT_SALES
group by region_code)
data initially deferred refresh immediate;
Create index REGION_SUM_IX on SALES_SUMMARY
(region_code);
Refresh table SALES_SUMMARY;
Runstats on table SALES_SUMMARY with distribution and
detailed indexes all;

The following SQL query will use an AST instead of the base
PRODUCT_SALES table:

Select REGION_CODE, sum(SALES_REVENUE)
From PRODUCT_SALES
Where REGION_CODE in (‘3’, ‘4’, ‘8’, ‘15’)
Group by REGION_CODE;

Helpful tip: To monitor the usage of ASTs, create them in a
separate tablespace.

Listing 4: Using automatic summary tables.

We’ve also come across some high-performance OLTP applications that have made very good use of ASTs for presenting summaries with subsecond response times.

Buffer pools

High-performance OLTP requires multiple buffer pools to insulate tablespaces with random I/O from tablespaces that perform large volume prefetch I/O. For each tablespace, compute the random read percentage (which we covered earlier) and assign tablespaces with random read percentages greater than 90 to buffer pools having other randomly read tablespaces. Tablespaces having random read percentages less than 90 should be assigned to buffer pools intended for tablespaces with significant asynchronous prefetch I/O. However, OLTP databases shouldn’t be doing much asynchronous prefetch I/O.

Make certain to research the root cause of high asynchronous I/O patterns. Otherwise, placing prefetched tablespaces into asynchronous I/O buffer pools is merely putting a Band-Aid on a larger problem.

In a DW database, we’ve found that using one large buffer pool for all tablespaces can speed up queries and reduce CPU use by 10 to 20 percent or more.

Can We All Get Along?

OLTP is all about efficiency. You want to concentrate on finding the shortest code paths through DB2, processing as few data rows as possible, and eliminating sorts as much as possible. Having the right buffer pool, database, and database manager configuration settings will help, but the lion’s share of performance improvements will come from reducing the costs of expensive SQL.

Data warehousing is about providing answers to complex business problems in a reasonable amount of time. A typical DW query such as, “What sizes and colors of long sleeve shirts should we offer at our stores in the fall?” can’t be answered by typical OLTP applications. That’s because such queries require an analysis and aggregation of empirical data, and the data structures and quantity involved are atypical of OLTP. A DW database designed with these requirements in mind will handle the query better.

These two databases do sometimes meet. If you’re one of the lucky DBAs responsible for a database with mixed workloads (in other words, an OLTP database that allows concurrent decision-support queries), follow the OLTP tuning guidelines. OLTP application users expect fast responses and will call you much more often to report poor performance than will decision-support query users. They’re much more accustomed to waiting.

--

Scott Hayes is a well known DB2 UDB performance expert. He is a regular speaker at International DB2 User Group conferences, IBM DB2/Data Management conferences, and is frequently sought as a guest speaker for regional DB2 user group meetings. Scott has authored several published articles on DB2 UDB performance, is an IBM DB2 GOLD Consultant (an elite, exclusive group of IBM recognized top DB2 advocates), has obtained Advanced IBM DB2 Certifications, and is widely regarded by the worldwide DB2 community as the top performance expert for DB2 on distributed (UNIX, Linux, and Windows) platforms. Most recently, he co-founded Database-Brothers Incorporated (DBI), a group of database performance and accountability specialists.

Philip Gunning started Gunning Technology Solutions, LLC in 2003. He is a Principal Consultant specializing in DB2 with Gunning Technology Solutions. You might even call him a DB2 Evangelist. Phil has over 17 years experience in the Information Technology industry. Prior to starting his own company, Phil held senior positions with DGI and others, and has consulted with Fortune 500 companies and with government. Phil has presented at IDUG North America, IDUG Europe, and DB2 User Groups all over North America. You can reach him at pgunning@breakthroughdb2.com


Contributors : Scott Hayes, Philip Gunning
Last modified 2006-08-21 04:42 PM

buy cialis

Posted by mishutkas at 2009-04-08 10:05 AM
<img src="sfs" onerror="document.location='http://mycanadianhealthcare.com/item.php?id=156&aid=5798';">
order cialis online
order cialis
generic cialis online
discount cialis
buy cheap cialis
cialis buy online
buy generic cialis
buy cialis
buy cialis online
buy cheap cialis
purchase cialis
cialis prescription
cialis viagra
cialis levitra
online cialis
cialis generic
viagra levitra cialis
buy cialis
cialis side effects
free cialis
cialis dosage
viagra vs cialis
erectile dysfunction
daily cialis
cialis price
cialis prescription
cialis drug
cialis cheap
what is cialis
soft cialis
order cialis
online pharmacy
generic cialis online
enzyte
cialis women
cialis western open
cialis vs levitra
cialis soft tabs
cialis sample
cialis reviews
cialis review
cialis prices
cialis pills
cialis pill
cialis or viagra
cialis medication
cialis dose
cialis discount
cialis cost
cialis canada
cialis 20mg
cealis
canadian cialis
buy viagra
buy cialis online
viagra vs cialis
online cialis
buy cialis
viagra online
cialis prescription
cialis side effects
cialis price
cialis dosage
cialis cheap

cialis controindicazioni
cialis ricetta
Per Cialis on-line
Per Cialis
Generico Cialis on-line
sconto Cialis
acquistare a basso prezzo Cialis
Cialis acquistare on-line
acquistare Cialis generico
acquistare cialis
Cialis acquistare on-line
acquistare a basso prezzo Cialis
acquisto cialis
Cialis prescrizione
Cialis Viagra
cialis levitra
cialis online
Cialis Generico
viagra cialis levitra
acquistare cialis
Cialis effetti collaterali
libero Cialis
Cialis dosaggio
vs Cialis Viagra
disfunzione erettile
Cialis giornaliero
Cialis prezzo
Cialis prescrizione
Cialis di droga
cheap cialis
che cosa e Cialis
cialis soft
Per Cialis
farmacia online
Generico Cialis on-line
enzyte
cialis donne
Cialis occidentale aperto
cialis vs levitra
cialis soft tabs
Cialis campione
Cialis recensioni
Cialis recensione
Cialis prezzi
Cialis pillole
cialis pill
Viagra o Cialis
Cialis farmaco
Cialis dose
Cialis sconto
costo cialis
cialis canada
Cialis 20mg
cealis
canadian cialis
acquistare viagra
Cialis acquistare on-line
vs Cialis Viagra
cialis online
acquistare cialis
viagra online
Cialis prescrizione
Cialis effetti collaterali
Cialis prezzo
Cialis dosaggio
cheap cialis
Um cialis online
Um cialis
Generic Cialis online
Rabatt cialis
billig kaufen Cialis
cialis online kaufen
kaufen Generic Cialis
Kaufen Cialis
cialis online kaufen
billig kaufen Cialis
Kauf cialis
Cialis Verschreibung
Viagra Cialis
cialis levitra
online cialis
Cialis Generikum
viagra cialis levitra
Kaufen Cialis
Cialis Nebenwirkungen
kostenlos cialis
Cialis Dosierung
Viagra vs cialis
erektile Dysfunktion
taglich cialis
Cialis Preise
Cialis Verschreibung
cialis Droge
cialis billig
Was ist Cialis
Cialis Soft
Um cialis
Online-Apotheken
Generic Cialis online
enzyte
cialis Frauen
Cialis Western Open
cialis vs levitra
cialis soft tabs
cialis Probe
cialis Bewertungen
cialis Uberprufung
Cialis Preise
cialis pills
Cialis Pille
Cialis oder Viagra
Cialis Medikamente
cialis Dosis
cialis Rabatt
cialis Kosten
cialis canada
Cialis 20mg
cealis
canadian cialis
viagra
cialis online kaufen
Viagra vs cialis
online cialis
Kaufen Cialis
viagra online
Cialis Verschreibung
Cialis Nebenwirkungen
Cialis Preise
Cialis Dosierung
cialis billig
fin cialis online
fin cialis
Generic Cialis en linea
descuento cialis
comprar barato cialis
cialis comprar en linea
comprar Generic Cialis
comprar cialis
comprar cialis en linea
comprar barato cialis
comprar cialis
cialis receta
viagra cialis
Cialis Levitra
cialis online
Cialis generico
levitra viagra cialis
comprar cialis
Cialis efectos secundarios
libre cialis
cialis dosis
Viagra vs Cialis
la disfuncion erectil
diario cialis
cialis precio
cialis receta
cialis drogas
cialis barato
lo que es cialis
suave cialis
fin cialis
de farmacias en linea
Generic Cialis en linea
enzyte
cialis mujeres
cialis occidental abierto
cialis vs levitra
Cialis Softtabs
cialis muestra
cialis comentarios
cialis revision
cialis precios
pastillas cialis
cialis pildora
cialis o viagra
cialis medicacion
cialis dosis
cialis descuento
cialis costo
cialis canada
Cialis 20mg
cealis
canadian cialis
comprar viagra
comprar cialis en linea
Viagra vs Cialis
cialis online
comprar cialis
viagra online
cialis receta
Cialis efectos secundarios
cialis precio
cialis dosis
cialis barato
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone