Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Oracle » Oracle Articles Archive » Perils and Pitfalls in Partitioning - Part 1
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 3549
 

Perils and Pitfalls in Partitioning - Part 1

by Arup Nanda

Part 1  |  Part 2

Partitioning is a favorite topic for authors, presenters, and general DBA community, but most of the papers dwell on the basics and fundamental concepts behind partitioning. The invariable action of most DBAs, after learning the ropes, is to jump into their databases with partitioning in mind. This article describes some of the potential problems — little or non-documented features that may create unanticipated (and unwanted) situations to which you should be alert, and how to resolve them. Understanding these potential problems will go a long way in designing a proper partitioning scheme for your database. Caution: to get the most from this article, you should already have basic knowledge about partitioning; this article is not a primer on that subject.

Plan Table Revisited

Before we begin, let's touch upon a very familiar table for identifying query execution paths that's been available for a long time — the PLAN_TABLE. You've certainly been using this table already, to identify the optimizer plan of a statement. We will examine three specific columns in this table (four, in Oracle9i) that are important for the partitioning option. Here is a basic explanation of these columns.

PARTITION_START When the optimizer searches a range of partitions for data, this column indicates the PARTITION_ID of the starting partition in that range.
PARTITION_STOP When the optimizer searches a range of partitions, this column indicates the PARTITION_ID of the last partition of the range.
PARTITION_ID Each step in the optimizer plan is identified by a unique number called STEP_ID. This column displays the STEP_ID of the step in PLAN_TABLE that decided the beginning and ending PARTITION_IDs.
FILTER_PREDICATES       The exact condition used to evaluate and arrive at the start and stop PARTITION_IDs (9i only).

More information and explanation about these columns will be provided later in the document along with examples.

The New Tool DBMS_XPLAN

It might be useful to describe an exciting tool available in 9i, a new package called DBMS_XPLAN, which is useful for querying the PLAN_TABLE data. Instead of writing a complicated SQL statement to see the optimizer plan from the PLAN_TABLE, a call to the DBMS_XPLAN displays the optimizer plan in a formatted fashion, making it easier to use. To select the optimizer plan for the last "explain plan" statement, simply use the query,

select * from table(dbms_xplan.display(format=>'BASIC'))

Using the operator TABLE() (or, performing a CAST operation) makes the return values from the function behave just like rows in a table so they can be queried as if being selected from a relational table.

PLAN_TABLE_OUTPUT
----------------------------------------------


--------------------------------------------
| Id | Operation | Name |
--------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | NESTED LOOPS | |
| 3 | TABLE ACCESS FULL | PTEST3HA |
| 4 | TABLE ACCESS FULL | PTEST3HB |
--------------------------------------------

The display() function takes three arguments:

TABLE_NAME The name of the table in which the optimization plan is stored; defaults to PLAN_TABLE.
STATEMENT_ID     The statement ID from the plan table mentioned earlier. By default, it takes the last ID, or NULL.
FORMAT This controls the way the display is formatted (explained later in detail).

Let's examine the last parameter, FORMAT, which is used to control how the output is displayed. It accepts four values as follows:

BASIC It provides only the minimum amount of information, as in case of the example above, similar to a query from PLAN_TABLE directly.
TYPICAL    This is the default value. It provides a variety of the information useful for understanding how the optimizer works for this statement. For instance, in case of partitioned table operation, the columns PARTITION_START, PARTITION_STOP, PARTITION_ID, and FILTER_PREDICATES are displayed in addition to COST for that step, the number of rows expected to be retrieved, and number of bytes those rows may have. This provides the information to understand statements involving partitioned objects.
ALL This setting displays all the information displayed for the BASIC and TYPICAL values, and also displays parallel query operations and the related SQL statements, if those are involved.
SERIAL This setting gets results similar to those retrieved by the TYPICAL setting, but the queries are explained serially even if a parallel query will be used.

Needless to say, the BASIC setting does not provide much information pertaining to partitioned objects, so the TYPICAL setting is recommended. However, the BASIC setting also widens the display. Before running the query, you should make the line size 120 or more. Here is the an output from the same query cited above using format=>'TYPICAL' or with no parameters:
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------


-------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |
-------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 575 | 4 | | |
|* 1 | TABLE ACCESS FULL | PTEST1 | 5 | 575 | 4 | 2 | 2 |
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("PTEST1"."COL1"=1500)

Note: cpu costing is off

14 rows selected.

This example shows that the optimizer will search only partitions with PARTITION_IDs from 2 to 2; i.e., it will search only PARTITION_ID 2. The decision to search that partition was made at STEP_ID 1, as displayed under Predicate Information below the formatted output. The result also mentions that the optimizer decided to select the step based on the information provided to it from the query (or, the filter predicate in the query COL1=1500. This kind of information is extremely useful determining optimizer plans for partitioned objects.

Partition Pruning or Elimination

Given this background information, let's jump into our discussion on partitioning mysteries. The main advantage of partitioning comes when the optimizer chooses the data in a specific partition only, where the requested data will be found and not all the partitions. For instance, consider a table, SALES, partitioned on ORDER_DATE, with one partition per quarter. When the following query is issued,

SELECT … FROM SALES
WHERE ORDER_DATE = ‘1/1/2003’

the optimizer does not go through the entire table, but only the partition that houses the rows for the order date, which is 2003 Quarter 1. This way, full table scans are limited to a specific partition only, saving significant I/O. When the optimizer chooses to scan only some partitions and not all, this is known as “partition pruning” or “elimination.”

But that is a basic property of partitioning — nothing new there. The important question is, how you can ensure that the partition pruning or elimination has indeed occurred? You can do so by explaining the query first and querying the PLAN_TABLE. Consider a table created as follows:

create table ptest1
(
col1 number,
col2 varchar2(200),
col3 varchar2(200)
)
partition by range (col1)
(
partition p1 values less than (1001),
partition p2 values less than (2001),
and so on partition p9 values less than (9001), partition pm values less than (maxvalue) );

Now, we will insert several records into this table so that each partition will have at least one record; then, we'll analyze the table:

insert into ptest1
select rownum, object_type, object_name
from all_objects
where rownum < 10001;
commit;

Next, we'll examine the optimization plan for a query that will be issued on the table PTEST1 as follows:

EXPLAIN PLAN FOR
SELECT * FROM PTEST1
WHERE COL1 = 1500;

This populates the PLAN_TABLE with the optimization plan records. Now, we'll choose the plan using the “SELECT ...” query. (Note: To perform the actions shown in this article, you will be using this query a lot; you should save the query in a script named plan.sql. The column FILTER_PREDICATES will be found only in Oracle9i, so remove the column from this query when running against an Oracle8i database.

select id, lpad(' ',level*1-1)||operation||' '||options||' on 
'||object_name operation,
partition_start PB, partition_stop PE,
partition_id, filter_predicates
from plan_table
connect by parent_id = prior id
start with parent_id is null;

The result is as follows:

ID OPERATION                       PB PE  PI
--- ------------------------------ -- -- ---
FILTER_PREDICATES
--------------------
0 SELECT STATEMENT on

1 TABLE ACCESS FULL on PTEST1 2 2 1
"PTEST1"."COL1"=1500

This could have been done via DBMS_XPLAN.DISPLAY(), too, but to make it version independent, we'll use PLAN_TABLE. Look at the partition_start and partition_stop columns; values are both 2, indicating that the data will be selected from partition number 2 only. This is not expected, since the value 1500 will be available in partition 2 only. How does the optimizer know which partition to look for? It does so at the Step ID 1 in the optimization plan as indicated by the column PARTITION_ID in PLAN_TABLE.

Finally, we also know that the optimizer applied a filter to retrieve rows as in the column FILTER_PREDICATES. This explains how the optimizer came up with the plan and from which segments it will select. This type of analysis will be most helpful when you are testing the different partition pruning scenarios.

Let's introduce another complexity to the mix — subpartitioning. Consider a table created as follows:

create table ptest2
(
col1 number,
col2 varchar2(200),
col3 varchar2(200)
)
partition by range (col1)
subpartition by hash (col2)
subpartitions 4
(
partition p1 values less than (1001),
partition p2 values less than (2001),
and so on… partition p9 values less than (9001), partition pm values less than (maxvalue) );

We will insert rows in the same manner as the example used previously, and analyze the table. Then we will issue the query as follows:

EXPLAIN PLAN FOR
SELECT COL2 FROM PTEST2 WHERE COL1 = 9500
AND COL2 = 'PROCEDURE';

Here, the query is forced to select from a subpartition, as the filter is based on the partitioning as well as the subpartitioning key. The query on PLAN_TABLE shown earlier displays the following output:

  ID OPERATION                     PB PE  PI
---- ----------------------------- -- -- ---
FILTER_PREDICATES
--------------------------------------------
0 SELECT STATEMENT on

1 TABLE ACCESS FULL on PTEST2 38 38 1
"PTEST2"."COL1"=9500 AND "PTEST2"."COL2"='PROCEDURE'

Note the PARTITION_START column; it shows 38 — but we don't have that many partitions. Actually, the number 38 reflects the count of subpartitions, not partitions. In this example, the number of subpartitions in a partition is four, so the first nine partitions in the table contain the first 36 subpartitions. The thirty-seventh and the thirty-eighth subpartitions exist in the tenth partition. The tenth partition is the partition PM, making the highlight subpartition the second one under that. If you look at the query, the optimizer correctly selected the partition PM for elimination.

Make note of this feature to avoid confusion - the PARTITION_START and PARTITION_STOP columns also point to subpartitions, if they are involved, not just partitions.

Partition-wise Joins

When a partitioned table is joined to another partitioned table in such a way that partitioning keys determine the filtering, the optimizer can determine that it does not need to search the whole table, but just the partitions in which the data resides. For instance, consider the table SALES range, partitioned on the SALES_DATE column, the table REVENUE range, partitioned on the BOOKED_DATE column, and the partitioning schemes (the boundary values of the partitions are the same). These tables are "equi-partitioned." If the user queries using the following:

SELECT … FROM SALES S, REVENUE R
WHERE S.SALES_DATE = R.BOOKED_DATE
AND S.SALES_DATE = '31-JAN-2003';

then the optimizer knows that the rows returned by the filtering condition, SALES_DATE = '31-JAN-2003' will be found only in a single partition, the one for 2003 Quarter 1. Since the REVENUE table is equi-partitioned, the rows also will be found only in that table's partition for 2003 Quarter 1. So, for each row in SALES, only rows in a particular partition in REVENUE need to be searched, not the entire table.

Next, we will examine if such a process of selection is indeed happening. Consider two tables created as follows:

create table ptest3a
(
col1a number,
col2a varchar2(200),
col3a varchar2(200)
)
partition by range (col1a)
(
partition p1 values less than (1001),
partition p2 values less than (2001),
and so on… partition p9 values less than (9001), partition pm values less than (maxvalue) ); create table ptest3b ( col1b number, col2b varchar2(200), col3b varchar2(200) ) partition by range (col1b) ( partition p1 values less than (1001), partition p2 values less than (2001), and so on… partition p9 values less than (9001), partition pm values less than (maxvalue) );

Note the tables have been range partitioned in an identical manner. Next, we will insert data into both tables so that all partitions will have at least one row, as follows:

insert into ptest3a
select rownum, object_type, object_name
from all_objects
where rownum < 10001;

insert into ptest3b
select rownum, object_type, object_name
from all_objects
where rownum < 10001;

After analyzing both tables, a user queries the tables in this manner:

explain plan for
select count(*)
from ptest3a , ptest3b
where ptest3b.col1b = ptest3a.col1a
and ptest3a.col1a between 1500 and 1700;

and then queries from the PLAN_TABLE using the script plan.sql, she gets

  ID OPERATION                      PB PE  PI
---- ------------------------------ -- -- ---
FILTER_PREDICATES
--------------------
0 SELECT STATEMENT on
1 SORT AGGREGATE on
2 NESTED LOOPS on
3 TABLE ACCESS FULL on PTEST3A 2 2 3
"PTEST3A"."COL1A">=1500 AND "PTEST3A"."COL1A"<=1700
4 TABLE ACCESS FULL on PTEST3B 2 2 4
"PTEST3B"."COL1B"="PTEST3A"."COL1A" AND "PTEST3B"."COL1B">=1500 AND
"PTEST3B"."COL1B"<=1700

Note how only partitions with ID# 2 from each table were subjected to Full Table Scans, not the entire table; this enabled partition-wise joins. The optimizer determined that partition-wise joins are possible in step ID 3 and step 4, as shown in the column PARTITION_ID. And it knew which partitions to join from the filter predicates, easily explained in the output. Since the rows will be found in partition ID 2 only, only that partition of ptest3a is used. And, since ptest3a and ptest3b are equi-partitioned, the optimizer will search for rows only in partition ID 2 of ptest3b, too, not the entire table.

Now let's see how a different type of partitioning scheme, hash partitioning, behaves for partition-wise joins. Consider the following two tables:

create table ptest3a
(
col1a number,
col2a varchar2(200),
col3a varchar2(200)
)
partition by hash (col1a)
partitions 4;

create table ptest3b
(
col1b number,
col2b varchar2(200),
col3b varchar2(200)
)
partition by hash (col1b)
partitions 4;

So each table has 4 hash partitions. Insert the data in the same way as in the previous example and analyze. If we explain the same query as we did before, and select from the plan table, we get

  ID OPERATION                        PB PE PI
---- -------------------------------- -- -- --
FILTER_PREDICATES
--------------------
0 SELECT STATEMENT on
1 SORT AGGREGATE on
2 PARTITION HASH ALL on 1 4 2
3 NESTED LOOPS on
4 TABLE ACCESS FULL on PTEST3A 1 4 2
"PTEST3A"."COL1A">=1500 AND "PTEST3A"."COL1A"<=1700
5 TABLE ACCESS FULL on PTEST3B 1 4 2
"PTEST3B"."COL1B"="PTEST3A"."COL1A" AND "PTEST3B"."COL1B" >=1500 AND
"PTEST3B"."COL1B"<=1700

Note the partition start (1) and stop (4) values, which are for all the partitions. This query does not perform a partition-wise join; it simply scans the entire table, even though it could have eliminated certain partitions. The filter predicates indicate that the optimizer knew about the rows to look for. So why didn't it do a partition-wise join?

The problem is the way hash-partitioned tables handle joins. In this example, the filtering condition is a range, between 1500 and 1700, not a specific value. This means the optimizer will not be able to point to a single partition for selection of the rows, and therefore a full-table scan is necessary. Partition-wise joins will not occur in this case. Let's take a look at another variation of this query:

explain plan for
select count(*)
from ptest3a , ptest3b
where ptest3b.col1b = ptest3a.col1a
and ptest3a.col1a = 1500;

Note the filtering predicate has been changed from a “between” to an “equality” with a constant. Using the plan.sql script, we get the “explain plan” as

ID OPERATION                                  PB PE  PI
---- ---------------------------------------- -- -- ---
FILTER_PREDICATES
--------------------
0 SELECT STATEMENT on
1 SORT AGGREGATE on
2 NESTED LOOPS on
3 TABLE ACCESS FULL on PTEST3A 3 3 3
"PTEST3HA"."COL1A"=1 500
4 TABLE ACCESS FULL on PTEST3B 3 3 4
"PTEST3HB"."COL1B"="PTEST3A"."COL1A" AND "PTEST3B"."COL1B" =1500

The PARTITION_IDs for start and stop partitions are 3 each, as expected. This means the third partitions of both table have been joined to get the answer; in other words, we have just achieved a partition-wise join. How were we able to do this?

If the filter predicates are based on equality operator only, then the optimizer can assign a specific partition to the predicate by using the hash function. That is why the partition-wise join was possible in the second example, but not in the first example. If the predicate is a range, the optimizer cannot decide whether a particular partition may be a candidate. Be very careful in designing hash-partitioned tables when there is a chance of joining with range filtering.

Character Value in Range Partitioning

Almost all documents, articles, books, and other documentation talks about range partitioning using either dates (the most common) or numbers. However, the partitioning scheme could be extended to character strings too. Consider the example of the employee table where the last name column is the partitioning key, to separate employees into multiple partitions by last name. Consider a table where the first partition P1 should hold all last names starting with C and below, P2 should hold between D and F; finally the rest with partition PM. According to a MetaLink Note, here is the proper syntax for designing such a partitioning scheme.

CREATE TABLE EMP (…………)
PARTITION BY RANGE (LAST_NAME)
(

PARTITION P1 VALUES LESS THAN (‘D%’),
PARTITION P2 VALUES LESS THAN (‘G%’),
PARTITION PM VALUES LESS THAN (MAXVALUE)

);

Note the percentage character after the names. This ensures that the ranges are well delineated by the boundaries. Consider this example:

SELECT * FROM EMP;

LAST_NAME FIRST_NAME
---------- ----------
CHAPLIN CHARLIE
D HARLEY
DAVIDSON HARLEY
EINSTEIN ALBERT

SELECT * FROM EMP PARTITION (P1);

LAST_NAME FIRST_NAME
---------- ----------
CHAPLIN CHARLIE
D HARLEY

SELECT * FROM EMP PARTITION (P2);

LAST_NAME FIRST_NAME
---------- ----------
DAVIDSON HARLEY
EINSTEIN ALBERT

Note the placement of two rows with last names starting with D. The last name DAVIDSON is placed in P2 as expected, but the last name D is placed in partition P1. Shouldn't DAVIDSON and D be within the same partition, P2?

Actually, this is not unusual. The character set comparison, "D" is less than “D%,” satisfying the boundary of the partition P1, and that is where the last name “D” goes, even though you probably expected the last name “D” to go into the same partition as DAVIDSON. While designing such a partitioning scheme be mindful of the potential problem.

Consider the same table in a slightly different way:

CREATE TABLE EMP (…………)
PARTITION BY RANGE (LAST_NAME)
(
PARTITION P1 VALUES LESS THAN (‘D’),
PARTITION P2 VALUES LESS THAN (‘G’),
PARTITION PM VALUES LESS THAN (MAXVALUE)
);

Note, there is no percentage sign after the character values. Inserting the same data into it and selecting from different partitions, we get

SELECT * FROM EMP2 PARTITION (P2);

LAST_NAME FIRST_NAME
---------- ----------
DAVIDSON HARLEY
EINSTEIN ALBERT
D HARLEY

Note how the partition P2 now has both DAVIDSON and D. This modified approach will help you avoid potential problems in the future. If you design character-based range partitioning, you should consider dropping the percentage character in your boundary to eliminate confusion, even though it is as specified in the MetaLink Note. If you use Oracle9i, you can probably change most of your character-based partitioning schemes to LIST.

This article should provide some insights into some potentially problematic situations regarding partitioning. More potential problems and pitfalls will be discussed in the Part 2 of this article next month.

--

Arup Nanda has been an Oracle DBA for 10 years. He is the founder of Proligence, Inc., a New York-area company that provides specialized Oracle database services. He is also an editor of SELECT, the journal of the International Oracle User Group. He has written Oracle technical articles in several national and international journals like SELECT, Oracle Scene, SQLUPDATE and presented in several conferences like Oracle Technology Symposium, Atlantic Oracle Training Conference, IOUG Live! among others. Send feedback to Arup about this article at arup@proligence.com. Based on the feedback, an updated copy of this article can be found at www.proligence.com.


Contributors : Arup Nanda
Last modified 2006-01-05 10:38 AM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone