Journey into Partitioning - Same Goal, Many Paths - Part 1

by Arup Nanda

Part 1 | Part 2

Partitioning, the best thing that happened to Oracle since sliced bread — perhaps that trite analogy has a grain of truth in this case. Sliced bread gave mankind the ability to tackle the monolithic loaf by breaking it into bite-sized pieces, not unlike what partitioning does for our Oracle tables. The slices hold patties, tomatoes, cheese, and all the relishes you may want to fit in — analogous to data in the table. However, a sandwich made from the slices is still a logical entity — just like a table. If a slice gets soggy, all you need to do is to replace it with a fresh one — sounds familiar to an operation called partition exchange, doesn't it? It's therefore no wonder that partitioning was embraced so quickly by the DBA community. Implementation, however, was not so swift; there were doubts about Cost Based Optimizer (CBO), and since partitioning required it, most users shunned the new technology. With the gradual improvement and widespread acceptance of CBO, partitioning has come to adorn not just data warehouses, but also OLTP databases.

Previously in DBAZine, I have explored some pitfalls and potential problems that you should be aware of while designing and maintaining a partitioning scheme in my articles, "Perils and Pitfalls in Partitioning," Part 1 and Part 2. After you have decided to move to partitioning, the next logical question is how to convert the existing tables. There are various methods to achieve the results, from the plain and simple export/import to the sophisticated online redefinition in Oracle9i. Each method is unique in some or other way — making it perfect for your specific situation. This article explains them all and their relative merits and drawbacks so that you can decide on their suitability at your site.

Apart from the already documented and known techniques to convert existing tables, this article also explains another means developed by the author, which is undocumented. In case the table has to be available for changes during the conversion process, the article describes a way to allow that.

Example Setup

For the sake of illustration, let's assume the existing table, named MYTAB, as follows:

COL1 NUMBER(10)
COL2 VARCHAR2(200)
COL3 NUMBER(10)
COL4 DATE

The table has an index IN_MYTAB on COL2 and a check constraint CK_MYTAB on COL4. We need to convert this table to a range partitioned one on column COL1. Each partition of the table should be in a different tablespace. The index should be LOCAL with tablespaces defined separately.

Deciding Factors

Before starting with any particular conversion approach, think about the exact criteria that will temper your decision. The criteria that you have to consider are:

Availability: Does the table have to be available while being converted? Or, can it be easily shut off from users for the duration? Can the table be available for read-only operation while being converted? If there can be shut-off time for the table, how long is that? The degree of availability needs affects the choice of a proper method.
Quiet Time: Some databases are equally loaded all over the 24-hour period, 7 days a week. However, most databases follow a pattern in which a specific spread of time — e.g., Sunday nights on a non-month-end date — may be a period of little activity. (This does not mean no activity, just lessened load on the database.) Does your database have such a window of quiet time? If so, for how long and how often?
Space: Do you have extra space in the database to spare? If so, how much and where? In the case of a typical Storage Area Network (SAN) system with virtualization in place, you may have a huge filesystem with plenty of room to spare. In this case, location does not make a difference; but in for a system in which the filesystems are carefully defined for tables keeping in mind their access patterns, the location where the free space is available makes a difference. You also have to consider whether the space available can be added to the database while taking away a similar chunk from the database elsewhere. And, is this temporary space available on some temporary filesystem where a database file cannot be created (e.g. '/tmp') but sufficient space is available there for doing some other work?
Redo Generation: Some operations are designed to minimize redo. Large redo information generation means a larger number of archived log files, which may overwhelm your tape backup system. Of course, if your database runs in NOARCHIVELOG mode, this lessens the problem, but your production system is most likely in ARCHIVELOG mode.
Rollback Segment: How big are rollback segments in your databases? This is the second most important factor in the selection of a conversion method. Some techniques rely on reading the source tables. Due to the read consistency requirement of the Oracle database engine, the database must provide the image of the block that was present when the query started. This past image is constructed from the data stored in rollback segment. If the query is long, the rollback segment may eventually run out of extents to store the past image data. When the long running query tries to get the data from the segment, it faces an error called ORA-1555 Snapshot Too Old. Even if the RBSs are huge, there is a good chance that the query will face the ORA-1555 error if the run time is too long.
Special Datatypes: Some datatypes such as LONGs in your table may make some conversion approaches inapplicable. If you have such columns in your table, you need to rethink your strategy.

Bear in mind these important factors while evaluating the following methods for your particular situation.

Method 1: Direct Creation

In this method, you would simply create a table as partitioned for the first time by selecting from the target table. For example, you would issue

CREATE TABLE MYTAB_NEW
PARTITION BY RANGE (COL1) (
PARTITION P1 VALUES LESS THAN (101) TABLESPACE TS_P1,
PARTITION P2 VALUES LESS THAN (201) TABLESPACE TS_P2,
PARTITION P3 VALUES LESS THAN (301) TABLESPACE TS_P3,
PARTITION PM VALUES LESS THAN (MAXVALUE) TABLESPACE TS_PM
)
NOLOGGING
AS
SELECT * FROM MYTAB;

This creates a new table named MYTAB_NEW as a partitioned table and with all the data from the table MYDATA. The use of NOLOGGING at the end makes sure the data is created with minimum redo. If necessary, you can also use the PARALLEL clause to create this table in parallel.

After the table is created, drop the original table and rename this new table to the original table name.

DROP TABLE MYTAB;
RENAME MYTAB_NEW TO MYTAB;

Then you can create the indexes and constraints on this table to match what existed earlier.

Advantages

The operation is quite simple.
Use of NOLOGGING ensures minimal redo generation.
Parallel Query can be used to speed up processing.

Disadvantages

The space requirement is significantly high. There should be enough space for both the source and target tables at any given point. This could be a serious problem if your database is strained for space. Since the probable reason you are trying to make a table partitioned is that it is large, this problem can be quite challenging.
Space is required in a filesystem in which the database files can be created, not just in any filesystem. For instance, you may have a large /tmp filesystem, but that can't be used in this conversion method. In the case of Oracle Parallel Server or Real Application Cluster, the space should be available in the shared storage area, not in the local area.
There is a need for rollback segments. While creating the table as select from another table, Oracle uses the snapshot of the table at the beginning of the query. This creates considerable load on the rollback segment space. If you are dealing with a huge table, the rollback segment space may not be sufficient; worse, in an active system, the create table may produce an ORA-1555 Snapshot Too Old error.
The constraints, indexes, and triggers must be created separately after the new table is created. Indexes and constraints, like primary keys that also create indexes, may require a long time to be created. This adds to the overall elapsed time. The new table must be analyzed after being created, too.

High space, time, and resource consumption makes the Direct Creation option less attractive if you are dealing with medium to big databases. However, for smaller databases, the simplicity of the option makes it quite attractive.

Method 2: Direct Load Insert

In this method, the partitioned table is created empty. This can be easily done by running the listing provided for Method 1, but adding the line, WHERE 1 = 2 to the very end of the script after SELECT * FROM MYTAB line. By adding this line, the table is created but the data is not loaded, since the condition 1 = 2 is never satisfied.

After the table is created, the data is loaded from the source table using

INSERT /*+ APPEND */ INTO MYTAB_NEW
SELECT * FROM MYTAB;

This use of hint APPEND above, called direct load insert, loads the table data after the high-water mark of the target table, and therefore generates minimal redo information. Redo Log generation is comparable to the first method, but considerably less time is required since the data loading patterns are different. In a Direct Load operation, the rows are not inserted as regular inserts, unlike an INSERT statement or IMPORT. Rather, the data blocks are formatted and placed directly over the high-water mark. Since this operation does not require that an insert statement be processed, the SQL buffer is completely bypassed, for significant savings. However, since the source table is still read, the problem of large rollback segment usage remains, and that could cause ORA-1555 errors.

After the table is created, the original table is dropped and the new table is renamed as the original table. After that, all the indexes, constraints, and so on, can be created in the newly created table.

Advantages

Loading the target is quite fast because the APPEND option is used (this loads data above the high-water mark).
It is a relatively simple operation.
Parallel query servers can be used, which will enhance the speed of the operation.

Disadvantages

Free space equal to at least the total size of the table must be available. In medium to large databases, this may mean that a substantial amount of size is required, which may be difficult to obtain, even temporarily.
As in the case of the previously discussed methods, the space must be available in a filesystem where database files can be created.
Since read consistency on the source table is still necessary, there is a high rollback segment requirement, which may lead to ORA-1555 errors.
The total conversion process time required is still high, when the index and constraint creation tasks are added. Local index partitions can't be exploited.

The relative simplicity of this approach makes it an attractive option, but the high rollback segment requirement makes it unsuitable for large databases. If your database is medium size, your storage array has some room to spare and needed downtime can be obtained, so this method is strongly recommended.

Method 3: Export/Import

The previous methods are efficient, but require space inside the database. If space is available in some filesystem like /tmp, which cannot be added to the database, export the data from the source table MYTAB to a dump file in the filesystem using the export utility. (Using the option DIRECT=Y and setting RECORDSIZE and BUFFER parameters to proper values will speed up the process.) Then drop the table MYTAB, create a partitioned table MYTAB in its place, and import the data back.

Another variation of the process is exporting parts of the table that correspond to the data in the partitions using a where clause in the export utility, then compressing the dump file to make room for the other parts. While importing, a single dump file can be uncompressed, imported to a single partition, and then dropped.

Advantages

The biggest advantage of this method is that significantly less space is required. Export typically consumes less space than the source table. But the need for space in a regular filesystem like /tmp, and not in the database, makes the process attractive.
If needed, the export dump file can be created on a tape, an optical drive, a NAS, or even in a separate system, using a UNIX pipe. This flexibility may open up several other avenues for space.
If a table contains LONG columns, this is one of the only two possible options.

Disadvantages

Imports cannot be tweaked to minimize logging, unlike the direct loader option. Therefore, the redo generation and consequently, the archived log generation, are quite high.
The rollback segment requirement is also quite high. This can be lessened somehow by importing partition by partition, and by committing with buffer sizes in the import command line parameter.
This makes the table unavailable for the whole operation.
Since the table is dropped and recreated, all other operations, such as creation of indexes, have to be performed later, and this may consume time.
The import utility does not take advantage of parallel query servers.

Method 4: SQL*Loader

Instead of using the direct load insert, you could use SQL*Loader with the DIRECT option to load the data, too. To do this, create a flat file from the source table using SQL*Plus and load the data into the table. With the DIRECT option, the table data is loaded after the high-water mark, as in the case of a direct load insert; this generates minimal redo information. In addition, since the load is from a static file, there is no rollback requirement to hold the "before" image copy of the source data. This dramatically reduces the rollback segment use and eliminates the rollback segment overflow and ORA-1555 errors.

In Oracle 9i, external tables can be used for such operations. Besides the static file requirement, the advantages and disadvantages to this option are identical to those for the direct load insert method. However, since SQL*Loader must read from a flat file, the extra step of creating a flat file is necessary, and that might make the process slightly more complicated. Oracle does not provide a utility to dump the contents into a flat file; you must write your own (however, there are several scripts for this available on the Internet, so this does not pose a real problem). The use of a static file as a source mandates that the tables remain unchanged until the conversion process is complete, affecting availability. On the positive side, though, the loading can be done partition by partition, so the space requirement may be somewhat less than that required for other methods.

Method 5: Individual Partition Creation

This is a more complex but controllable operation. In this method, you would create small tables resembling the partitions of the target tablespace. Using the previously cited example, issue the following statements:

CREATE TABLE MYTAB_P1
TABLESPACE TS_P1
NOLOGGING
AS
SELECT *
FROM MYTAB
WHERE COL1 < 101;

This creates a table with data identical to what would be present in the partition P1 of the target table. Next, create index and constraints of the table:

CREATE INDEX IN_MYTAB_P1
ON MYTAB_P1 (COL2)
TABLESPACE INDX_P1
NOLOGGING;

If necessary, use parallelism to speed up the process. Next, analyze the table to make sure the data has proper partition-level statistics. Creating the indexes and generating stats at this stage are not mandatory; that can be put off until later.

Create similar tables corresponding partitions, naming then MYTAB_P2, MYTAB_P3, and so on, and create the indexes and constraints accordingly. When all these tables are ready, create a target table using the table creation script. Then, exchange the partitions of the target table with the corresponding source tables. For instance, you would issue the following:

ALTER TABLE MYTAB
EXCHANGE PARTITION P1
WITH TABLE MYTAB_P1
INCLUDING INDEX
WITHOUT VALIDATION;

This operation replaces the empty partition P1 of the table with the standalone table MYTAB_P1. The standalone index IN_MYTAB_P1 also replaces the corresponding partition of the local index IN_MYTAB. And the stats gathered on the standalone table are transferred to the partition of the target table. In reality, the partition and the table data are not exchanged; pointers inside the data dictionary are simply updated to point to the appropriate objects. Since no data is actually moved, the redo generation is negligible.

Any global indexes can now be formed on the newly produced table. The constraints of the old table are now placed on the new table with the NOVALIDATE option. (If you have not yet created the local indexes, you can do so at this point.) You can produce the indexes with the NOLOGGING option and by using the parallel operation to decrease elapsed time.

Advantages

Creation of individual partitions is separately controlled, and sometimes offers significant advantages. For example, in a sales table, the rows related to past sales may be static, so they can be converted to part tables at any time. The more current and changing data could then be transferred at a quieter time, ensuring minimal downtime for operations. With the other methods cited, the entire table is converted as a whole; finding a longer quiet and down time may be impossible.
Since you can form part tables and indexes in parallel, the process is actually a little faster if sufficient free space and CPU power is available.
If space is a problem, the partitions can be done one by one, increasing the elapsed time, but minimizing the need for space.

Disadvantages

This option requires space. If the entire table is converted as a whole, the space requirement is double the size of the table, which can be substantial. Even if the partitions are done one after the other, the size necessity is still high — double the size of the biggest partition size.
There is also the possibility of ORA-1555 errors in case of large tables.

Conclusion

Converting tables from partitioned to non-partitioned structures is a process that consumes considerable time and effort. DBAs managing different sites have different constraints to deal with when making this transition. This two-part article presents several options for the DBA to make this process work with the hope that a specific option may be suitable based on the particular nature of the problem at the site. In part 2 of this article, we will explore more options, including an undocumented one, and ways to convert tables with active transactions in both Oracle8i and 9i. Feel free to email the author with feedback on this article and experiences using these approaches at arup@proligence.com.

Arup Nanda has been an Oracle DBA for last 10 years, experiencing all aspects of database administration along the way, from modeling to performance tuning to disaster recovery planning. He is the founder of Proligence, Inc., (www.proligence.com), a New York-area company that provides specialized Oracle database services like replication, disaster recovery, parallel server, among others. Arup is an editor of SELECT, the journal of International Oracle User Group. He has written Oracle technical articles in several national and international journals including SELECT, Oracle Scene, and SQLUPDATE, and has presented at conferences like IOUG Live!. He is also the co-author of the upcoming book, HIPAA auditing and Security for Oracle from Rampant TechPress.

Contributors : Arup Nanda
Last modified 2006-01-05 09:59 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?

Journey into Partitioning - Same Goal, Many Paths - Part 1

Example Setup

Deciding Factors

Method 1: Direct Creation

Advantages

Disadvantages

Method 2: Direct Load Insert

Advantages

Disadvantages

Method 3: Export/Import

Advantages

Disadvantages

Method 4: SQL*Loader

Method 5: Individual Partition Creation

Advantages

Disadvantages

Conclusion