Exploiting Oracle8 Object Features

by Dave Ensor

Paper presented by Dave Ensor at Oracle Open World, November 1998, San Francisco. The paper looks at the object features released with Oracle8 from a pragmatic standpoint and tries to determine how useful they are likely to be to the developers of business or enterprise applications which are intended to run against the Oracle8 server.

Introduction

This paper does not attempt to contribute to the debate as to whether Object Orientation (OO) and its associated development practices are good, bad or indifferent. Rather it looks at the object features released with Oracle8 from a more pragmatic standpoint and tries to determine how useful they are likely to be to the developers of business or enterprise applications which are intended to run against the Oracle8 server.

Having declined the opportunity to discuss OO itself, the author still believes that it is important to disclose his bias in researching and writing this paper. Whatever the merits of OO for the creation of applications, there is little evidence that the persistent storage of data objects is, of itself, a significant advantage for the average enterprise application. The rationale for this assertion is that one of the strengths of the relational database has proved to be its ability to allow different parts of an application to take entirely different views (pun definitely intended) of the data which is typically held decomposed into third normal form (3NF). One process may choose to join Orders to Lines and to join from there to Products to project a valued or priced Order, whereas another may choose to join Orders to Stock Allocations to derive a picking list to be sent to the warehouse.

One of the major restrictions of the traditional relational table is the absence of repeating groups, and a major feature of Oracle's object support is that we at last have the ability to formally declare the existence of a collection (or repeating group) within a row. This alone might be sufficient to excite your interest in looking at the facilities offered by the Objects Option to see whether you should be using it in upcoming projects.

Object support within Oracle had a long gestation period, and the support that has arrived in Oracle8 is stated by Oracle Corporation to be just the beginning. This is both worrying as it seems to imply some incompleteness, and encouraging as it may be taken to mean that a number of current restrictions will last only until the next release (or possibly the one after).

The investigative work reported in this paper was conducted against Oracle8 Server Release 8.0.4.0.0 running under MS-Windows NT Server Version 4 on a Toshiba 730CDT with 144Mb of RAM.

All views expressed within this paper are those of the author, and are not necessarily those of BMC Software Inc. Some of the material in this paper has been taken from the book Oracle8 Design Tips by Dave Ensor and Ian Stevenson, published by O'Reilly & Associates Inc. in September 1997.

Evaluating New Features

Each time that we are faced with new features within a software package, especially one that has served us well, there are a series of questions that we should address before committing ourselves to using these new features:

Does the feature look as though it might solve a problem that we have encountered in the past?
Does the feature look as though it may allow us to do something that we have not previously considered?
Does the feature work?

In the past we have seen many new features which were clearly aimed at solving a particular problem, but which when closely examined turned out to be something of a disappointment. In the following sections of this paper the author attempts to briefly discuss whether the feature that is the topic of the section does solve a problem experienced with previous versions of Oracle.

User-Defined Datatypes

User-defined datatypes sit on top of the built-in datatypes provided by the Oracle server. These new datatypes can be used (with certain restrictions) in place of the built-in datatypes within table definitions and PL/SQL declarations. In addition a user-defined type may have methods or executable code (PL/SQL) as part of its definition. These are declared as member procedures and functions that operate within the context of an instance of the type. A simple member function is shown in the example in the next section.

Three new classes of type are supported. OBJECT allows the creation of a new scalar datatype with its own internal structure. VARRAY and TABLE are collections meaning that an instance of the type may have multiple values.

Objects

Rather than presenting a long description it may be simpler to quote the following simple example:

create type BOX as object
   ( HEIGHT number
   , WIDTH  number
   , DEPTH  number
   , member function VOLUME 
                return number
   , pragma restrict_references (VOLUME, rnds, wnds, 
     rnps, wnps)
   );
/
create type body BOX is
     member function VOLUME 
                return number is
     begin
       return   HEIGHT
              * WIDTH
              * DEPTH;
     end;
end;
/
create table PRODUCTS
 ( PRODUCT_ID   number
                primary key
 , PRODUCT_NAME varchar2(30) 
                not null
 , PRODUCT_SIZE box
                not null
 );

All references to the attributes of an instance of an user-defined type have to be qualified by the column name so references to the height must take the form PRODUCT_SIZE.HEIGHT. Worse, in SQL references to type members must be fully qualified using a table alias and in SQL, rather than PL/SQL, references to member functions must use parentheses. This may be completely rational but most current tools do not support it. Further disappointments are that it is not possible to declare constraints on the members of a type, and it is also impossible to create "triggers" which operate on a type.

In order to create an instance of BOX we must use its constructor whose name is also BOX, e.g.

insert into PRODUCTS 
   values ( 42
          , 'HITCHHIKER''S GUIDE'
          ,  BOX(8,6,3));

In the present release the default constructor function cannot be overridden with user-supplied code so the constructor cannot be used to enforce data rules about maximum and minimum values (or indeed anything else). This inability to use object types to enforce domain discipline is a major disappointment.

VARRAYs

VARRAY or VARYING ARRAY (Oracle8 accepts both names but VARRAY has the advantage of being shorter and easier to type) is a repeating group that can be housed in a column. This ability is useful in a number of applications where sets of readings are taken which are never processed or queried in isolation but which are always handled as a group. One example might be a VARRAY of 24 half-hourly meter readings.

Each VARRAY type has a maximum number of entries and the values are stored in-line with the rest of the row data. The VARRAY may itself be NULL or any or all of its entries may be NULL. In Oracle8 it is rather easier to populate a VARRAY than it is to retrieve values from it. Like other types, a VARRAY has a constructor function and can therefore be populated by using the individual values as arguments to the constructor function e.g.

create or replace type BOX2 
   as VARRAY(3) of NUMBER;
/
create table PRODUCTS
 ( PRODUCT_ID   number
                primary key
 , PRODUCT_NAME varchar2(30)
                not null
 , DIMENSIONS   BOX2
                not null
 );
insert into PRODUCTS 
   values ( 42
          , 'HITCHHIKER''S GUIDE'
          ,  BOX2(8,6,3));

However, unlike the components of an object type, which can be retrieved by name, the retrieval of the values in a VARRAY requires either PL/SQL or C/C++. This restriction may be a major problem if the values need to be widely available to query tools, but the PL/SQL is quite trivial and can be provided within functions and procedures as required:

declare
  PSIZE BOX2;
begin
  select DIMENSIONS into PSIZE
    from PRODUCTS
   where PRODUCT_ID = 42;
  dbms_output.put_line
   (   'Product is '
    || to_char(PSIZE(1)) || '*'
    || to_char(PSIZE(2)) || '*'
    || to_char(PSIZE(3)));
end;
/

The VARRAY type BOX2 shown above uses NUMBER, which is one of Oracle's built-in types. VARRAYs can themselves include ADTs, though nested VARRAYs are not supported. VARRAYs offer a convenient way of holding repeating groups of values in-line within a row provided that the (maximum) size of the repeating group is known and that it is acceptable to have to use either a 3GL or PL/SQL to extract the values.

Tables

With Oracle8 object support we can create tables using traditional DDL and include user-defined types as table columns, or we can create object tables e.g.

create type DEPT_TYPE as object
  ( DEPTNO number(2)
  , DNAME  varchar2(14)
  , LOC    varchar2(13)
  );
/
create table DEPT_TAB
          of DEPT_TYPE;

Interestingly the object table DEPT_TAB can be handled in both queries and DML using exactly the same syntax as the standard DEPT table because its columns are based on built-in datatypes. However, as an object table, each row has a globally unique OID or Object Identifier and we can create and store pointers from one table to another. The pseudo-column rowid is also available for an object table.

REFs

The next example creates a new version of the EMP table containing a reference or pointer to the table DEPT_TAB:

create table EMP_TAB
  ( EMPNO    number(4)
  , ENAME    varchar2(10)
  , JOB      varchar2(9)
  , MGR      number(4)
  , HIREDATE date
  , SAL      number(7,2)
  , COMM     number(7,2)
  , DEPT     ref   DEPT_TYPE 
             scope DEPT_TAB
  );

Interestingly the SCOPE clause is optional, and without it different rows of our EMP_TAB could contain pointers to different object tables each of the row type DEPT_TYPE. These tables can be in different schemas, and at some point in the future even in different physical databases.

We can populate these new tables from the old EMP and DEPT tables with the statements:

insert into DEPT_TAB
select * from SCOTT.DEPT;

insert into EMP_TAB
   select EMPNO, ENAME, JOB, MGR
        , HIREDATE, SAL, COMM
        , REF(D)
     from SCOTT.EMP E
        , DEPT_TAB D
    where E.DEPTNO = D.DEPTNO;

and we can now code "automatic joins" between our two newly loaded tables using queries such as

select ENAME, DEPT.LOC
  from EMP_TAB
 where JOB = 'CLERK';

The initial impression is that this could help reduce coding errors, and that because it uses an internal reference or pointer it might also prove to be more efficient than a traditional join on a foreign key. The simplicity of the query is certainly attractive but testing has shown that REFs to take more space within each row (at least 36 bytes) than the average foreign key (often less than 10 bytes). Navigation of a REF uses almost exactly the same resource as navigation through a traditional indexed equi-join unless the subject of the REF has been reloaded in which case there is a severe performance penalty in using the REF.

REFs have another characteristic that takes some getting used to - if the row which is referenced is deleted, any REF to it becomes a dangling REF and points nowhere. Any attempt to navigate such a reference generates an ORA-21700 error. So any projects that decide to use REFs will find that they must write all of their code to handle the possibility of dangling REFs. There is no efficient way of preventing dangling references from occurring without enforcing dequeues (the storage of pointers in each direction). Now, of course, the target of the REF could contain a trigger-maintained list of the rows which are referencing it, but such a list could become long and might need to be held in a separate (child) table. It would appear, therefore, that for the moment most projects should continue to use foreign key relationships rather than REFs.

Derek Storkey, one of BMC Software's DB2 gurus, got to this point while proofreading the original draft of this paper and commented, "So Codd was right after all!"

Nested Tables

In addition to being able to hold "repeating groups" within VARRAYs we can also hold one table within another using nested tables, e.g.

create type EMP_T
  ( EMPNO    number(4)
  , ENAME    varchar2(10)
  , JOB      varchar2(9)
  , MGR      number(4)
  , HIREDATE date
  , SAL      number(7,2)
  , COMM     number(7,2)
  );
/
create type   EMP_TAB 
  as table of EMP_T;
/
create table  DEPT_NEST 
  ( DEPTNO    number(2)    
              primary key
  , DNAME     varchar2(14)
  , LOC       varchar2(13)
  , DEPT_EMPS emp_tab
  )
  nested table DEPT_EMPS 
      store as NESTED_EMP;

Tables can only be nested to one level but we can create indexes and triggers on the table NESTED_EMP. However what we cannot do is access a nested table independently of its parent. Any index on a nested table has its index key prefixed with the object ID of its parent row. The syntax for handling nested tables uses a series of extensions to SQL which are not supported by the majority of the tools on the market today, and which are initially complex to code. The inability to query directly against the child table, plus the lack of support within the current generation of tools for the SQL required to process nested tables, makes it unlikely that many projects will chose to exploit them in the near future.

Client-Side Cache

For projects already using OO (Object Oriented) techniques within C or (more likely) C++, Oracle8 provides the ability to fetch an entire object into the client space, manipulate it locally using calls, and then to send it back to the server for storage. This set of services, known collectively as the client-side cache, has the ability to greatly reduce the number of calls that must be sent from client to server and this alone can offer an enormous performance advantage. It also means that current OO practice can be followed within a project that is setting out to use the Oracle Server.

However the requirement to code the database interface using procedural calls rather than SQL almost certainly means that the client-side cache will be used only by a small percentage of projects. Oracle's Object Database Designer tool generates C++ classes to issue the (many) OCI calls required and may present a more tenable route to use of the client-side cache than calling the services directly.

Object Views

Oracle recognizes that many enterprise databases will continue to be structured using conventional relational table structures. The Object View mechanism provides the ability for views to be constructed which externalize conventionally structured tables as object tables, allowing master/detail relationships between such tables to be presented as VARRAYs or nested tables. Thus applications operating against conventional relational structures can use Object Views to take advantage of features such as the client-side cache.

This is an exciting capability, but may in some cases be the opposite of what is required in the short term. Current programming practice in Oracle-based applications is oriented around the use of server-side PL/SQL packages and SQL statements to operate on relational structures in 3NF (third normal form). Any migration to persistent object structures is likely to create an immediate demand for such structures to be externalized in 3NF.

LOBs

LOBs (Large Objects) are not strictly speaking part of the Oracle8 Object Support at all. The LOB functionality is included with Oracle8 whether or not the Objects option is present and is aimed at improving the server's ability to handle long opaque scalar values. It allows the value to be up to 4Gb long, and also stores the value in a separate segment from the rest of the row. From within PL/SQL and OCI a LOB value can be read and written in chunks. In many ways the calls available from PL/SQL and C/C++ are analogous to those provided by operating systems to handle serial files, and indeed one of the supported forms of LOB storage is an operating system file. This data type, BFILE, has the severe restriction that the data may not be written using LOB support, only read.

The package DBMS_LOB is created when Oracle8 is installed and contains a number of valuable functions, including the ability to test a character LOB to see whether it contains a particular string. In other words, using PL/SQL functions, it is now possible to operate on "long" values of up to 4Gb from within a conventional SQL statement.

In Oracle7 the only options for storing a scalar value over 2,000 bytes in length were to place it in a LONG or LONG RAW column, or to decompose the value into pieces 2,000 bytes each or less and to store these in a child table. Oracle has always placed a series of restrictions on LONG and LONG RAW columns, and the storage management needs of relatively short structured row data conflicted with the storage management needs of the relatively long opaque data. The term opaque is used rather than the term unstructured because the assumption is that the LONG or LOB has some internal structure even if this is unknown (or opaque) to the Data Server.

Using the child table technique counters all of the restrictions placed by Oracle on LONG values, and has the additional benefit that the LONG data is no longer stored with the rest of the row. The severe disadvantages are that insertion takes much longer, and that the application must provide the required assembly and disassembly operations.

The greatest single potential disadvantage of storing data in LOBs is that such data can only be accessed from PL/SQL and C/C++. The only manipulations that can be performed on LOBs directly from the SQL language are the assignment of an empty value to a LOB and testing a LOB value for nullity (which is different from emptiness). Of course the data within a LOB can be manipulated indirectly from SQL by calling PL/SQL functions from within SQL.

An application using LOBs may choose to provide assembly and disassembly operations, but if the value is less than 32Kb (32,768 bytes) it can be inserted or retrieved in a single operation in PL/SQL. Values of any legal length can be written and read from C/C++ assuming only that the execution platform has sufficient physical and virtual memory. One strange features of the PL/SQL interface is that in order to write a LOB value into a row, the row must first be inserted into the database with an empty value for the LOB. Then the resulting locator must then be queried back into the program. This requirement to query rows after insert means that, unusually for Oracle, the best performance is achieved when loading with the primary key constraint enabled.

create table LOB1
     ( HANDLE   number primary key
     , TEXT     clob 
     )
     lob (TEXT) store as 
       (storage
         (initial 500k 
          next    500k));
…
     insert into LOB1 
     (HANDLE, TEXT) values 
     (THIS.HANDLE, EMPTY_CLOB);
     select TEXT into CLOB_LOC
       from LOB1
      where HANDLE = THIS.HANDLE;
   TEXT_LTH := length(THIS.TEXT);
   dbms_lob.write( CLOB_LOC
                 , TEXT_LTH
                 , 1
                 , THIS.TEXT);
…

This requirement to load with an index in place seemed to represent a major inefficiency. A number of tests showed, however, that the overhead of having to maintain the primary key index during the load was not as significant as initially expected. The benefit of having the index started to become significant when the table that contained the LOB locator had more than a few hundred rows. In order to compare insertion rates a series of tests were performed loading text strings into detail tables with VARCHAR2 pieces, LONG columns and CLOB columns. While the strings were short (under 4Kb) LOB storage was rather slower but with 500 * 28Kb text strings (stored in 7 * 4Kb VARCHAR2 child records, a LONG column and a CLOB) the following results were obtained:

Structure	Load with PK	Load, no PK	Search 1	Search 2	Search 3	Space Used
VARCHAR2	120 secs	112 secs	41.8 secs	10.4 secs	0.01 secs	16.5 Mb
LONG	100 secs*	100 secs*	43.3 secs	18.4 secs	9.10 secs	16.5 Mb
CLOB	84 secs	92 secs	11.3 secs	8.2 secs	0.01 secs	14.4 Mb

All of the times quoted are elapsed on an otherwise idle machine. The times for insertion of the LONG were not stable over several iterations and the values quoted are therefore approximate.

The three searches were designed to pose different problems. In Search 1 a PL/SQL anonymous block counted the number of rows which contained a particular substring which existed in only one row thus requiring all of the text fields to be searched. Search 2 used the same PL/SQL code but the substring occurred in every row and was within a few characters of the start of the column. This greatly reduced the amount of time spent in LIKE and INSTR functions. Search 3 was a summary operation on a unindexed NUMBER column, and was included to show the potential penalty of holding long data values in-line with rows where some serial processing may not require access to the long value.

It was also found during tests that with a constant LOB length the time taken for piecewise insert operations rose sharply with the number of pieces. It was for this reason that the tests presented above were performed using a value 28Kb long.

Despite the requirement for additional code, LOBs appear to offer a number of functional and performance advantages over the techniques available in previous versions of Oracle. The Objects option is not required for LOB use.

Conclusion

The LOB mechanism in Oracle8 offers a valuable facility for managing attributes which are too large to be contained in a conventional column. Although it requires that at least some coding be performed in either PL/SQL or C/C++, the bulk of the data within the schema can remain in traditional datatypes and will be fully accessible through conventional techniques.

In general the Oracle8's user-defined types offer facilities only to projects which are prepared to adopt an entirely new programming paradigm whether that be within the extended PL/SQL language or using the many new calls added to Oracle's C/C++ libraries. At their simplest, the new type facility fails to meet the reasonable expectation of a declarative mechanism to enforce domain discipline, which is a major disappointment.

It appears unlikely that projects will find major benefit in Oracle8 Object Support without a substantial investment in either new tools or in-house development of middleware to exploit the new features.

---

Dave Ensor is a Product Developer with BMC Software where his mission is to produce software solutions that automate Oracle performance tuning. He has been tuning Oracle for 13 years, and in total he has more thab 30 years' active programming and design experience.

As an Oracle design and tuning specialist Dave built a global reputation both for finding cost-effective solutions to Oracle performance problems and for his ability to explain performance issues to technical audiences. He is co-author of the O'Reilly & Associates books Oracle Design and Oracle8 Design Tips.

Contributors : Dave Ensor
Last modified 2005-02-18 10:30 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?