Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » DB2 » DB2 Mainframe Articles Archive » Large Objects and DB2
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 3554
 

Large Objects and DB2

by Craig. S. Mullins
This article was adapted from the DB2 Developer's Guide, 4th edition; for more information, go to http://www.craigsmullins.com/cm-book.htm or order it at http://www.amazon.com/exec/obidos/ASIN/0672318288/mullinassoci-20/002-3432133-0123225.

Traditionally, database management systems were designed to manage simple, structured data types. Most any database can be used to store numbers, alphabetic characters, and basic date and time constructs. But modern database management systems must be able to store and manipulate complex, unstructured data including multimedia data such as images, video, sound, and long documents.

As of Version 6, IBM augmented DB2 to enable the storage and management of complex data types using large objects. DBMSs that provide extensible data types to manage non-traditional data are sometimes referred to as object/relational database systems. IBM describes the term object/relational to encompass not just large objects, but also support for triggers, user-defined distinct types, and user-defined function. But these areas are beyond the scope of this article; here we will discuss DB2’s implementation of large objects. Do not be confused by the use of the term “object” in the phrase “object/relational.” An object/relational database management system has little to do with object-oriented technology or object-oriented programming and development.

OO technology is fundamentally based on the concept of, what else, but an object. Objects are defined based on object classes that determine the structure (variables) and behavior (methods) for the object. True objects, in traditional OO parlance, cannot be easily represented using a relational database. In the RDBMS, a logical entity is transformed into a physical representation of that entity solely in terms of its data characteristics. In DB2, you create a table that can store the data elements (in an underlying VSAM data file represented by a tablespace). The table contains rows that represent the current state of that entity. The table does not store all of the encapsulated logic necessary to act upon that data. By contrast, an object would define an entity in terms of both its state and its behavior. In other words, an object encapsulates both the data (state) and the valid procedures that can be performed upon the object's data (behavior). With stored procedures, triggers, and UDFs relational databases are “getting closer” to supporting OO techniques, but the implementation is significantly different.

Another term used in the industry when referring to extensible data type support is “universal.” IBM went so far as to rename and brand DB2 as DB2 Universal Database for OS/390 as of Version 6. Large object support is the primary factor governing the applicability of the term “universal” to DB2.

What is a Large Object?

A large object is a data type used by DB2 to manage unstructured data. DB2 provides three built-in data types for storing large objects:

      • BLOBs (Binary Large OBjects) — up to 2GB of binary data. Typical uses for BLOB data include photographs and pictures, audio and sound clips, and video clips.
      • CLOBs (Character Large OBjects) — up to 2GB of single byte character data. CLOBs are ideal for storing large documents in a DB2 database.
      • DBCLOBs (Double Byte Character Large OBjects) — up to 1GB of double byte character data (total of 2GB). DBCLOBs are useful for storing documents in languages that require double byte characters, such as Kanji.

To clarify, LOB data types actually can only store 1 byte less than 2 gigabytes of data.

BLOBs, CLOBs, and DBCLOBs are collectively referred to as LOBs. The three LOB data types are designed to efficiently store and access large amounts of unstructured data. DB2 understands that it is expensive to move and manipulate large objects. Therefore, LOBs are treated differently than the other standard built-in data types.

LOBs are not stored in the same structure as the rest of the data in the DB2 table. Instead, the table contains a descriptor that points to the actual LOB value. The LOB value itself is stored in separate LOB tablespace in an auxiliary table.

Application programs are written using LOB locators. A LOB locator represents the value of a LOB but does not actually contain the LOB data. This method is used because LOBs are typically very large and therefore expensive in terms of the resources required to move and store the LOB value. By using LOB locators, programs can avoid the expense associated with materializing LOBs.

When LOBs are created, the DBA can specify whether LOBs are to be logged or not. Once again, because LOBs are very large, logging them can be quite expensive and consume a large amount of log storage.

LOB Columns Versus VARCHAR and VARGRAPHIC Columns

It has been possible to store multimedia data in DB2 databases since Version 1 using VARCHAR and VARGAPHIC columns. But these data types provide limited functionality and usefulness when compared to LOBs.

The maximum size of VARCHAR or VARGRAPHIC column is 32 KB. This limitation may not pose a problem for smaller databases, but the requirements of modern (often web-enabled) applications usually require larger multimedia data. A 32 KB text document is not a very large document at all. And 32 KB is miniscule when it comes to storing multimedia data types such as audio, video, graphics, and images.

If you are sure that the text or graphic you wish to store will always consume less than 32 KB of storage, then you can use a VARCHAR or VARGRAPHIC data type instead of one of the LOB data types. However, LOB data types might still be preferable because of the efficient manner in which they are handled by DB2. Remember, VARCHAR and VARGRAPHIC data is stored with the rest of the data in the tablespace, as opposed to LOB data - which is stored in an auxiliary LOB tablespace.

The bottom line here is that one of the biggest considerations when using LOBs is their size. DBAs and developers who decide to use LOBs will need to understand that even small or simple graphics and text files can consume a large amount of space.

Creating Tables That Contain LOB Columns

There are four basic steps required to create and populate a table that uses LOB data types.
The first step is to define the appropriate columns in the DB2 table. Define one ROWID column and as many LOB columns as needed. Only one ROWID columns is required regardless of the number of LOB columns you specify. The ROWID and LOB columns are defined using the CREATE TABLE or ALTER TABLE statement. The definition of the LOB column must specify whether the column is a BLOB, CLOB, or DBCLOB. Furthermore, you must specify a size for the LOB. Failure to specify a size causes DB2 to use the following default:

      • For BLOBs - 1 MB (or 1,048,576 bytes)
      • For CLOBs - 1,048,576 single byte characters
      • For DBCLOBs - 524,288 double-byte characters

The LOB column in the DB2 table will contain only information about the LOB, not the actual data value. The table containing the LOB definition is referred to as the base table. The table containing the actual LOB data is referred to as the auxiliary table (more on this in a moment).

The ROWID column is used by DB2 to locate the LOB data. A ROWID is a unique 19-byte system generated value. If you are adding a LOB column and a ROWID column to an existing table, you must use two ALTER TABLE statements. Add the ROWID with the first ALTERTABLE statement and the LOB column with the second ALTER TABLE statement.

In the second step you will need to create a table and a tablespace to store the LOB data. The table is referred to as an auxiliary table; the tablespace is called a LOB tablespace. The base table can be in a partitioned tablespace but the LOB tablespace cannot be partitioned.

If the base table is not partitioned, you must create one LOB tablespace and one auxiliary table for each LOB column. If the tablespace containing the base table is partitioned, you must create one LOB tablespace and one auxiliary table for each partition, for each LOB. For example, if your base table has six partitions, you must create six LOB tablespaces and six auxiliary tables for each LOB column.

You will use the CREATE LOB TABLESPACE statement to create LOB tablespaces and the CREATE AUXILIARY TABLE statement to create auxiliary tables. Keep in mind that the LOB tablespace must be created in the same database as the base table.

The third step is to create a unique index on the auxiliary table. Each auxiliary table must have exactly one index. The CREATE INDEX statement is used to create the auxiliary table index. Do not specify any columns for the index key. When a CREATE INDEX is issued against an auxiliary table DB2 will implicitly define the index key on the ROWID column. This may seem odd at first but it you will get used to it.

The final step is to populate the LOB data into the table. Though we know that the actual LOB data is stored in an auxiliary table in a LOB tablespace and not in the base table, when you populate the LOB data you must reference the base table. If the total length of the LOB column and the base table row is less than 32 KB, you can use the LOAD utility to populate the data into DB2. If the LOB column is greater in size you must use INSERT or UPDATE statements. When using INSERT to populate the LOB data you must ensure that your application has access to adequate storage to hold the entire LOB value that is to be inserted.

A Sample Table Using LOB Columns

Consider the steps you would need to take to add an org chart to the DSN8710.DEPT sample table. The org chart is a BLOB of no more than 5 megabytes in size. The first step would be to alter the table to add two columns: a ROWID column and a BLOB column. For example:

ALTER TABLE DSN8710.DEPT
 ADD ROW_ID ROWID GENERATED ALWAYS;
COMMIT;
ALTER TABLE DSN8710.DEPT
 ADD DEPT_ORG_CHART BLOB(5M);
COMMIT;

The next step would be to create the LOB tablespace and auxiliary table for the LOB column. For example:

CREATE LOB TABLESPACE TDORGCHT
  IN DSN8D71A
  LOG NO;
COMMIT;
CREATE AUXILIARY TABLE DEPT_ORGCHART_TAB
  IN DSN8D71A.TDORGCHT
  STORES DSN8710.DEPT
  COLUMN DEPT_ORG_CHART;
COMMIT;

Following this you must create the index on the auxiliary table. Remember, you do not need to specify columns for the index key when an index is defined on an auxiliary table. The following SQL CREATE statement defines the auxiliary table index:

CREATE UNIQUE INDEX XDEPTORG
  ON DEPT_ORGCHART_TAB;
COMMIT;

Accessing LOB Data

LOB columns can be accessed using SQL just like other columns, in most cases. For example, you can code a SQL SELECT statement to retrieve the resume information stored in the EMP_RESUME column of the DSN8710.EMP table as follows:

SELECT EMPNO, EMP_RESUME
FROM   DSN8710.EMP;

When embedding SQL in application programs you need to take the size of LOBs into consideration. By using a LOB locator you can manipulate LOB data without actually moving the data into a host variable. A LOB locator is as a reference to the large object, and not the LOB data itself. Figure 1 illustrates this principle.

Figure 1: Using LOB locators.

A LOB locator is associated with a LOB data value or LOB expression, not with a row in a DB2 table or an actual physical storage location in a DB2 tablespace. So, once you SELECT the LOB value using a LOB locator the value of the locator should not change, but the actual value of the LOB might change.

DB2 provides two statements to work with LOB locators:

FREE LOCATOR removes the association between the LOB locator and its LOB value before a unit of work ends
HOLD LOCATOR maintains the association between a LOB locator and its LOB value after the unit of work ends. After issuing the HOLD LOCATOR statement, the LOB locator will keep its association with the LOB data value until the program ends or FREE LOCATOR is issued.

You can not use EXECUTE IMMEDIATE with the HOLD LOCATOR or FREE LOCATOR statements when issuing dynamic SQL.

By using LOB locators your application programs will require significantly less memory than would be required if entire LOB values were returned to the program. The LOB locator can be returned from queries, inserted into new tables, and used by the application code like any other host variable. LOB locators enable the application to defer actually accessing the large object itself until the application needs the contents of that object.

You will need to DECLARE host variables to hold the LOB data or LOB locators. The host variables must be declared of SQL type BLOB, CLOB, or DBCLOB. DB2 will generate an appropriate declaration for the host language. For example, review Tables 1 and 2 for COBOL host variable declarations for LOB variables and LOB locators. In SQL statements you must refer to the LOB host variable or locator variable specified in the SQL type declaration. In host language statements (such as COBOL) you must use the variable generated by DB2.

Declared in the Program Generated by DB2
01 BLOB-VAR USAGE IS
     SQL TYPE IS BLOB(1M).
01 BLOB-VAR.
   02 BLOB-VAR-LENGTH PIC 9(9) COMP.
   02 BLOB-VAR-DATA.
      49 FILLER PIC X(32767).
      Repeat above line 31 times.
      49 FILLER PIC X(32).
01 CLOB-VAR USAGE IS
   SQL TYPE IS CLOB(40000K).
01 CLOB-VAR.
   02 CLOB-VAR-LENGTH PIC 9(9) COMP.
   02 CLOB-VAR-DATA.
      49 FILLER PIC X(32767).
      Repeat above line 1249 times.
      49 FILLER PIC X(1250).
01 DBCLOB-VAR USAGE IS
   SQL TYPE IS CLOB(40000K).
 01 DBCLOB-VAR.
    02 DBCLOB-VAR-LENGTH PIC 9(9) COMP.
    02 DBCLOB-VAR-DATA.
       49 FILLER PIC G(32767)
          USAGE DISPLAY-1.
       Repeat above 2 lines 1249 times.
       49 FILLER PIC G(1250).

Table 1: LOB Variable Declarations.

The size limitation for COBOL variables is 32,767 bytes. This is a limit of the COBOL compiler. That is why DB2 generates multiple declarations of 32,767 bytes until it reaches 1M.

Declared in the Program Generated by DB2
01 BLOB-LOC USAGE IS 
   SQL TYPE IS BLOB-LOCATOR.
01 BLOB-LOC PIC S9(9) USAGE IS BINARY.
01 CLOB-LOC USAGE IS
   SQL TYPE IS CLOB-LOCATOR.
01 CLOB-LOC PIC S9(9) USAGE IS BINARY.
01 DBCLOB-LOC USAGE IS 
   SQL TYPE IS DBCLOB-LOCATOR.
01 DBBLOB-LOC PIC S9(9) USAGE IS BINARY.

Table 2: LOB Locator Variable Declarations.

The sizes of the LOBs you can declare and manipulate depend on the limits of the host language and the amount of storage available to your program. LOB host variables can be defined for the C, C++, COBOL, Assembler, PL/I, and Fortran programming languages.

LOB Materialization

When DB2 materializes a LOB it places the LOB value into contiguous storage in a data space. The amount of storage that is used in data spaces for LOB materialization depends on the size of the LOB data and the number of LOBs being materialized.

Because LOBs are usually quite large, LOB materialization should be avoided until it is absolutely required. DB2 will perform LOB materialization under the following circumstances:

      • When a LOB host variable is assigned to a LOB locator host variable in an application program
      • When a program calls a UDF that specifies a LOB as at least one of the arguments
      • When a LOB is moved into or out of a stored procedure
      • When a LOB is converted from one CCSID to another
        By reducing the number of times you take these actions in your programs you can minimize LOB materialization and enhance the performance of applications that access LOB data. You cannot completely eliminate LOB materialization. However, using LOB locators you can minimize its impact on your applications.

LOBs and Locking

A lock that is held on a LOB value is referred to as a LOB lock.

When a row is read or modified in a table containing LOB columns the application will obtain a normal transaction lock on the base table. The locks on the base table also control concurrency for the LOB tablespace. When locks are not acquired on the base table, because of ISOLATION(UR) for example, DB2 maintains data consistency by using locks on the LOB tablespace.

Regardless of the isolation level, for other reasons DB2 also obtains locks on the LOB tablespace and the LOB values stored in that LOB tablespace.

DB2 Extenders

LOB data types (BLOB, CLOB, and DBCLOB) provide an easy way to store large, unstructured data in DB2 databases. But LOBs are nondescript. The only thing you know about them is a general idea of the type of data:

      • A BLOB is binary data
      • A CLOB is character data
      • A DBCLOB is double-byte character data

But DB2 comes equipped with Extenders that can be used to provide additional meaning and functionality to LOBs. DB2 Extenders are available for image, audio, video, and text data. A DB2 Extender provides a distinct type for the LOB, and a set of user-defined functions for use with objects of its distinct type. Additionally, the DB2 Extenders automatically capture and maintain attribute information about the objects being stored. They also provide APIs for your applications to use.

Basically, the DB2 Extenders provide the functionality to make LOBs useful for your applications. With the DB2 Extenders, you could store LOBs, but doing anything very useful with them would be difficult and require a lot of work.

The DB2 Extenders use the MMDBSYS schema for all objects, including UDTs and UDFs. The following UDTs are created by the DB2 Extenders to support image, audio, and video data:

DB2AUDIO A variable-length string containing information needed to access an audio object, also called an ausio handle.
DB2IMAGE A variable-length string containing information needed to access an image object, also called an image handle.
DB2TEXTH A variable-length string containing information needed to access a text document, also called a text handle.
DB2TEXTFH A variable length string containing information required for indexing an external text file, also referred to as a file handle.
DB2VIDEO A variable-length string containing information needed to access a video object, also known as a video handle.

The DB2AUDIO, DB2IMAGE, and DB2VIDEO UDTs are based on a VARCHAR(250) data type.The DB2TEXTH UDT is based on a VARCHAR(60) data type with FOR BIT DATA.

The information in a text handle includes a document ID, the name of the server where the text is to be indexed, the name of the index, information about the text file, and information about the location of the file. File handles are stored in columns that Text Extender creates and associates with each group of external files. The audio, image, and video handles are stored in columns created by each specific extenders for handling that type of data - audio, image, or video.

When enabled, each of the DB2 Extenders - audio, image, text, and video - also creates user-defined functions for use on columns defined as the UDT.

The system administrator must enable the DB2 Extenders you wish to use at your site. When the extender is enabled it creates the UDTs, UDFs, administrative tables, and supporting APIs for the extender. The extenders require the use of WLM (Work Load Manager) application environments for the UDFs and stored procedures that are created. The extenders use stored procedures to process API requests. After the DB2 extenders are installed, you need to establish WLM environments for the extender UDFs and stored procedures.

Before you use the DB2 Extenders, you need to consider the security and authorization issues your will encounter. First, you must decide how to secure access to the actual content of the audio, image, text, and video data. Additionally, the DB2 Extenders create administrative support tables to store additional information about the extenders. Some administrative support tables identify user tables and columns that are enabled for an extender. Other administrative support tables contain attribute information about objects in enabled columns. One example is the QBIC tables created by the Image Extender (QBIC stands for Query By Image Content). You must decide who should have access to the metadata in the administrative support tables.

Secondly, you need to determine how to manage the privileges that are automatically granted when the DB2 Extender is enabled. For example, when a DB2 Extender is enabled, USE privilege is granted to PUBLIC for the UDT, its related CAST functions, and all the UDFs for the extender. This may, or may not, be acceptable in your shop. If you REVOKE the privileges and GRANT them to specific authids, be prepared for the potential headache of administering the list of authorized users of the extender’s functionality.

Your audio, image, and video data can be stored in files external to DB2. In that case you can also control access to the content in external files. This can be achieved using operating system security commands, which are usually performed by a separate security group. By limiting access to the external files you limit the ability to retrieve the objects for the extender’s data type.

The files must be in a file system that is compatible with OS/390 UNIX System Services (USS), for example, a hierarchical file system. OS/390 USS was previously known as MVS Open Edition.

Another consideration is MMDBSYS. All of the DB2 Extenders use the MMDBSYS SQLID. The UDT and all of the UDFs created by the DB2 Extender will be created in the MMDBSYS schema. You should consider creating the MMDBSYS userid to manage the administrative support tables. Use an appropriate external security package (such as ACF2 or RACF) to create an MMDBSYS userid.

Finally, you must consider who can issue administration commands to the db2ext command line processor for DB2 Extenders.

The administrative APIs and administration commands are documented in the DB2 Extenders manuals:

SC26-9947 DB2 Image, Audio, and Video Extenders: Administration and Programming

SC26-9948 DB2 Text Extender: Administration and Programming

Summary

Now you should have a basic understanding of how to create DB2 databases that manage complex, multimedia data types. Good luck exploring the world of multimedia and DB2!

--

Craig Mullins is an independent consultant and president of Mullins Consulting, Inc. Craig has extensive experience in the field of database management having worked as an application developer, a DBA, and an instructor with multiple database management systems including DB2, Sybase, and SQL Server. Craig is also the author of the DB2 Developer’s Guide, the industry-leading book on DB2 for z/OS, and Database Administration: Practices and Procedures, the industry’s only book on heterogeneous DBA procedures. You can contact Craig via his web site at http://www.craigsmullins.com.


Contributors : Craig. S. Mullins
Last modified 2006-01-16 04:41 AM

error in large object

Posted by julianti at 2006-04-25 01:26 AM
i have read your article and try it.
but i found error on it.
i have table like this
create table employee
(
name varchar(10),
photo BLOB (1M)
)

ALTER TABLE employee ADD ROW_ID ROWID GENERATED ALWAYS;
but it is error.
my database call BARU.
what is DSN8710 in your article?
the schema or username?
please help me, i need it.
thank you.
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone