DB2 User-defined Functions

by Craig S. Mullins

This article was adapted from the DB2 Developer's Guide, 4th edition; for more information, go to http://www.craigmullins.com/ddg4e.htm.

As of DB2 Version 6, it is possible to create additional functions to supplement the built-in function supplied with DB2. User-defined functions and types give users the ability effectively to customize DB2 to their shop requirements. The ability to customize is potentially very powerful. Yet, it also can be quite complex and requires detailed knowledge, additional application development skills, and administrative dexterity.

What is a User-Defined Function?

A user-defined function, or UDF for short, is procedural functionality added to DB2 by the user. The UDF, once coded and implemented extends the functionality of DB2 by enabling users to specify the UDF in SQL statements just like built-in SQL functions.
User-defined functions are ideal for organizations wishing to utilize DB2 and SQL to perform specialized, corporate routines performing business logic and data transformation.
Types of User-Defined Functions (UDFs)

There are two ways of creating a user-defined function: you can code your own function program from scratch or you can edit an existing function.

Two types of user-defined functions can be written from scratch: scalar functions and table functions. Recall from Chapter 3 that scalar functions are applied to a column or expression and operate on a single value. Table functions are a different type of function that, when invoked, return an entire table. A table function is specified in the WHERE clause of a SELECT statement taking the place of a table, view, synonym, or alias.

Scalar and table user-defined functions are referred to as external functions, because they are written and developed outside of (or external to) DB2. External UDFs must be written in a host programming language. DB2 user-defined functions can be written in Assembler, C, COBOL, or PL/I.

A third type of user-defined function can be created from another existing function. This is a sourced function. A sourced function is based on a function that already exists--it can be based on a built-in function or another user-defined function that has already been created. A sourced function can be based on an existing scalar or column function.

Why would you ever need to create a sourced UDF? Typically, sourced UDFs are used in conjunction with DISTINCT data types. A DISTINCT data type is basically a user-defined type, or UDT. DB2 enforces strong typing on user-defined data types. Strong typing prohibits non-defined operations between different types. For example, the following operation will not be allowed due to strong typing:

TOTAL = AUSTRALIAN_DOLLAR + EURO

Strong typing, in this case, helps us to avoid an error. Adding two different currencies together, without converting one currency to the other, will always result in nonsense data. Think about it, you can not add a handful of Australian coins with a handful of US coins and come up with anything meaningful (or, perhaps more importantly, spendable).

Now, consider another example where UDFs have been defined to convert currency amounts. If a specific conversion function is defined that accepts US_DOLLAR data types as input, you would not want to accept other currencies as input. Doing so would, most likely, cause the UDF to convert the currency amount incorrectly. For example, consider the UDF USDTOYEN() created as follows:

CREATE FUNCTION USDTOYEN(US_DOLLAR)
RETURNS JAPANESE_YEN . . .

This function accepts a US_DOLLAR amount and converts it to JAPANESE_YEN. Consider the problems that could occur if, instead of a US_DOLLAR input, an AUSTRALIAN_DOLLAR amount was allowed to be specified. Without strong typing, the function would use the conversion routines for US_DOLLAR and arrive at the wrong JAPANESE_YEN amount for the input argument, which was actually specified as an AUSTRALIAN_DOLLAR. With strong typing, the function will reject the request as an error.

To summarize, strong typing ensures that only functions, procedures, comparisons, and assignments that are defined for a data type can be used. So, sourced UDFs can be used to enable existing BIFs and UDFs to work for newly created user-defined DISTINCT types.
User-defined functions are similar in functionality to application subroutines. But user-defined functions are different because they can be used inside of SQL statements. In fact, the only way that user-defined functions can be executed is from within a SQL statement. This gives them great power. A user-defined function is not a substitute for an application subroutine, subprogram, or procedure. Instead, user-defined functions are used to extend the functionality of the SQL language.

The Schema

User-defined functions, user-defined distinct types, stored procedures, and triggers all are associated with a schema. By default, the schema name is the authid of the process that issues the CREATE FUNCTION, CREATE DISTINCT TYPE, CREATE PROCEDURE, or CREATE TRIGGER statement.

A schema therefore is simply a logical grouping of procedural database objects (user-defined functions, user-defined distinct types, stored procedures, and triggers).
You can specify a schema name when you create a user-defined function, type, or trigger. If the schema name is not the same as the SQL authorization ID, then the issuer of the statement must have either SYSADM or SYSCTRL authority, or the authid of the issuing process has the CREATEIN privilege on the schema.

For example, the following statement creates a user-defined function named NEWFUNC in the schema named MYSCHEMA:

CREATE FUNCTION MYSCHEMA.NEWFUNC ...

If the MYSCHEMA component was not included in the CREATE statement, then the schema would default to the authid of the person (or process) that executed the CREATE statement. In short, the schema is set to the owner of the function. If the CREATE statement was embedded in a program, the owner is the authid of the owner of the plan or package; if the statement is dynamically prepared, the owner is the authid in the CURRENT SQLID special register.

Creating User-Defined Functions

Before using DDL to create a user-defined function, the function program should be coded and prepared. This requires the developer to write the program, precompile, compile, and link-edit the program, BIND the DBRM for the program (if the function contains SQL statements), and then test the program to be sure it is free of bugs.

Then, before the user-defined function can be used, it must be registered to DB2 using the CREATE FUNCTION DDL statement. For example, assume that you have written a user-defined function program. Further assume that the program returns the number of days in the month for a given date. The following is a simplified version of the CREATE FUNCTION statement that could be used to register the UDF to DB2:

CREATE FUNCTION DAYSINMONTH(DATE)
  RETURNS INTEGER
  EXTERNAL NAME 'DAYMTH'
   LANGUAGE COBOL;

This statement creates a UDF named DAYSINMONTH, with one parameter of DATE data type, that returns a single value of INTEGER data type. The external name for the function program is DAYMTH and it is coded in COBOL. Note: Most of the parameters have been omitted from this simple CREATE FUNCTION example.

Once the user-defined function has been created, and the appropriate authority has been granted, the UDF can be used in an SQL statement as follows:

SELECT EMPNO, LASTNME, BIRTHDATE, DAYSINMONTH(BIRTHDATE)
FROM DSN8610.EMP
WHERE DAYSINMONTH(BIRTHDATE) < 31;

The results of this statement would be a list of employees whose birth date falls in a month having fewer than 31 days (that is, February, April, June, September, and November). This assumes that the program for the user-defined function, DAYSINMONTH, is correctly coded to examine the date specified as input and return the actual number of days in the month.

How Functions are Executed

User-defined functions run in WLM-managed stored procedure address spaces. To execute a user-defined function, simply reference the function in a SQL statement. The SQL statement can be issued dynamically or statically; as part of an application program or via ad hoc SQL; anywhere SQL can be run, the UDF can be coded.

When a function is invoked in an SQL statement, DB2 must choose the correct function to run to satisfy the request. DB2 will check for candidate functions to satisfy the function request. The manner in which DB2 chooses which function to run is based on the following criteria:

First of all, the schema must match. If the function being invoked is fully qualified, the schema must match for the function to be considered as a candidate for execution. If the function being invoked is not fully qualified, then DB2 will check the SQL path of the invoking process to find a function with a matching schema.
Of course, the name must match the function being invoked for the user-defined function to be considered a candidate for execution.

The number of parameters for the user-defined function must match the number of parameters specified by the invoked function. Additionally, the data type of each parameter must match, or be promotable to, the data types specified for each parameter in the function being invoked. Refer to Table 4.2 for a list of which data types are promotable to other data types. The data types in the first column can be promoted to the data types in the second column. When performing function resolution, the earlier the data type in the second column appears the more preferable it is to the other promotable data types.

To clarify this requirement, consider the following example:

SELECT XSCHEMA.FUNCX(COLA) 
FROM   TABLE;

The data type of COLA is SMALLINT. Furthermore, two user-defined functions have been created named FUNCX, both in the same schema, XSCHEMA. Both FUNCX UDFs require one parameter, but one is defined with an INTEGER data type, the other with a data type of REAL. The SMALLINT data type is promotable to both INTEGER and REAL, but because INTEGER appears first in the promotion list, the FUNCX with the INTEGER parameter will be used instead of the one with the REAL parameter.

The appropriate authority must exist. That is, the invoking authid must have the authority to execute the user-defined function
And, finally, the timestamp of the BIND for the user-defined function must be older than the timestamp of the BIND for the package or plan that invokes the function.
For unqualified UDFs, it is possible that two or more candidate functions will fit equally well. In this case, the user-defined function whose schema name is earliest in the SQL path will be chosen for execution.

For example, suppose functions XSCHEMA.FUNC1 and YSCHEMA2.FUNC1 both fit the function resolution criteria equally well. Both have the same function name, but different schema names. Both also fit the rest of the criteria regarding number of parameters, parameter data types, and requisite authority. If the SQL path is:

"ZSCHEMA"; "YSCHEMA"; "SYSPROC"; "SYSIBM"; "XSCHEMA";

Then DB2 will select function YSCHEMA.FUNC1 because YSCHEMA is before XSCHEMA in the SQL path.

The SQL path is specified to DB2 in one of two ways. The SQL path is determined by the CURRENT PATH special register for dynamically prepared SQL statements. For dynamic SQL, the SQL path can be set by issuing the SET CURRENT PATH statement. The PATH parameter of the BIND and REBIND commands is used to specify the SQL path for SQL containing UDFs in plans and packages.

DB2 supports function overloading. This means that multiple functions can have the same name, and DB2 will decide which one to run based on the parameters. Consider an example where an application developer writes a UDR that overloads that addition operator +. The UDF is created to concatenate text strings together using +. The + function is overloaded because , if it is acting on numbers, it adds them together; but if the function is acting on text, it concatenates the text strings together.

The process of following these steps to determine which function to execute is called function resolution.

When automatic rebind is invoked on a package or plan that contains UDFs, DB2 will not consider any UDF created after the original BIND or REBIND was issued. In other words, only those UDFs that existed at the time of the original BIND or REBIND are considered during function resolution for plans and packages bound as a result of automatic rebind.

DSN_FUNCTION_TABLE and EXPLAIN

You can use EXPLAIN to obtain information about DB2 function resolution. To use EXPLAIN to obtain function resolution information you must create a special table called DSN_FUNCTION_TABLE. When EXPLAIN is executed and UDFs are used, DB2 will store function resolution details for each UDF in the statement, package, or plan, in DSN_FUNCTION_TABLE.

Information will be populated in DSN_FUNCTION_TABLE when you execute an EXPLAIN on an SQL statement that contains one or more UDFs or when you run a program whose plan is bound with EXPLAIN(YES) and the program executes an SQL statement that contains one or more UDFs.

Remember though, you must create a table named DSN_FUNCTION_TABLE before you can use EXPLAIN to obtain function resolution details. A sample create statement for this table follows:

CREATE TABLE userid.DSN_FUNCTION_TABLE
  (QUERYNO        INTEGER      NOT NULL WITH DEFAULT,
   QBLOCKNO       INTEGER      NOT NULL WITH DEFAULT,
   APPLNAME       CHAR(8)      NOT NULL WITH DEFAULT,
   PROGNAME       CHAR(8)      NOT NULL WITH DEFAULT,
   COLLID         CHAR(18)     NOT NULL WITH DEFAULT,
   GROUP_MEMBER   CHAR(8)      NOT NULL WITH DEFAULT,
   EXPLAIN_TIME   TIMESTAMP    NOT NULL WITH DEFAULT,
   SCHEMA_NAME    CHAR(8)      NOT NULL WITH DEFAULT,
   FUNCTION_NAME  CHAR(18)     NOT NULL WITH DEFAULT,
   SPEC_FUNC_NAME CHAR(18)     NOT NULL WITH DEFAULT,
   FUNCTION_TYPE  CHAR(2)      NOT NULL WITH DEFAULT,
   VIEW_CREATOR   CHAR(8)      NOT NULL WITH DEFAULT,
   VIEW_NAME      CHAR(18)     NOT NULL WITH DEFAULT,
   PATH           VARCHAR(254) NOT NULL WITH DEFAULT,
   FUNCTION_TEXT  VARCHAR(254) NOT NULL WITH DEFAULT
  ) IN database.tablespace;

After executing EXPLAIN for a SQL statement that uses a UDF, each of these columns will contain information about the UDF chosen during function resolution. The actual definition of the information contained in the columns of DSN_FUNCTION_TABLE is shown in Table 1.

Column Name	Description
QUERYNO	Indicates an integer value assigned by the user issuing the EXPLAIN, or by DB2. Enables the user to differentiate between EXPLAIN statements.
QBLOCKNO	Indicates an integer value enabling the identification of subselects or a union in a given SQL statement. The first subselect is numbered 1; the second, 2; and so on.
APPLNAME	Contains the plan name for rows inserted as a result of running BIND PLAN specifying EXPLAIN(YES). Contains the package name for rows inserted as a result of running BIND PACKAGE with EXPLAIN(YES). Otherwise, contains blanks for rows inserted as a result of dynamic EXPLAIN statements.
PROGNAME	Contains the name of the program in which the SQL statement is embedded.
COLLID	Contains the collection ID for the package.
GROUP_MEMBER	Indicates the member name of the DB2 that executed EXPLAIN. The column is blank if the DB2 subsystem was not in a data sharing environment when EXPLAIN was executed.
EXPLAIN_TIME	Contains a TIMESTAMP value indicating when the EXPLAIN that created this row was executed.
SCHEMA_NAME	Contains the name of the schema for the invoked function.
FUNCTION_NAME	Contains the name of the UDF to be invoked.
SPEC_FUNC_NAME	Contains the specific name of the UDF to be invoked.
FUNCTION_TYPE	Contains a value indicating the type of function to be invoked: SU Scalar Function TU Table Function
VIEW_CREATOR	If the function is referenced in a CREATE VIEW statement, this column contains the creator name for the view. If not, the column is left blank.
VIEW_NAME	If the function is referenced in a CREATE VIEW statement, this column contains the name for the view. If not, the column is left blank.
PATH	Contains the value of the SQL path at the time DB2 performed function resolution for this statement.
FUNCTION_TEXT	Contains the first 100 bytes of the actual text used to invoke the UDF, including the function name and all parameters.

Table 1: DSN_FUNCTION_TABLE columns.

Table Functions

Table functions are different in nature than scalar fuctions. A table function is designed to return multiple columns and rows. Its output is a table. An example using a table function follows:

SELECT WINNER, WINNER_SCORE, LOSER, LOSER_SCORE
FROM   FOOTBALL_RESULTS(5)
WHERE  LOSER_SCORE = 0;

In this case, the table function FOOTBALL_RESULTS() is used to return the win/loss statistics for football games. The table function can be used in SQL statements just like a regular DB2 table. The function program is designed to fill the rows and columns of the "table." The input parameter is an INTEGER value corresponding to the week the game was played; if 0 is entered, all weeks are considered. The query above would return all results where the losing team was shut out (had 0 points) during the fifth week of the season.

The following or similar CREATE FUNCTION statement could be used to define the FOOTBALL_RESULTS() function:

CREATE FUNCTION FOOTBALL_RESULTS(INTEGER)
RETURNS TABLE (WEEK INTEGER,
                WINNER CHAR(20),
                WINNER_SCORE INTEGER,
                LOSER CHAR(20),
                LOSER_SCORE INTEGER)
EXTERNAL NAME FOOTBALL
   LANGUAGE C
PARAMETER STYLE DB2SQL
   NO SQL
   DETERMINISTIC
NO EXTERNAL ACTION
   FENCED
   SCRATCHPAD
   FINAL CALL
   DISALLOW PARALLEL
   CARDINALITY 300;

The key parameter is the RETURNS TABLE parameter which is used to define the columns of the table function. The function program must create these rows itself or from another data source such as a flat file.

The value supplied for the CARDINALITY parameter is only an estimate. It is provided to help DB2 optimize statements using the table function. It is possible to return more or fewer rows than is specified in CARDINALITY.

Sourced Functions

Sourced functions are created from already existing built-in (scalar and column) and user-defined (scalar) functions. The primary reason to create a sourced function is to enable functions for user-defined distinct data types. This is required because DB2 implements strong typing.

More information on sourced functions and strong typing is provided later in this chapter in the section on user-defined data types. For now, though, here is an example of creating a sourced UDF:

CREATE FUNCTION FINDWORD (DOCUMENT, VARCHAR(50))
  RETURNS INTEGER
  SPECIFIC FINDWORDDOC
SOURCE SPECIFIC FINDWORDCLOB;

In this example a new function, FINDWORD, is created from an exisitng function FINDWORDCLOB. The function finds the location of the supplied word (expressed as a VARCHAR(50) value) in the supplied DOCUMENT. The function returns an INTEGER indicating the location of the word in the DOCUMENT. DOCUMENT is a user-defined type based on a CLOB data type.

Some Guidelines on Using UDFs

External UDF Program Restrictions: When you develop programs for external user-defined functions, DB2 places certain restrictions on the type of services and functions that can be used. Keep the following restrictions in mind as you code your external UDF programs:

COMMIT and ROLLBACK statements cannot be issued in a user-defined function. The UDF is part of the unit of work of the issuing SQL statement.
RRSAF calls can not be used in user-defined functions. DB2 uses the RRSAF as its interface to user-defined functions. Therefore, any RRSAF calls made within the UDF code will be rejected.
If your user-defined function does not specify either the EXTERNAL ACTION or SCRATCHPAD parameter, the UDF may not execute under the same task each time it is invoked.
All open cursors in user-defined scalar functions must be closed before the function completes or DB2 will return an SQL error.
The host language that is used to write UDF programs can impose restrictions on UDF development, as well. Each programming language has its own restrictions and limits on the number of parameters that can be passed to a routine in that language. Be sure to read the programming guide for the language being used (before you begin coding) to determine the number of parameters allowed.
Finally, the limitation on the number of parameters for the programming language to be used can impact table UDFs because table functions often require a large number of parameters (that is, at least one output parameter for every column of the table).

Keep It Simple: Each user-defined function program should be coded to perform one and only one task. The UDF program should be as simple as possible while still performing the desired task. Do not create overly complex UDF programs that perform multiple tasks based upon the input. IT is far better to have multiple UDFs, each performing one simple task, than to have a single, very complex UDF that performs multiple tasks. The UDF program will be easier to code, debug, understand, and maintain when it needs to be modified.

Promote UDF Reusability: User-defined functions should be developed with reusability in mind. Once the UDF has been coded and registered to DB2, it can be shared by multiple applications. It is wise to code your UDFs such that they perform simple, useful tasks that can be used by many applications at your site.

Reusing UDFs in multiple applications is better than creating multiple UDFs having the same (or similar) functionality for each application. Of course, you should promote reusability while at the same time keeping the UDF code as simple as possible.

Handle UDF Abends: When an external UDF abends, the invoking statement in the calling program receives an error code, namely SQLCODE -430. The unit of work containing the invoking statement must be rolled back. The calling program should check for the -430 SQLCODE and issue a ROLLBACK when it is received.

Invoke UDFs Using Qualified Names: Use the qualified name of a function in the invoking SQL statement. By doing so, you simplify function resolution. DB2 will only search for functions in the specific schema you code. Therefore, DB2 is more likely to choose the function you intend, and the function resolution process will take less time to complete because fewer functions will qualify as candidates.

CAST Parameters to the Right Data Type: Use the CAST function to cast the parameters of the invoked UDF to the data types specified in the user-defined function definition. This assists the function resolution process to choose the correct function for execution.

For example, consider a sample UDF named TAXAMT. It requires one input parameter which is defined as DECIMAL(9,2). If you desire to pass a column defined as INTEGER to the UDF, use the CAST function as follows to cast the value of the integer column to a DECIMAL(9,2) value:

SELECT TAXAMT(CAST (INT_COL AS DECIMAL(9,2)))
FROM TABLE;

Define UDF Parameter Data Types Efficiently: Avoid defining UDF parameters using the following data types: CHAR, GRAPHIC, SMALLINT, and REAL. Instead use VARCHAR, VARGRAPHIC, INTEGER, and DOUBLE respectively. These data types should be avoided because you must use only those data types when invoking a UDF defined with parameters of those data types must use parameters of CHAR, GRAPHIC, SMALLINT, or REAL.

To clarify this guidelines, consider a UDF named FUNCX that is defined with a parameter of data type SMALLINT. To invoke this UDF, the parameter must be of data type SMALLINT. Using a data type of INTEGER will not suffice. For example, the following statement will not resolve to FUNCX because the constant 500 is of type INTEGER, not SMALLINT:

SELECT FUNCX(500)
FROM   TABLE;

The same line of thinking applies to CHAR, GRAPHIC, and REAL data types. Of course, you could use the CAST function as described above to resolve this problem. But it is better to avoid the problem altogether by specifying VARCHAR, VARGRAPHIC, INTEGER, and DOUBLE as parameter data types instead.

Use The Sample UDFs: IBM provides quite a few sample programs for user-defined functions. Examine these samples for examples of how to implement effective DB2 user-defined functions. There are sample function programs for

[lb] converting date and time formats

[lb] returning the name of the day or month for a specific date

[lb] formatting floating point data as a currency value

[lb] returning DB2 catalog information for DB2 objects

[lb] returning a table of weather data.

These functions can be used as samples to learn how to code function programs for your specific needs and requirements.

SQL Usage Options Within External UDFs: There are four options for external functions regarding their usage of SQL. These are:

NO SQL indicates that the function can not execute SQL statements. However, non-executable SQL statements, such as DECLARE CURSOR, are not restricted.

MODIFIES SQL DATA indicates that the function can execute any legal SQL statement that can be issued by a UDF.

READS SQL DATA indicates that the function can execute SQL statements that access data, but can not modify data (this is the default SQL usage option for UDFs).

CONTAINS SQL indicates that the function can execute SQL statements so long as data is neither read nor modified, and the SQL statement is legal for issuance by a UDF.

Table 2 indicates which SQL statements are valid for each type of SQL usage described above.

Table 2. Using SQL Within User-Defined Functions

SQL Statement	NO SQ	CONTAINS SQL	READS SQL	MODIFIES SQL
ALLOCATE CURSOR	N	N	Y	Y
ALTER	N	N	N	Y
ASSOCIATE LOCATORS	N	N	Y	Y
BEGIN DECLARE SECTION	Y	Y	Y	Y
CALL	N	Y	Y	Y
CLOSE	N	N	Y	Y
COMMENT ON	N	N	N	Y
COMMIT	N	N	N	N
CONNECT	N	N	N	N
CREATE	N	N	N	Y
DECLARE CURSOR	Y	Y	Y	Y
DECLARE GLOBAL TEMPORARY TABLE	N	Y	Y	Y
DECLARE STATEMENT	Y	Y	Y	Y
DECLARE TABLE	Y	Y	Y	Y
DELETE	N	N	N	Y
DESCRIBE	N	N	Y	Y
DESCRIBE CURSOR	N	N	Y	Y
DESCRIBE INPUT	N	N	Y	Y
DESCRIBE PROCEDURE	N	N	Y	Y
DROP	N	N	N	Y
END DECLARE SECTION	Y	Y	Y	Y
EXECUTE	N	Y	Y	Y
EXECUTE IMMEDIATE	N	Y	Y	Y
EXPLAIN	N	N	N	Y
FETCH	N	N	Y	Y
FREE LOCATOR	N	Y	Y	Y
GRANT	N	N	N	Y
HOLD LOCATOR	N	Y	Y	Y
INCLUDE	Y	Y	Y	Y
INSERT	N	N	N	Y
LABEL ON	N	N	N	Y
LOCK TABLE	N	Y	Y	Y
OPEN	N	N	Y	Y
PREPARE	N	Y	Y	Y
RELEASE	N	N	N	N
RENAME	N	N	N	Y
REVOKE	N	N	N	Y
ROLLBACK	N	N	N	N
SELECT	N	N	Y	Y
SELECT INTO	N	N	Y	Y
SET	N	Y	Y	Y
SET CONNECTION	N	N	N	N
SIGNAL SQLSTATE	N	Y	Y	Y
UPDATE	N	N	N	Y
VALUES	N	N	Y	Y
VALUES INTO	N	Y	Y	Y
WHENEVER	Y	Y	Y	Y

When to DISALLOW PARALLEL Operations: A table function can not operate in parallel, so the DISABLE PARALLEL parameter should be specified when issuing a CREATE FUNCTION statement for a table UDF.

Some functions that are NOT DETERMINISTIC can receive incorrect results if the function is executed by parallel tasks. Specify the DISALLOW PARALLEL option for these functions.
Likewise, some functions that rely upon a SCRATCHPAD to store data between UDF invocations might not function correctly in parallel. Specify the DISALLOW PARALLEL option for these functions, too.

DETERMINISTIC Versus NOT DETERMINISTIC: Be sure to specify accurately whether the UDF will always return the same result for identical input arguments. If the UDF always returns the same result for identical input arguments, then the UDF is DETERMINISTIC. If not, the UDF should be identified as NOT DETERMINISTIC. Any UDF that relies on external data sources that can change should be specified as NOT DETERMINISTIC. Other examples of functions that are not deterministic include any UDF that contains SQL SELECT, INSERT, UPDATE, or DELETE statements or a random number generator.

DB2 uses the [NOT] DETERMINISTIC parameter to optimize view processing for SQL SELECT, INSERT, UPDATE, or DELETE statements that refer to the UDF. If the UDF is NOT DETERMINISTIC, view merge will be disabled when the UDF is specified.

Choose UDF SECURITY Wisely: The SECURITY parameter indicates how the UDF will interact with an external security product, such as ACF2 or RACF. If SECURITY DB2 is specified, then the UDF does not require an external security environment. This is the default value for SECURITY. If the UDF accesses resources protected by an external security product, the access is performed using the authid that is associated with the WLM-established stored procedure address space.

If SECURITY USER is specified, then an external security environment should be established for the function. If the function accesses resources that the external security product protects, the access is performed using the primary authid of the process that invoked the UDF.

The third and final option for SECURITY is DEFINER. If this option is chosen and the UDF accesses resources protected by an external security product, then the access is performed using the primary authid of the owner of the UDF.

Handling Null Input Arguments: There are two option for handling null input arguments in user-defined functions: RETURNS NULL ON NULL INPUT and CALLED ON NULL INPUT. If nulls are to be allowed to be specified as input to a UDF, the UDF must be programmed to test for and handle null inputs.

If RETURNS NULL ON INPUT is specified when the UDF is created, the function is not called if any of the input arguments is null. The result of the function call is null.

If CALLED ON NULL INPUT is specified when the UDF is created, the function is called whether any input arguments are null or not. In this case, the UDF must test for null input arguments in the function program.

UDF Scratchpads: The [NO] SCRATCHPAD clause should be specified to indicate whether DB2 provides a scratchpad for the UDF to utilize. In general, external UDFs should be coded as reentrant and a scratchpad can help to store data between invocations of the UDF. A scratchpad provides a storage area for the UDF to use from one invocation to the next.

If a scratchpad is specified, a length should be provided. The length can be from 1 to 32767; the default is 100 if no length is specified.

The first time the UDF is invoked, DB2 allocates memory for the scratchpad and initializes it to contain all binary zeroes. The scope of a scratchpad is a single SQL statement. A separate scratchpad is allocated for each reference to the UDF in the SQL statement. So, if the UDF is specified once in the SELECT-list and once in the WHERE clause, two scratchpads would be allocated. Furthermore, if the UDF is run in parallel, one scratchpad is allocated for each parallel task.

Take care when SCRATCHPAD with ALLOW PARALLEL because results can be difficult to predict. Consider, for example, a UDF that uses the scratchpad to count the number of times it is invoked. The count would be thrown off if run in parallel because the count would be for the parallel task, not the UDF. For this reason, be sure to specify DISALLOW PARALLEL for UDFs that will not operate in parallel.

If the UDF acquires system resources, be sure to specify the FINAL CALL clause to make sure that DB2 calls the UDF one last time so the UDF can free the system resources it acquired.

Summary

User-defined functions can be used to customize DB2 to handle your specific business requirements. Using UDFs you can accomplish very complex and intricate tasks using nothing but SQL - SQL that calls your UDFs.

Craig Mullins is an independent consultant and president of Mullins Consulting, Inc. Craig has extensive experience in the field of database management having worked as an application developer, a DBA, and an instructor with multiple database management systems including DB2, Sybase, and SQL Server. Craig is also the author of the DB2 Developer’s Guide, the industry-leading book on DB2 for z/OS, and Database Administration: Practices and Procedures, the industry’s only book on heterogeneous DBA procedures. You can contact Craig via his web site at http://www.craigsmullins.com.

Contributors : Craig S. Mullins
Last modified 2006-01-16 04:31 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?