Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Oracle » Oracle Articles Archive » External Tables in Oracle9i
Seeking new owner for this high-traffic DBAzine.com site.
Tap into the potential of this DBA community to expand your business! Interested? Contact us today.
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 3549
 

External Tables in Oracle9i

by Dave Moore

Here's a step-by-step example of creating an external table and querying the data source from within Oracle along with a discussion of practical applications for external tables, performance and management issues.

ORACLE9i has many new features, and one of my favorites is the ability to create external tables. An external table is a table whose structure is defined inside the database even though its data resides externally as one or more files in the operating system (see Figure 1). External tables are very similar to regular tables in Oracle, except the data isn't stored in Oracle datafiles and isn't managed by the database.

Figure 1: External table structure in Oracle.

Example

This example begins with product information listed in a Microsoft Excel spreadsheet (see Figure 2). The data is saved in comma-separated values (CSV) format to D:\products\products.csv. The spreadsheet contains three columns: Product Number, Description, and Price. This file contains the data that we'll query from Oracle.

Figure 2: Product data in Excel.

After saving the file from Excel, the next task is to create a DIRECTORY object in Oracle that points to the physical operating system directory that contains the file. This DIRECTORY is required in order to create the external table.

SQL> CREATE DIRECTORY PRODUCT_DIR AS 'd:\products';
Directory created.

Now the external table is created by using the CREATE TABLE command:

create table products (
product_no number,
description varchar2(100),
price varchar2(20)
)
organization EXTERNAL (
type oracle_loader
default directory PRODUCT_DIR
access parameters
( records delimited by newline
badfile 'products.bad'
logfile 'products.log'
fields terminated by ','
)
location ('products.csv')
)
reject limit unlimited
/

The first part of the CREATE TABLE statement holds no surprises. Notice, however, that the next part of the statement specifies ORGANIZATION EXTERNAL, which indicates that this table is an external table. This part of the statement also specifies a type of oracle_loader-the only one currently supported by Oracle. Oracle_loader is actually an oracle TYPE object defined in the database to handle the processing. Also notice that the directory object is part of the CREATE TABLE statement; it tells Oracle where to find the files.

The next part of the statement specifies the access parameters, which should look familiar to anyone who's experienced with SQL*Loader:

      • records delimited by specifies the characters that will be used to separate rows.
      • badfile specifies the file that Oracle will use to store the rejected rows.
      • logfile specifies the file that Oracle will use to store log information. Documentation of any errors will be provided in this file.
      • fields terminated by specifies the field separator that will distinguish one column from another during the load.

Finally, the location and reject limit are specified:

      • location provides the name of the actual file to access. If Oracle needs to access multiple files, they can be specified as follows:
location ('file1.dat', 'file2.dat')
      • reject limit specifies the number of rows that can be rejected before the command returns an error. If this threshold is reached, the following error appears when trying to access the table:
ERROR at line 1:
ORA-29913: error in executing ODCIEXTTABLEFETCH callout
ORA-30653: reject limit reached
ORA-06512: at "SYS.ORACLE_LOADER", line 14
ORA-06512: at line 1

The DDL for creating the statement will run even if the file doesn't exist in the system, which can produce mixed results. On the one hand, you won't know whether the table was successfully created until a statement is executed against the table, which in a data-warehousing environment might be at 3:00 a.m. Conversely, the file doesn't have to exist at the time the table is created. In fact, the file can come and go as needed, which is quite customary in OLAP environments.

The external table is now created. However, if another user tries to access the table at this point, that user will receive an error:

SQL> select count(*) from dave.products;
select count(*) from dave.products
                          *
ERROR at line 1:
ORA-06564: object PRODUCT_DIR does not exist

To prevent this error, you must grant read and write access on the directory for any user who wants to select data from the table. Granting SELECT on the table itself will allow the object to be seen, but you must also grant access to the underlying directory object.

grant read, write on directory products_dir to alex; 

Listing 1: Querying the table.

SQL> select product_no, substr(description,1,40) "Desc", Price from products;

PRODUCT_NO Desc PRICE
---------- -------------------------------------- ---------------
12300 Robin Yount Autographed Baseball $29.99
12301 George Brett Autographed Baseball $19.99
12302 Dale Murphy Autographed Baseball $19.99
12303 Paul Molitor Autographed Baseball $19.99
12304 Nolan Ryan Autographed Baseball $19.99
12305 Craig Biggio Autographed Baseball $19.99
12306 Jeff Bagwell Autographed Baseball $19.99
12307 Barry Bonds Autographed Baseball $19.99
12308 Mark McGuire Autographed Baseball $19.99
12309 Sammy Sosa Autographed Baseball $19.99
12310 Jeff Kent Autographed Baseball $19.99
12311 Roger Clemens Autographed Baseball $19.99
12312 Goose Gossage Autographed Baseball $19.99
12313 Derek Jeter Autographed Baseball $19.99

14 rows selected.

Read/write access means that Oracle will be allowed to write to that directory when it needs to update the logfile or badfile. As an OS user, you don't have access to those files in the operating system unless your ID has proper privileges; as a result, security isn't compromised.

After creating the external table and granting privileges, the table can be queried like any other table (see Listing 1).

The external table can be used as a substitute for SQL*Loader and a regular table can be used to hold its data:

INSERT INTO PROD.PRODUCTS AS SELECT * from DAVE.PRODUCTS;

The data that was in Excel is loaded into Oracle, which allows it to be backed up and to perform better than an external table.

Limitations

External tables in Oracle9i have the following limitations:

      • They're read-only, so no data manipulation language (DML) operations (such as Insert, Update, or Delete) can be performed against them. Also, no indexes can be defined on the table. Oracle does plan to support writing to these tables in a future release.
      • They don't support files larger than 2GB. If you attempt to access a file larger than 2GB, Oracle fails with the following error:
KUP - 04039: unexpected error while trying to find file
<file name> in director <directory name>
      • Certain commands against the table, such as ANALYZE, will fail.
SQL> analyze table products compute statistics;
analyze table products compute statistics

*
ERROR at line 1:
ORA-30657: operation not supported on external organized
Table

This limitation is important because most DBAs have scripts that regularly refresh object statistics based on a schema. If you try to generate statistics on an external table, the command will fail.

      • The data in external tables isn't backed up as part of regular Oracle backup routines because it's outside the scope of the database.

Performance

One expects the Oracle kernel to incur more overhead when processing external tables. An Oracle TYPE and TYPE BODY named SYS.ORACLE_LOADER exist in the database and process all statements accessing external tables. This process increases the overhead to access the data, and when compared to a regular table is many times slower. Oracle must fetch and perform tasks that it normally doesn't perform (such as conversions, handling rejections, and logging) and is therefore significantly slower. I experimented with the performance of external tables by creating an internal table with the exact data as the external one:

SQL> create table products_internal as select * from
products;

Table created.

The table contained 5,292 rows, with the same data as in the spreadsheet. The internal table didn't have any indexes or primary keys defined. Based on the script shown in Listing 2, the internal table was consistently 8-10 times faster to access than the external one. Optimally, external tables should be used as a means to load data into internal tables and shouldn't be queried as an external data source.

Listing 2: Access to the internal table is significantly faster than to the external table.

set term off
col a new_value start
select dbms_utility.get_time() a from dual;
select count(*) from products_internal where product_no = 12313;
col b new_value stop
select dbms_utility.get_time() b from dual;
col c new_value answer
select (&stop - &start) c from dual;
col d new_value start_ext
select dbms_utility.get_time() d from dual;
select count(*) from products where product_no = 12313;
col e new_value stop_ext
select dbms_utility.get_time() e from dual;
col f new_value answer_ext
select (&stop_ext - &start_ext) f from dual;
col ans form 999
col ans_ext form 999
set term on
prompt
prompt
select 'Internal Table Execution Time in ms ', &answer ans
from dual;
select 'External Table Execution Time in ms ', &answer_ext ans_ext
from dual;

By taking the following actions, you can minimize the overhead used when processing an external table:

      • Use the PARALLEL clause when you create the table. This value indicates the number of access drivers that will be started to process the datafiles and will divide the files into portions that can be processed separately.
      • Use datatypes in Oracle that will match the physical data attributes, which will eliminate costly data conversion.
      • Use fixed values when possible, including:
          • Fixed-width character sets
          • Fixed-length fields
          • Fixed-length records

The RECORDS FIXED clause is listed under access parameters and requires the definition of fields. In the following example, the data line is 40 bytes long, plus one byte for the new line. The field names must be the same as the column names to which they correspond.

RECORDS FIXED 41
FIELDS
(
emp_first_name char(20)
emp_last_name char(20)
)
      • Use single-character delimiters, and use the same character sets as used in the database.
      • Minimize rejections since Oracle performs more I/O for each one.

Practical Applications

External tables have many different practical applications, which I'll place into two categories: business processing and database administration.

From the business-processing standpoint, external tables serve a vital need in a data-warehousing environment, in which Extract, Transform, and Load processes are common. External tables make it unnecessary for users to create temporary tables during these processes, thereby reducing required space and the risk of failed jobs. External tables can be used instead of temporary tables and utilities like SQL*Loader. They also provide an easy way for companies to load different information sources into Oracle-whether in Excel, ACT!, or Access, information can be loaded and processed.

From the database administration view, I'm most interested in features that help me do my job. I want to monitor those files that I look at frequently-alert.log and init.ora-without leaving a SQL> prompt. Then I can use SQL commands to query the file and specify WHERE clauses for more sophisticated processing. An example of creating an external table to point to the alert log is as follows:

create directory BDUMP AS 'd:\oracle9i\admin\PROD\bdump';
create table alert_log (text varchar2(200))
organization EXTERNAL (
type oracle_loader
default directory BDUMP
access parameters
( records delimited by newline
badfile 'dave.bad'
logfile 'dave.log'
fields terminated by ' '
)
location ('PRODALRT.LOG')
)
reject limit unlimited;

Listing 3: The DBA_EXTERNAL_LOCATIONS view.

SQL> desc dba_external_locations;
Name Null? Type
----------------------------------------- -------- ----------------
OWNER NOT NULL VARCHAR2(30)
TABLE_NAME NOT NULL VARCHAR2(30)
LOCATION VARCHAR2(4000)
DIRECTORY_OWNER CHAR(3)
DIRECTORY_NAME VARCHAR2(30)

Database Administration

It's important to know what views in Oracle contain the information pertaining to external tables. The view DBA_TABLES shows external tables and has a value of 0 for PCT_FREE, PCT_USED, INI_TRANS, and MAX_TRANS. All other storage columns in the view are null. Scripts that use this view to determine problems should be updated to access DBA_EXTERNAL_TABLES. This view contains all of the parameters that you specified when you created the external table.

Another useful view is DBA_EXTERNAL_ LOCATIONS, which provides a quick way to see which files are accessed from the database (see Listing 3).

---

Dave Moore is a product architect at BMC Software in Austin, TX. He's also a Java and PL/SQL developer, Oracle DBA and author of Oracle Utilities by Rampant Tech Press.


Contributors : Dave Moore
Last modified 2005-04-18 02:59 PM

Need Help

Posted by Ravgopal at 2006-10-17 05:11 PM
Hi
I need an help. I am trying to load external table from a DAT file. The file and table contains only one number field.

The data is just pure numbers like
2374
2375
2376
2377
2378

When I try to query the external table I am getting the following error

LOG file opened at 10/17/06 16:52:22

Field Definitions for table DBKEY_EXTERNAL
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Rows with all null fields are accepted

Fields in Data Source:

DBKEY CHAR (255)
Terminated by ","
Trim whitespace same as SQL Loader
error processing column DBKEY in row 1 for datafile /landingpad/ndt2stage/MMADHOC_DBKEYS.DAT
ORA-01722: invalid number

The table definition is :

create table DBKEY_EXTERNAL
( DBKey number
)
organization external
(type oracle_loader default directory NDT2STAGE
access parameters
( records delimited by newline
fields terminated by ','
)
location ('MMADHOC_DBKEYS.DAT')
)
REJECT LIMIT UNLIMITED;

Why does the loader reads the number as Char(255)? The table is having number data type, when I query the table I am getting this error. I would appreciate any help.

Thanks
RG
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone