Oracle Dynamic Grid Computing

by Donald K. Burleson

On the OracleWorld Web page, we see the announcement that the next generation of Oracle will be unveiled by Larry Ellison on Tuesday, September 9th from 2:00 — 3:00 p.m. It states:

It promises not only to change the way you run your data center, but to change the way you think about the data center itself. It adapts to your changing business needs so that you can spend more time thinking about how to run your business, knowing that your infrastructure will respond with the reliable, secure performance your applications need. It represents a significant rethinking of the traditional role of software infrastructure in areas such as system performance, clustering and storage. The software is the first infrastructure designed for Grid computing.

Oracle Corporation has been promoting Grid computing for the past year, and there are several great sources of information about Oracle’s vision of Grid computing:

Overview of Oracle Grid Technology — OTN
“Oracle and the Grid” whitepaper
Benny Souder on Oracle Grid — Oracle Magazine

Grid computing is the on-demand sharing of computing resources with in a tightly-coupled network. For those of you who are old enough to remember data processing in the 1980s, the IBM mainframes are a primitive example of Grid computing. Mainframes have several CPUs, each independent, and the MVS/ESA operating system allocated work to the processors based on least-recently-used algorithms and customized task dispatching priorities. Of course, RAM and disk resources were available to all programs executing on the huge, monolithic server.

However, Grid computing is fundamentally different from mainframes. In the 21st century, Grid computing performs a “virtualization” of distributed computing resources and allows for the automated allocating of resources as system demand changes. Each server is independent, yet ready to participate in a variety of processing requests from many types of applications.

Grid computing also employs special software infrastructure (for Oracle, using Oracle*Streams) to monitor resource usage and allocate requests to the most appropriate resource. This enables a distributed enterprise to function as if it were a single supercomputer (refer to Figure 1).

Figure 1: Server virtualization technology.

Remember, Grid computing is all about reducing costs and ensuring acceptable performance, and infinite scalability.

The History of Grid Computing

The idea of Grid computing arose from the need to solve highly-parallel computational problems that were beyond the processing capability of any single computer. These types of scientific parallel computing problems are non-linear, and therefore ideal for splitting-out into subprograms because there is no need for each subprogram to communicate with the master program. The main program will send out a program to a remote computer in the network, where the program will execute independently from the master. Upon completion, the subprograms deliver the results back to the main program.

The Search For Extraterrestrial Intelligence (SETI) project is the best-known example of distributed grid computing. Participants register their PC with SETI, and the SETI software uses computing cycles from thousands of PCs across the globe to solve complex analytical problems.

Why Grid Computing?

Let’s take a quick look at some of the problems that led to the development of Grid technology. The trend of the 1990s was toward small, cheap, dedicated servers, and Oracle shops replace a single mainframe with hundreds of UNIX and MS-Windows servers. However, IT shops soon realized that the high expense of maintaining multiple servers plus the requirement to over-allocate hardware was eroding these savings. The problems of using dedicated servers include:

1. Expense: In large enterprise data centers, hardware resources are deliberately over-allocated to accommodate processing-load peaks.

2. Wastefulness: Because each application resides on a single server, there is significant duplication of work, and a sub-optimal utilization of RAM and CPU resources.

3. High Maintenance: In many large shops, a “shuffle” occurs when an application outgrows its server. A new server is purchased, and the database is moved to the new server. Then the old server becomes available, and another database is migrated onto the old server. This shuffling of databases between servers is a huge headache for the administrators who are kept busy after hours moving databases to new server platforms.

In Figure 2, we see how each dedicated Oracle server must be over-allocated for RAM to meet system demand. As we know, an increase in volume requires additional PGA RAM for each connected Oracle user, and the savvy DBA will allocate RAM to the high-water mark of RAM usage.

Figure 2: Over-allocation of RAM resources.

We see the same type of over-allocation for CPU resources. To ensure acceptable response time at peak usage, the Oracle DBA will assign additional CPU resources to each server (refer to Figure 3).

Figure 3: Over-allocation of CPU resources.

The Gartner Group predicts that worldwide server blade shipments will increase to more than one million over the next three years and that server blade technology is one of the only areas within the server market that is experiencing significant growth.

All of the major hardware vendors are now offering server blade hardware, at far lower costs than traditional UNIX servers:

IBM — IBM's Grid offering uses their self-managing server initiative, called eLiza. eLiza claims to place all resource management software into a coordinated package, affording the boxes some measure of "self-healing" capability. eLiza will use Web services to request additional computing resources. The IBM blade server hardware uses their 64-bit pSeries chips.
Sun Microsystems — Sun also employed low-cost, low-power servers, using a UltraSPARC 650-MHz CPU with up to two gigabytes of RAM
Compaq — Compaq is also entering the server blade market with a low-cost blade setup that only costs $17k for a rack of 10 blade servers.
Dell — Dell’s PowerEdge 1655MC product line allows a single rack to hold 84 servers, each equipped with up to two hard drives and dual Intel Pentium III processors. The individual blade servers will start at $1,499 each and include a single 1.26 GHz Pentium III and an 18 Gig hard drive. A fully populated enclosure, which includes six servers, starts at $10,973.

Blades and Independent Processing

It is critical to note that blade servers are good for programs that do not require the symmetric multiprocessing (SMP) capabilities of large mid-range servers. For example, a blade server would not be appropriate for a RAC node that performs parallel query operations because blade servers are normally single-CPU machines and OPQ may require up to 32 CPS for fast large-table, full-table scans. For OLTP Oracle9i RAC system, blade server are perfect because the nature of individual queries does not require multiple CPU resources.

Blade servers are ideal for Oracle9iAS web cache servers or Oracle HTTP servers because a new server can easily be added into the Oracle9iAS server farm. In Oracle9iAS, we can use a rack of blade servers and pre-install Oracle Web Server and Oracle HTTP server (OHS) software. At runtime, the Oracle9iAS administrator can add these server blades to their Oracle9iAS farm, using each blade as either a Web cache server or an HTTP server, depending upon the stress on the system.

Figure 4: Using blade servers with Oracle9iAS.

It is important to understand that Grid computing is not the on-demand allocation of computing resources to an existing server. Rather, Grid computing is the use of software tools to farm-out processing tasks to independent servers or add additional servers to an existing cluster or farm. This allows the enterprise to share computing, storage, data, programs, and other resources in a dynamic fashion.

How Grid Works with Oracle

Oracle Grid computing allows for two ways to provide on-demand computing resources. These involve direct hardware allocation via blades, and the allocation of remote servers to execute Oracle tasks:

Add processing power to existing systems — Server blades can be allocated to Oracle9i RAC clusters.
Server banks — Oracle servers can be established with no local data and copies of all application stored procedures. This would allow for three types of distributed execution:

Remote execution — Entire programs, along with the data are shipped to the remote server for execution. Using Oracle Streams, the results are sent back to the initiating instance.
Remote data extracts — Distributed SQL can execute from a bank of servers with instances, but no local Oracle data
Remote process execution — Oracle applications can call PL/SQL stored procedures on the least-loaded servers

Oracle follows the Sun Microsystems definition of delivering computing resources as a “utility,” much the same way as additional electricity is added to a network during high-usage times.

Oracle whitepapers suggest that the next generation of Oracle Grid computing will use Oracle*Streams as the glue for the Grid communications. Oracle*Streams allows for the streaming of data between Oracle instances, and also provides built-in tools for replication, data warehouse ETL (data loading), message queues and messaging.

Hence, Oracle*Streams could be used as the vehicle for a Grid system whereby Oracle detects a resource shortage and invokes a procedure to relieve the stress on the database. While using Oracle*streams as the backbone, Oracle identifies three ways that computing resources can be assigned:

1. Oracle9i Real Application Clusters (RAC)
Using existing Oracle9i RAC capabilities, data blade servers can be used to add instance nodes. This allows for automated scalability, and a rack of server blades can be allocated to the Oracle9i RAC system on an as-needed basis.

2. Oracle transportable tablespaces
Transportable Tablespaces allow Oracle data files to be unplugged from a database and copied to another Oracle instance (on another server) and then added into that Oracle instance (refer to Figure 5). Transportable Tablespaces also supports simultaneous mounting of read-only tablespaces by two or more databases. Using transportable tablespaces with Oracle*Streams, the Grid could request a refresh of the tablespace data at any time.

Figure 5: Using transportable tablespaces to enable resource allocation.

Distributed SQL

Oracle SQL can be executed from any Oracle instance, accessing the data remotely using database links. This provides a framework where SQL servers may be allocated, all executing queries that access the same Oracle database.

However, it is important to remember that not all types of queries will benefit from remote execution. A simple Oracle query such as the one below would have all of the processing work (sorting) done on the remote server, saving no computing resources.

select          
   cust_name
from 
   customer 
order by 
   cust_name;

Farming out SQL queries to remote non-RAC instances makes more sense when the initiating instance has an opportunity to use its own computing resources to service the query. For example, consider the following distributed query, run from an instance named Omaha:

select
   cust_name.
   order_number,
from
   customer@new_york,
   orders@san_fran
where
   region = ‘WEST’
order by
   cust_name,
   order_number;

This type of query is ideal for remote execution because the bulk of the data processing will be done on the Omaha instance (refer to Figure 6).

Figure 6: Remote SQL query execution.

In this example, the new_york and san_fran servers perform the disk I/O to retrieve the requested data, passing the result set back to the Omaha instance. The Omaha instance will use its computing resources to join the tables together and perform the sorting.

Despite the apparent growth of server blade technology, there are critics who claim that the technology is exaggerated.

Limitations of Grid Technology

In an interesting Blue Arc article titled, “Blade Server Not Looking so Sharp,” Dr. Geoff Barrall (CTO of BlueArc Corporation) notes several issues with Blade Server computing:

Not low cost — Barrall claim that the costs for a rack of blade servers, despite their limited functionality, is often as high as the cost of a fully configured rack mount server with similar specifications.
Limited I/O — In a blade server, the I/O paths are shared, leading to limitations in the number of peripheral I/Os that can take place, such as disk I/O or server-to-server network communication.
Limited flexibility — Blade servers cannot be retired and replaced in the same way regular rack mount servers can, and there is a loss of flexibility in the way servers can be interconnected.

Let’s take a look at the major vendors’ offerings on server blade technology.

Alternatives to Grid Computing

There are several competing technology vendors that offer similar sharing of computing resources without using Grid technology:

Platform Computing — The Toronto-based Platform Computing Inc. has been working since 1992 toward on-demand computing resources. Back when most shops were entrenched in client-server technologies, Platform Computing was already developing software to achieve “hardware virtualization.” Over the past decade, Platform Computing has evolved a sophisticated solution for reallocation of data resources.
Savantis — Savantis developed a “dbSwitch” technology to provide switched access to databases, virtualizing the database server layer of the data center. The result is a Database Area Network (DAN) that enables database server consolidation and provides high availability, resource optimization, and capacity management. DAN technology works by dynamically relocating Oracle instances to different-sized servers, thereby performing dynamic resource allocation. During relocation, Oracle Transparent Application failover (TAF) ensures that no work is lost.

Conclusion

All Oracle professionals eagerly await Oracle’s announcement about the latest release of the Oracle database and its adoption of Grid computing as a central feature.

Time will tell if the market embraces the technology and if Oracle has hit the mark with Grid technology, or if Oracle Grid will slip by the wayside as computing resources become cheaper.

Donald K. Burleson is one of the world's top Oracle Database experts with more than 20 years of full-time DBA experience. He specializes in creating database architectures for very large online databases and he has worked with some of the world's most powerful and complex systems. A former Adjunct Professor, Don Burleson has written 15 books, published more than 100 articles in national magazines, serves as Editor-in-Chief of Oracle Internals and edits for Rampant TechPress. Don is a popular lecturer and teacher and is a frequent speaker at Oracle Openworld and other international database conferences. Don's Web sites include DBA-Oracle, Remote-DBA, Oracle-training, remote support and remote DBA.

Contributors : Donald K. Burleson
Last modified 2005-06-22 12:12 AM

DBAzine.com

Sections

Personal tools

Menu

Who Are You?