IMSPLEX Implementation Considerations
There are many reasons for the re-emergence of the mainframe as a viable processing platform. Probably the most important reason of all, however, was IBM's* decision to enable IMS for MVS Parallel Sysplex. The MVS Parallel Sysplex provides an easily scalable computing platform, which is more cost-efficient than the historical mainframe environment. Parallel Sysplex also provides an environment where 24x7 can be more readily achieved.
The connection of two or more IMS systems in a Parallel Sysplex is commonly referred to as an IMSPLEX. Enhancements introduced in previous releases of IMS provide two main capabilities of the IMSPLEX: data sharing and workload sharing.
IMS V5 was the first release to be enabled for Parallel Sysplex. IMS databases can now be shared at the block level across as many as 255 IMS subsystems. The data may be read and updated by the IMS systems participating in the data sharing. This capability of the IMSPLEX is referred to as N-way data sharing. The set of IMS systems involved in this type of IMSPLEX is referred to as a data-sharing group. The sharing of database records is accomplished using lock and cache structures residing in the coupling facility. The IMS Resource Lock Manager (IRLM) address space provides the necessary serialization.
Building on these enhancements, IMS V6 allows messages to be shared between multiple IMS systems. By moving the IMS message queues into the coupling facility, as many as 32 IMS systems can have access to the same message queues. This process allows messages to be placed on the message queues by one IMS system and then processed by any IMS system in the IMSPLEX. This capability of the IMSPLEX is commonly referred to as IMS shared message queues. The set of IMS systems involved in this type of IMSPLEX is referred to as a shared queues group.
The decision to migrate to an IMSPLEX involves many issues that must be thought through and planned carefully. This article discusses some of the more important aspects of building and maintaining an IMSPLEX. By learning to master the challenges of an IMSPLEX implementation, you will be in a better position to achieve your business objectives.
Planning for an IMSPLEX
As with most projects, the difference between success and failure for an IMSPLEX is proper planning. There are many different components, capabilities and features of an IMSPLEX as well as many different ways to design and build the IMSPLEX. Several choices must be made based on the needs of the environment.
Objectives of an IMSPLEX
Most companies today are planning for the future. They are trying to determine how to accommodate workload increases while keeping the associated costs down. Many of the objectives associated with this plan are the same objectives that the IMSPLEX is intended to address. These business objectives can be realized by successfully implementing an IMSPLEX.
The main objectives of the IMSPLEX are:
- Reduced processing costs
- CMOS boxes are cheaper than bipolar machines and can be added in smaller capacity increments.
- The Parallel Sysplex pricing structure allows for reduced software costs.
- Greater throughput
- Multiple IMS systems mean more work can be processed synchronously across the IMSPLEX.
- Increased system availability and reliability
- Failure of a single IMS or MVS image does not result in a complete loss of processing. The remaining IMS systems continue to process with minimal impact to the user community.
- Individual IMS systems can be removed from, or added to, the IMSPLEX while the other systems continue to process. This capability allows hardware and software maintenance to be performed without affecting application availability.
- Greater growth potential
- As workload increases beyond the current capacity, new IMS systems can be added easily to the IMSPLEX to accommodate growth.
- Capacity can be added in smaller increments because CMOS boxes come in smaller processing increments than do bipolar machines.
- IMS systems can be added to, or removed from, the IMSPLEX to accommodate fluctuations in system workload.
- During peak processing times, additional IMS systems can be brought up to handle the increased workload. During non-peak times, these IMS systems can be brought down, freeing up the MVS systems to perform batch processing.
Three basic methods exist for implementing an IMSPLEX:
- Sharing workload
- Combining workload
- Dividing workload
Combinations of these methods are also possible, depending on specific needs. The methods used will depend on the characteristics of the data and applications involved. An IMSPLEX may involve sharing a workload across multiple systems by cloning; combining two or more existing systems that perform similar functions; or dividing the workload by isolating the key components of an IMS DB/DC system.
Creating an IMSPLEX for workload sharing involves cloning an existing IMS system to create multiple, identical IMS systems, each providing identical functionality. The same STAGE-1 deck can be used for generating each system in the IMSPLEX. These systems are then connected via the coupling facility to form an IMSPLEX. To the end user, however, the IMSPLEX is viewed as a single entity. By using the shared queues feature, each IMS system has access to messages placed on the message queues, which reside in the coupling facility. Since each system is a duplicate of the original, any IMS in the IMSPLEX can retrieve and process the message, thus helping to distribute the workload across all of the systems evenly. In addition to using the shared message queues feature, a cloned IMSPLEX will also incorporate some level of N-way data sharing because programs that access the databases may run on any system in the IMSPLEX.
In many IMS shops there are multiple IMS systems that perform similar functions, but for different areas of the company. Each of these systems has a set of databases unique to a specific area and a set containing common company data. These common databases must be replicated and extreme caution must be given to keep them synchronized. By combining these independent IMS systems into an IMSPLEX, the common databases may be shared, eliminating the duplication and the associated operational problems. Each system in the IMSPLEX can read and update information in the common databases.
As the name implies, an IMS DB/DC system consists of two main components:
- Application processing (the DB component)
- Terminal processing (the DC component)
It is possible to divide the workload associated with these functions so that one or more IMS systems handle terminal access and the associated message traffic, and one or more IMS systems handle the application processing.
The terminal processing system is commonly referred to as a front-end while the application processing system is called a back-end. The front-end and back-end concept has been around for quite a while and is generally used to manage systems that have capacity or availability issues. Once again, the connection is provided through the message queues, which reside in the coupling facility. Generally, the front-end systems will place the message (transaction) on the queues and then one of the back-end systems will retrieve the message, process it and place it back on the shared queues. The front-end system will then retrieve the output message and send it back to the originating destination.
Although each type of IMSPLEX configuration mentioned above has merit, the cloned IMSPLEX provides the greatest benefit. It allows true workload balancing and minimizes the impact that the failure of any single system has on processing. When a system fails, the remaining systems carry the workload because each system is a duplicate of the others. Additional IMS systems can be added to the IMSPLEX as workload increases. This type of IMSPLEX also leads to a more manageable environment due to the ability to perform the same task once for all of the systems in the IMSPLEX. It will lead to fewer operational and procedural changes than the other IMSPLEX configurations. For these reasons, the remainder of this paper will focus on the cloned IMSPLEX.
IMS System Data Sets
When planning for an IMSPLEX implementation, the decision to share, or duplicate, IMS system data sets will need to be made based on the environment and the processes and procedures used to maintain the IMSPLEX. Since cloning involves making exact copies of an IMS, many of the data sets used to generate and operate IMS can be shared. Several data sets cannot be shared, however, because they are updated by IMS and contain information specific to that IMS.
Data sets that must be unique
Data sets that must be unique are generally updated by the IMS system and contain information specific to that IMS system. These data sets include:
- IMS logs
- Restart data set
- Modify status (MODSTAT) data set
- MSDB data sets
- CQS checkpoint data sets (IMS 7.1 and later allow one CQS address space to be shared among multiple IMS systems)
A unique copy of each of these data sets must exist for every IMS system in the IMSPLEX.
Data sets that must be shared
To provide data-sharing capabilities, each IMS in the data-sharing group must use the same RECON data sets. Also, any database data sets involved in data sharing or needed for backup and recovery of the shared queues structures must be shared.
Data sets that may be unique or shared
This group of data sets will require the most time because the decision to share them is based on many factors. These data sets include those that are associated with generating the system and defining the various resources. They include:
- System generation data sets (STAGE-1 input, PSB/DBD/MFS source, security input)
- ACBLIB data sets
- FMTLIB data sets
- MODBLKS data sets
- MATRIX data sets
- RESLIB data set
In a cloned IMSPLEX, these data sets should be shared to reduce data redundancy and eliminate duplication of effort. Resource definitions should be synchronized across the IMSPLEX. By sharing these data sets, synchronization is less difficult and the complexity associated with making changes is reduced.
Managing an IMSPLEX
Although an IMSPLEX may be viewed as a single entity to the end user, managing the IMSPLEX is more complicated than maintaining a single IMS system. There is no single point of control for operating an IMSPLEX. Instead of one IMS system executing on a single MVS image, there are multiple IMS systems executing on multiple MVS images. Procedures and processes must be reviewed, streamlined and modified to handle the duplication. In addition, an IMSPLEX introduces the challenge of keeping IMS systems synchronized.
This section discusses some of the key IMSPLEX management issues and provides some recommendations for implementing a manageable IMSPLEX.
Coupling Facility Structures
Implementation of an IMSPLEX requires structures that reside in the coupling facility. These structures are defined and maintained outside the control of IMS. There may be structures associated with shared OSAM databases, shared VSAM databases, shared DEDBs, IRLM address spaces, shared full-function message queues and shared EMH message queues. For any of these structures that exist, you must determine the size of the structure. Each active structure must be supported by storage in the coupling facility. Creating a structure that is too small will affect performance. Creating a structure that is too large is a waste of precious resources. Formulas for calculating the size of these structures may be found in various IBM manuals that discuss data-sharing and shared message queues.
IMS System Definitions
The task of defining IMS system definitions can be greatly simplified by maintaining a single set of IMS system definitions for the entire IMSPLEX. The cloned IMSPLEX provides this capability. The STAGE-1 deck, DBD source, PSB source, MFS source and SMU input should be identical across all systems in the IMSPLEX. The nucleus, MATRIX, MODBLKS, ACBLIBs and FMTLIBs should be generated once and then shared across the IMSPLEX. Sharing these data sets means that the system generation process does not have to change when implementing an IMSPLEX. If unique master consoles are required, however, you must use a new IMS PROCLIB member, DFSDCXXX, to identify the primary and secondary master terminal node and LTERM names for each IMS system.
Change control problems are nothing new. They exist apart from an IMSPLEX. "Who changed what, when and where?" has always been the question when problems arise. With IMSPLEX, however, another important concern surfaces: "Are my systems synchronized?" It is imperative in an IMSPLEX to keep shared resources synchronized. Failure to do so will result in data integrity problems.
Tight controls must be in place to ensure the systems remain synchronized. Procedures must be developed and implemented to identify and correct out-of-sync conditions. Sharing the data sets that contain resource definitions and ensuring that each IMS is using the same active libraries is strongly recommended.
With 24x7 being the desired availability standard, system availability is a key issue. Online change provides the ability to make changes to resource definitions while the IMS system remains active. However, online change works one system at a time with no coordination between, or knowledge of, any other system in the IMSPLEX. Ideally, online change would be executed on each system in the IMSPLEX at the same time.
Even though the systems are clones, it is possible for online change to work on some systems and fail on others. When this happens, problems can occur because the systems are no longer synchronized. Transactions or programs could function differently depending on which IMS in the IMSPLEX executes them. Database characteristics could be inconsistent across the IMSPLEX. Attempting to back off the online change on the systems where it was successful could also result in failures due to work in progress.
The following procedures (from IBM) are recommended for performing online change in an IMSPLEX.
System setup recommendations:
- Share the active, inactive and staging libraries across all systems in the IMSPLEX.
- Define each IMS with its own MODSTAT data set, indicating the same active and inactive ACBLIB, FMTLIB, MODBLKS and MATRIX data sets.
Online change procedure:
- Ensure that all systems in the IMSPLEX are using the same set of active libraries by issuing a /DISPLAY MODIFY on each system and comparing the output.
- Perform the necessary system generations using the staging libraries.
- Run the online change utility to copy the staging libraries to the inactive libraries.
- Issue the /MODIFY PREPARE on each IMS system in the IMSPLEX.
- Issue a /DISPLAY MODIFY on each IMS system in the IMSPLEX. Review the output for any work-in-progress situations and resolve them.
- Issue the /MODIFY COMMIT on each IMS system in the IMSPLEX.
- If the online change fails on any of the IMS systems during the PREPARE phase, issue a /MODIFY ABORT on each system in the IMSPLEX to halt the online change process. If the COMMIT phase works on some systems but not on others, issue the sequence of /MODIFY PREPARE and /MODIFY COMMIT on those systems where the changes were committed in order to revert back to before the changes were implemented.
To ensure the success of online change across an IMSPLEX, stop any resources across all systems that are going to be changed with the online change. This assumes you know everything that is being changed, a function that online change normally handles.
As stated earlier, each IMS system in the IMSPLEX must have its own unique copy of the modify status (MODSTAT) data set. This data set contains a record that indicates which ACBLIB, FMTLIB, MODBLKS and MATRIX data sets are currently active. If the active and inactive libraries are shared among the systems in the IMSPLEX and the MODSTAT data set on each system indicates the same set of active libraries, the systems should be synchronized.
One of the areas most affected by the changes associated with running an IMSPLEX is operations. Though the user community can view the IMSPLEX as a single entity, that is not true for those responsible for operating and maintaining the systems in the IMSPLEX. Some of the procedures currently in place may be usable, but it is more likely that existing procedures will have to be changed and new procedures developed for the IMSPLEX.
This section discusses a few of the operational areas greatly affected by the implementation of an IMSPLEX.
System software maintenance
Applying maintenance to the IMS or MVS system software in an IMSPLEX can be confined to a system at a time. Each IMS or MVS can be brought down, have the necessary maintenance applied, then brought back into the IMSPLEX.
This method allows the applications to continue to process on the remaining systems in the IMSPLEX, thus preventing unavailable time. However, there are two items to be aware of:
- It is possible (but rare) that the maintenance being applied changes a component that affects all systems in the IMSPLEX. If this is true, this method will not work and the maintenance must be applied to all the systems at the same time.
- If application changes are made to the systems in the IMSPLEX (via online change) while one of the IMS systems is down, when that IMS system is brought back up, it will not be synchronized with the other systems in the IMSPLEX. For this reason, use caution to prevent online change from occurring while an IMS system is down.
Vendor software maintenance
If the same method outlined above is used for applying new releases of vendor software, it is assumed that the new release is downward compatible with the one being replaced. Since this may not always be true, checking with the vendor before installing the new release is recommended. Vendors should strive to ensure, now more than ever, that their software is downward compatible with the prior release.
Application changes were mentioned in the previous section that covered the use of online change. The concern when making changes to applications (such as programs, databases, transactions and code) is to make sure that the changes are synchronized across the IMSPLEX. Also, before performing any maintenance against databases, such as reorganizations, the database must be unavailable to all the systems in the IMSPLEX.
With a few exceptions, IMS commands continue to work the same in an IMSPLEX as they do in an individual system. The emergence of the IMSPLEX encourages coordinating commands across the IMSPLEX. When checking on the status of a resource, each system in the IMSPLEX must be checked. If commands are issued that affect a resource, they should be entered across the IMSPLEX. Each system must then be checked to ensure the change was made. Using facilities provided by MVS, commands can be issued across all systems in the IMSPLEX.
When IMS commands are executed from the MVS console, the command recognition character (CRC) or the IMS subsystem name precedes the command. Enhancements provided with IMS V6 allow multiple IMS systems, executing on the same MVS image, to use the same CRC. To route IMS commands to all the active systems from a single MVS console within the Sysplex, specify the same CRC for each system in the IMSPLEX and use the MVS ROUTE * ALL command along with the CRC when you enter IMS commands.
MVS will route the command to each MVS system in the Sysplex; any IMS systems, which use that CRC, will execute the command. The output for commands entered from multiple console support (MCS) or extended multiple console support (E-MCS) consoles will be routed back to the originating console. IMS commands can also be directed to a single
IMS system by specifying the subsystem name instead of the CRC.
As mentioned earlier, there are a few commands that may function differently when executed in an IMSPLEX. These exceptions are display commands and global commands.
- Display commands
When issuing /DISPLAY commands on an IMS within an IMSPLEX, there are two things to remember:
- The results of the command give the information for that specific system only. The results may differ on the other systems.
- When displaying destinations (transactions and LTERMS), the message queue counts are not necessarily accurate. Since the message queues reside in the coupling facility and are shared by all the systems in the IMSPLEX, the message enqueue and dequeue counts for a specific system reflect the messages processed by that system only. The counts do not reflect the messages processed by the IMSPLEX. Similarly, the number of messages currently queued may be inaccurate. A new keyword, QCNT, must be added to the /DISPLAY command to obtain the overall message counts. This keyword causes the message queues in the coupling facility to be queried to obtain accurate queue counts.
- Global commands
When data sharing is involved, the following commands may include the GLOBAL parameter:
• /START DATABASE
• /START AREA
• /STOP DATABASE
• /STOP AREA
• /DBDUMP DATABASE
• /DBRECOVERY DATABASE
• /DBRECOVERY AREA
This parameter causes these commands to be passed to each system in the IMSPLEX and to be executed. Output from the commands is displayed on the individual system's Master Terminal. Therefore, to check the success or failure of the command, each system must be checked individually.
Review all automated operations procedures currently in use to ensure that they continue to function in an IMSPLEX environment. Automation that relates to a single system (for example, restarting an abended program) normally requires little or no modification. However, if the same automation has intelligence built in to keep track of the number of abends a given program encounters, this will be more difficult in an IMSPLEX because the program may execute on multiple systems. Any automation that affects resource availability (for example, bringing an application down at a certain time of the day for maintenance) must account for multiple systems.
The transition of the message queues into the coupling facility means these messages are no longer associated with a single IMS system. The message queues use list structures within the coupling facility and are shared, through the common queue server (CQS) address space, by every IMS system in the IMSPLEX. Local message queues are no longer associated with each IMS system.
A /CHE SNAPQ is often used to dump the messages to the IMS log for recovery and restart in an error situation. In a shared queues environment, however, this command cannot be used to produce a point of recovery because there are no local message queues. The /CQCHKPT command should be used instead. It initiates a checkpoint of the message queue structures residing in the coupling facility to the structure recovery data sets. Issuing this command with the SHAREDQ keyword will provide a point of recovery if an error requires that the shared message queue structures be rebuilt.
When shutting down an IMS system, a /CHE DUMPQ or PURGE will shut down the IMS system. The message queues are not dumped or purged, however, because they are not local to the IMS system. DUMPQ or PURGE processing does not affect the contents of the shared message queues.
As a result, cold-starting an IMS system that is part of an IMSPLEX will not result in the deletion of queued messages. The only way to perform a cold start is to:
- Delete and redefine the CQS structure recovery data sets
- Use the MVS SETXCF FORCE command to delete the structures in the coupling facility
- Restart IMS and specify a cold start for both the CQS and IMS address spaces
However, performing these procedures will affect every IMS system running in the IMSPLEX and must, therefore, be done while all the CQS and IMS address spaces are down.
Many factors are involved in the decision to migrate to an IMSPLEX. You should understand both the objectives and the challenges associated with an IMSPLEX implementation. With proper planning, you can learn to master the challenges of an IMSPLEX implementation while taking advantage of the benefits of the IMSPLEX environment. These benefits include reduced processing costs, greater throughput, increased system availability and reliability, flexibility and greater growth potential. Procedures and processes should be reviewed, streamlined and modified to manage all the IMS systems in the IMSPLEX. You should also look for tools that will allow you to automate as many processes as possible.
Last modified 2005-08-09 01:00 PM