Skip to content

DBAzine.com

Sections
Personal tools
You are here: Home » Of Interest » Articles of Interest » Database Recovery Control in Practice - Part 6: DBRC Data Set Control
Who Are You?
I am a:
Mainframe True Believer
Distributed Fast-tracker

[ Results | Polls ]
Votes : 1984
 

Database Recovery Control in Practice - Part 6: DBRC Data Set Control

by Peter Armstrong
From Database Recovery Control (DBRC) in Practice by Peter Armstrong (Third Edition); copyright 1990 BMC Software.

Part 1  |  Part 2  |  Part 3  | Part 4  |  Part 5  |  Part 6

In DBRC terms, a log data set is an entry in a PRILOG record. In a batch environment, the PRILOG record contains a single data set entry. In an online environment, the PRILOG records can contain multiple data set entries, and a data set entry is added to the PRILOG record each time an archive is successfully performed. In a CICS-DL/1 environment, a data set entry is added each time a journal extent switch occurs (when the journal extent is subsequently archived, the data set entry will be updated to reflect the new copied data set information). In all cases, the data set entry could contain multiple volumes.

The forward recovery utilities only accept logs that have been archived. You cannot use OLDS directly. DBRC always looks in the PRILOG records when executing GENJCL.CA or GENJCL.RECOV. If you create an RLDS, DBRC uses this automatically. CICS-DL/1 and batch create PRILOG records.

Log Tape Management

DBRC is not a log tape management system. It does not stop you from reusing a log that is already recorded in the RECONs. Some customers have written native VSAM programs to interrogate the RECONs and see which logs are in use, subtract these logs from the pool, and generate a list of available logs. Other customers have developed programs based on the logs that are deleted from the RECONs as part of the regular cleanup.

The only data set control mechanism that DBRC provides, apart from the archive process, is the ability to predefine image copy and change accumulation data sets. DBRC cycles around and reuses the image copy and change accumulation data sets. See Chapter 5, “DBRC Registration” for a description of this procedure.

Generation Data Groups (GDGs)

DBRC fully supports the use of GDGs. They are recorded in RECON with their fully expanded names. See Chapter 5, “DBRC Registration” for a description of how to use them for image copy and change accumulation data sets.

GDGs can also be used for logs. No special action is required in DBRC to use them. The major thing to remember about GDGs is that they are normally updated at the end of the job (although you can choose to make this end of jobstep in an SMS environment), whereas DBRC is updated at the end of each job step.

If you ever generate multi-step jobs with DBRC (multiple steps doing the same thing, e.g., archives, change accumulations), DBRC will generate the same GDG numbers in each step by default. See Appendix C, “DBRC GENJCL Enhancements” for a description of the techniques required to solve this problem.

Log Information in RECON

DBRC stores information about batch logs, online logs, and archive output data sets. Batch logs are stored in PRILOG/SECLOG records. Online logs are recorded in PRIOLD and SECOLD records, and archive output is stored in a combination of PRILOG, SECLOG, PRISLD, and SECSLD records depending on your archive JCL setup. See Appendix B, “The Real Rules of Archive” for details of how archive works and what records are written to RECON.

In a log control environment, DBRC will simply record log information, but the associated LOGALL records will be empty—these only get entries written to them when you update a registered DBDS/area.

DBRC cleans up most of the records in RECON automatically—see “GENMAX Cycle—Log Cleanup in RECON” on page 45 for details. It is your responsibility, however, to get rid of unwanted—in DBRC speak INACTIVE—log records/log entries. The command to do this is DELETE.LOG INACTIVE.

DELETE.LOG INACTIVE

This command deletes log records (PRILOG, SECLOG, PRISLD, SECSLD) that have no LOGALL information (i.e., are no longer needed for recovery) and are older than the log retention period (LOGRET parameter on INIT.RECON). LOGALL information is cleared when ALLOCs are cleared (including RECOV, REORG records, etc.), which is done by cycling through the image copy GENMAX cycle. In a Log Control environment, this command will delete all log information from RECON, which is closed and older than LOGRET.

The RECONs must be large enough to hold log information back to the oldest image copy in RECON. If your RECONs are growing and the DELETE.LOG INACTIVE process appears to be ineffective, then you are probably suffering from one of the following problems:

      • DELETE.LOG INACTIVE stops when it finds an open log. So do a LIST.LOG OPEN first. Close any logs that are not currently active (typically one that ran a long time ago when you were trying out DBRC and was not cleaned up properly) using NOTIFY.PRILOG, and then do DELETE.LOG INACTIVE. Details on how to close batch logs and online logs is given in the DBRC Guide and Reference. If you have problems closing log records in RECON, look carefully at the table in the manual showing the combination of parameters needed—I have frequently been caught by forgetting to code the SSID parameter, for instance.
      • There are one or more databases that have not been image copied for a long time. If DELETE.LOG INACTIVE does not delete anything and there are no old OPEN logs in RECON, then do a LIST.LOG ALL and look at the first PRILOG/LOGALL combination listed.The LOGALL record will contain a list of the DBDSs, for which DBRC considers this log still holds valid recovery information, or in other words the DBDSs that have not been image copied frequently enough to cycle round GENMAX and trigger the ALLOC/LOGALL cleanup. Copy these DBDSs and your problems will go away.
      • Too many generations of image copy defined in RECON (GENMAX).  In this case, image copy more frequently or reduce the number of generations.
      • You’ve forgotten to dump the message queues for ages—see below. Never let the RECONs grow larger than 80 cylinders. This triggers a third level of VSAM index and impacts your performance. Most RECONs are smaller than 10 cylinders.

Continuous Operations

Prior to IMS V4, DELETE.LOG only deletes complete records, not entries within them. This has implications in continuous operations.

      • The first problem is that prior to V4 you have to close down IMS before the PRILOG record fills up—the only way prior to V4 to get a new PRILOG record and hence room to write archive output details is to close IMS down and start it up again.
      • The second problem is when you run IMS for four weeks, say, and you image copy weekly with a GENMAX of three. Now you close down IMS and run DELETE.LOG INACTIVE. Because the DELETE.LOG command deletes complete log records and not entries within them, the DELETE.LOG INACTIVE command achieves nothing as the log records are still required for their entries, which are more recent than the oldest image copy.

As from IMS V4, DELETE.LOG INACTIVE will delete inactive entries from log records and shuffle the remaining entries to the top of the record, hence creating more space—it calls this PRILOG compression. This means that you can run IMS continuously for months on end. IMS V4 will also try to compress these log records each time you run archive.

Figure 1: Part of a COMPRESSED PRILOG record.

All inactive log data set entries in the current PRILOG (SECLOG, PRISLD, SECSLD) will be deleted automatically at archive time if the size of the PRILOG record exceeds 50% of the maximum RECON record size. DBRC considers a log entry to be INACTIVE if it is older than all the following criteria:

      • log retention period
      • oldest ALLOC on that log (in other words, you’ve image copied GENMAX times since that entry)
      • earliest restart checkpoint for the online IMS system. The most common cause of this one is that you have forgotten to issue a /CHE SNAPQ for ages and IMS has to keep all the SLDS back to the last dump of the message queues in case of restart.

One customer was worried about the performance implications of this—did this mean that DBRC had to scan the whole RECON to find the oldest ALLOC each time they ran archive? What happens is that DBRC looks in the LOGALL record associated with the PRILOG, which provides it with a list of the DBDSs updated on that log. It then goes and looks for the first ALLOC for each of those DBDSs to determine the earliest ALLOC timestamp. This, in my opinion, could still be significant if you have thousands of databases.

Some DELETE.LOG Tests

      • DELETE.LOG STARTIME deletes PRILOG and primary system log data set (PRISLD) records with STOP times of zero (even though some manuals say it will not). Some customers use this for cleaning up old records in RECON. However, there is a chance you will delete the open PRILOG and PRISLD records being used by the archive utility, or leave some orphan ALLOC records (i.e., ALLOC records pointing at a PRILOG that does not exist). A much safer method is to close the log using DFSULTR0 or NOTIFY.PRILOG. Image copy cleanup and DELETE.LOG INACTIVE will then get rid of the log record and any associated ALLOCs. One customer deleted some old OPEN logs and then discovered that they had 50 orphaned ALLOCs that they had to remove manually.
      • When using DELETE.LOG STARTIME, you code the start timestamp of the record, and it will delete the whole record.
      • When using DELETE.LOG TOTIME, you can code any 12-digit timestamp. It does not have to be the timestamp of a particular record. This command deletes any PRILOG, PRISLD, or LOGALL records whose starting timestamp is earlier than the TOTIME specified when the following conditions exist:
          • The STOPTIME is not zeroes.
          • The record is older than LOGRET (based on STARTIME).

List of Logs Deleted from RECON

IMS V1.3 used to produce a list of the logs deleted from RECON during the DELETE.LOG INACTIVE process. This list disappeared in IMS V2—it only indicated the number of deleted data sets. This causes problems for customers who post-process this list to determine which logs are reusable. The good news is that it is back again now—use the parameter LISTDL on INIT.RECON.

Size of PRISLD/PRILOG Record

If you are running IMS for any length of time, e.g., days or weeks or longer, then the PRILOG/PRISLD etc. records are going to get bigger and bigger as new entries are written to them each time you run archive. So you have to calculate the size of these records, in order to be able to select a sensible RECORDSIZE for the RECONs. The formula for calculating the size of a PRISLD/PRILOG record (this is for IMS V6—previous versions will be smaller, but it does no harm to over-allocate) is:

S = 112 + 120 D + 32 V

where S is the size of the PRISLD/PRILOG record in bytes, D is the number of SLDS/RLDS data sets created from archives for this execution of the online system, and V is the number of volumes for these SLDS/RLDS data sets.

For each unarchived OLDS, DBRC assumes that it will require 16 VOLSERs, which gives a size of 120 + 16 x 32 = 632 bytes. When you switch OLDS, DBRC wants enough room left in the record to archive this one and the one you have just opened. So work out the formula above based on the number of archives you expect to do between the oldest image copy and now, add 2 x 632 bytes and that gives you the maximum record size. For instance, with GENMAX = 3, image copy once per week, ARC = 1 archiving to a single cartridge each time and archiving 20 times per day gives:

112 + 120 x (21 x 20) +32 x (21 x 20) + 632 x 2 = 65216 bytes

REPRO will not copy a VSAM KSDS to a non-VSAM data set when the maximum record size is greater than 32,760 bytes. You have to back up the RECON to VSAM KSDS. When using this method of backup (rather than to non-VSAM data sets), you must use empty—IDCAMS DELETE/DEFINEd— KSDS. Otherwise your output will be merged with the previous backup copy. For performance reasons, it is recommended you use SPANNED records and a CISIZE of 8K for the data and 2K for the index in RECON.

DBRC and Data Facility Hierarchical Storage Manager (DFHSM)

DFHSM does not have a DBRC interface. DBRC is not aware that a data set is taken away by DFHSM. This in itself does not matter as DFHSM migrates it back if it is required. The problem is that DBRC insists that it is on the same volume serial number (VOLSER) as the one from which it migrated because DBRC checks the VOLSER against what is recorded in RECON. Recalling to the same VOLSER can be forced via the exit in DFHSM. However, this assumes that there is still space available on that volume.

Figure 2: DFHSM flow.

DBRC caters for this with a parameter on the INIT.RECON and CHANGE.RECON commands: CATDS/NOCATDS. If you want DBRC to use the catalog to find data sets, remove VOLSER and UNIT information from the skeletal JCL and specify CATDS. DBRC still records volume and unit information in RECON and will check them if VOLSER and UNIT exist in the JCL.

This parameter is at the RECON level and applies to all logs, image copies, and change accumulations recorded in RECON. It has implications on JCL with disks and tapes as discussed below. It does not apply to DFSULTR0—always code all the information in the JCL for this utility.

Concatenated Tape and Disk

IMS allows you to log batch jobs to disk data sets (batch backout supports backout from disk). Many people use the combination of archiving their online logs to cartridges and using disk logs for batch processing. DBRC, however, generates JCL for concatenated data sets from one sample DD statement. So it is impossible to generate JCL with mixed devices using basic DBRC.

When you initialize RECON (or issue CHANGE.RECON) you place in the RECON header record default UNIT information for tape and DASD using the TAPEUNIT and DASDUNIT parameters.

DBRC identifies if a data set is on tape or DASD and stores that information in RECON. It does this for logs, image copy data sets, and change accumulation data sets. Every time you create one of these data sets, DBRC extracts a one-bit flag from MVS that indicates whether the data set is on tape or DASD. It then checks the RECON header record to see what you defined as the default tape and DASD units. So, if you have specified TAPEUNIT(3480), all data sets that are on tape will be stored in RECON with a UNIT of 3480.

The problems with this are:

      • DBRC cannot tell the difference between 3420, 3480, and 3490, or between 3380 and 3390.
      • You don’t always want to use UNIT=3480 for tapes anyway. It would be fine on image copy and change accumulation data sets, but it is far more efficient to use UNIT=AFF and save on tape units when reading concatenated log data sets.

Here are some solutions to the problem. The JCL you are trying to produce looks roughly like this:

There is a facility in DBRC that allows you to selectively generate any part of your skeletal JCL, depending on conditions you specify on the control card.

Figure 3: Using DELETE Logic with LOGUNIT.

The above example JCL only generates the UNIT information if the data set is not on DASD. Note, however, that you cannot tell the difference between different tapes or different disks. For disk, you could, of course, use a generic name, e.g., UNIT=SYSALLDA.

The select and delete logic is very powerful, but is not easy to understand at first sight. You are recommended to study the examples in my breathtakingly interesting manual—GENJCL.USER Examples and Description—and then to try some simple options first. Here is an example of generating concatenated tape and disk using %select logic:

Figure 4: Sample JCL for concatenated tape and disk.

However, if you are using CATDS, you do not want to generate the UNIT parameter on the disk data sets, but you do want to code UNIT=AFF on any tape data sets to reduce the number of tape drives required. Code a %DELETE, %ENDDEL pair to select UNIT=AFF only if the data set is not on disk.

Figure 5: Sample JCL for concatenated tape and disk with CATDS.

You may be wondering why I am using %DELETE instead of %SELECT. If you are in the middle of a %SELECT, %ENDSEL pair that you are already using to select the logs, you are not allowed to nest one %SELECT pair inside another.

Note: When writing this conditional JCL, make sure that you test it all possible ways, because it is dead simple to end up with invalid JCL—e.g., you forgot a comma or part of a DD statement. DBRC does not check your JCL for syntax—it generates anything you throw at it.

My JCL tends to be conceptual; if there are syntax errors, please don’t call me to complain!

Batch Logging

Following are some suggestions regarding stacking batch logs, batch disk logging, and checkpoint and extended restart.

Stacking Batch Logs

Many customers now use disk logging in batch processing because it makes operations and JCL much easier. Problems start when you want to migrate the data sets from disk to tape. You cannot use DISP=MOD, because each log must be a separate data set so that you can recover to the end of it or back out from it.

The easiest way to migrate these data sets is to use DFHSM. You can use DFSUARC0 to migrate these data sets; it will update RECON, but you have to generate the JCL yourself.

Batch Disk Logging Considerations

When using disks for batch logging, consider the following:

      • Block size. If you make the block size large and the job has to perform many buffer flushes, then you will probably write many short records and waste a lot of disk space. Conversely, if the job only performs a few buffer flushes, the job requires larger blocks. The default chosen by IMS is 1K.
      • Out of space. Miscalculation of sizes can lead to B37 abends.
      • BKO=Y. This parameter, designed for the block-level sharing user, performs dynamic backout in the event of certain, but not all, batch failures. It works by writing fixed-length records to the log and padding the blocks where necessary. It is not recommended due to its high DASD usage. Also, it does not work in all circumstances.
      • IDRC. IBM does not recommend the use of hardware compression on IMS batch logs. This is because it makes batch backout run like a dog—it has to expand each record to find the backwards pointer to the previous one.

Checkpoint and Extended Restart

Using CHKPT/XRST is recommended in long-running batch jobs, and is essential in n-way sharing environments. Backout is to the last checkpoint by default in non-sharing environments, and always in sharing environments.

Part 1  |  Part 2  |  Part 3  | Part 4  |  Part 5  |  Part 6

--

Peter Armstrong joined IBM in 1976 and was the UK Country IMS specialist. He helped design parts of DBRC and wrote the Recovery/Restart procedures for IMS disk logging. He joined BMC in 1986 and travels the world discussing computing issues with customers, analysts, etc. He has used all this technical and practical experience to write a book on how DBRC works in practice rather than a boring theoretical tome. He hopes you will enjoy it.

Peter Armstrong's Blog - Adopting a Service Mentality


Contributors : Peter Armstrong
Last modified 2006-01-04 12:55 PM
Transaction Management
Reduce downtime and increase repeat sales by improving end-user experience.
Free White Paper
Database Recovery
Feeling the increased demands on data protection and storage requirements?
Download Free Report!
 
 

Powered by Plone