The error recovery routine keeps track of the number of retries in
the IOBRCNT field of the IOBLOK. This count determines if a retry limit has been exceeded for a particular error. On initial entry from DMKICS for an error condition, the count is zero. Each time a retry is
attempted, the count is increased by one.
The ERP preserves the original error CSW and sense information by placing a pointer to the original IOERBLOK in the RDEVBLOK. Additional IOERBLOKs, which are received from DMKIOS on failing restart attempts,
are discarded. The original IOERBLOK is thus preserved for recording
purposes.
If after a specified number of retries, DMKDAS fails to correct the
error, the operator mayor may not be notified of the error. Control is
returned to DMKIOS.DMKIOS is notified of the permanent error by posting
the IOBLOK (IOBST1T=IOBFATAL). The error is recorded via DMKIOS by if DMKDAS and DMKIOE determine that the error warrants
recording.
If the error is corrected by a restart, the temporary or transient
error is not recorded. Control is returned to DMKIOS with the error flag
off.
Before returning control to DMKIOS on either a permanent error or a
successful recovery, the ERP frees all auxiliary storage gotten for
recovery CCWs, buffers, and IOERBLOKs, and updates the statistical
counters for 2314 and 2319 devices.
The DMKIOS interface with the ERP uses the IOBSTAT and IOBFLAG fields
of the IOBLOK to determine the action required when the ERP returns to DMKIOS. When retry is to be attempted, the ERP turns on the restart bit of
the IOBFLAG field. The ERP bit of the IOBFL1G field is also turned on to
indicate to DMKIOS that the ERP wants control back when the task has
finished. This enables the ERP to receive control even if the retry was
successful and allows the freeing of all storage gotten for CCWs and temporary buffers. The IOBRCAW is set to the recovery CCW string
address.
In handling an intervention-required the ERP sends a
message to the operator and then waits for the device end to arrive.
This is accomplished by a return to DMKIOS with the ERP bit in the
IOBFLAG field set on and the IOBSTRT bit in the IOBFLAG field set off. When the device end interruption arrives, the original channel program
which was interrupted is then started.
The ERP flags of the IOERBLOK are also used to indicate when special
recovery is being attempted. For example, a READ HOME ADDRESS command when a no record found error occurs.
The other two indicators are self-explanatory and are explained in
Figure 22. CP Introduction 1-165
Field IOBFLAGIIOBFLAG IIOBST1T IOBERP I IOBRSTRT I IOBFAT1L Action To Be performed by DMKIOS I I 1 I 0 I 0 Return control when solicited device end I I arrives I I 1 I 1 I 0 Restart using IOBRC1W I I 0 I 0 I 1 Permanent I/O error I I 0 I 0 I 0 Retry successful
Figure 22. Summary of lOB Indicators
If the error is uncorrectable or intervention is required, the ERP calls DMKMSW to notify operator. The specific message is identified in
the HSGPARH field of the IOERBLOK.
ALTERNATE TRACK RECOVERY, ERP (DHKTRK)
The software alternate track recovery support described in the following
paragraphs applies only to the 3340/3344 disk. For 3330 and 3350 disks
no software support is needed since the hardware performs alternate
track recovery. No support is needed for the 2305 drum since the CE is
able to rewire the device to use spare tracks in place of defective
tracks. For the 2314 and 2319 disks no true alternate track recovery is
provided by CP. But track condition checks from any device type are
reflected back to the virtual machine. Therefore, even though CP itself
cannot use a 2314 or 2319 cylinder that contains a defective track, it
it possible for a virtual machine to use such a cylinder if it provides
its own error recovery. To facilitate this, the VM/370 version of the
IBCD1SDI program allows 2314 and 2319 minidisks to be formatted with an
alternate track cylinder as the last cylinder of each ainidisk rather
than using the last cylinders of the real disk for this purpose.
The 3340 alternate track support applies to CP I/O, to Diagnose I/O (thereby giving alternate track support to CMS), and to SIO executed in
a virtual machine. For CP I/O and Diagnose I/O, the alternate track
recovery support essentially consists of directing (seeking) an
interrupted channel program to an alternate track and restarting it.
Later, in some cases, the interrupted channel program is directed back
to the original cylinder and restarted there. For SIO in a virtual machine, the operating system in the virtual machine provides its own
error recovery when CP reflects a track condition check to the virtual machine. On the 3340 disk, alternate tracks are assigned in the conventional
alternate tracks cylinders at the high end of the real disk, not in the
last cylinder of each ainidisk. Therefore a virtual machine may need to
seek outside of its ainidisk extent. This occurs when an operating
system in a virtual machine performs its own error recovery following a
track condition cbeck. So for SIO issued from a virtual machine, CP's 1-166 IBM VM/370 System Logic and Problem Deterlination--Volume 1
Previous Page Next Page