Control is then returned to the instruction following the SVC 76
rather than reflecting the SVC. This eliminates the duplication of error recording in VM/370 and the operating system in the virtual machine. If DMKYER determines that the recording represented a permanent I/O error, a message is sent to the primary system operator. ERROR RECORDING AND RECOVERY The error recording facility is made up of four modules. (DMKIOE) is resident and the other three (DMKIOC, DMKIOF, are pageable. One module and DMKIOG) The error recording modules record temporary errors (statistical data
recording) for CP generated I/O except for DASDs with a buffered log.
The error recording routines record: unit checks, statistical data
counter overflow records, selected temporary DASD errors, machine
checks, chanLel checks, and hardware environmental counter sense data on
the error recording cylinders of the system resident device in a for.at
suitable for subsequent processing by the CPEREP command (DMSIFC). The
recorder asynchronously updates the statistical data counters for
supported devices. The recorder also initializes the error recording
cylinders at IPL if they are in an unrecognizable format. When the recorder is entered from DMKIOS, it is entered at DMKIOERR. This entry is used for unit checks and channel data checks. A test is
.ade of the failing CSW (located in the IOERBLOK) to see if the error
was a channel error. If it was, control is passed to the routine for
recording channel checks.
The IOERBLOK sense data, IOBLOK flags, and VMBLOK privilege class are
examined to determine if the error should be recorded. ERROR RECORD WRITING After an error record is formatted, it is added to the error recording
cylinder using DMKRPAGT and DMKRPAPT. The error recording cylinders
have page-sized records (4096 bytes). Each page contains a header (8
bytes) which signifies: the cylinder and page number of the page (4
bytes), the next available space for recording within page (2 bytes), a
page-in-use indicator (1 byte), and a flag byte. Each record within the
page is recorded with a 4-byte prefix.
If an error record is too large to be added into a page, a new page
is retrieved, updated with record, and placed back on the error
recording cylinder with the paging routines.
From two to nine cylinders are used for error recording; errors are
recorded in the order in which they occur. The cylinders that are used
for error recording are specified by the installation or system programmer at system generation time. If the error recording cylinders
become 90 percent full, a message is issued to the operator using DMKQCNWT to warn him of the condition. If the cylinders beco.e full,
another message is issued to inform the operator and recording is
stopped. On the 3031, 3032, and 3033 processors, frame records are read from the SRF device and written on the error recording cylinders during
initialization if no records exist after a CPEREP CLEARF operation.
1-162 IBM VM/370 System Logic and Problem Determination--Volu.e 1
If a channel check error is to be recorded, the recorder is entered
at DMKIOERR or DMKIOECC. The channel check handler determines the entry_ A channel check error record is formatted.
A machine check enters at DMKIOEMC. Pointers are passed from the machine check handler in registers 6 and 7 to locate a buffer where the
machine check record and length are saved. A aachine check error record
is recorded with the saved machine check logout and additional
information. The machine check error record is written onto the error
recording cylinder by using the paging routines.
Hardware environmental counter records are formed using routine DMKIOEEV. This routine is scheduled by DMKIOS after control is returned
from the ERP. Sense data information is stored in the IOERBLOK by the ERP. The record formed is called a nonstandard record. DMKIOEFM is called by DMSIFC (CPEREP command) via a DIAGNOSE instruction. DMKIOEFM 1S invoked to reset the specified error recording
cylinders (if CLEAR, CLEARF, or ZERO=Y was specified). The clear is
performed by resetting each page-header, space-available field. Pointers in storage are then updated to address the first available page
on each of the error recording cylinders. Control is then returned to
the calling routine. For details on the CPEREP com.and and EREP execution, refer to the Guide and OS/VS EREP publications.
CLEARF on a 3031, 3032, or 3033 processor clears the cylinders, then
causes the frame records to be read from the SRF device. DMKIOEFL is called by DMKCPI to find the first available page that can
be used for error recording. The paging routines, DMKRPAPT and DMKRPIGT, are used to read the error recording cylinders' pages
(4096-byte records). As each page record is read, it is examined to see
if this record is the last recorded. If so, a pointer in storage is
saved so recording can continue on that page record. Control is then
returned to the caller. If any error recording cylinder is in an
unrecognizable format, the error recording area is automatically
reformatted by CP. DASD ERROR RECOVERY, ERP (DMKDAS) Error recovery is attempted for CP-initiated I/O operations to its supported devices and for user-initiated operations to CP-supported
devices that use a DIAGNOSE interface. The primary control blocks used
for error recovery are the RDEVBLOK, the IOBLOK and the IOERBLOK. In
addition, auxiliary storage is sometimes used for recovery channel
prograas and sense buffers.
The initial error is first detected by the I/O interruption handler
which performs a SENSE operation if a unit check occurs. Unit check
errors are then passed to an appropriate ERP. If a Channel check is
encountered, the channel check interruption handler determines whether CP Introduction 1-163
Previous Page Next Page