The CPU always enters the check-stop
state if the check-stop-control bit, bit o of control register 14, is one and if
any of the following conditions exists: PSW bit 13 is zero and an exigent
machine-check condition is gener­
ated. During the execution of an inter­
ruption due to one exigent
machine-check condition, another
exigent machine-check condition is
detected. During a machine-check interrup­
tion, the machine-cheek-interrup­
tion code cannot be stored
successfully, or the new PSW cannot
be fetched successfully.
Invalid CBC is detected in the
prefix register. A malfunction in the receiving CPU, which is detected after accepting
the order, prevents the successful
completion of a SIGNAL PROCESSOR order and the order was a reset, or
the receiving CPU cannot determine
what the order was. The receiving CPU enters the check-stop state.
If the check-stop-control bit is zero
when one of these conditions occurs, the CPU mayor may not enter the check-stop
state, depending on the model. There
may be many other conditions for partic­
ular models when an error may cause
check stop.
When the CPU is in the check-stop state,
instructions and interruptions are not
executed, the interval timer is not
updated, and channel operations may be
stopped. In systems with channel-set
switching, I/O operations are normally
not affected. The TOD clock is normally
not affected by the check-stop state.
The CPU timer mayor may not run in the
check-stop state, depending on the error
and the model. The start key and stop
key are not effective in this state.
The CPU may be removed from the check­
stop state by CPU reset.
In a multiprocessing configuration, a CPU entering the check-stop state gener­
ates a request for a malfunction-alert
external interruption to all CPUs in the
configuration. Except for the reception
of a malfunction alert, other CPUs and
channels not connected to the malfunc­
tioning CPU are normally unaffected by
the check-stop state in a CPU. However,
depending on the nature of the condition
causing the check stop, other CPUs may
also be delayed or stopped, and I/O activity for channels connected to other CPUs may be affected.
System Check stop
In a multiprocessing configuration, some
errors, malfunctions, and damage condi­
tions are of such severity that the
condition causes all CPUs in the config­
uration to enter the check-stop state.
This condition is called a system check
stop. The state of the channels is
unpredictable.
Programming Note The program should avoid setting the
check-stop control, bit 0 of control
register 14, to zero, since the machine
may continue to operate rather than
enter the check-stop state when extreme­
ly serious conditions, such as an error
in the prefix register, occur. MACHINE-CHECK INTERRUPTION
A request for a machine-check inter­
ruption, which is made pending as the
result of a machine check, is called a
machine-cheek-interruption condition.
There are two types of machine-check­
interruption conditions: exigent condi­
tions and repressible conditions.
EXIGENT CONDITIONS Exigent machine-cheek-interruption con­
ditions are those in which damage has or
would have occurred such that execution
of the current instruction or inter­
ruption sequence cannot safely continue.
Exigent conditions include two sub­
classes: instruction-processing damage
and system damage. In addition to indi­
cating specific exigent conditions,
system damage is used to report any
malfunction or error which cannot be
isolated to a less severe report.
Exigent conditions for instruction
sequences can be either nullifying exi­
gent conditions or terminating exigent
conditions, according to whether the
instructions affected are nullified or
terminated. Exigent conditions for
interruption sequences are terminating
exigent conditions. The terms "nullifi­ cation" and "termination" have the same
meaning as that used in Chapter 6, "Interruptions," except that more than
one instruction may be involved. Thus,
a nullifying exigent condition indicates
that the CPU has returned to the begin­
ning of a unit of operation prior to the
error. A terminating exigent condition
means that the results of one or more Chapter 11. Machine-Check Handling 11-11
instructions
values.
may have REPRESSIBLE CONDITIONS unpredictable
Repressible machine-cheek-interruption
conditions are those in which the
results of the instruction-processing
sequence have not been affected.
Repressible conditions can be delayed,
until the completion of the current
instruction or even longer, without
affecting the integrity of CPU operation. Repressible conditions are
of three groups: recovery, alert, and
repressible damage. Each group includes
one or more subclasses.
A malfunction in the CPU, storage, chan­
nel, or operator facilities which has
been successfully corrected or circum­
vented internally without logical damage
is called a recovery condition. Depend­
ing on the model and the type of
malfunction, some or all recovery condi­
tions may be discarded and not reported.
Recovery conditions that are reported
are grouped in one subclass, system
recovery.
A machine-cheek-interruption condition
not directly related to a machine mal­
function is called an alert condition.
The alert conditions are grouped in two
subclasses: degradation and warning.
A malfunction resulting in an incorrect
state of a portion of the system not
directly affecting sequential CPU opera­
tion is called a repressible-damage
condition. Repressible-damage condi­
tions are grouped in five subclasses,
according to the function affected:
timing-facility damage, interval-timer
damage, external damage, service­
processor damage, and vector-facility
failure.
Programming Notes
1. Even though repressible conditions
are usually reported only at normal
points of interruption, they may
also be reported with exigent
machine-check conditions. Thus, if
an exigent machine-check condition
causes an instruction to be abnor­
mally terminated and a machine­
check interruption occurs to report
the exigent condition, any pending
repressible conditions may also be
reported. The meaningfulness of
the validity bits depends on what
exigent condition is reported.
2. Classification of damage as either
exigent or repressible does not
imply the severity of the damage.
11-12 System/370 Principles of Operation
The distinction is whether action
must be taken as soon as the damage
is detected (exigent) or whether
the CPU can continue processing
(repressible). For a repressible
condition, the current instruction
can be completed before taking the
machine-check interruption if the CPU is enabled for machine checks;
if the CPU is disabled for machine
checks, the condition can safely be
kept pending until the CPU is again
enabled for machine checks.
For example, the CPU may be disa­
bled for machine-check inter­
ruptions because it is handling an
earlier instruction-processing-dam­
age interruption. If, during that time, an I/O operation encounters a
storage error, that condition can
be kept pending because it is not
expected to interfere with the
current machine-check processing.
If, however, the CPU also makes a
reference to the area of storage
containing the error before re­
enabling machine-check interrup­
tions, another instruction-proces­
sing-damage condition is created,
which is treated as an exigent con­
dition and causes the CPU to enter
the check-stop state, if the check­
stop-control bit is set to one. INTERRUPTION ACTION A machine-check interruption causes the
following actions to be taken. The PSW reflecting the point of interruption is
stored as the machine-check old PSW at
real location 48. The contents of other
registers are stored in register-save
areas at real locations 216-231 and
352-511. After the contents of the
registers are stored in register-save
areas, depending on the model, the
registers may be validated with the
contents being unpredictable. A
failing-storage address may be stored at
real location 248, an external-damage
code may be stored at real location 244,
and a region code may be stored at real
location 252. A machine-check­ interruption code (MCIC) of eight bytes
is placed at real location 232. The new PSW is fetched from real location 112.
Additionally, sometime before the stor­
ing of the MCIC, one or more machine­
check logouts may have occurred. The
machine-generated addresses to access
the old and new PSW, the MCIC, extended
interruption information, and the
fixed-logout area are all real
addresses. The machine-check extended­
logout address is also a real address.
The fields accessed during the machine­
check interruption are summarized in the
figure "Machine-Cheek-Interruption Loca­
tions."
Previous Page Next Page