38 lines
1.4 KiB
Plaintext
38 lines
1.4 KiB
Plaintext
|
What: /sys/devices/system/machinecheck/machinecheckX/tolerant
|
||
|
Contact: Borislav Petkov <bp@suse.de>
|
||
|
Date: Dec, 2021
|
||
|
Description:
|
||
|
Unused and obsolete after the advent of recoverable machine
|
||
|
checks (see last sentence below) and those are present since
|
||
|
2010 (Nehalem).
|
||
|
|
||
|
Original description:
|
||
|
|
||
|
The entries appear for each CPU, but they are truly shared
|
||
|
between all CPUs.
|
||
|
|
||
|
Tolerance level. When a machine check exception occurs for a
|
||
|
non corrected machine check the kernel can take different
|
||
|
actions.
|
||
|
|
||
|
Since machine check exceptions can happen any time it is
|
||
|
sometimes risky for the kernel to kill a process because it
|
||
|
defies normal kernel locking rules. The tolerance level
|
||
|
configures how hard the kernel tries to recover even at some
|
||
|
risk of deadlock. Higher tolerant values trade potentially
|
||
|
better uptime with the risk of a crash or even corruption
|
||
|
(for tolerant >= 3).
|
||
|
|
||
|
== ===========================================================
|
||
|
0 always panic on uncorrected errors, log corrected errors
|
||
|
1 panic or SIGBUS on uncorrected errors, log corrected errors
|
||
|
2 SIGBUS or log uncorrected errors, log corrected errors
|
||
|
3 never panic or SIGBUS, log all errors (for testing only)
|
||
|
== ===========================================================
|
||
|
|
||
|
Default: 1
|
||
|
|
||
|
Note this only makes a difference if the CPU allows recovery
|
||
|
from a machine check exception. Current x86 CPUs generally
|
||
|
do not.
|