Commit | Line | Data |
---|---|---|
18bd1963 | 1 | Last reviewed: 08/20/2018 |
47bece87 | 2 | |
84df082c | 3 | HPE iLO NMI Watchdog Driver |
18bd1963 | 4 | for iLO based ProLiant Servers |
47bece87 | 5 | |
84df082c | 6 | The HPE iLO NMI Watchdog driver is a kernel module that provides basic |
18bd1963 JH |
7 | watchdog functionality and handler for the iLO "Generate NMI to System" |
8 | virtual button. | |
9 | ||
84df082c NC |
10 | All references to iLO in this document imply it also works on iLO2 and all |
11 | subsequent generations. | |
47bece87 TM |
12 | |
13 | Watchdog functionality is enabled like any other common watchdog driver. That | |
14 | is, an application needs to be started that kicks off the watchdog timer. A | |
718d50ec | 15 | basic application exists in tools/testing/selftests/watchdog/ named |
47bece87 | 16 | watchdog-test.c. Simply compile the C file and kick it off. If the system |
84df082c | 17 | gets into a bad state and hangs, the HPE ProLiant iLO timer register will |
47bece87 TM |
18 | not be updated in a timely fashion and a hardware system reset (also known as |
19 | an Automatic Server Recovery (ASR)) event will occur. | |
20 | ||
18bd1963 | 21 | The hpwdt driver also has the following module parameters: |
47bece87 | 22 | |
84df082c NC |
23 | soft_margin - allows the user to set the watchdog timer value. |
24 | Default value is 30 seconds. | |
18bd1963 JH |
25 | timeout - an alias of soft_margin. |
26 | pretimeout - allows the user to set the watchdog pretimeout value. | |
27 | This is the number of seconds before timeout when an | |
28 | NMI is delivered to the system. Setting the value to | |
29 | zero disables the pretimeout NMI. | |
30 | Default value is 9 seconds. | |
47bece87 TM |
31 | nowayout - basic watchdog parameter that does not allow the timer to |
32 | be restarted or an impending ASR to be escaped. | |
84df082c NC |
33 | Default value is set when compiling the kernel. If it is set |
34 | to "Y", then there is no way of disabling the watchdog once | |
35 | it has been started. | |
47bece87 TM |
36 | |
37 | NOTE: More information about watchdog drivers in general, including the ioctl | |
38 | interface to /dev/watchdog can be found in | |
39 | Documentation/watchdog/watchdog-api.txt and Documentation/IPMI.txt. | |
40 | ||
18bd1963 JH |
41 | Due to limitations in the iLO hardware, the NMI pretimeout if enabled, |
42 | can only be set to 9 seconds. Attempts to set pretimeout to other | |
43 | non-zero values will be rounded, possibly to zero. Users should verify | |
44 | the pretimeout value after attempting to set pretimeout or timeout. | |
47bece87 | 45 | |
18bd1963 JH |
46 | Upon receipt of an NMI from the iLO, the hpwdt driver will initiate a |
47 | panic. This is to allow for a crash dump to be collected. It is incumbent | |
48 | upon the user to have properly configured the system for kdump. | |
47bece87 | 49 | |
18bd1963 JH |
50 | The default Linux kernel behavior upon panic is to print a kernel tombstone |
51 | and loop forever. This is generally not what a watchdog user wants. | |
47bece87 | 52 | |
18bd1963 JH |
53 | For those wishing to learn more please see: |
54 | Documentation/kdump/kdump.txt | |
55 | Documentation/admin-guide/kernel-parameters.txt (panic=) | |
56 | Your Linux Distribution specific documentation. | |
47bece87 | 57 | |
18bd1963 JH |
58 | If the hpwdt does not receive the NMI associated with an expiring timer, |
59 | the iLO will proceed to reset the system at timeout if the timer hasn't | |
60 | been updated. | |
47bece87 | 61 | |
18bd1963 | 62 | -- |
47bece87 | 63 | |
18bd1963 JH |
64 | The HPE iLO NMI Watchdog Driver and documentation were originally developed |
65 | by Tom Mingarelli. | |
47bece87 | 66 |