Commit | Line | Data |
---|---|---|
53b95375 MCC |
1 | =================================== |
2 | Documentation for /proc/sys/kernel/ | |
3 | =================================== | |
1da177e4 | 4 | |
021622df SK |
5 | .. See scripts/check-sysctl-docs to keep this up to date |
6 | ||
7 | ||
53b95375 MCC |
8 | Copyright (c) 1998, 1999, Rik van Riel <riel@nl.linux.org> |
9 | ||
10 | Copyright (c) 2009, Shen Feng<shen@cn.fujitsu.com> | |
11 | ||
2793e19d MCC |
12 | For general info and legal blurb, please look in |
13 | Documentation/admin-guide/sysctl/index.rst. | |
53b95375 MCC |
14 | |
15 | ------------------------------------------------------------------------------ | |
1da177e4 LT |
16 | |
17 | This file contains documentation for the sysctl files in | |
d151a23d | 18 | ``/proc/sys/kernel/``. |
1da177e4 LT |
19 | |
20 | The files in this directory can be used to tune and monitor | |
21 | miscellaneous and general things in the operation of the Linux | |
a3cb66a5 | 22 | kernel. Since some of the files *can* be used to screw up your |
1da177e4 LT |
23 | system, it is advisable to read both documentation and source |
24 | before actually making adjustments. | |
25 | ||
26 | Currently, these files might (depending on your configuration) | |
a3cb66a5 SK |
27 | show up in ``/proc/sys/kernel``: |
28 | ||
29 | .. contents:: :local: | |
30 | ||
31 | ||
32 | acct | |
33 | ==== | |
34 | ||
35 | :: | |
1da177e4 | 36 | |
a3cb66a5 | 37 | highwater lowwater frequency |
1da177e4 LT |
38 | |
39 | If BSD-style process accounting is enabled these values control | |
40 | its behaviour. If free space on filesystem where the log lives | |
30fb8761 SK |
41 | goes below ``lowwater``\ % accounting suspends. If free space gets |
42 | above ``highwater``\ % accounting resumes. ``frequency`` determines | |
1da177e4 LT |
43 | how often do we check the amount of free space (value is in |
44 | seconds). Default: | |
1da177e4 | 45 | |
a3cb66a5 | 46 | :: |
807094c0 | 47 | |
a3cb66a5 | 48 | 4 2 30 |
807094c0 | 49 | |
a3cb66a5 SK |
50 | That is, suspend accounting if free space drops below 2%; resume it |
51 | if it increases to at least 4%; consider information about amount of | |
52 | free space valid for 30 seconds. | |
807094c0 | 53 | |
807094c0 | 54 | |
a3cb66a5 SK |
55 | acpi_video_flags |
56 | ================ | |
57 | ||
2793e19d | 58 | See Documentation/power/video.rst. This allows the video resume mode to be set, |
2bd49cb5 SK |
59 | in a similar fashion to the ``acpi_sleep`` kernel parameter, by |
60 | combining the following values: | |
61 | ||
62 | = ======= | |
63 | 1 s3_bios | |
64 | 2 s3_mode | |
65 | 4 s3_beep | |
66 | = ======= | |
807094c0 | 67 | |
bfca3dd3 PV |
68 | arch |
69 | ==== | |
70 | ||
71 | The machine hardware name, the same output as ``uname -m`` | |
72 | (e.g. ``x86_64`` or ``aarch64``). | |
a3cb66a5 SK |
73 | |
74 | auto_msgmni | |
75 | =========== | |
807094c0 | 76 | |
0050ee05 MS |
77 | This variable has no effect and may be removed in future kernel |
78 | releases. Reading it always returns 0. | |
a3cb66a5 SK |
79 | Up to Linux 3.17, it enabled/disabled automatic recomputing of |
80 | `msgmni`_ | |
81 | upon memory add/remove or upon IPC namespace creation/removal. | |
0050ee05 | 82 | Echoing "1" into this file enabled msgmni automatic recomputing. |
a3cb66a5 | 83 | Echoing "0" turned it off. The default value was 1. |
807094c0 | 84 | |
d75757ab | 85 | |
a3cb66a5 SK |
86 | bootloader_type (x86 only) |
87 | ========================== | |
d75757ab PA |
88 | |
89 | This gives the bootloader type number as indicated by the bootloader, | |
90 | shifted left by 4, and OR'd with the low four bits of the bootloader | |
91 | version. The reason for this encoding is that this used to match the | |
a3cb66a5 | 92 | ``type_of_loader`` field in the kernel header; the encoding is kept for |
d75757ab PA |
93 | backwards compatibility. That is, if the full bootloader type number |
94 | is 0x15 and the full version number is 0x234, this file will contain | |
95 | the value 340 = 0x154. | |
96 | ||
a3cb66a5 | 97 | See the ``type_of_loader`` and ``ext_loader_type`` fields in |
2793e19d | 98 | Documentation/x86/boot.rst for additional information. |
d75757ab | 99 | |
d75757ab | 100 | |
a3cb66a5 SK |
101 | bootloader_version (x86 only) |
102 | ============================= | |
d75757ab PA |
103 | |
104 | The complete bootloader version number. In the example above, this | |
105 | file will contain the value 564 = 0x234. | |
106 | ||
a3cb66a5 | 107 | See the ``type_of_loader`` and ``ext_loader_ver`` fields in |
2793e19d | 108 | Documentation/x86/boot.rst for additional information. |
d75757ab | 109 | |
d75757ab | 110 | |
5d8e5aee SK |
111 | bpf_stats_enabled |
112 | ================= | |
113 | ||
114 | Controls whether the kernel should collect statistics on BPF programs | |
115 | (total time spent running, number of times run...). Enabling | |
116 | statistics causes a slight reduction in performance on each program | |
117 | run. The statistics can be seen using ``bpftool``. | |
118 | ||
119 | = =================================== | |
120 | 0 Don't collect statistics (default). | |
121 | 1 Collect statistics. | |
122 | = =================================== | |
123 | ||
124 | ||
6bc47621 SK |
125 | cad_pid |
126 | ======= | |
127 | ||
128 | This is the pid which will be signalled on reboot (notably, by | |
129 | Ctrl-Alt-Delete). Writing a value to this file which doesn't | |
130 | correspond to a running process will result in ``-ESRCH``. | |
131 | ||
132 | See also `ctrl-alt-del`_. | |
133 | ||
134 | ||
a3cb66a5 SK |
135 | cap_last_cap |
136 | ============ | |
73efc039 DB |
137 | |
138 | Highest valid capability of the running kernel. Exports | |
a3cb66a5 | 139 | ``CAP_LAST_CAP`` from the kernel. |
73efc039 | 140 | |
73efc039 | 141 | |
a3cb66a5 SK |
142 | core_pattern |
143 | ============ | |
1da177e4 | 144 | |
a3cb66a5 | 145 | ``core_pattern`` is used to specify a core dumpfile pattern name. |
53b95375 MCC |
146 | |
147 | * max length 127 characters; default value is "core" | |
a3cb66a5 SK |
148 | * ``core_pattern`` is used as a pattern template for the output |
149 | filename; certain string patterns (beginning with '%') are | |
150 | substituted with their actual values. | |
151 | * backward compatibility with ``core_uses_pid``: | |
53b95375 | 152 | |
a3cb66a5 SK |
153 | If ``core_pattern`` does not include "%p" (default does not) |
154 | and ``core_uses_pid`` is set, then .PID will be appended to | |
1da177e4 | 155 | the filename. |
53b95375 | 156 | |
a3cb66a5 SK |
157 | * corename format specifiers |
158 | ||
159 | ======== ========================================== | |
160 | %<NUL> '%' is dropped | |
161 | %% output one '%' | |
162 | %p pid | |
163 | %P global pid (init PID namespace) | |
164 | %i tid | |
165 | %I global tid (init PID namespace) | |
166 | %u uid (in initial user namespace) | |
167 | %g gid (in initial user namespace) | |
168 | %d dump mode, matches ``PR_SET_DUMPABLE`` and | |
169 | ``/proc/sys/fs/suid_dumpable`` | |
170 | %s signal number | |
171 | %t UNIX time of dump | |
172 | %h hostname | |
f38c85f1 LW |
173 | %e executable filename (may be shortened, could be changed by prctl etc) |
174 | %f executable filename | |
a3cb66a5 SK |
175 | %E executable path |
176 | %c maximum size of core file by resource limit RLIMIT_CORE | |
177 | %<OTHER> both are dropped | |
178 | ======== ========================================== | |
53b95375 MCC |
179 | |
180 | * If the first character of the pattern is a '|', the kernel will treat | |
cd081041 MU |
181 | the rest of the pattern as a command to run. The core dump will be |
182 | written to the standard input of that program instead of to a file. | |
1da177e4 | 183 | |
1da177e4 | 184 | |
a3cb66a5 SK |
185 | core_pipe_limit |
186 | =============== | |
a293980c | 187 | |
a3cb66a5 SK |
188 | This sysctl is only applicable when `core_pattern`_ is configured to |
189 | pipe core files to a user space helper (when the first character of | |
190 | ``core_pattern`` is a '|', see above). | |
191 | When collecting cores via a pipe to an application, it is occasionally | |
192 | useful for the collecting application to gather data about the | |
193 | crashing process from its ``/proc/pid`` directory. | |
194 | In order to do this safely, the kernel must wait for the collecting | |
195 | process to exit, so as not to remove the crashing processes proc files | |
196 | prematurely. | |
197 | This in turn creates the possibility that a misbehaving userspace | |
198 | collecting process can block the reaping of a crashed process simply | |
199 | by never exiting. | |
200 | This sysctl defends against that. | |
201 | It defines how many concurrent crashing processes may be piped to user | |
202 | space applications in parallel. | |
203 | If this value is exceeded, then those crashing processes above that | |
204 | value are noted via the kernel log and their cores are skipped. | |
205 | 0 is a special value, indicating that unlimited processes may be | |
206 | captured in parallel, but that no waiting will take place (i.e. the | |
207 | collecting process is not guaranteed access to ``/proc/<crashing | |
208 | pid>/``). | |
209 | This value defaults to 0. | |
210 | ||
211 | ||
212 | core_uses_pid | |
213 | ============= | |
1da177e4 LT |
214 | |
215 | The default coredump filename is "core". By setting | |
a3cb66a5 SK |
216 | ``core_uses_pid`` to 1, the coredump filename becomes core.PID. |
217 | If `core_pattern`_ does not include "%p" (default does not) | |
218 | and ``core_uses_pid`` is set, then .PID will be appended to | |
1da177e4 LT |
219 | the filename. |
220 | ||
1da177e4 | 221 | |
a3cb66a5 SK |
222 | ctrl-alt-del |
223 | ============ | |
1da177e4 LT |
224 | |
225 | When the value in this file is 0, ctrl-alt-del is trapped and | |
a3cb66a5 | 226 | sent to the ``init(1)`` program to handle a graceful restart. |
1da177e4 LT |
227 | When, however, the value is > 0, Linux's reaction to a Vulcan |
228 | Nerve Pinch (tm) will be an immediate reboot, without even | |
229 | syncing its dirty buffers. | |
230 | ||
53b95375 MCC |
231 | Note: |
232 | when a program (like dosemu) has the keyboard in 'raw' | |
233 | mode, the ctrl-alt-del is intercepted by the program before it | |
234 | ever reaches the kernel tty layer, and it's up to the program | |
235 | to decide what to do with it. | |
1da177e4 | 236 | |
1da177e4 | 237 | |
a3cb66a5 SK |
238 | dmesg_restrict |
239 | ============== | |
eaf06b24 | 240 | |
807094c0 | 241 | This toggle indicates whether unprivileged users are prevented |
a3cb66a5 SK |
242 | from using ``dmesg(8)`` to view messages from the kernel's log |
243 | buffer. | |
244 | When ``dmesg_restrict`` is set to 0 there are no restrictions. | |
ee74db08 | 245 | When ``dmesg_restrict`` is set to 1, users must have |
a3cb66a5 | 246 | ``CAP_SYSLOG`` to use ``dmesg(8)``. |
eaf06b24 | 247 | |
a3cb66a5 SK |
248 | The kernel config option ``CONFIG_SECURITY_DMESG_RESTRICT`` sets the |
249 | default value of ``dmesg_restrict``. | |
eaf06b24 | 250 | |
eaf06b24 | 251 | |
a3cb66a5 SK |
252 | domainname & hostname |
253 | ===================== | |
1da177e4 LT |
254 | |
255 | These files can be used to set the NIS/YP domainname and the | |
256 | hostname of your box in exactly the same way as the commands | |
53b95375 MCC |
257 | domainname and hostname, i.e.:: |
258 | ||
259 | # echo "darkstar" > /proc/sys/kernel/hostname | |
260 | # echo "mydomain" > /proc/sys/kernel/domainname | |
261 | ||
262 | has the same effect as:: | |
263 | ||
264 | # hostname "darkstar" | |
265 | # domainname "mydomain" | |
1da177e4 LT |
266 | |
267 | Note, however, that the classic darkstar.frop.org has the | |
268 | hostname "darkstar" and DNS (Internet Domain Name Server) | |
269 | domainname "frop.org", not to be confused with the NIS (Network | |
270 | Information Service) or YP (Yellow Pages) domainname. These two | |
271 | domain names are in general different. For a detailed discussion | |
a3cb66a5 | 272 | see the ``hostname(1)`` man page. |
1da177e4 | 273 | |
53b95375 | 274 | |
d75829c1 SK |
275 | firmware_config |
276 | =============== | |
277 | ||
2793e19d | 278 | See Documentation/driver-api/firmware/fallback-mechanisms.rst. |
d75829c1 SK |
279 | |
280 | The entries in this directory allow the firmware loader helper | |
281 | fallback to be controlled: | |
282 | ||
283 | * ``force_sysfs_fallback``, when set to 1, forces the use of the | |
284 | fallback; | |
285 | * ``ignore_sysfs_fallback``, when set to 1, ignores any fallback. | |
286 | ||
287 | ||
50cdae76 SK |
288 | ftrace_dump_on_oops |
289 | =================== | |
290 | ||
291 | Determines whether ``ftrace_dump()`` should be called on an oops (or | |
292 | kernel panic). This will output the contents of the ftrace buffers to | |
293 | the console. This is very useful for capturing traces that lead to | |
294 | crashes and outputting them to a serial console. | |
295 | ||
296 | = =================================================== | |
297 | 0 Disabled (default). | |
298 | 1 Dump buffers of all CPUs. | |
299 | 2 Dump the buffer of the CPU that triggered the oops. | |
300 | = =================================================== | |
301 | ||
302 | ||
303 | ftrace_enabled, stack_tracer_enabled | |
304 | ==================================== | |
305 | ||
2793e19d | 306 | See Documentation/trace/ftrace.rst. |
50cdae76 SK |
307 | |
308 | ||
a3cb66a5 SK |
309 | hardlockup_all_cpu_backtrace |
310 | ============================ | |
55537871 JK |
311 | |
312 | This value controls the hard lockup detector behavior when a hard | |
313 | lockup condition is detected as to whether or not to gather further | |
314 | debug information. If enabled, arch-specific all-CPU stack dumping | |
315 | will be initiated. | |
316 | ||
a3cb66a5 SK |
317 | = ============================================ |
318 | 0 Do nothing. This is the default behavior. | |
319 | 1 On detection capture more debug information. | |
320 | = ============================================ | |
53b95375 | 321 | |
1da177e4 | 322 | |
a3cb66a5 SK |
323 | hardlockup_panic |
324 | ================ | |
d22881dc SW |
325 | |
326 | This parameter can be used to control whether the kernel panics | |
327 | when a hard lockup is detected. | |
328 | ||
a3cb66a5 SK |
329 | = =========================== |
330 | 0 Don't panic on hard lockup. | |
331 | 1 Panic on hard lockup. | |
332 | = =========================== | |
d22881dc | 333 | |
2793e19d | 334 | See Documentation/admin-guide/lockup-watchdogs.rst for more information. |
a3cb66a5 | 335 | This can also be set using the nmi_watchdog kernel parameter. |
d22881dc | 336 | |
d22881dc | 337 | |
a3cb66a5 SK |
338 | hotplug |
339 | ======= | |
1da177e4 LT |
340 | |
341 | Path for the hotplug policy agent. | |
1e886090 RV |
342 | Default value is ``CONFIG_UEVENT_HELPER_PATH``, which in turn defaults |
343 | to the empty string. | |
344 | ||
345 | This file only exists when ``CONFIG_UEVENT_HELPER`` is enabled. Most | |
346 | modern systems rely exclusively on the netlink-based uevent source and | |
347 | don't need this. | |
1da177e4 | 348 | |
1da177e4 | 349 | |
e996919b RD |
350 | hung_task_all_cpu_backtrace |
351 | =========================== | |
0ec9dc9b GP |
352 | |
353 | If this option is set, the kernel will send an NMI to all CPUs to dump | |
354 | their backtraces when a hung task is detected. This file shows up if | |
355 | CONFIG_DETECT_HUNG_TASK and CONFIG_SMP are enabled. | |
356 | ||
357 | 0: Won't show all CPUs backtraces when a hung task is detected. | |
358 | This is the default behavior. | |
359 | ||
360 | 1: Will non-maskably interrupt all CPUs and dump their backtraces when | |
361 | a hung task is detected. | |
362 | ||
363 | ||
a3cb66a5 SK |
364 | hung_task_panic |
365 | =============== | |
270750db AT |
366 | |
367 | Controls the kernel's behavior when a hung task is detected. | |
a3cb66a5 | 368 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
270750db | 369 | |
a3cb66a5 SK |
370 | = ================================================= |
371 | 0 Continue operation. This is the default behavior. | |
372 | 1 Panic immediately. | |
373 | = ================================================= | |
270750db | 374 | |
270750db | 375 | |
a3cb66a5 SK |
376 | hung_task_check_count |
377 | ===================== | |
270750db AT |
378 | |
379 | The upper bound on the number of tasks that are checked. | |
a3cb66a5 | 380 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
270750db | 381 | |
270750db | 382 | |
a3cb66a5 SK |
383 | hung_task_timeout_secs |
384 | ====================== | |
270750db | 385 | |
a2e51445 | 386 | When a task in D state did not get scheduled |
270750db | 387 | for more than this value report a warning. |
a3cb66a5 | 388 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
270750db | 389 | |
a3cb66a5 | 390 | 0 means infinite timeout, no checking is done. |
53b95375 | 391 | |
a3cb66a5 | 392 | Possible values to set are in range {0:``LONG_MAX``/``HZ``}. |
270750db | 393 | |
270750db | 394 | |
a3cb66a5 SK |
395 | hung_task_check_interval_secs |
396 | ============================= | |
a2e51445 DV |
397 | |
398 | Hung task check interval. If hung task checking is enabled | |
a3cb66a5 SK |
399 | (see `hung_task_timeout_secs`_), the check is done every |
400 | ``hung_task_check_interval_secs`` seconds. | |
401 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. | |
a2e51445 | 402 | |
a3cb66a5 SK |
403 | 0 (default) means use ``hung_task_timeout_secs`` as checking |
404 | interval. | |
a2e51445 | 405 | |
a3cb66a5 | 406 | Possible values to set are in range {0:``LONG_MAX``/``HZ``}. |
a2e51445 | 407 | |
a3cb66a5 SK |
408 | |
409 | hung_task_warnings | |
410 | ================== | |
270750db AT |
411 | |
412 | The maximum number of warnings to report. During a check interval | |
70e0ac5f AT |
413 | if a hung task is detected, this value is decreased by 1. |
414 | When this value reaches 0, no more warnings will be reported. | |
a3cb66a5 | 415 | This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled. |
270750db AT |
416 | |
417 | -1: report an infinite number of warnings. | |
418 | ||
270750db | 419 | |
a3cb66a5 SK |
420 | hyperv_record_panic_msg |
421 | ======================= | |
81b18bce SM |
422 | |
423 | Controls whether the panic kmsg data should be reported to Hyper-V. | |
424 | ||
a3cb66a5 SK |
425 | = ========================================================= |
426 | 0 Do not report panic kmsg data. | |
427 | 1 Report the panic kmsg data. This is the default behavior. | |
428 | = ========================================================= | |
81b18bce | 429 | |
81b18bce | 430 | |
997c798e SK |
431 | ignore-unaligned-usertrap |
432 | ========================= | |
433 | ||
434 | On architectures where unaligned accesses cause traps, and where this | |
435 | feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN``; | |
436 | currently, ``arc`` and ``ia64``), controls whether all unaligned traps | |
437 | are logged. | |
438 | ||
439 | = ============================================================= | |
440 | 0 Log all unaligned accesses. | |
441 | 1 Only warn the first time a process traps. This is the default | |
442 | setting. | |
443 | = ============================================================= | |
444 | ||
445 | See also `unaligned-trap`_ and `unaligned-dump-stack`_. On ``ia64``, | |
446 | this allows system administrators to override the | |
447 | ``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded. | |
448 | ||
449 | ||
a3cb66a5 SK |
450 | kexec_load_disabled |
451 | =================== | |
81b18bce | 452 | |
a3cb66a5 SK |
453 | A toggle indicating if the ``kexec_load`` syscall has been disabled. |
454 | This value defaults to 0 (false: ``kexec_load`` enabled), but can be | |
455 | set to 1 (true: ``kexec_load`` disabled). | |
456 | Once true, kexec can no longer be used, and the toggle cannot be set | |
457 | back to false. | |
458 | This allows a kexec image to be loaded before disabling the syscall, | |
459 | allowing a system to set up (and later use) an image without it being | |
460 | altered. | |
461 | Generally used together with the `modules_disabled`_ sysctl. | |
7984754b | 462 | |
7984754b | 463 | |
a3cb66a5 SK |
464 | kptr_restrict |
465 | ============= | |
455cd5ab DR |
466 | |
467 | This toggle indicates whether restrictions are placed on | |
a3cb66a5 SK |
468 | exposing kernel addresses via ``/proc`` and other interfaces. |
469 | ||
470 | When ``kptr_restrict`` is set to 0 (the default) the address is hashed | |
471 | before printing. | |
472 | (This is the equivalent to %p.) | |
473 | ||
474 | When ``kptr_restrict`` is set to 1, kernel pointers printed using the | |
475 | %pK format specifier will be replaced with 0s unless the user has | |
476 | ``CAP_SYSLOG`` and effective user and group ids are equal to the real | |
477 | ids. | |
478 | This is because %pK checks are done at read() time rather than open() | |
479 | time, so if permissions are elevated between the open() and the read() | |
480 | (e.g via a setuid binary) then %pK will not leak kernel pointers to | |
481 | unprivileged users. | |
482 | Note, this is a temporary solution only. | |
483 | The correct long-term solution is to do the permission checks at | |
484 | open() time. | |
485 | Consider removing world read permissions from files that use %pK, and | |
486 | using `dmesg_restrict`_ to protect against uses of %pK in ``dmesg(8)`` | |
487 | if leaking kernel pointer values to unprivileged users is a concern. | |
488 | ||
489 | When ``kptr_restrict`` is set to 2, kernel pointers printed using | |
490 | %pK will be replaced with 0s regardless of privileges. | |
491 | ||
492 | ||
a3cb66a5 SK |
493 | modprobe |
494 | ======== | |
455cd5ab | 495 | |
52338dfb | 496 | The full path to the usermode helper for autoloading kernel modules, |
f4d3f25a RV |
497 | by default ``CONFIG_MODPROBE_PATH``, which in turn defaults to |
498 | "/sbin/modprobe". This binary is executed when the kernel requests a | |
499 | module. For example, if userspace passes an unknown filesystem type | |
500 | to mount(), then the kernel will automatically request the | |
501 | corresponding filesystem module by executing this usermode helper. | |
52338dfb EB |
502 | This usermode helper should insert the needed module into the kernel. |
503 | ||
504 | This sysctl only affects module autoloading. It has no effect on the | |
505 | ability to explicitly insert modules. | |
506 | ||
507 | This sysctl can be used to debug module loading requests:: | |
0317c537 SK |
508 | |
509 | echo '#! /bin/sh' > /tmp/modprobe | |
510 | echo 'echo "$@" >> /tmp/modprobe.log' >> /tmp/modprobe | |
511 | echo 'exec /sbin/modprobe "$@"' >> /tmp/modprobe | |
512 | chmod a+x /tmp/modprobe | |
513 | echo /tmp/modprobe > /proc/sys/kernel/modprobe | |
514 | ||
52338dfb EB |
515 | Alternatively, if this sysctl is set to the empty string, then module |
516 | autoloading is completely disabled. The kernel will not try to | |
517 | execute a usermode helper at all, nor will it call the | |
518 | kernel_module_request LSM hook. | |
807094c0 | 519 | |
52338dfb EB |
520 | If CONFIG_STATIC_USERMODEHELPER=y is set in the kernel configuration, |
521 | then the configured static usermode helper overrides this sysctl, | |
522 | except that the empty string is still accepted to completely disable | |
523 | module autoloading as described above. | |
807094c0 | 524 | |
a3cb66a5 SK |
525 | modules_disabled |
526 | ================ | |
3d43321b KC |
527 | |
528 | A toggle value indicating if modules are allowed to be loaded | |
529 | in an otherwise modular kernel. This toggle defaults to off | |
530 | (0), but can be set true (1). Once true, modules can be | |
531 | neither loaded nor unloaded, and the toggle cannot be set back | |
a3cb66a5 SK |
532 | to false. Generally used with the `kexec_load_disabled`_ toggle. |
533 | ||
3d43321b | 534 | |
a3cb66a5 | 535 | .. _msgmni: |
3d43321b | 536 | |
a3cb66a5 SK |
537 | msgmax, msgmnb, and msgmni |
538 | ========================== | |
539 | ||
fa5b5264 SK |
540 | ``msgmax`` is the maximum size of an IPC message, in bytes. 8192 by |
541 | default (``MSGMAX``). | |
542 | ||
543 | ``msgmnb`` is the maximum size of an IPC queue, in bytes. 16384 by | |
544 | default (``MSGMNB``). | |
545 | ||
546 | ``msgmni`` is the maximum number of IPC queues. 32000 by default | |
547 | (``MSGMNI``). | |
548 | ||
a3cb66a5 SK |
549 | |
550 | msg_next_id, sem_next_id, and shm_next_id (System V IPC) | |
551 | ======================================================== | |
03f59566 SK |
552 | |
553 | These three toggles allows to specify desired id for next allocated IPC | |
554 | object: message, semaphore or shared memory respectively. | |
555 | ||
556 | By default they are equal to -1, which means generic allocation logic. | |
a3cb66a5 | 557 | Possible values to set are in range {0:``INT_MAX``}. |
03f59566 SK |
558 | |
559 | Notes: | |
53b95375 MCC |
560 | 1) kernel doesn't guarantee, that new object will have desired id. So, |
561 | it's up to userspace, how to handle an object with "wrong" id. | |
562 | 2) Toggle with non-default value will be set back to -1 by kernel after | |
563 | successful IPC object allocation. If an IPC object allocation syscall | |
564 | fails, it is undefined if the value remains unmodified or is reset to -1. | |
03f59566 | 565 | |
17444d9b SK |
566 | |
567 | ngroups_max | |
568 | =========== | |
569 | ||
570 | Maximum number of supplementary groups, _i.e._ the maximum size which | |
571 | ``setgroups`` will accept. Exports ``NGROUPS_MAX`` from the kernel. | |
572 | ||
573 | ||
574 | ||
a3cb66a5 SK |
575 | nmi_watchdog |
576 | ============ | |
807094c0 | 577 | |
195daf66 UO |
578 | This parameter can be used to control the NMI watchdog |
579 | (i.e. the hard lockup detector) on x86 systems. | |
807094c0 | 580 | |
a3cb66a5 SK |
581 | = ================================= |
582 | 0 Disable the hard lockup detector. | |
583 | 1 Enable the hard lockup detector. | |
584 | = ================================= | |
195daf66 UO |
585 | |
586 | The hard lockup detector monitors each CPU for its ability to respond to | |
587 | timer interrupts. The mechanism utilizes CPU performance counter registers | |
588 | that are programmed to generate Non-Maskable Interrupts (NMIs) periodically | |
589 | while a CPU is busy. Hence, the alternative name 'NMI watchdog'. | |
590 | ||
591 | The NMI watchdog is disabled by default if the kernel is running as a guest | |
53b95375 | 592 | in a KVM virtual machine. This default can be overridden by adding:: |
195daf66 UO |
593 | |
594 | nmi_watchdog=1 | |
595 | ||
2793e19d MCC |
596 | to the guest kernel command line (see |
597 | Documentation/admin-guide/kernel-parameters.rst). | |
807094c0 | 598 | |
807094c0 | 599 | |
118b1366 LD |
600 | nmi_wd_lpm_factor (PPC only) |
601 | ============================ | |
602 | ||
603 | Factor to apply to the NMI watchdog timeout (only when ``nmi_watchdog`` is | |
604 | set to 1). This factor represents the percentage added to | |
605 | ``watchdog_thresh`` when calculating the NMI watchdog timeout during an | |
606 | LPM. The soft lockup timeout is not impacted. | |
607 | ||
608 | A value of 0 means no change. The default value is 200 meaning the NMI | |
609 | watchdog is set to 30s (based on ``watchdog_thresh`` equal to 10). | |
610 | ||
611 | ||
a3cb66a5 SK |
612 | numa_balancing |
613 | ============== | |
10fc05d0 | 614 | |
c574bbe9 HY |
615 | Enables/disables and configures automatic page fault based NUMA memory |
616 | balancing. Memory is moved automatically to nodes that access it often. | |
617 | The value to set can be the result of ORing the following: | |
10fc05d0 | 618 | |
c574bbe9 HY |
619 | = ================================= |
620 | 0 NUMA_BALANCING_DISABLED | |
621 | 1 NUMA_BALANCING_NORMAL | |
622 | 2 NUMA_BALANCING_MEMORY_TIERING | |
623 | = ================================= | |
624 | ||
625 | Or NUMA_BALANCING_NORMAL to optimize page placement among different | |
626 | NUMA nodes to reduce remote accessing. On NUMA machines, there is a | |
627 | performance penalty if remote memory is accessed by a CPU. When this | |
628 | feature is enabled the kernel samples what task thread is accessing | |
629 | memory by periodically unmapping pages and later trapping a page | |
630 | fault. At the time of the page fault, it is determined if the data | |
631 | being accessed should be migrated to a local memory node. | |
10fc05d0 MG |
632 | |
633 | The unmapping of pages and trapping faults incur additional overhead that | |
634 | ideally is offset by improved memory locality but there is no universal | |
635 | guarantee. If the target workload is already bound to NUMA nodes then this | |
3624ba7b | 636 | feature should be disabled. |
10fc05d0 | 637 | |
c574bbe9 HY |
638 | Or NUMA_BALANCING_MEMORY_TIERING to optimize page placement among |
639 | different types of memory (represented as different NUMA nodes) to | |
640 | place the hot pages in the fast memory. This is implemented based on | |
641 | unmapping and page fault too. | |
10fc05d0 | 642 | |
c6833e10 HY |
643 | numa_balancing_promote_rate_limit_MBps |
644 | ====================================== | |
645 | ||
646 | Too high promotion/demotion throughput between different memory types | |
647 | may hurt application latency. This can be used to rate limit the | |
648 | promotion throughput. The per-node max promotion throughput in MB/s | |
649 | will be limited to be no more than the set value. | |
650 | ||
651 | A rule of thumb is to set this to less than 1/10 of the PMEM node | |
652 | write bandwidth. | |
653 | ||
e996919b RD |
654 | oops_all_cpu_backtrace |
655 | ====================== | |
60c958d8 GP |
656 | |
657 | If this option is set, the kernel will send an NMI to all CPUs to dump | |
658 | their backtraces when an oops event occurs. It should be used as a last | |
659 | resort in case a panic cannot be triggered (to protect VMs running, for | |
660 | example) or kdump can't be collected. This file shows up if CONFIG_SMP | |
661 | is enabled. | |
662 | ||
663 | 0: Won't show all CPUs backtraces when an oops is detected. | |
664 | This is the default behavior. | |
665 | ||
666 | 1: Will non-maskably interrupt all CPUs and dump their backtraces when | |
667 | an oops event is detected. | |
668 | ||
669 | ||
a3cb66a5 SK |
670 | osrelease, ostype & version |
671 | =========================== | |
53b95375 MCC |
672 | |
673 | :: | |
1da177e4 | 674 | |
53b95375 MCC |
675 | # cat osrelease |
676 | 2.1.88 | |
677 | # cat ostype | |
678 | Linux | |
679 | # cat version | |
680 | #5 Wed Feb 25 21:49:24 MET 1998 | |
1da177e4 | 681 | |
a3cb66a5 SK |
682 | The files ``osrelease`` and ``ostype`` should be clear enough. |
683 | ``version`` | |
1da177e4 LT |
684 | needs a little more clarification however. The '#5' means that |
685 | this is the fifth kernel built from this source base and the | |
686 | date behind it indicates the time the kernel was built. | |
687 | The only way to tune these values is to rebuild the kernel :-) | |
688 | ||
1da177e4 | 689 | |
a3cb66a5 SK |
690 | overflowgid & overflowuid |
691 | ========================= | |
1da177e4 | 692 | |
807094c0 BP |
693 | if your architecture did not always support 32-bit UIDs (i.e. arm, |
694 | i386, m68k, sh, and sparc32), a fixed UID and GID will be returned to | |
695 | applications that use the old 16-bit UID/GID system calls, if the | |
696 | actual UID or GID would exceed 65535. | |
1da177e4 LT |
697 | |
698 | These sysctls allow you to change the value of the fixed UID and GID. | |
699 | The default is 65534. | |
700 | ||
1da177e4 | 701 | |
a3cb66a5 SK |
702 | panic |
703 | ===== | |
1da177e4 | 704 | |
404347e6 SK |
705 | The value in this file determines the behaviour of the kernel on a |
706 | panic: | |
707 | ||
708 | * if zero, the kernel will loop forever; | |
709 | * if negative, the kernel will reboot immediately; | |
710 | * if positive, the kernel will reboot after the corresponding number | |
711 | of seconds. | |
712 | ||
713 | When you use the software watchdog, the recommended setting is 60. | |
807094c0 | 714 | |
9f318e3f | 715 | |
a3cb66a5 SK |
716 | panic_on_io_nmi |
717 | =============== | |
9f318e3f HK |
718 | |
719 | Controls the kernel's behavior when a CPU receives an NMI caused by | |
720 | an IO error. | |
721 | ||
a3cb66a5 SK |
722 | = ================================================================== |
723 | 0 Try to continue operation (default). | |
724 | 1 Panic immediately. The IO error triggered an NMI. This indicates a | |
725 | serious system condition which could result in IO data corruption. | |
726 | Rather than continuing, panicking might be a better choice. Some | |
727 | servers issue this sort of NMI when the dump button is pushed, | |
728 | and you can use this option to take a crash dump. | |
729 | = ================================================================== | |
9f318e3f | 730 | |
807094c0 | 731 | |
a3cb66a5 SK |
732 | panic_on_oops |
733 | ============= | |
1da177e4 LT |
734 | |
735 | Controls the kernel's behaviour when an oops or BUG is encountered. | |
736 | ||
a3cb66a5 SK |
737 | = =================================================================== |
738 | 0 Try to continue operation. | |
739 | 1 Panic immediately. If the `panic` sysctl is also non-zero then the | |
740 | machine will be rebooted. | |
741 | = =================================================================== | |
1da177e4 | 742 | |
1da177e4 | 743 | |
a3cb66a5 SK |
744 | panic_on_stackoverflow |
745 | ====================== | |
55af7796 MH |
746 | |
747 | Controls the kernel's behavior when detecting the overflows of | |
748 | kernel, IRQ and exception stacks except a user stack. | |
a3cb66a5 | 749 | This file shows up if ``CONFIG_DEBUG_STACKOVERFLOW`` is enabled. |
55af7796 | 750 | |
a3cb66a5 SK |
751 | = ========================== |
752 | 0 Try to continue operation. | |
753 | 1 Panic immediately. | |
754 | = ========================== | |
55af7796 | 755 | |
55af7796 | 756 | |
a3cb66a5 SK |
757 | panic_on_unrecovered_nmi |
758 | ======================== | |
9e3961a0 PB |
759 | |
760 | The default Linux behaviour on an NMI of either memory or unknown is | |
761 | to continue operation. For many environments such as scientific | |
762 | computing it is preferable that the box is taken out and the error | |
763 | dealt with than an uncorrected parity/ECC error get propagated. | |
764 | ||
a3cb66a5 | 765 | A small number of systems do generate NMIs for bizarre random reasons |
9e3961a0 PB |
766 | such as power management so the default is off. That sysctl works like |
767 | the existing panic controls already in that directory. | |
768 | ||
9e3961a0 | 769 | |
a3cb66a5 SK |
770 | panic_on_warn |
771 | ============= | |
9e3961a0 PB |
772 | |
773 | Calls panic() in the WARN() path when set to 1. This is useful to avoid | |
774 | a kernel rebuild when attempting to kdump at the location of a WARN(). | |
775 | ||
a3cb66a5 SK |
776 | = ================================================ |
777 | 0 Only WARN(), default behaviour. | |
778 | 1 Call panic() after printing out WARN() location. | |
779 | = ================================================ | |
9e3961a0 | 780 | |
9e3961a0 | 781 | |
a3cb66a5 SK |
782 | panic_print |
783 | =========== | |
81c9d43f FT |
784 | |
785 | Bitmask for printing system info when panic happens. User can chose | |
786 | combination of the following bits: | |
787 | ||
a3cb66a5 | 788 | ===== ============================================ |
53b95375 MCC |
789 | bit 0 print all tasks info |
790 | bit 1 print system memory info | |
791 | bit 2 print timer info | |
a3cb66a5 | 792 | bit 3 print locks info if ``CONFIG_LOCKDEP`` is on |
53b95375 | 793 | bit 4 print ftrace buffer |
a1ff1de0 | 794 | bit 5 print all printk messages in buffer |
8d470a45 | 795 | bit 6 print all CPUs backtrace (if available in the arch) |
a3cb66a5 | 796 | ===== ============================================ |
53b95375 MCC |
797 | |
798 | So for example to print tasks and memory info on panic, user can:: | |
81c9d43f | 799 | |
81c9d43f FT |
800 | echo 3 > /proc/sys/kernel/panic_print |
801 | ||
81c9d43f | 802 | |
a3cb66a5 SK |
803 | panic_on_rcu_stall |
804 | ================== | |
088e9d25 DBO |
805 | |
806 | When set to 1, calls panic() after RCU stall detection messages. This | |
807 | is useful to define the root cause of RCU stalls using a vmcore. | |
808 | ||
a3cb66a5 SK |
809 | = ============================================================ |
810 | 0 Do not panic() when RCU stall takes place, default behavior. | |
811 | 1 panic() after printing RCU stall messages. | |
812 | = ============================================================ | |
088e9d25 | 813 | |
81c65365 JS |
814 | max_rcu_stall_to_panic |
815 | ====================== | |
816 | ||
817 | When ``panic_on_rcu_stall`` is set to 1, this value determines the | |
818 | number of times that RCU can stall before panic() is called. | |
819 | ||
820 | When ``panic_on_rcu_stall`` is set to 0, this value is has no effect. | |
088e9d25 | 821 | |
a3cb66a5 SK |
822 | perf_cpu_time_max_percent |
823 | ========================= | |
14c63f17 DH |
824 | |
825 | Hints to the kernel how much CPU time it should be allowed to | |
826 | use to handle perf sampling events. If the perf subsystem | |
827 | is informed that its samples are exceeding this limit, it | |
828 | will drop its sampling frequency to attempt to reduce its CPU | |
829 | usage. | |
830 | ||
831 | Some perf sampling happens in NMIs. If these samples | |
832 | unexpectedly take too long to execute, the NMIs can become | |
833 | stacked up next to each other so much that nothing else is | |
834 | allowed to execute. | |
835 | ||
a3cb66a5 SK |
836 | ===== ======================================================== |
837 | 0 Disable the mechanism. Do not monitor or correct perf's | |
838 | sampling rate no matter how CPU time it takes. | |
14c63f17 | 839 | |
a3cb66a5 SK |
840 | 1-100 Attempt to throttle perf's sample rate to this |
841 | percentage of CPU. Note: the kernel calculates an | |
842 | "expected" length of each sample event. 100 here means | |
843 | 100% of that expected length. Even if this is set to | |
844 | 100, you may still see sample throttling if this | |
845 | length is exceeded. Set to 0 if you truly do not care | |
846 | how much CPU is consumed. | |
847 | ===== ======================================================== | |
14c63f17 | 848 | |
14c63f17 | 849 | |
a3cb66a5 SK |
850 | perf_event_paranoid |
851 | =================== | |
3379e0c3 BH |
852 | |
853 | Controls use of the performance events system by unprivileged | |
025b16f8 AB |
854 | users (without CAP_PERFMON). The default value is 2. |
855 | ||
856 | For backward compatibility reasons access to system performance | |
857 | monitoring and observability remains open for CAP_SYS_ADMIN | |
858 | privileged processes but CAP_SYS_ADMIN usage for secure system | |
859 | performance monitoring and observability operations is discouraged | |
860 | with respect to CAP_PERFMON use cases. | |
3379e0c3 | 861 | |
53b95375 | 862 | === ================================================================== |
a3cb66a5 | 863 | -1 Allow use of (almost) all events by all users. |
53b95375 | 864 | |
a3cb66a5 SK |
865 | Ignore mlock limit after perf_event_mlock_kb without |
866 | ``CAP_IPC_LOCK``. | |
53b95375 | 867 | |
a3cb66a5 | 868 | >=0 Disallow ftrace function tracepoint by users without |
025b16f8 | 869 | ``CAP_PERFMON``. |
53b95375 | 870 | |
025b16f8 | 871 | Disallow raw tracepoint access by users without ``CAP_PERFMON``. |
3379e0c3 | 872 | |
025b16f8 | 873 | >=1 Disallow CPU event access by users without ``CAP_PERFMON``. |
53b95375 | 874 | |
025b16f8 | 875 | >=2 Disallow kernel profiling by users without ``CAP_PERFMON``. |
53b95375 MCC |
876 | === ================================================================== |
877 | ||
55af7796 | 878 | |
a3cb66a5 SK |
879 | perf_event_max_stack |
880 | ==================== | |
c5dfd78e | 881 | |
a3cb66a5 SK |
882 | Controls maximum number of stack frames to copy for (``attr.sample_type & |
883 | PERF_SAMPLE_CALLCHAIN``) configured events, for instance, when using | |
884 | '``perf record -g``' or '``perf trace --call-graph fp``'. | |
c5dfd78e ACM |
885 | |
886 | This can only be done when no events are in use that have callchains | |
a3cb66a5 | 887 | enabled, otherwise writing to this file will return ``-EBUSY``. |
c5dfd78e ACM |
888 | |
889 | The default value is 127. | |
890 | ||
c5dfd78e | 891 | |
a3cb66a5 SK |
892 | perf_event_mlock_kb |
893 | =================== | |
ac0bb6b7 | 894 | |
751d5b27 | 895 | Control size of per-cpu ring buffer not counted against mlock limit. |
ac0bb6b7 KK |
896 | |
897 | The default value is 512 + 1 page | |
898 | ||
ac0bb6b7 | 899 | |
a3cb66a5 SK |
900 | perf_event_max_contexts_per_stack |
901 | ================================= | |
c85b0334 ACM |
902 | |
903 | Controls maximum number of stack frame context entries for | |
a3cb66a5 SK |
904 | (``attr.sample_type & PERF_SAMPLE_CALLCHAIN``) configured events, for |
905 | instance, when using '``perf record -g``' or '``perf trace --call-graph fp``'. | |
c85b0334 ACM |
906 | |
907 | This can only be done when no events are in use that have callchains | |
a3cb66a5 | 908 | enabled, otherwise writing to this file will return ``-EBUSY``. |
c85b0334 ACM |
909 | |
910 | The default value is 8. | |
911 | ||
c85b0334 | 912 | |
e2012600 RH |
913 | perf_user_access (arm64 only) |
914 | ================================= | |
915 | ||
916 | Controls user space access for reading perf event counters. When set to 1, | |
917 | user space can read performance monitor counter registers directly. | |
918 | ||
919 | The default value is 0 (access disabled). | |
920 | ||
921 | See Documentation/arm64/perf.rst for more information. | |
922 | ||
923 | ||
a3cb66a5 SK |
924 | pid_max |
925 | ======= | |
1da177e4 | 926 | |
beb7dd86 | 927 | PID allocation wrap value. When the kernel's next PID value |
1da177e4 | 928 | reaches this value, it wraps back to a minimum PID value. |
a3cb66a5 | 929 | PIDs of value ``pid_max`` or larger are not allocated. |
1da177e4 | 930 | |
1da177e4 | 931 | |
a3cb66a5 SK |
932 | ns_last_pid |
933 | =========== | |
b8f566b0 PE |
934 | |
935 | The last pid allocated in the current (the one task using this sysctl | |
936 | lives in) pid namespace. When selecting a pid for a next task on fork | |
937 | kernel tries to allocate a number starting from this one. | |
938 | ||
b8f566b0 | 939 | |
a3cb66a5 SK |
940 | powersave-nap (PPC only) |
941 | ======================== | |
1da177e4 LT |
942 | |
943 | If set, Linux-PPC will use the 'nap' mode of powersaving, | |
944 | otherwise the 'doze' mode will be used. | |
945 | ||
a3cb66a5 | 946 | |
1da177e4 LT |
947 | ============================================================== |
948 | ||
a3cb66a5 SK |
949 | printk |
950 | ====== | |
1da177e4 | 951 | |
a3cb66a5 SK |
952 | The four values in printk denote: ``console_loglevel``, |
953 | ``default_message_loglevel``, ``minimum_console_loglevel`` and | |
954 | ``default_console_loglevel`` respectively. | |
1da177e4 LT |
955 | |
956 | These values influence printk() behavior when printing or | |
a3cb66a5 | 957 | logging error messages. See '``man 2 syslog``' for more info on |
1da177e4 LT |
958 | the different loglevels. |
959 | ||
a3cb66a5 SK |
960 | ======================== ===================================== |
961 | console_loglevel messages with a higher priority than | |
962 | this will be printed to the console | |
963 | default_message_loglevel messages without an explicit priority | |
964 | will be printed with this priority | |
965 | minimum_console_loglevel minimum (highest) value to which | |
966 | console_loglevel can be set | |
967 | default_console_loglevel default value for console_loglevel | |
968 | ======================== ===================================== | |
1da177e4 | 969 | |
1da177e4 | 970 | |
a3cb66a5 SK |
971 | printk_delay |
972 | ============ | |
807094c0 | 973 | |
a3cb66a5 | 974 | Delay each printk message in ``printk_delay`` milliseconds |
807094c0 BP |
975 | |
976 | Value from 0 - 10000 is allowed. | |
977 | ||
807094c0 | 978 | |
a3cb66a5 SK |
979 | printk_ratelimit |
980 | ================ | |
1da177e4 | 981 | |
a3cb66a5 | 982 | Some warning messages are rate limited. ``printk_ratelimit`` specifies |
ca30ad85 ON |
983 | the minimum length of time between these messages (in seconds). |
984 | The default value is 5 seconds. | |
1da177e4 LT |
985 | |
986 | A value of 0 will disable rate limiting. | |
987 | ||
1da177e4 | 988 | |
a3cb66a5 SK |
989 | printk_ratelimit_burst |
990 | ====================== | |
1da177e4 | 991 | |
a3cb66a5 | 992 | While long term we enforce one message per `printk_ratelimit`_ |
1da177e4 | 993 | seconds, we do allow a burst of messages to pass through. |
a3cb66a5 | 994 | ``printk_ratelimit_burst`` specifies the number of messages we can |
1da177e4 LT |
995 | send before ratelimiting kicks in. |
996 | ||
ca30ad85 ON |
997 | The default value is 10 messages. |
998 | ||
1da177e4 | 999 | |
a3cb66a5 SK |
1000 | printk_devkmsg |
1001 | ============== | |
53b95375 | 1002 | |
a3cb66a5 | 1003 | Control the logging to ``/dev/kmsg`` from userspace: |
53b95375 | 1004 | |
a3cb66a5 SK |
1005 | ========= ============================================= |
1006 | ratelimit default, ratelimited | |
1007 | on unlimited logging to /dev/kmsg from userspace | |
1008 | off logging to /dev/kmsg disabled | |
1009 | ========= ============================================= | |
750afe7b | 1010 | |
a3cb66a5 | 1011 | The kernel command line parameter ``printk.devkmsg=`` overrides this and is |
750afe7b BP |
1012 | a one-time setting until next reboot: once set, it cannot be changed by |
1013 | this sysctl interface anymore. | |
1014 | ||
a3cb66a5 | 1015 | ============================================================== |
750afe7b | 1016 | |
a3cb66a5 SK |
1017 | |
1018 | pty | |
1019 | === | |
1020 | ||
01478b83 | 1021 | See Documentation/filesystems/devpts.rst. |
a3cb66a5 SK |
1022 | |
1023 | ||
0b227076 SK |
1024 | random |
1025 | ====== | |
1026 | ||
1027 | This is a directory, with the following entries: | |
1028 | ||
1029 | * ``boot_id``: a UUID generated the first time this is retrieved, and | |
1030 | unvarying after that; | |
1031 | ||
069c4ea6 JD |
1032 | * ``uuid``: a UUID generated every time this is retrieved (this can |
1033 | thus be used to generate UUIDs at will); | |
1034 | ||
0b227076 SK |
1035 | * ``entropy_avail``: the pool's entropy count, in bits; |
1036 | ||
1037 | * ``poolsize``: the entropy pool size, in bits; | |
1038 | ||
1039 | * ``urandom_min_reseed_secs``: obsolete (used to determine the minimum | |
489c7fc4 JD |
1040 | number of seconds between urandom pool reseeding). This file is |
1041 | writable for compatibility purposes, but writing to it has no effect | |
069c4ea6 | 1042 | on any RNG behavior; |
0b227076 SK |
1043 | |
1044 | * ``write_wakeup_threshold``: when the entropy count drops below this | |
1045 | (as a number of bits), processes waiting to write to ``/dev/random`` | |
489c7fc4 JD |
1046 | are woken up. This file is writable for compatibility purposes, but |
1047 | writing to it has no effect on any RNG behavior. | |
0b227076 | 1048 | |
0b227076 | 1049 | |
a3cb66a5 SK |
1050 | randomize_va_space |
1051 | ================== | |
1ec7fd50 JK |
1052 | |
1053 | This option can be used to select the type of process address | |
1054 | space randomization that is used in the system, for architectures | |
1055 | that support this feature. | |
1056 | ||
53b95375 MCC |
1057 | == =========================================================================== |
1058 | 0 Turn the process address space randomization off. This is the | |
b7f5ab6f HS |
1059 | default for architectures that do not support this feature anyways, |
1060 | and kernels that are booted with the "norandmaps" parameter. | |
1ec7fd50 | 1061 | |
53b95375 | 1062 | 1 Make the addresses of mmap base, stack and VDSO page randomized. |
1ec7fd50 | 1063 | This, among other things, implies that shared libraries will be |
b7f5ab6f HS |
1064 | loaded to random addresses. Also for PIE-linked binaries, the |
1065 | location of code start is randomized. This is the default if the | |
a3cb66a5 | 1066 | ``CONFIG_COMPAT_BRK`` option is enabled. |
1ec7fd50 | 1067 | |
53b95375 | 1068 | 2 Additionally enable heap randomization. This is the default if |
a3cb66a5 | 1069 | ``CONFIG_COMPAT_BRK`` is disabled. |
b7f5ab6f HS |
1070 | |
1071 | There are a few legacy applications out there (such as some ancient | |
1ec7fd50 | 1072 | versions of libc.so.5 from 1996) that assume that brk area starts |
b7f5ab6f HS |
1073 | just after the end of the code+bss. These applications break when |
1074 | start of the brk area is randomized. There are however no known | |
1ec7fd50 | 1075 | non-legacy applications that would be broken this way, so for most |
b7f5ab6f HS |
1076 | systems it is safe to choose full randomization. |
1077 | ||
1078 | Systems with ancient and/or broken binaries should be configured | |
a3cb66a5 | 1079 | with ``CONFIG_COMPAT_BRK`` enabled, which excludes the heap from process |
b7f5ab6f | 1080 | address space randomization. |
53b95375 | 1081 | == =========================================================================== |
1ec7fd50 | 1082 | |
1ec7fd50 | 1083 | |
a3cb66a5 SK |
1084 | real-root-dev |
1085 | ============= | |
1086 | ||
2793e19d | 1087 | See Documentation/admin-guide/initrd.rst. |
a3cb66a5 SK |
1088 | |
1089 | ||
1090 | reboot-cmd (SPARC only) | |
1091 | ======================= | |
1da177e4 LT |
1092 | |
1093 | ??? This seems to be a way to give an argument to the Sparc | |
1094 | ROM/Flash boot loader. Maybe to tell it what to do after | |
1095 | rebooting. ??? | |
1096 | ||
1da177e4 | 1097 | |
a3cb66a5 SK |
1098 | sched_energy_aware |
1099 | ================== | |
8d5d0cfb QP |
1100 | |
1101 | Enables/disables Energy Aware Scheduling (EAS). EAS starts | |
1102 | automatically on platforms where it can run (that is, | |
1103 | platforms with asymmetric CPU topologies and having an Energy | |
1104 | Model available). If your platform happens to meet the | |
1105 | requirements for EAS but you do not want to use it, change | |
1106 | this value to 0. | |
1107 | ||
fcb50170 MG |
1108 | task_delayacct |
1109 | =============== | |
1110 | ||
1111 | Enables/disables task delay accounting (see | |
0f60a29c | 1112 | Documentation/accounting/delay-accounting.rst. Enabling this feature incurs |
fcb50170 MG |
1113 | a small amount of overhead in the scheduler but is useful for debugging |
1114 | and performance tuning. It is required by some tools such as iotop. | |
8d5d0cfb | 1115 | |
a3cb66a5 SK |
1116 | sched_schedstats |
1117 | ================ | |
cb251765 MG |
1118 | |
1119 | Enables/disables scheduler statistics. Enabling this feature | |
1120 | incurs a small amount of overhead in the scheduler but is | |
1121 | useful for debugging and performance tuning. | |
1122 | ||
d151a23d SK |
1123 | sched_util_clamp_min |
1124 | ==================== | |
1f73d1ab QY |
1125 | |
1126 | Max allowed *minimum* utilization. | |
1127 | ||
1128 | Default value is 1024, which is the maximum possible value. | |
1129 | ||
1130 | It means that any requested uclamp.min value cannot be greater than | |
1131 | sched_util_clamp_min, i.e., it is restricted to the range | |
1132 | [0:sched_util_clamp_min]. | |
1133 | ||
d151a23d SK |
1134 | sched_util_clamp_max |
1135 | ==================== | |
1f73d1ab QY |
1136 | |
1137 | Max allowed *maximum* utilization. | |
1138 | ||
1139 | Default value is 1024, which is the maximum possible value. | |
1140 | ||
1141 | It means that any requested uclamp.max value cannot be greater than | |
1142 | sched_util_clamp_max, i.e., it is restricted to the range | |
1143 | [0:sched_util_clamp_max]. | |
1144 | ||
d151a23d SK |
1145 | sched_util_clamp_min_rt_default |
1146 | =============================== | |
1f73d1ab QY |
1147 | |
1148 | By default Linux is tuned for performance. Which means that RT tasks always run | |
1149 | at the highest frequency and most capable (highest capacity) CPU (in | |
1150 | heterogeneous systems). | |
1151 | ||
1152 | Uclamp achieves this by setting the requested uclamp.min of all RT tasks to | |
1153 | 1024 by default, which effectively boosts the tasks to run at the highest | |
1154 | frequency and biases them to run on the biggest CPU. | |
1155 | ||
1156 | This knob allows admins to change the default behavior when uclamp is being | |
1157 | used. In battery powered devices particularly, running at the maximum | |
1158 | capacity and frequency will increase energy consumption and shorten the battery | |
1159 | life. | |
1160 | ||
1161 | This knob is only effective for RT tasks which the user hasn't modified their | |
1162 | requested uclamp.min value via sched_setattr() syscall. | |
1163 | ||
1164 | This knob will not escape the range constraint imposed by sched_util_clamp_min | |
1165 | defined above. | |
1166 | ||
1167 | For example if | |
1168 | ||
1169 | sched_util_clamp_min_rt_default = 800 | |
1170 | sched_util_clamp_min = 600 | |
1171 | ||
1172 | Then the boost will be clamped to 600 because 800 is outside of the permissible | |
1173 | range of [0:600]. This could happen for instance if a powersave mode will | |
1174 | restrict all boosts temporarily by modifying sched_util_clamp_min. As soon as | |
1175 | this restriction is lifted, the requested sched_util_clamp_min_rt_default | |
1176 | will take effect. | |
cb251765 | 1177 | |
a3cb66a5 SK |
1178 | seccomp |
1179 | ======= | |
1180 | ||
2793e19d | 1181 | See Documentation/userspace-api/seccomp_filter.rst. |
a3cb66a5 SK |
1182 | |
1183 | ||
1184 | sg-big-buff | |
1185 | =========== | |
1da177e4 LT |
1186 | |
1187 | This file shows the size of the generic SCSI (sg) buffer. | |
1188 | You can't tune it just yet, but you could change it on | |
a3cb66a5 SK |
1189 | compile time by editing ``include/scsi/sg.h`` and changing |
1190 | the value of ``SG_BIG_BUFF``. | |
1da177e4 LT |
1191 | |
1192 | There shouldn't be any reason to change this value. If | |
1193 | you can come up with one, you probably know what you | |
1194 | are doing anyway :) | |
1195 | ||
1da177e4 | 1196 | |
a3cb66a5 SK |
1197 | shmall |
1198 | ====== | |
358e419f CALP |
1199 | |
1200 | This parameter sets the total amount of shared memory pages that | |
a3cb66a5 SK |
1201 | can be used system wide. Hence, ``shmall`` should always be at least |
1202 | ``ceil(shmmax/PAGE_SIZE)``. | |
358e419f | 1203 | |
a3cb66a5 SK |
1204 | If you are not sure what the default ``PAGE_SIZE`` is on your Linux |
1205 | system, you can run the following command:: | |
358e419f | 1206 | |
53b95375 | 1207 | # getconf PAGE_SIZE |
358e419f | 1208 | |
358e419f | 1209 | |
a3cb66a5 SK |
1210 | shmmax |
1211 | ====== | |
1da177e4 LT |
1212 | |
1213 | This value can be used to query and set the run time limit | |
1214 | on the maximum shared memory segment size that can be created. | |
807094c0 | 1215 | Shared memory segments up to 1Gb are now supported in the |
a3cb66a5 | 1216 | kernel. This value defaults to ``SHMMAX``. |
1da177e4 | 1217 | |
1da177e4 | 1218 | |
a3cb66a5 SK |
1219 | shmmni |
1220 | ====== | |
1221 | ||
fa5b5264 SK |
1222 | This value determines the maximum number of shared memory segments. |
1223 | 4096 by default (``SHMMNI``). | |
1224 | ||
a3cb66a5 SK |
1225 | |
1226 | shm_rmid_forced | |
1227 | =============== | |
b34a6b1d VK |
1228 | |
1229 | Linux lets you set resource limits, including how much memory one | |
a3cb66a5 | 1230 | process can consume, via ``setrlimit(2)``. Unfortunately, shared memory |
b34a6b1d VK |
1231 | segments are allowed to exist without association with any process, and |
1232 | thus might not be counted against any resource limits. If enabled, | |
1233 | shared memory segments are automatically destroyed when their attach | |
1234 | count becomes zero after a detach or a process termination. It will | |
1235 | also destroy segments that were created, but never attached to, on exit | |
a3cb66a5 | 1236 | from the process. The only use left for ``IPC_RMID`` is to immediately |
b34a6b1d VK |
1237 | destroy an unattached segment. Of course, this breaks the way things are |
1238 | defined, so some applications might stop working. Note that this | |
1239 | feature will do you no good unless you also configure your resource | |
a3cb66a5 | 1240 | limits (in particular, ``RLIMIT_AS`` and ``RLIMIT_NPROC``). Most systems don't |
b34a6b1d VK |
1241 | need this. |
1242 | ||
1243 | Note that if you change this from 0 to 1, already created segments | |
1244 | without users and with a dead originative process will be destroyed. | |
1245 | ||
b34a6b1d | 1246 | |
a3cb66a5 SK |
1247 | sysctl_writes_strict |
1248 | ==================== | |
f4aacea2 KC |
1249 | |
1250 | Control how file position affects the behavior of updating sysctl values | |
a3cb66a5 | 1251 | via the ``/proc/sys`` interface: |
f4aacea2 | 1252 | |
53b95375 MCC |
1253 | == ====================================================================== |
1254 | -1 Legacy per-write sysctl value handling, with no printk warnings. | |
f4aacea2 KC |
1255 | Each write syscall must fully contain the sysctl value to be |
1256 | written, and multiple writes on the same sysctl file descriptor | |
1257 | will rewrite the sysctl value, regardless of file position. | |
53b95375 | 1258 | 0 Same behavior as above, but warn about processes that perform writes |
41662f5c | 1259 | to a sysctl file descriptor when the file position is not 0. |
53b95375 | 1260 | 1 (default) Respect file position when writing sysctl strings. Multiple |
41662f5c KC |
1261 | writes will append to the sysctl value buffer. Anything past the max |
1262 | length of the sysctl value buffer will be ignored. Writes to numeric | |
1263 | sysctl entries must always be at file position 0 and the value must | |
1264 | be fully contained in the buffer sent in the write syscall. | |
53b95375 | 1265 | == ====================================================================== |
f4aacea2 | 1266 | |
f4aacea2 | 1267 | |
a3cb66a5 SK |
1268 | softlockup_all_cpu_backtrace |
1269 | ============================ | |
ed235875 AT |
1270 | |
1271 | This value controls the soft lockup detector thread's behavior | |
1272 | when a soft lockup condition is detected as to whether or not | |
1273 | to gather further debug information. If enabled, each cpu will | |
1274 | be issued an NMI and instructed to capture stack trace. | |
1275 | ||
1276 | This feature is only applicable for architectures which support | |
1277 | NMI. | |
1278 | ||
a3cb66a5 SK |
1279 | = ============================================ |
1280 | 0 Do nothing. This is the default behavior. | |
1281 | 1 On detection capture more debug information. | |
1282 | = ============================================ | |
ed235875 | 1283 | |
ed235875 | 1284 | |
0a07bef6 GP |
1285 | softlockup_panic |
1286 | ================= | |
1287 | ||
1288 | This parameter can be used to control whether the kernel panics | |
1289 | when a soft lockup is detected. | |
1290 | ||
1291 | = ============================================ | |
1292 | 0 Don't panic on soft lockup. | |
1293 | 1 Panic on soft lockup. | |
1294 | = ============================================ | |
1295 | ||
1296 | This can also be set using the softlockup_panic kernel parameter. | |
1297 | ||
1298 | ||
a3cb66a5 SK |
1299 | soft_watchdog |
1300 | ============= | |
195daf66 UO |
1301 | |
1302 | This parameter can be used to control the soft lockup detector. | |
1303 | ||
a3cb66a5 SK |
1304 | = ================================= |
1305 | 0 Disable the soft lockup detector. | |
1306 | 1 Enable the soft lockup detector. | |
1307 | = ================================= | |
195daf66 UO |
1308 | |
1309 | The soft lockup detector monitors CPUs for threads that are hogging the CPUs | |
256f7a67 WQ |
1310 | without rescheduling voluntarily, and thus prevent the 'migration/N' threads |
1311 | from running, causing the watchdog work fail to execute. The mechanism depends | |
1312 | on the CPUs ability to respond to timer interrupts which are needed for the | |
1313 | watchdog work to be queued by the watchdog timer function, otherwise the NMI | |
1314 | watchdog — if enabled — can detect a hard lockup condition. | |
195daf66 | 1315 | |
195daf66 | 1316 | |
a3cb66a5 SK |
1317 | stack_erasing |
1318 | ============= | |
964c9dff AP |
1319 | |
1320 | This parameter can be used to control kernel stack erasing at the end | |
a3cb66a5 | 1321 | of syscalls for kernels built with ``CONFIG_GCC_PLUGIN_STACKLEAK``. |
964c9dff AP |
1322 | |
1323 | That erasing reduces the information which kernel stack leak bugs | |
1324 | can reveal and blocks some uninitialized stack variable attacks. | |
1325 | The tradeoff is the performance impact: on a single CPU system kernel | |
1326 | compilation sees a 1% slowdown, other systems and workloads may vary. | |
1327 | ||
a3cb66a5 SK |
1328 | = ==================================================================== |
1329 | 0 Kernel stack erasing is disabled, STACKLEAK_METRICS are not updated. | |
1330 | 1 Kernel stack erasing is enabled (default), it is performed before | |
1331 | returning to the userspace at the end of syscalls. | |
1332 | = ==================================================================== | |
1333 | ||
1334 | ||
1335 | stop-a (SPARC only) | |
1336 | =================== | |
964c9dff | 1337 | |
a1ad4f15 SK |
1338 | Controls Stop-A: |
1339 | ||
1340 | = ==================================== | |
1341 | 0 Stop-A has no effect. | |
1342 | 1 Stop-A breaks to the PROM (default). | |
1343 | = ==================================== | |
1344 | ||
1345 | Stop-A is always enabled on a panic, so that the user can return to | |
1346 | the boot PROM. | |
1347 | ||
a3cb66a5 SK |
1348 | |
1349 | sysrq | |
1350 | ===== | |
1351 | ||
2793e19d | 1352 | See Documentation/admin-guide/sysrq.rst. |
53b95375 | 1353 | |
964c9dff | 1354 | |
896dd323 | 1355 | tainted |
53b95375 | 1356 | ======= |
1da177e4 | 1357 | |
9c4560e5 KC |
1358 | Non-zero if the kernel has been tainted. Numeric values, which can be |
1359 | ORed together. The letters are seen in "Tainted" line of Oops reports. | |
1360 | ||
53b95375 MCC |
1361 | ====== ===== ============================================================== |
1362 | 1 `(P)` proprietary module was loaded | |
1363 | 2 `(F)` module was force loaded | |
547f574f | 1364 | 4 `(S)` kernel running on an out of specification system |
53b95375 MCC |
1365 | 8 `(R)` module was force unloaded |
1366 | 16 `(M)` processor reported a Machine Check Exception (MCE) | |
1367 | 32 `(B)` bad page referenced or some unexpected page flags | |
1368 | 64 `(U)` taint requested by userspace application | |
1369 | 128 `(D)` kernel died recently, i.e. there was an OOPS or BUG | |
1370 | 256 `(A)` an ACPI table was overridden by user | |
1371 | 512 `(W)` kernel issued warning | |
1372 | 1024 `(C)` staging driver was loaded | |
1373 | 2048 `(I)` workaround for bug in platform firmware applied | |
1374 | 4096 `(O)` externally-built ("out-of-tree") module was loaded | |
1375 | 8192 `(E)` unsigned module was loaded | |
1376 | 16384 `(L)` soft lockup occurred | |
1377 | 32768 `(K)` kernel has been live patched | |
1378 | 65536 `(X)` Auxiliary taint, defined and used by for distros | |
1379 | 131072 `(T)` The kernel was built with the struct randomization plugin | |
1380 | ====== ===== ============================================================== | |
896dd323 | 1381 | |
2793e19d | 1382 | See Documentation/admin-guide/tainted-kernels.rst for more information. |
1da177e4 | 1383 | |
db38d5c1 RA |
1384 | Note: |
1385 | writes to this sysctl interface will fail with ``EINVAL`` if the kernel is | |
1386 | booted with the command line option ``panic_on_taint=<bitmask>,nousertaint`` | |
1387 | and any of the ORed together values being written to ``tainted`` match with | |
1388 | the bitmask declared on panic_on_taint. | |
2793e19d MCC |
1389 | See Documentation/admin-guide/kernel-parameters.rst for more details on |
1390 | that particular kernel command line option and its optional | |
1391 | ``nousertaint`` switch. | |
760df93e | 1392 | |
a3cb66a5 SK |
1393 | threads-max |
1394 | =========== | |
0ec62afe HS |
1395 | |
1396 | This value controls the maximum number of threads that can be created | |
a3cb66a5 | 1397 | using ``fork()``. |
0ec62afe HS |
1398 | |
1399 | During initialization the kernel sets this value such that even if the | |
1400 | maximum number of threads is created, the thread structures occupy only | |
1401 | a part (1/8th) of the available RAM pages. | |
1402 | ||
a3cb66a5 | 1403 | The minimum value that can be written to ``threads-max`` is 1. |
53b95375 | 1404 | |
a3cb66a5 SK |
1405 | The maximum value that can be written to ``threads-max`` is given by the |
1406 | constant ``FUTEX_TID_MASK`` (0x3fffffff). | |
53b95375 | 1407 | |
a3cb66a5 SK |
1408 | If a value outside of this range is written to ``threads-max`` an |
1409 | ``EINVAL`` error occurs. | |
0ec62afe | 1410 | |
0ec62afe | 1411 | |
50cdae76 SK |
1412 | traceoff_on_warning |
1413 | =================== | |
1414 | ||
2793e19d | 1415 | When set, disables tracing (see Documentation/trace/ftrace.rst) when a |
50cdae76 SK |
1416 | ``WARN()`` is hit. |
1417 | ||
1418 | ||
1419 | tracepoint_printk | |
1420 | ================= | |
1421 | ||
1422 | When tracepoints are sent to printk() (enabled by the ``tp_printk`` | |
1423 | boot parameter), this entry provides runtime control:: | |
1424 | ||
1425 | echo 0 > /proc/sys/kernel/tracepoint_printk | |
1426 | ||
1427 | will stop tracepoints from being sent to printk(), and:: | |
1428 | ||
1429 | echo 1 > /proc/sys/kernel/tracepoint_printk | |
1430 | ||
1431 | will send them to printk() again. | |
1432 | ||
1433 | This only works if the kernel was booted with ``tp_printk`` enabled. | |
1434 | ||
2793e19d MCC |
1435 | See Documentation/admin-guide/kernel-parameters.rst and |
1436 | Documentation/trace/boottime-trace.rst. | |
50cdae76 SK |
1437 | |
1438 | ||
997c798e SK |
1439 | .. _unaligned-dump-stack: |
1440 | ||
1441 | unaligned-dump-stack (ia64) | |
1442 | =========================== | |
1443 | ||
1444 | When logging unaligned accesses, controls whether the stack is | |
1445 | dumped. | |
1446 | ||
1447 | = =================================================== | |
1448 | 0 Do not dump the stack. This is the default setting. | |
1449 | 1 Dump the stack. | |
1450 | = =================================================== | |
1451 | ||
1452 | See also `ignore-unaligned-usertrap`_. | |
1453 | ||
1454 | ||
1455 | unaligned-trap | |
1456 | ============== | |
1457 | ||
1458 | On architectures where unaligned accesses cause traps, and where this | |
1459 | feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently, | |
1460 | ``arc`` and ``parisc``), controls whether unaligned traps are caught | |
1461 | and emulated (instead of failing). | |
1462 | ||
1463 | = ======================================================== | |
1464 | 0 Do not emulate unaligned accesses. | |
1465 | 1 Emulate unaligned accesses. This is the default setting. | |
1466 | = ======================================================== | |
1467 | ||
1468 | See also `ignore-unaligned-usertrap`_. | |
1469 | ||
1470 | ||
a3cb66a5 SK |
1471 | unknown_nmi_panic |
1472 | ================= | |
760df93e | 1473 | |
807094c0 BP |
1474 | The value in this file affects behavior of handling NMI. When the |
1475 | value is non-zero, unknown NMI is trapped and then panic occurs. At | |
1476 | that time, kernel debugging information is displayed on console. | |
760df93e | 1477 | |
807094c0 BP |
1478 | NMI switch that most IA32 servers have fires unknown NMI up, for |
1479 | example. If a system hangs up, try pressing the NMI switch. | |
08825c90 | 1480 | |
08825c90 | 1481 | |
5d8e5aee SK |
1482 | unprivileged_bpf_disabled |
1483 | ========================= | |
1484 | ||
1485 | Writing 1 to this entry will disable unprivileged calls to ``bpf()``; | |
08389d88 DB |
1486 | once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` or ``CAP_BPF`` |
1487 | will return ``-EPERM``. Once set to 1, this can't be cleared from the | |
1488 | running kernel anymore. | |
5d8e5aee | 1489 | |
08389d88 DB |
1490 | Writing 2 to this entry will also disable unprivileged calls to ``bpf()``, |
1491 | however, an admin can still change this setting later on, if needed, by | |
1492 | writing 0 or 1 to this entry. | |
5d8e5aee | 1493 | |
08389d88 DB |
1494 | If ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this |
1495 | entry will default to 2 instead of 0. | |
1496 | ||
1497 | = ============================================================= | |
1498 | 0 Unprivileged calls to ``bpf()`` are enabled | |
1499 | 1 Unprivileged calls to ``bpf()`` are disabled without recovery | |
1500 | 2 Unprivileged calls to ``bpf()`` are disabled | |
1501 | = ============================================================= | |
5d8e5aee | 1502 | |
a3cb66a5 SK |
1503 | watchdog |
1504 | ======== | |
195daf66 UO |
1505 | |
1506 | This parameter can be used to disable or enable the soft lockup detector | |
a3cb66a5 | 1507 | *and* the NMI watchdog (i.e. the hard lockup detector) at the same time. |
195daf66 | 1508 | |
a3cb66a5 SK |
1509 | = ============================== |
1510 | 0 Disable both lockup detectors. | |
1511 | 1 Enable both lockup detectors. | |
1512 | = ============================== | |
195daf66 UO |
1513 | |
1514 | The soft lockup detector and the NMI watchdog can also be disabled or | |
a3cb66a5 SK |
1515 | enabled individually, using the ``soft_watchdog`` and ``nmi_watchdog`` |
1516 | parameters. | |
1517 | If the ``watchdog`` parameter is read, for example by executing:: | |
195daf66 UO |
1518 | |
1519 | cat /proc/sys/kernel/watchdog | |
1520 | ||
a3cb66a5 SK |
1521 | the output of this command (0 or 1) shows the logical OR of |
1522 | ``soft_watchdog`` and ``nmi_watchdog``. | |
195daf66 | 1523 | |
195daf66 | 1524 | |
a3cb66a5 SK |
1525 | watchdog_cpumask |
1526 | ================ | |
fe4ba3c3 CM |
1527 | |
1528 | This value can be used to control on which cpus the watchdog may run. | |
a3cb66a5 | 1529 | The default cpumask is all possible cores, but if ``NO_HZ_FULL`` is |
fe4ba3c3 | 1530 | enabled in the kernel config, and cores are specified with the |
a3cb66a5 | 1531 | ``nohz_full=`` boot argument, those cores are excluded by default. |
fe4ba3c3 CM |
1532 | Offline cores can be included in this mask, and if the core is later |
1533 | brought online, the watchdog will be started based on the mask value. | |
1534 | ||
a3cb66a5 | 1535 | Typically this value would only be touched in the ``nohz_full`` case |
fe4ba3c3 CM |
1536 | to re-enable cores that by default were not running the watchdog, |
1537 | if a kernel lockup was suspected on those cores. | |
1538 | ||
1539 | The argument value is the standard cpulist format for cpumasks, | |
1540 | so for example to enable the watchdog on cores 0, 2, 3, and 4 you | |
53b95375 | 1541 | might say:: |
fe4ba3c3 CM |
1542 | |
1543 | echo 0,2-4 > /proc/sys/kernel/watchdog_cpumask | |
1544 | ||
fe4ba3c3 | 1545 | |
a3cb66a5 SK |
1546 | watchdog_thresh |
1547 | =============== | |
08825c90 LZ |
1548 | |
1549 | This value can be used to control the frequency of hrtimer and NMI | |
1550 | events and the soft and hard lockup thresholds. The default threshold | |
1551 | is 10 seconds. | |
1552 | ||
a3cb66a5 | 1553 | The softlockup threshold is (``2 * watchdog_thresh``). Setting this |
08825c90 | 1554 | tunable to zero will disable lockup detection altogether. |