lib, include/linux: add usercopy failure capability
[linux-block.git] / Documentation / fault-injection / fault-injection.rst
CommitLineData
10ffebbe 1===========================================
de1ba09b
AM
2Fault injection capabilities infrastructure
3===========================================
4
1892ce4c 5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
de1ba09b
AM
6
7
8Available fault injection capabilities
9--------------------------------------
10
10ffebbe 11- failslab
de1ba09b
AM
12
13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
14
10ffebbe 15- fail_page_alloc
de1ba09b
AM
16
17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
18
2c739ced
AL
19- fail_usercopy
20
21 injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
22
10ffebbe 23- fail_futex
ab51fbab
DB
24
25 injects futex deadlock and uaddr fault errors.
26
10ffebbe 27- fail_make_request
de1ba09b 28
5d0ffa2b 29 injects disk IO errors on devices permitted by setting
de1ba09b 30 /sys/block/<device>/make-it-fail or
ed00aabd 31 /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
de1ba09b 32
10ffebbe 33- fail_mmc_request
1e4cb22b
PF
34
35 injects MMC data errors on devices permitted by setting
36 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
37
10ffebbe 38- fail_function
4b1a29a7
MH
39
40 injects error return on specific functions, which are marked by
41 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
42 under /sys/kernel/debug/fail_function. No boot option supported.
43
10ffebbe 44- NVMe fault injection
cf4182f3
TT
45
46 inject NVMe status code and retry flag on devices permitted by setting
47 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
48 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
49 retry flag can be set via the debugfs.
50
51
de1ba09b
AM
52Configure fault-injection capabilities behavior
53-----------------------------------------------
54
10ffebbe
MCC
55debugfs entries
56^^^^^^^^^^^^^^^
de1ba09b
AM
57
58fault-inject-debugfs kernel module provides some debugfs entries for runtime
59configuration of fault-injection capabilities.
60
156f5a78 61- /sys/kernel/debug/fail*/probability:
de1ba09b
AM
62
63 likelihood of failure injection, in percent.
10ffebbe 64
de1ba09b
AM
65 Format: <percent>
66
5d0ffa2b
DM
67 Note that one-failure-per-hundred is a very high error rate
68 for some testcases. Consider setting probability=100 and configure
156f5a78 69 /sys/kernel/debug/fail*/interval for such testcases.
de1ba09b 70
156f5a78 71- /sys/kernel/debug/fail*/interval:
de1ba09b
AM
72
73 specifies the interval between failures, for calls to
74 should_fail() that pass all the other tests.
75
76 Note that if you enable this, by setting interval>1, you will
77 probably want to set probability=100.
78
156f5a78 79- /sys/kernel/debug/fail*/times:
de1ba09b
AM
80
81 specifies how many times failures may happen at most.
82 A value of -1 means "no limit".
83
156f5a78 84- /sys/kernel/debug/fail*/space:
de1ba09b
AM
85
86 specifies an initial resource "budget", decremented by "size"
87 on each call to should_fail(,size). Failure injection is
88 suppressed until "space" reaches zero.
89
156f5a78 90- /sys/kernel/debug/fail*/verbose
de1ba09b
AM
91
92 Format: { 0 | 1 | 2 }
10ffebbe 93
5d0ffa2b
DM
94 specifies the verbosity of the messages when failure is
95 injected. '0' means no messages; '1' will print only a single
96 log line per failure; '2' will print a call trace too -- useful
97 to debug the problems revealed by fault injection.
de1ba09b 98
156f5a78 99- /sys/kernel/debug/fail*/task-filter:
de1ba09b 100
5d0ffa2b 101 Format: { 'Y' | 'N' }
10ffebbe 102
5d0ffa2b 103 A value of 'N' disables filtering by process (default).
de1ba09b
AM
104 Any positive value limits failures to only processes indicated by
105 /proc/<pid>/make-it-fail==1.
106
10ffebbe
MCC
107- /sys/kernel/debug/fail*/require-start,
108 /sys/kernel/debug/fail*/require-end,
109 /sys/kernel/debug/fail*/reject-start,
110 /sys/kernel/debug/fail*/reject-end:
de1ba09b
AM
111
112 specifies the range of virtual addresses tested during
113 stacktrace walking. Failure is injected only if some caller
329409ae
AM
114 in the walked stacktrace lies within the required range, and
115 none lies within the rejected range.
116 Default required range is [0,ULONG_MAX) (whole of virtual address space).
117 Default rejected range is [0,0).
de1ba09b 118
156f5a78 119- /sys/kernel/debug/fail*/stacktrace-depth:
de1ba09b
AM
120
121 specifies the maximum stacktrace depth walked during search
5d0ffa2b
DM
122 for a caller within [require-start,require-end) OR
123 [reject-start,reject-end).
de1ba09b 124
156f5a78 125- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
de1ba09b 126
5d0ffa2b 127 Format: { 'Y' | 'N' }
10ffebbe 128
5d0ffa2b 129 default is 'N', setting it to 'Y' won't inject failures into
de1ba09b
AM
130 highmem/user allocations.
131
156f5a78
GL
132- /sys/kernel/debug/failslab/ignore-gfp-wait:
133- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
de1ba09b 134
5d0ffa2b 135 Format: { 'Y' | 'N' }
10ffebbe 136
5d0ffa2b 137 default is 'N', setting it to 'Y' will inject failures
de1ba09b
AM
138 only into non-sleep allocations (GFP_ATOMIC allocations).
139
156f5a78 140- /sys/kernel/debug/fail_page_alloc/min-order:
54114994
AM
141
142 specifies the minimum page allocation order to be injected
143 failures.
144
ab51fbab
DB
145- /sys/kernel/debug/fail_futex/ignore-private:
146
147 Format: { 'Y' | 'N' }
10ffebbe 148
ab51fbab
DB
149 default is 'N', setting it to 'Y' will disable failure injections
150 when dealing with private (address space) futexes.
151
4b1a29a7
MH
152- /sys/kernel/debug/fail_function/inject:
153
154 Format: { 'function-name' | '!function-name' | '' }
10ffebbe 155
4b1a29a7
MH
156 specifies the target function of error injection by name.
157 If the function name leads '!' prefix, given function is
158 removed from injection list. If nothing specified ('')
159 injection list is cleared.
160
161- /sys/kernel/debug/fail_function/injectable:
162
163 (read only) shows error injectable functions and what type of
164 error values can be specified. The error type will be one of
165 below;
166 - NULL: retval must be 0.
167 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
168 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
169
170- /sys/kernel/debug/fail_function/<functiuon-name>/retval:
171
172 specifies the "error" return value to inject to the given
173 function for given function. This will be created when
174 user specifies new injection entry.
175
10ffebbe
MCC
176Boot option
177^^^^^^^^^^^
de1ba09b
AM
178
179In order to inject faults while debugfs is not available (early boot time),
10ffebbe 180use the boot option::
de1ba09b
AM
181
182 failslab=
183 fail_page_alloc=
2c739ced 184 fail_usercopy=
1e4cb22b 185 fail_make_request=
ab51fbab 186 fail_futex=
199e3f4b 187 mmc_core.fail_request=<interval>,<probability>,<space>,<times>
de1ba09b 188
10ffebbe
MCC
189proc entries
190^^^^^^^^^^^^
e41d5818 191
10ffebbe
MCC
192- /proc/<pid>/fail-nth,
193 /proc/self/task/<tid>/fail-nth:
e41d5818 194
9049f2f6 195 Write to this file of integer N makes N-th call in the task fail.
bfc74093
AM
196 Read from this file returns a integer value. A value of '0' indicates
197 that the fault setup with a previous write to this file was injected.
198 A positive integer N indicates that the fault wasn't yet injected.
e41d5818
DV
199 Note that this file enables all types of faults (slab, futex, etc).
200 This setting takes precedence over all other generic debugfs settings
201 like probability, interval, times, etc. But per-capability settings
202 (e.g. fail_futex/ignore-private) take precedence over it.
203
204 This feature is intended for systematic testing of faults in a single
205 system call. See an example below.
206
de1ba09b
AM
207How to add new fault injection capability
208-----------------------------------------
209
10ffebbe 210- #include <linux/fault-inject.h>
de1ba09b 211
10ffebbe 212- define the fault attributes
de1ba09b 213
2d87948a 214 DECLARE_FAULT_ATTR(name);
de1ba09b
AM
215
216 Please see the definition of struct fault_attr in fault-inject.h
217 for details.
218
10ffebbe 219- provide a way to configure fault attributes
de1ba09b
AM
220
221- boot option
222
223 If you need to enable the fault injection capability from boot time, you can
5d0ffa2b 224 provide boot option to configure it. There is a helper function for it:
de1ba09b 225
5d0ffa2b 226 setup_fault_attr(attr, str);
de1ba09b
AM
227
228- debugfs entries
229
2c739ced 230 failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
5d0ffa2b 231 Helper functions:
de1ba09b 232
dd48c085 233 fault_create_debugfs_attr(name, parent, attr);
de1ba09b
AM
234
235- module parameters
236
237 If the scope of the fault injection capability is limited to a
238 single kernel module, it is better to provide module parameters to
239 configure the fault attributes.
240
10ffebbe 241- add a hook to insert failures
de1ba09b 242
10ffebbe 243 Upon should_fail() returning true, client code should inject a failure:
de1ba09b 244
5d0ffa2b 245 should_fail(attr, size);
de1ba09b
AM
246
247Application Examples
248--------------------
249
10ffebbe 250- Inject slab allocation failures into module init/exit code::
de1ba09b 251
10ffebbe 252 #!/bin/bash
de1ba09b 253
10ffebbe
MCC
254 FAILTYPE=failslab
255 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
256 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
257 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
258 echo -1 > /sys/kernel/debug/$FAILTYPE/times
259 echo 0 > /sys/kernel/debug/$FAILTYPE/space
260 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
261 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
de1ba09b 262
10ffebbe
MCC
263 faulty_system()
264 {
18584870 265 bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
10ffebbe 266 }
de1ba09b 267
10ffebbe
MCC
268 if [ $# -eq 0 ]
269 then
18584870
AM
270 echo "Usage: $0 modulename [ modulename ... ]"
271 exit 1
10ffebbe 272 fi
18584870 273
10ffebbe
MCC
274 for m in $*
275 do
18584870
AM
276 echo inserting $m...
277 faulty_system modprobe $m
de1ba09b 278
18584870
AM
279 echo removing $m...
280 faulty_system modprobe -r $m
10ffebbe 281 done
de1ba09b
AM
282
283------------------------------------------------------------------------------
284
10ffebbe 285- Inject page allocation failures only for a specific module::
de1ba09b 286
10ffebbe 287 #!/bin/bash
de1ba09b 288
10ffebbe
MCC
289 FAILTYPE=fail_page_alloc
290 module=$1
de1ba09b 291
10ffebbe
MCC
292 if [ -z $module ]
293 then
18584870
AM
294 echo "Usage: $0 <modulename>"
295 exit 1
10ffebbe 296 fi
de1ba09b 297
10ffebbe 298 modprobe $module
de1ba09b 299
10ffebbe
MCC
300 if [ ! -d /sys/module/$module/sections ]
301 then
18584870
AM
302 echo Module $module is not loaded
303 exit 1
10ffebbe 304 fi
18584870 305
10ffebbe
MCC
306 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
307 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
18584870 308
10ffebbe
MCC
309 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
310 echo 10 > /sys/kernel/debug/$FAILTYPE/probability
311 echo 100 > /sys/kernel/debug/$FAILTYPE/interval
312 echo -1 > /sys/kernel/debug/$FAILTYPE/times
313 echo 0 > /sys/kernel/debug/$FAILTYPE/space
314 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
315 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
316 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
317 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
18584870 318
10ffebbe 319 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
18584870 320
10ffebbe
MCC
321 echo "Injecting errors into the module $module... (interrupt to stop)"
322 sleep 1000000
de1ba09b 323
4b1a29a7
MH
324------------------------------------------------------------------------------
325
10ffebbe
MCC
326- Inject open_ctree error while btrfs mount::
327
328 #!/bin/bash
329
330 rm -f testfile.img
331 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
332 DEVICE=$(losetup --show -f testfile.img)
333 mkfs.btrfs -f $DEVICE
334 mkdir -p tmpmnt
335
336 FAILTYPE=fail_function
337 FAILFUNC=open_ctree
338 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
339 echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
340 echo N > /sys/kernel/debug/$FAILTYPE/task-filter
341 echo 100 > /sys/kernel/debug/$FAILTYPE/probability
342 echo 0 > /sys/kernel/debug/$FAILTYPE/interval
343 echo -1 > /sys/kernel/debug/$FAILTYPE/times
344 echo 0 > /sys/kernel/debug/$FAILTYPE/space
345 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
346
347 mount -t btrfs $DEVICE tmpmnt
348 if [ $? -ne 0 ]
349 then
4b1a29a7 350 echo "SUCCESS!"
10ffebbe 351 else
4b1a29a7
MH
352 echo "FAILED!"
353 umount tmpmnt
10ffebbe 354 fi
4b1a29a7 355
10ffebbe 356 echo > /sys/kernel/debug/$FAILTYPE/inject
4b1a29a7 357
10ffebbe
MCC
358 rmdir tmpmnt
359 losetup -d $DEVICE
360 rm testfile.img
4b1a29a7
MH
361
362
c24aa64d
AM
363Tool to run command with failslab or fail_page_alloc
364----------------------------------------------------
365In order to make it easier to accomplish the tasks mentioned above, we can use
366tools/testing/fault-injection/failcmd.sh. Please run a command
367"./tools/testing/fault-injection/failcmd.sh --help" for more information and
368see the following examples.
369
370Examples:
371
372Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
10ffebbe 373allocation failure::
c24aa64d
AM
374
375 # ./tools/testing/fault-injection/failcmd.sh \
376 -- make -C tools/testing/selftests/ run_tests
377
378Same as above except to specify 100 times failures at most instead of one time
10ffebbe 379at most by default::
c24aa64d
AM
380
381 # ./tools/testing/fault-injection/failcmd.sh --times=100 \
382 -- make -C tools/testing/selftests/ run_tests
383
384Same as above except to inject page allocation failure instead of slab
10ffebbe 385allocation failure::
c24aa64d
AM
386
387 # env FAILCMD_TYPE=fail_page_alloc \
388 ./tools/testing/fault-injection/failcmd.sh --times=100 \
10ffebbe 389 -- make -C tools/testing/selftests/ run_tests
e41d5818
DV
390
391Systematic faults using fail-nth
392---------------------------------
393
394The following code systematically faults 0-th, 1-st, 2-nd and so on
10ffebbe
MCC
395capabilities in the socketpair() system call::
396
397 #include <sys/types.h>
398 #include <sys/stat.h>
399 #include <sys/socket.h>
400 #include <sys/syscall.h>
401 #include <fcntl.h>
402 #include <unistd.h>
403 #include <string.h>
404 #include <stdlib.h>
405 #include <stdio.h>
406 #include <errno.h>
407
408 int main()
409 {
e41d5818
DV
410 int i, err, res, fail_nth, fds[2];
411 char buf[128];
412
413 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
414 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
415 fail_nth = open(buf, O_RDWR);
9049f2f6 416 for (i = 1;; i++) {
e41d5818
DV
417 sprintf(buf, "%d", i);
418 write(fail_nth, buf, strlen(buf));
419 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
420 err = errno;
bfc74093 421 pread(fail_nth, buf, sizeof(buf), 0);
e41d5818
DV
422 if (res == 0) {
423 close(fds[0]);
424 close(fds[1]);
425 }
bfc74093
AM
426 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
427 res, err);
428 if (atoi(buf))
e41d5818
DV
429 break;
430 }
431 return 0;
10ffebbe
MCC
432 }
433
434An example output::
435
436 1-th fault Y: res=-1/23
437 2-th fault Y: res=-1/23
438 3-th fault Y: res=-1/12
439 4-th fault Y: res=-1/12
440 5-th fault Y: res=-1/23
441 6-th fault Y: res=-1/23
442 7-th fault Y: res=-1/23
443 8-th fault Y: res=-1/12
444 9-th fault Y: res=-1/12
445 10-th fault Y: res=-1/12
446 11-th fault Y: res=-1/12
447 12-th fault Y: res=-1/12
448 13-th fault Y: res=-1/12
449 14-th fault Y: res=-1/12
450 15-th fault Y: res=-1/12
451 16-th fault N: res=0/12