Commit | Line | Data |
---|---|---|
c797997a DG |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ==================== | |
4 | Kernel Testing Guide | |
5 | ==================== | |
6 | ||
7 | ||
8 | There are a number of different tools for testing the Linux kernel, so knowing | |
9 | when to use each of them can be a challenge. This document provides a rough | |
10 | overview of their differences, and how they fit together. | |
11 | ||
12 | ||
13 | Writing and Running Tests | |
14 | ========================= | |
15 | ||
16 | The bulk of kernel tests are written using either the kselftest or KUnit | |
17 | frameworks. These both provide infrastructure to help make running tests and | |
18 | groups of tests easier, as well as providing helpers to aid in writing new | |
19 | tests. | |
20 | ||
21 | If you're looking to verify the behaviour of the Kernel — particularly specific | |
22 | parts of the kernel — then you'll want to use KUnit or kselftest. | |
23 | ||
24 | ||
25 | The Difference Between KUnit and kselftest | |
26 | ------------------------------------------ | |
27 | ||
28 | KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system | |
29 | for "white box" testing: because test code is part of the kernel, it can access | |
30 | internal structures and functions which aren't exposed to userspace. | |
31 | ||
32 | KUnit tests therefore are best written against small, self-contained parts | |
33 | of the kernel, which can be tested in isolation. This aligns well with the | |
34 | concept of 'unit' testing. | |
35 | ||
36 | For example, a KUnit test might test an individual kernel function (or even a | |
37 | single codepath through a function, such as an error handling case), rather | |
38 | than a feature as a whole. | |
39 | ||
40 | This also makes KUnit tests very fast to build and run, allowing them to be | |
41 | run frequently as part of the development process. | |
42 | ||
43 | There is a KUnit test style guide which may give further pointers in | |
44 | Documentation/dev-tools/kunit/style.rst | |
45 | ||
46 | ||
47 | kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is | |
48 | largely implemented in userspace, and tests are normal userspace scripts or | |
49 | programs. | |
50 | ||
51 | This makes it easier to write more complicated tests, or tests which need to | |
52 | manipulate the overall system state more (e.g., spawning processes, etc.). | |
53 | However, it's not possible to call kernel functions directly from kselftest. | |
54 | This means that only kernel functionality which is exposed to userspace somehow | |
55 | (e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To | |
56 | work around this, some tests include a companion kernel module which exposes | |
57 | more information or functionality. If a test runs mostly or entirely within the | |
58 | kernel, however, KUnit may be the more appropriate tool. | |
59 | ||
60 | kselftest is therefore suited well to tests of whole features, as these will | |
61 | expose an interface to userspace, which can be tested, but not implementation | |
62 | details. This aligns well with 'system' or 'end-to-end' testing. | |
63 | ||
64 | For example, all new system calls should be accompanied by kselftest tests. | |
65 | ||
66 | Code Coverage Tools | |
67 | =================== | |
68 | ||
69 | The Linux Kernel supports two different code coverage measurement tools. These | |
70 | can be used to verify that a test is executing particular functions or lines | |
71 | of code. This is useful for determining how much of the kernel is being tested, | |
72 | and for finding corner-cases which are not covered by the appropriate test. | |
73 | ||
3a8b57d2 MCC |
74 | Documentation/dev-tools/gcov.rst is GCC's coverage testing tool, which can be |
75 | used with the kernel to get global or per-module coverage. Unlike KCOV, it | |
76 | does not record per-task coverage. Coverage data can be read from debugfs, | |
77 | and interpreted using the usual gcov tooling. | |
78 | ||
79 | Documentation/dev-tools/kcov.rst is a feature which can be built in to the | |
80 | kernel to allow capturing coverage on a per-task level. It's therefore useful | |
81 | for fuzzing and other situations where information about code executed during, | |
82 | for example, a single syscall is useful. | |
c797997a DG |
83 | |
84 | ||
85 | Dynamic Analysis Tools | |
86 | ====================== | |
87 | ||
88 | The kernel also supports a number of dynamic analysis tools, which attempt to | |
89 | detect classes of issues when they occur in a running kernel. These typically | |
90 | each look for a different class of bugs, such as invalid memory accesses, | |
91 | concurrency issues such as data races, or other undefined behaviour like | |
92 | integer overflows. | |
93 | ||
94 | Some of these tools are listed below: | |
95 | ||
96 | * kmemleak detects possible memory leaks. See | |
97 | Documentation/dev-tools/kmemleak.rst | |
98 | * KASAN detects invalid memory accesses such as out-of-bounds and | |
99 | use-after-free errors. See Documentation/dev-tools/kasan.rst | |
100 | * UBSAN detects behaviour that is undefined by the C standard, like integer | |
101 | overflows. See Documentation/dev-tools/ubsan.rst | |
102 | * KCSAN detects data races. See Documentation/dev-tools/kcsan.rst | |
103 | * KFENCE is a low-overhead detector of memory issues, which is much faster than | |
104 | KASAN and can be used in production. See Documentation/dev-tools/kfence.rst | |
105 | * lockdep is a locking correctness validator. See | |
106 | Documentation/locking/lockdep-design.rst | |
107 | * There are several other pieces of debug instrumentation in the kernel, many | |
108 | of which can be found in lib/Kconfig.debug | |
109 | ||
110 | These tools tend to test the kernel as a whole, and do not "pass" like | |
111 | kselftest or KUnit tests. They can be combined with KUnit or kselftest by | |
112 | running tests on a kernel with these tools enabled: you can then be sure | |
113 | that none of these errors are occurring during the test. | |
114 | ||
115 | Some of these tools integrate with KUnit or kselftest and will | |
116 | automatically fail tests if an issue is detected. | |
117 | ||
12379401 MS |
118 | Static Analysis Tools |
119 | ===================== | |
120 | ||
121 | In addition to testing a running kernel, one can also analyze kernel source code | |
122 | directly (**at compile time**) using **static analysis** tools. The tools | |
123 | commonly used in the kernel allow one to inspect the whole source tree or just | |
124 | specific files within it. They make it easier to detect and fix problems during | |
125 | the development process. | |
126 | ||
127 | Sparse can help test the kernel by performing type-checking, lock checking, | |
128 | value range checking, in addition to reporting various errors and warnings while | |
129 | examining the code. See the Documentation/dev-tools/sparse.rst documentation | |
130 | page for details on how to use it. | |
131 | ||
132 | Smatch extends Sparse and provides additional checks for programming logic | |
133 | mistakes such as missing breaks in switch statements, unused return values on | |
134 | error checking, forgetting to set an error code in the return of an error path, | |
135 | etc. Smatch also has tests against more serious issues such as integer | |
136 | overflows, null pointer dereferences, and memory leaks. See the project page at | |
137 | http://smatch.sourceforge.net/. | |
138 | ||
139 | Coccinelle is another static analyzer at our disposal. Coccinelle is often used | |
140 | to aid refactoring and collateral evolution of source code, but it can also help | |
141 | to avoid certain bugs that occur in common code patterns. The types of tests | |
142 | available include API tests, tests for correct usage of kernel iterators, checks | |
143 | for the soundness of free operations, analysis of locking behavior, and further | |
144 | tests known to help keep consistent kernel usage. See the | |
145 | Documentation/dev-tools/coccinelle.rst documentation page for details. | |
146 | ||
147 | Beware, though, that static analysis tools suffer from **false positives**. | |
148 | Errors and warns need to be evaluated carefully before attempting to fix them. | |
a32d5c0f MS |
149 | |
150 | When to use Sparse and Smatch | |
151 | ----------------------------- | |
152 | ||
153 | Sparse does type checking, such as verifying that annotated variables do not | |
154 | cause endianness bugs, detecting places that use ``__user`` pointers improperly, | |
155 | and analyzing the compatibility of symbol initializers. | |
156 | ||
157 | Smatch does flow analysis and, if allowed to build the function database, it | |
158 | also does cross function analysis. Smatch tries to answer questions like where | |
159 | is this buffer allocated? How big is it? Can this index be controlled by the | |
160 | user? Is this variable larger than that variable? | |
161 | ||
162 | It's generally easier to write checks in Smatch than it is to write checks in | |
163 | Sparse. Nevertheless, there are some overlaps between Sparse and Smatch checks. | |
164 | ||
165 | Strong points of Smatch and Coccinelle | |
166 | -------------------------------------- | |
167 | ||
168 | Coccinelle is probably the easiest for writing checks. It works before the | |
169 | pre-processor so it's easier to check for bugs in macros using Coccinelle. | |
170 | Coccinelle also creates patches for you, which no other tool does. | |
171 | ||
172 | For example, with Coccinelle you can do a mass conversion from | |
173 | ``kmalloc(x * size, GFP_KERNEL)`` to ``kmalloc_array(x, size, GFP_KERNEL)``, and | |
174 | that's really useful. If you just created a Smatch warning and try to push the | |
175 | work of converting on to the maintainers they would be annoyed. You'd have to | |
176 | argue about each warning if can really overflow or not. | |
177 | ||
178 | Coccinelle does no analysis of variable values, which is the strong point of | |
179 | Smatch. On the other hand, Coccinelle allows you to do simple things in a simple | |
180 | way. |