[linux-block.git] / Documentation / dev-tools / testing-overview.rst

.. SPDX-License-Identifier: GPL-2.0

====================
Kernel Testing Guide
====================


There are a number of different tools for testing the Linux kernel, so knowing
when to use each of them can be a challenge. This document provides a rough
overview of their differences, and how they fit together.


Writing and Running Tests
=========================

The bulk of kernel tests are written using either the kselftest or KUnit
frameworks. These both provide infrastructure to help make running tests and
groups of tests easier, as well as providing helpers to aid in writing new
tests.

If you're looking to verify the behaviour of the Kernel — particularly specific
parts of the kernel — then you'll want to use KUnit or kselftest.


The Difference Between KUnit and kselftest
------------------------------------------

KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
for "white box" testing: because test code is part of the kernel, it can access
internal structures and functions which aren't exposed to userspace.

KUnit tests therefore are best written against small, self-contained parts
of the kernel, which can be tested in isolation. This aligns well with the
concept of 'unit' testing.

For example, a KUnit test might test an individual kernel function (or even a
single codepath through a function, such as an error handling case), rather
than a feature as a whole.

This also makes KUnit tests very fast to build and run, allowing them to be
run frequently as part of the development process.

There is a KUnit test style guide which may give further pointers in
Documentation/dev-tools/kunit/style.rst


kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
largely implemented in userspace, and tests are normal userspace scripts or
programs.

This makes it easier to write more complicated tests, or tests which need to
manipulate the overall system state more (e.g., spawning processes, etc.).
However, it's not possible to call kernel functions directly from kselftest.
This means that only kernel functionality which is exposed to userspace somehow
(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest.  To
work around this, some tests include a companion kernel module which exposes
more information or functionality. If a test runs mostly or entirely within the
kernel, however,  KUnit may be the more appropriate tool.

kselftest is therefore suited well to tests of whole features, as these will
expose an interface to userspace, which can be tested, but not implementation
details. This aligns well with 'system' or 'end-to-end' testing.

For example, all new system calls should be accompanied by kselftest tests.

Code Coverage Tools
===================

The Linux Kernel supports two different code coverage measurement tools. These
can be used to verify that a test is executing particular functions or lines
of code. This is useful for determining how much of the kernel is being tested,
and for finding corner-cases which are not covered by the appropriate test.

Documentation/dev-tools/gcov.rst is GCC's coverage testing tool, which can be
used with the kernel to get global or per-module coverage. Unlike KCOV, it
does not record per-task coverage. Coverage data can be read from debugfs,
and interpreted using the usual gcov tooling.

Documentation/dev-tools/kcov.rst is a feature which can be built in to the
kernel to allow capturing coverage on a per-task level. It's therefore useful
for fuzzing and other situations where information about code executed during,
for example, a single syscall is useful.


Dynamic Analysis Tools
======================

The kernel also supports a number of dynamic analysis tools, which attempt to
detect classes of issues when they occur in a running kernel. These typically
each look for a different class of bugs, such as invalid memory accesses,
concurrency issues such as data races, or other undefined behaviour like
integer overflows.

Some of these tools are listed below:

* kmemleak detects possible memory leaks. See
  Documentation/dev-tools/kmemleak.rst
* KASAN detects invalid memory accesses such as out-of-bounds and
  use-after-free errors. See Documentation/dev-tools/kasan.rst
* UBSAN detects behaviour that is undefined by the C standard, like integer
  overflows. See Documentation/dev-tools/ubsan.rst
* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
* KFENCE is a low-overhead detector of memory issues, which is much faster than
  KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
* lockdep is a locking correctness validator. See
  Documentation/locking/lockdep-design.rst
* There are several other pieces of debug instrumentation in the kernel, many
  of which can be found in lib/Kconfig.debug

These tools tend to test the kernel as a whole, and do not "pass" like
kselftest or KUnit tests. They can be combined with KUnit or kselftest by
running tests on a kernel with these tools enabled: you can then be sure
that none of these errors are occurring during the test.

Some of these tools integrate with KUnit or kselftest and will
automatically fail tests if an issue is detected.

Static Analysis Tools
=====================

In addition to testing a running kernel, one can also analyze kernel source code
directly (**at compile time**) using **static analysis** tools. The tools
commonly used in the kernel allow one to inspect the whole source tree or just
specific files within it. They make it easier to detect and fix problems during
the development process.

Sparse can help test the kernel by performing type-checking, lock checking,
value range checking, in addition to reporting various errors and warnings while
examining the code. See the Documentation/dev-tools/sparse.rst documentation
page for details on how to use it.

Smatch extends Sparse and provides additional checks for programming logic
mistakes such as missing breaks in switch statements, unused return values on
error checking, forgetting to set an error code in the return of an error path,
etc. Smatch also has tests against more serious issues such as integer
overflows, null pointer dereferences, and memory leaks. See the project page at
http://smatch.sourceforge.net/.

Coccinelle is another static analyzer at our disposal. Coccinelle is often used
to aid refactoring and collateral evolution of source code, but it can also help
to avoid certain bugs that occur in common code patterns. The types of tests
available include API tests, tests for correct usage of kernel iterators, checks
for the soundness of free operations, analysis of locking behavior, and further
tests known to help keep consistent kernel usage. See the
Documentation/dev-tools/coccinelle.rst documentation page for details.

Beware, though, that static analysis tools suffer from **false positives**.
Errors and warns need to be evaluated carefully before attempting to fix them.

When to use Sparse and Smatch
-----------------------------

Sparse does type checking, such as verifying that annotated variables do not
cause endianness bugs, detecting places that use ``__user`` pointers improperly,
and analyzing the compatibility of symbol initializers.

Smatch does flow analysis and, if allowed to build the function database, it
also does cross function analysis. Smatch tries to answer questions like where
is this buffer allocated? How big is it? Can this index be controlled by the
user? Is this variable larger than that variable?

It's generally easier to write checks in Smatch than it is to write checks in
Sparse. Nevertheless, there are some overlaps between Sparse and Smatch checks.

Strong points of Smatch and Coccinelle
--------------------------------------

Coccinelle is probably the easiest for writing checks. It works before the
pre-processor so it's easier to check for bugs in macros using Coccinelle.
Coccinelle also creates patches for you, which no other tool does.

For example, with Coccinelle you can do a mass conversion from
``kmalloc(x * size, GFP_KERNEL)`` to ``kmalloc_array(x, size, GFP_KERNEL)``, and
that's really useful. If you just created a Smatch warning and try to push the
work of converting on to the maintainers they would be annoyed. You'd have to
argue about each warning if can really overflow or not.

Coccinelle does no analysis of variable values, which is the strong point of
Smatch. On the other hand, Coccinelle allows you to do simple things in a simple
way.
Commit	Line	Data
c797997a DG	1	.. SPDX-License-Identifier: GPL-2.0
	2
	3	====================
	4	Kernel Testing Guide
	5	====================
	6
	7
	8	There are a number of different tools for testing the Linux kernel, so knowing
	9	when to use each of them can be a challenge. This document provides a rough
	10	overview of their differences, and how they fit together.
	11
	12
	13	Writing and Running Tests
	14	=========================
	15
	16	The bulk of kernel tests are written using either the kselftest or KUnit
	17	frameworks. These both provide infrastructure to help make running tests and
	18	groups of tests easier, as well as providing helpers to aid in writing new
	19	tests.
	20
	21	If you're looking to verify the behaviour of the Kernel — particularly specific
	22	parts of the kernel — then you'll want to use KUnit or kselftest.
	23
	24
	25	The Difference Between KUnit and kselftest
	26	------------------------------------------
	27
	28	KUnit (Documentation/dev-tools/kunit/index.rst) is an entirely in-kernel system
	29	for "white box" testing: because test code is part of the kernel, it can access
	30	internal structures and functions which aren't exposed to userspace.
	31
	32	KUnit tests therefore are best written against small, self-contained parts
	33	of the kernel, which can be tested in isolation. This aligns well with the
	34	concept of 'unit' testing.
	35
	36	For example, a KUnit test might test an individual kernel function (or even a
	37	single codepath through a function, such as an error handling case), rather
	38	than a feature as a whole.
	39
	40	This also makes KUnit tests very fast to build and run, allowing them to be
	41	run frequently as part of the development process.
	42
	43	There is a KUnit test style guide which may give further pointers in
	44	Documentation/dev-tools/kunit/style.rst
	45
	46
	47	kselftest (Documentation/dev-tools/kselftest.rst), on the other hand, is
	48	largely implemented in userspace, and tests are normal userspace scripts or
	49	programs.
	50
	51	This makes it easier to write more complicated tests, or tests which need to
	52	manipulate the overall system state more (e.g., spawning processes, etc.).
	53	However, it's not possible to call kernel functions directly from kselftest.
	54	This means that only kernel functionality which is exposed to userspace somehow
	55	(e.g. by a syscall, device, filesystem, etc.) can be tested with kselftest. To
	56	work around this, some tests include a companion kernel module which exposes
	57	more information or functionality. If a test runs mostly or entirely within the
	58	kernel, however, KUnit may be the more appropriate tool.
	59
	60	kselftest is therefore suited well to tests of whole features, as these will
	61	expose an interface to userspace, which can be tested, but not implementation
	62	details. This aligns well with 'system' or 'end-to-end' testing.
	63
	64	For example, all new system calls should be accompanied by kselftest tests.
65
66	Code Coverage Tools
67	===================
68
69	The Linux Kernel supports two different code coverage measurement tools. These
70	can be used to verify that a test is executing particular functions or lines
71	of code. This is useful for determining how much of the kernel is being tested,
72	and for finding corner-cases which are not covered by the appropriate test.
73
3a8b57d2 MCC	74	Documentation/dev-tools/gcov.rst is GCC's coverage testing tool, which can be
	75	used with the kernel to get global or per-module coverage. Unlike KCOV, it
	76	does not record per-task coverage. Coverage data can be read from debugfs,
	77	and interpreted using the usual gcov tooling.
	78
	79	Documentation/dev-tools/kcov.rst is a feature which can be built in to the
	80	kernel to allow capturing coverage on a per-task level. It's therefore useful
	81	for fuzzing and other situations where information about code executed during,
	82	for example, a single syscall is useful.
c797997a DG	83
	84
	85	Dynamic Analysis Tools
	86	======================
	87
	88	The kernel also supports a number of dynamic analysis tools, which attempt to
	89	detect classes of issues when they occur in a running kernel. These typically
	90	each look for a different class of bugs, such as invalid memory accesses,
	91	concurrency issues such as data races, or other undefined behaviour like
	92	integer overflows.
	93
	94	Some of these tools are listed below:
	95
	96	* kmemleak detects possible memory leaks. See
	97	Documentation/dev-tools/kmemleak.rst
	98	* KASAN detects invalid memory accesses such as out-of-bounds and
	99	use-after-free errors. See Documentation/dev-tools/kasan.rst
	100	* UBSAN detects behaviour that is undefined by the C standard, like integer
	101	overflows. See Documentation/dev-tools/ubsan.rst
	102	* KCSAN detects data races. See Documentation/dev-tools/kcsan.rst
	103	* KFENCE is a low-overhead detector of memory issues, which is much faster than
	104	KASAN and can be used in production. See Documentation/dev-tools/kfence.rst
	105	* lockdep is a locking correctness validator. See
	106	Documentation/locking/lockdep-design.rst
	107	* There are several other pieces of debug instrumentation in the kernel, many
	108	of which can be found in lib/Kconfig.debug
	109
	110	These tools tend to test the kernel as a whole, and do not "pass" like
	111	kselftest or KUnit tests. They can be combined with KUnit or kselftest by
	112	running tests on a kernel with these tools enabled: you can then be sure
	113	that none of these errors are occurring during the test.
	114
	115	Some of these tools integrate with KUnit or kselftest and will
	116	automatically fail tests if an issue is detected.
	117
12379401 MS	118	Static Analysis Tools
	119	=====================
	120
	121	In addition to testing a running kernel, one can also analyze kernel source code
	122	directly (at compile time) using static analysis tools. The tools
	123	commonly used in the kernel allow one to inspect the whole source tree or just
	124	specific files within it. They make it easier to detect and fix problems during
	125	the development process.
	126
	127	Sparse can help test the kernel by performing type-checking, lock checking,
	128	value range checking, in addition to reporting various errors and warnings while
	129	examining the code. See the Documentation/dev-tools/sparse.rst documentation
	130	page for details on how to use it.
	131
	132	Smatch extends Sparse and provides additional checks for programming logic
	133	mistakes such as missing breaks in switch statements, unused return values on
	134	error checking, forgetting to set an error code in the return of an error path,
	135	etc. Smatch also has tests against more serious issues such as integer
	136	overflows, null pointer dereferences, and memory leaks. See the project page at
	137	http://smatch.sourceforge.net/.
	138
	139	Coccinelle is another static analyzer at our disposal. Coccinelle is often used
	140	to aid refactoring and collateral evolution of source code, but it can also help
	141	to avoid certain bugs that occur in common code patterns. The types of tests
	142	available include API tests, tests for correct usage of kernel iterators, checks
	143	for the soundness of free operations, analysis of locking behavior, and further
	144	tests known to help keep consistent kernel usage. See the
	145	Documentation/dev-tools/coccinelle.rst documentation page for details.
	146
	147	Beware, though, that static analysis tools suffer from false positives.
	148	Errors and warns need to be evaluated carefully before attempting to fix them.
a32d5c0f MS	149
	150	When to use Sparse and Smatch
	151	-----------------------------
	152
	153	Sparse does type checking, such as verifying that annotated variables do not
	154	cause endianness bugs, detecting places that use ``__user`` pointers improperly,
	155	and analyzing the compatibility of symbol initializers.
	156
	157	Smatch does flow analysis and, if allowed to build the function database, it
	158	also does cross function analysis. Smatch tries to answer questions like where
	159	is this buffer allocated? How big is it? Can this index be controlled by the
	160	user? Is this variable larger than that variable?
	161
	162	It's generally easier to write checks in Smatch than it is to write checks in
	163	Sparse. Nevertheless, there are some overlaps between Sparse and Smatch checks.
	164
	165	Strong points of Smatch and Coccinelle
	166	--------------------------------------
	167
	168	Coccinelle is probably the easiest for writing checks. It works before the
	169	pre-processor so it's easier to check for bugs in macros using Coccinelle.
	170	Coccinelle also creates patches for you, which no other tool does.
	171
	172	For example, with Coccinelle you can do a mass conversion from
	173	``kmalloc(x * size, GFP_KERNEL)`` to ``kmalloc_array(x, size, GFP_KERNEL)``, and
	174	that's really useful. If you just created a Smatch warning and try to push the
	175	work of converting on to the maintainers they would be annoyed. You'd have to
	176	argue about each warning if can really overflow or not.
	177
	178	Coccinelle does no analysis of variable values, which is the strong point of
	179	Smatch. On the other hand, Coccinelle allows you to do simple things in a simple
	180	way.