[linux-block.git] / Documentation / dev-tools / ktap.rst

.. SPDX-License-Identifier: GPL-2.0

===================================================
The Kernel Test Anything Protocol (KTAP), version 1
===================================================

TAP, or the Test Anything Protocol is a format for specifying test results used
by a number of projects. It's website and specification are found at this `link
<https://testanything.org/>`_. The Linux Kernel largely uses TAP output for test
results. However, Kernel testing frameworks have special needs for test results
which don't align with the original TAP specification. Thus, a "Kernel TAP"
(KTAP) format is specified to extend and alter TAP to support these use-cases.
This specification describes the generally accepted format of KTAP as it is
currently used in the kernel.

KTAP test results describe a series of tests (which may be nested: i.e., test
can have subtests), each of which can contain both diagnostic data -- e.g., log
lines -- and a final result. The test structure and results are
machine-readable, whereas the diagnostic data is unstructured and is there to
aid human debugging.

KTAP output is built from four different types of lines:
- Version lines
- Plan lines
- Test case result lines
- Diagnostic lines

In general, valid KTAP output should also form valid TAP output, but some
information, in particular nested test results, may be lost. Also note that
there is a stagnant draft specification for TAP14, KTAP diverges from this in
a couple of places (notably the "Subtest" header), which are described where
relevant later in this document.

Version lines
-------------

All KTAP-formatted results begin with a "version line" which specifies which
version of the (K)TAP standard the result is compliant with.

For example:
- "KTAP version 1"
- "TAP version 13"
- "TAP version 14"

Note that, in KTAP, subtests also begin with a version line, which denotes the
start of the nested test results. This differs from TAP14, which uses a
separate "Subtest" line.

While, going forward, "KTAP version 1" should be used by compliant tests, it
is expected that most parsers and other tooling will accept the other versions
listed here for compatibility with existing tests and frameworks.

Plan lines
----------

A test plan provides the number of tests (or subtests) in the KTAP output.

Plan lines must follow the format of "1..N" where N is the number of tests or subtests.
Plan lines follow version lines to indicate the number of nested tests.

While there are cases where the number of tests is not known in advance -- in
which case the test plan may be omitted -- it is strongly recommended one is
present where possible.

Test case result lines
----------------------

Test case result lines indicate the final status of a test.
They are required and must have the format:

.. code-block:: none

	<result> <number> [<description>][ # [<directive>] [<diagnostic data>]]

The result can be either "ok", which indicates the test case passed,
or "not ok", which indicates that the test case failed.

<number> represents the number of the test being performed. The first test must
have the number 1 and the number then must increase by 1 for each additional
subtest within the same test at the same nesting level.

The description is a description of the test, generally the name of
the test, and can be any string of characters other than # or a
newline.  The description is optional, but recommended.

The directive and any diagnostic data is optional. If either are present, they
must follow a hash sign, "#".

A directive is a keyword that indicates a different outcome for a test other
than passed and failed. The directive is optional, and consists of a single
keyword preceding the diagnostic data. In the event that a parser encounters
a directive it doesn't support, it should fall back to the "ok" / "not ok"
result.

Currently accepted directives are:

- "SKIP", which indicates a test was skipped (note the result of the test case
  result line can be either "ok" or "not ok" if the SKIP directive is used)
- "TODO", which indicates that a test is not expected to pass at the moment,
  e.g. because the feature it is testing is known to be broken. While this
  directive is inherited from TAP, its use in the kernel is discouraged.
- "XFAIL", which indicates that a test is expected to fail. This is similar
  to "TODO", above, and is used by some kselftest tests.
- “TIMEOUT”, which indicates a test has timed out (note the result of the test
  case result line should be “not ok” if the TIMEOUT directive is used)
- “ERROR”, which indicates that the execution of a test has failed due to a
  specific error that is included in the diagnostic data. (note the result of
  the test case result line should be “not ok” if the ERROR directive is used)

The diagnostic data is a plain-text field which contains any additional details
about why this result was produced. This is typically an error message for ERROR
or failed tests, or a description of missing dependencies for a SKIP result.

The diagnostic data field is optional, and results which have neither a
directive nor any diagnostic data do not need to include the "#" field
separator.

Example result lines include::

	ok 1 test_case_name

The test "test_case_name" passed.

::

	not ok 1 test_case_name

The test "test_case_name" failed.

::

	ok 1 test # SKIP necessary dependency unavailable

The test "test" was SKIPPED with the diagnostic message "necessary dependency
unavailable".

::

	not ok 1 test # TIMEOUT 30 seconds

The test "test" timed out, with diagnostic data "30 seconds".

::

	ok 5 check return code # rcode=0

The test "check return code" passed, with additional diagnostic data “rcode=0”


Diagnostic lines
----------------

If tests wish to output any further information, they should do so using
"diagnostic lines". Diagnostic lines are optional, freeform text, and are
often used to describe what is being tested and any intermediate results in
more detail than the final result and diagnostic data line provides.

Diagnostic lines are formatted as "# <diagnostic_description>", where the
description can be any string.  Diagnostic lines can be anywhere in the test
output. As a rule, diagnostic lines regarding a test are directly before the
test result line for that test.

Note that most tools will treat unknown lines (see below) as diagnostic lines,
even if they do not start with a "#": this is to capture any other useful
kernel output which may help debug the test. It is nevertheless recommended
that tests always prefix any diagnostic output they have with a "#" character.

Unknown lines
-------------

There may be lines within KTAP output that do not follow the format of one of
the four formats for lines described above. This is allowed, however, they will
not influence the status of the tests.

This is an important difference from TAP.  Kernel tests may print messages
to the system console or a log file.  Both of these destinations may contain
messages either from unrelated kernel or userspace activity, or kernel
messages from non-test code that is invoked by the test.  The kernel code
invoked by the test likely is not aware that a test is in progress and
thus can not print the message as a diagnostic message.

Nested tests
------------

In KTAP, tests can be nested. This is done by having a test include within its
output an entire set of KTAP-formatted results. This can be used to categorize
and group related tests, or to split out different results from the same test.

The "parent" test's result should consist of all of its subtests' results,
starting with another KTAP version line and test plan, and end with the overall
result. If one of the subtests fail, for example, the parent test should also
fail.

Additionally, all lines in a subtest should be indented. One level of
indentation is two spaces: "  ". The indentation should begin at the version
line and should end before the parent test's result line.

"Unknown lines" are not considered to be lines in a subtest and thus are
allowed to be either indented or not indented.

An example of a test with two nested subtests:

::

	KTAP version 1
	1..1
	  KTAP version 1
	  1..2
	  ok 1 test_1
	  not ok 2 test_2
	# example failed
	not ok 1 example

An example format with multiple levels of nested testing:

::

	KTAP version 1
	1..2
	  KTAP version 1
	  1..2
	    KTAP version 1
	    1..2
	    not ok 1 test_1
	    ok 2 test_2
	  not ok 1 test_3
	  ok 2 test_4 # SKIP
	not ok 1 example_test_1
	ok 2 example_test_2


Major differences between TAP and KTAP
--------------------------------------

==================================================   =========  ===============
Feature                                              TAP        KTAP
==================================================   =========  ===============
yaml and json in diagnosic message                   ok         not recommended
TODO directive                                       ok         not recognized
allows an arbitrary number of tests to be nested     no         yes
"Unknown lines" are in category of "Anything else"   yes        no
"Unknown lines" are                                  incorrect  allowed
==================================================   =========  ===============

The TAP14 specification does permit nested tests, but instead of using another
nested version line, uses a line of the form
"Subtest: <name>" where <name> is the name of the parent test.

Example KTAP output
--------------------
::

	KTAP version 1
	1..1
	  KTAP version 1
	  1..3
	    KTAP version 1
	    1..1
	    # test_1: initializing test_1
	    ok 1 test_1
	  ok 1 example_test_1
	    KTAP version 1
	    1..2
	    ok 1 test_1 # SKIP test_1 skipped
	    ok 2 test_2
	  ok 2 example_test_2
	    KTAP version 1
	    1..3
	    ok 1 test_1
	    # test_2: FAIL
	    not ok 2 test_2
	    ok 3 test_3 # SKIP test_3 skipped
	  not ok 3 example_test_3
	not ok 1 main_test

This output defines the following hierarchy:

A single test called "main_test", which fails, and has three subtests:
- "example_test_1", which passes, and has one subtest:

   - "test_1", which passes, and outputs the diagnostic message "test_1: initializing test_1"

- "example_test_2", which passes, and has two subtests:

   - "test_1", which is skipped, with the explanation "test_1 skipped"
   - "test_2", which passes

- "example_test_3", which fails, and has three subtests

   - "test_1", which passes
   - "test_2", which outputs the diagnostic line "test_2: FAIL", and fails.
   - "test_3", which is skipped with the explanation "test_3 skipped"

Note that the individual subtests with the same names do not conflict, as they
are found in different parent tests. This output also exhibits some sensible
rules for "bubbling up" test results: a test fails if any of its subtests fail.
Skipped tests do not affect the result of the parent test (though it often
makes sense for a test to be marked skipped if _all_ of its subtests have been
skipped).

See also:
---------

- The TAP specification:
  https://testanything.org/tap-version-13-specification.html
- The (stagnant) TAP version 14 specification:
  https://github.com/TestAnything/Specification/blob/tap-14-specification/specification.md
- The kselftest documentation:
  Documentation/dev-tools/kselftest.rst
- The KUnit documentation:
  Documentation/dev-tools/kunit/index.rst
Commit	Line	Data
a32fa6b2 RM	1	.. SPDX-License-Identifier: GPL-2.0
a32fa6b2 RM	2
a693396f FR	3	===================================================
	4	The Kernel Test Anything Protocol (KTAP), version 1
	5	===================================================
a32fa6b2 RM	6
	7	TAP, or the Test Anything Protocol is a format for specifying test results used
	8	by a number of projects. It's website and specification are found at this `link
	9	<https://testanything.org/>`_. The Linux Kernel largely uses TAP output for test
	10	results. However, Kernel testing frameworks have special needs for test results
	11	which don't align with the original TAP specification. Thus, a "Kernel TAP"
	12	(KTAP) format is specified to extend and alter TAP to support these use-cases.
	13	This specification describes the generally accepted format of KTAP as it is
	14	currently used in the kernel.
	15
	16	KTAP test results describe a series of tests (which may be nested: i.e., test
	17	can have subtests), each of which can contain both diagnostic data -- e.g., log
	18	lines -- and a final result. The test structure and results are
	19	machine-readable, whereas the diagnostic data is unstructured and is there to
	20	aid human debugging.
	21
	22	KTAP output is built from four different types of lines:
	23	- Version lines
	24	- Plan lines
	25	- Test case result lines
	26	- Diagnostic lines
	27
	28	In general, valid KTAP output should also form valid TAP output, but some
	29	information, in particular nested test results, may be lost. Also note that
	30	there is a stagnant draft specification for TAP14, KTAP diverges from this in
	31	a couple of places (notably the "Subtest" header), which are described where
	32	relevant later in this document.
	33
	34	Version lines
	35	-------------
	36
	37	All KTAP-formatted results begin with a "version line" which specifies which
	38	version of the (K)TAP standard the result is compliant with.
	39
	40	For example:
	41	- "KTAP version 1"
	42	- "TAP version 13"
	43	- "TAP version 14"
	44
	45	Note that, in KTAP, subtests also begin with a version line, which denotes the
	46	start of the nested test results. This differs from TAP14, which uses a
	47	separate "Subtest" line.
	48
	49	While, going forward, "KTAP version 1" should be used by compliant tests, it
	50	is expected that most parsers and other tooling will accept the other versions
	51	listed here for compatibility with existing tests and frameworks.
	52
	53	Plan lines
	54	----------
	55
	56	A test plan provides the number of tests (or subtests) in the KTAP output.
	57
	58	Plan lines must follow the format of "1..N" where N is the number of tests or subtests.
	59	Plan lines follow version lines to indicate the number of nested tests.
	60
	61	While there are cases where the number of tests is not known in advance -- in
	62	which case the test plan may be omitted -- it is strongly recommended one is
	63	present where possible.
	64
	65	Test case result lines
	66	----------------------
	67
	68	Test case result lines indicate the final status of a test.
	69	They are required and must have the format:
70
ff136876	71	.. code-block:: none
a32fa6b2 RM	72
	73	<result> <number> [<description>][ # [<directive>] [<diagnostic data>]]
	74
	75	The result can be either "ok", which indicates the test case passed,
	76	or "not ok", which indicates that the test case failed.
	77
	78	<number> represents the number of the test being performed. The first test must
	79	have the number 1 and the number then must increase by 1 for each additional
	80	subtest within the same test at the same nesting level.
	81
	82	The description is a description of the test, generally the name of
054be257 MB	83	the test, and can be any string of characters other than # or a
054be257 MB	84	newline. The description is optional, but recommended.
a32fa6b2 RM	85
	86	The directive and any diagnostic data is optional. If either are present, they
	87	must follow a hash sign, "#".
	88
	89	A directive is a keyword that indicates a different outcome for a test other
	90	than passed and failed. The directive is optional, and consists of a single
	91	keyword preceding the diagnostic data. In the event that a parser encounters
	92	a directive it doesn't support, it should fall back to the "ok" / "not ok"
	93	result.
	94
	95	Currently accepted directives are:
	96
	97	- "SKIP", which indicates a test was skipped (note the result of the test case
	98	result line can be either "ok" or "not ok" if the SKIP directive is used)
	99	- "TODO", which indicates that a test is not expected to pass at the moment,
	100	e.g. because the feature it is testing is known to be broken. While this
	101	directive is inherited from TAP, its use in the kernel is discouraged.
	102	- "XFAIL", which indicates that a test is expected to fail. This is similar
	103	to "TODO", above, and is used by some kselftest tests.
	104	- “TIMEOUT”, which indicates a test has timed out (note the result of the test
	105	case result line should be “not ok” if the TIMEOUT directive is used)
	106	- “ERROR”, which indicates that the execution of a test has failed due to a
	107	specific error that is included in the diagnostic data. (note the result of
	108	the test case result line should be “not ok” if the ERROR directive is used)
	109
	110	The diagnostic data is a plain-text field which contains any additional details
	111	about why this result was produced. This is typically an error message for ERROR
	112	or failed tests, or a description of missing dependencies for a SKIP result.
	113
	114	The diagnostic data field is optional, and results which have neither a
	115	directive nor any diagnostic data do not need to include the "#" field
	116	separator.
	117
62ce577b	118	Example result lines include::
a32fa6b2 RM	119
	120	ok 1 test_case_name
	121
	122	The test "test_case_name" passed.
	123
62ce577b	124	::
a32fa6b2 RM	125
	126	not ok 1 test_case_name
	127
	128	The test "test_case_name" failed.
	129
62ce577b	130	::
a32fa6b2 RM	131
	132	ok 1 test # SKIP necessary dependency unavailable
	133
	134	The test "test" was SKIPPED with the diagnostic message "necessary dependency
	135	unavailable".
	136
62ce577b	137	::
a32fa6b2 RM	138
	139	not ok 1 test # TIMEOUT 30 seconds
	140
	141	The test "test" timed out, with diagnostic data "30 seconds".
	142
62ce577b	143	::
a32fa6b2 RM	144
	145	ok 5 check return code # rcode=0
	146
	147	The test "check return code" passed, with additional diagnostic data “rcode=0”
	148
	149
	150	Diagnostic lines
	151	----------------
	152
	153	If tests wish to output any further information, they should do so using
	154	"diagnostic lines". Diagnostic lines are optional, freeform text, and are
	155	often used to describe what is being tested and any intermediate results in
	156	more detail than the final result and diagnostic data line provides.
	157
	158	Diagnostic lines are formatted as "# <diagnostic_description>", where the
	159	description can be any string. Diagnostic lines can be anywhere in the test
	160	output. As a rule, diagnostic lines regarding a test are directly before the
	161	test result line for that test.
	162
	163	Note that most tools will treat unknown lines (see below) as diagnostic lines,
	164	even if they do not start with a "#": this is to capture any other useful
	165	kernel output which may help debug the test. It is nevertheless recommended
	166	that tests always prefix any diagnostic output they have with a "#" character.
	167
	168	Unknown lines
	169	-------------
	170
	171	There may be lines within KTAP output that do not follow the format of one of
	172	the four formats for lines described above. This is allowed, however, they will
	173	not influence the status of the tests.
	174
a693396f FR	175	This is an important difference from TAP. Kernel tests may print messages
	176	to the system console or a log file. Both of these destinations may contain
	177	messages either from unrelated kernel or userspace activity, or kernel
	178	messages from non-test code that is invoked by the test. The kernel code
	179	invoked by the test likely is not aware that a test is in progress and
	180	thus can not print the message as a diagnostic message.
	181
a32fa6b2 RM	182	Nested tests
	183	------------
	184
	185	In KTAP, tests can be nested. This is done by having a test include within its
	186	output an entire set of KTAP-formatted results. This can be used to categorize
	187	and group related tests, or to split out different results from the same test.
	188
	189	The "parent" test's result should consist of all of its subtests' results,
	190	starting with another KTAP version line and test plan, and end with the overall
	191	result. If one of the subtests fail, for example, the parent test should also
	192	fail.
	193
a693396f	194	Additionally, all lines in a subtest should be indented. One level of
a32fa6b2 RM	195	indentation is two spaces: " ". The indentation should begin at the version
	196	line and should end before the parent test's result line.
	197
a693396f FR	198	"Unknown lines" are not considered to be lines in a subtest and thus are
	199	allowed to be either indented or not indented.
	200
a32fa6b2 RM	201	An example of a test with two nested subtests:
a32fa6b2 RM	202
62ce577b	203	::
a32fa6b2 RM	204
	205	KTAP version 1
	206	1..1
	207	KTAP version 1
	208	1..2
	209	ok 1 test_1
	210	not ok 2 test_2
	211	# example failed
	212	not ok 1 example
	213
	214	An example format with multiple levels of nested testing:
	215
62ce577b	216	::
a32fa6b2 RM	217
	218	KTAP version 1
	219	1..2
	220	KTAP version 1
	221	1..2
	222	KTAP version 1
	223	1..2
	224	not ok 1 test_1
	225	ok 2 test_2
	226	not ok 1 test_3
	227	ok 2 test_4 # SKIP
	228	not ok 1 example_test_1
	229	ok 2 example_test_2
	230
	231
	232	Major differences between TAP and KTAP
	233	--------------------------------------
	234
a693396f FR	235	================================================== ========= ===============
	236	Feature TAP KTAP
	237	================================================== ========= ===============
	238	yaml and json in diagnosic message ok not recommended
	239	TODO directive ok not recognized
	240	allows an arbitrary number of tests to be nested no yes
	241	"Unknown lines" are in category of "Anything else" yes no
	242	"Unknown lines" are incorrect allowed
	243	================================================== ========= ===============
a32fa6b2 RM	244
	245	The TAP14 specification does permit nested tests, but instead of using another
	246	nested version line, uses a line of the form
	247	"Subtest: <name>" where <name> is the name of the parent test.
	248
	249	Example KTAP output
	250	--------------------
62ce577b	251	::
a32fa6b2 RM	252
	253	KTAP version 1
	254	1..1
	255	KTAP version 1
	256	1..3
	257	KTAP version 1
	258	1..1
	259	# test_1: initializing test_1
	260	ok 1 test_1
	261	ok 1 example_test_1
	262	KTAP version 1
	263	1..2
	264	ok 1 test_1 # SKIP test_1 skipped
	265	ok 2 test_2
	266	ok 2 example_test_2
	267	KTAP version 1
	268	1..3
	269	ok 1 test_1
	270	# test_2: FAIL
	271	not ok 2 test_2
	272	ok 3 test_3 # SKIP test_3 skipped
	273	not ok 3 example_test_3
	274	not ok 1 main_test
	275
	276	This output defines the following hierarchy:
	277
	278	A single test called "main_test", which fails, and has three subtests:
	279	- "example_test_1", which passes, and has one subtest:
	280
	281	- "test_1", which passes, and outputs the diagnostic message "test_1: initializing test_1"
	282
	283	- "example_test_2", which passes, and has two subtests:
	284
	285	- "test_1", which is skipped, with the explanation "test_1 skipped"
	286	- "test_2", which passes
	287
	288	- "example_test_3", which fails, and has three subtests
	289
	290	- "test_1", which passes
	291	- "test_2", which outputs the diagnostic line "test_2: FAIL", and fails.
	292	- "test_3", which is skipped with the explanation "test_3 skipped"
	293
	294	Note that the individual subtests with the same names do not conflict, as they
	295	are found in different parent tests. This output also exhibits some sensible
	296	rules for "bubbling up" test results: a test fails if any of its subtests fail.
	297	Skipped tests do not affect the result of the parent test (though it often
	298	makes sense for a test to be marked skipped if _all_ of its subtests have been
	299	skipped).
	300
	301	See also:
	302	---------
	303
	304	- The TAP specification:
	305	https://testanything.org/tap-version-13-specification.html
	306	- The (stagnant) TAP version 14 specification:
	307	https://github.com/TestAnything/Specification/blob/tap-14-specification/specification.md
	308	- The kselftest documentation:
	309	Documentation/dev-tools/kselftest.rst
	310	- The KUnit documentation:
	311	Documentation/dev-tools/kunit/index.rst