Merge tag 'linux_kselftest-fixes-6.10-rc7' of git://git.kernel.org/pub/scm/linux...
[linux-2.6-block.git] / Documentation / mm / page_table_check.rst
CommitLineData
df4e817b
PT
1.. SPDX-License-Identifier: GPL-2.0
2
df4e817b
PT
3================
4Page Table Check
5================
6
7Introduction
8============
9
854d0982 10Page table check allows to harden the kernel by ensuring that some types of
df4e817b
PT
11the memory corruptions are prevented.
12
13Page table check performs extra verifications at the time when new pages become
14accessible from the userspace by getting their page table entries (PTEs PMDs
15etc.) added into the table.
16
8430557f 17In case of most detected corruption, the kernel is crashed. There is a small
df4e817b
PT
18performance and memory overhead associated with the page table check. Therefore,
19it is disabled by default, but can be optionally enabled on systems where the
20extra hardening outweighs the performance costs. Also, because page table check
21is synchronous, it can help with debugging double map memory corruption issues,
22by crashing kernel at the time wrong mapping occurs instead of later which is
23often the case with memory corruptions bugs.
24
8430557f
PX
25It can also be used to do page table entry checks over various flags, dump
26warnings when illegal combinations of entry flags are detected. Currently,
27userfaultfd is the only user of such to sanity check wr-protect bit against
28any writable flags. Illegal flag combinations will not directly cause data
29corruption in this case immediately, but that will cause read-only data to
30be writable, leading to corrupt when the page content is later modified.
31
df4e817b
PT
32Double mapping detection logic
33==============================
34
35+-------------------+-------------------+-------------------+------------------+
36| Current Mapping | New mapping | Permissions | Rule |
37+===================+===================+===================+==================+
38| Anonymous | Anonymous | Read | Allow |
39+-------------------+-------------------+-------------------+------------------+
40| Anonymous | Anonymous | Read / Write | Prohibit |
41+-------------------+-------------------+-------------------+------------------+
42| Anonymous | Named | Any | Prohibit |
43+-------------------+-------------------+-------------------+------------------+
44| Named | Anonymous | Any | Prohibit |
45+-------------------+-------------------+-------------------+------------------+
46| Named | Named | Any | Allow |
47+-------------------+-------------------+-------------------+------------------+
48
49Enabling Page Table Check
50=========================
51
52Build kernel with:
53
54- PAGE_TABLE_CHECK=y
55 Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE_CHECK
56 is available.
57
58- Boot with 'page_table_check=on' kernel parameter.
59
60Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have page
61table support without extra kernel parameter.
81a31a86
RL
62
63Implementation notes
64====================
65
66We specifically decided not to use VMA information in order to avoid relying on
67MM states (except for limited "struct page" info). The page table check is a
68separate from Linux-MM state machine that verifies that the user accessible
69pages are not falsely shared.
70
71PAGE_TABLE_CHECK depends on EXCLUSIVE_SYSTEM_RAM. The reason is that without
72EXCLUSIVE_SYSTEM_RAM, users are allowed to map arbitrary physical memory
73regions into the userspace via /dev/mem. At the same time, pages may change
74their properties (e.g., from anonymous pages to named pages) while they are
75still being mapped in the userspace, leading to "corruption" detected by the
76page table check.
77
78Even with EXCLUSIVE_SYSTEM_RAM, I/O pages may be still allowed to be mapped via
79/dev/mem. However, these pages are always considered as named pages, so they
80won't break the logic used in the page table check.