Commit | Line | Data |
---|---|---|
2757aafa JC |
1 | The Kernel Address Sanitizer (KASAN) |
2 | ==================================== | |
3 | ||
4 | Overview | |
5 | -------- | |
6 | ||
b3b0e6ac | 7 | KernelAddressSANitizer (KASAN) is a dynamic memory error detector designed to |
948e3253 AK |
8 | find out-of-bound and use-after-free bugs. KASAN has three modes: |
9 | 1. generic KASAN (similar to userspace ASan), | |
10 | 2. software tag-based KASAN (similar to userspace HWASan), | |
11 | 3. hardware tag-based KASAN (based on hardware memory tagging). | |
2757aafa | 12 | |
948e3253 AK |
13 | Software KASAN modes (1 and 2) use compile-time instrumentation to insert |
14 | validity checks before every memory access, and therefore require a compiler | |
15 | version that supports that. | |
2757aafa | 16 | |
b3b0e6ac | 17 | Generic KASAN is supported in both GCC and Clang. With GCC it requires version |
527f6750 | 18 | 8.3.0 or later. Any supported Clang version is compatible, but detection of |
ac4766be | 19 | out-of-bounds accesses for global variables is only supported since Clang 11. |
b3b0e6ac | 20 | |
527f6750 | 21 | Tag-based KASAN is only supported in Clang. |
b3b0e6ac | 22 | |
ea01ce67 | 23 | Currently generic KASAN is supported for the x86_64, arm64, xtensa, s390 and |
948e3253 | 24 | and riscv architectures, and tag-based KASAN modes are supported only for arm64. |
2757aafa JC |
25 | |
26 | Usage | |
27 | ----- | |
28 | ||
29 | To enable KASAN configure kernel with:: | |
30 | ||
31 | CONFIG_KASAN = y | |
32 | ||
948e3253 AK |
33 | and choose between CONFIG_KASAN_GENERIC (to enable generic KASAN), |
34 | CONFIG_KASAN_SW_TAGS (to enable software tag-based KASAN), and | |
35 | CONFIG_KASAN_HW_TAGS (to enable hardware tag-based KASAN). | |
b3b0e6ac | 36 | |
948e3253 AK |
37 | For software modes, you also need to choose between CONFIG_KASAN_OUTLINE and |
38 | CONFIG_KASAN_INLINE. Outline and inline are compiler instrumentation types. | |
39 | The former produces smaller binary while the latter is 1.1 - 2 times faster. | |
2757aafa | 40 | |
948e3253 AK |
41 | Both software KASAN modes work with both SLUB and SLAB memory allocators, |
42 | hardware tag-based KASAN currently only support SLUB. | |
2757aafa JC |
43 | For better bug detection and nicer reporting, enable CONFIG_STACKTRACE. |
44 | ||
0fe9a448 VB |
45 | To augment reports with last allocation and freeing stack of the physical page, |
46 | it is recommended to enable also CONFIG_PAGE_OWNER and boot with page_owner=on. | |
47 | ||
2757aafa JC |
48 | To disable instrumentation for specific files or directories, add a line |
49 | similar to the following to the respective kernel Makefile: | |
50 | ||
51 | - For a single file (e.g. main.o):: | |
52 | ||
53 | KASAN_SANITIZE_main.o := n | |
54 | ||
55 | - For all files in one directory:: | |
56 | ||
57 | KASAN_SANITIZE := n | |
58 | ||
59 | Error reports | |
60 | ~~~~~~~~~~~~~ | |
61 | ||
b3b0e6ac | 62 | A typical out-of-bounds access generic KASAN report looks like this:: |
2757aafa JC |
63 | |
64 | ================================================================== | |
b3b0e6ac AK |
65 | BUG: KASAN: slab-out-of-bounds in kmalloc_oob_right+0xa8/0xbc [test_kasan] |
66 | Write of size 1 at addr ffff8801f44ec37b by task insmod/2760 | |
67 | ||
68 | CPU: 1 PID: 2760 Comm: insmod Not tainted 4.19.0-rc3+ #698 | |
69 | Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 | |
2757aafa | 70 | Call Trace: |
b3b0e6ac AK |
71 | dump_stack+0x94/0xd8 |
72 | print_address_description+0x73/0x280 | |
73 | kasan_report+0x144/0x187 | |
74 | __asan_report_store1_noabort+0x17/0x20 | |
75 | kmalloc_oob_right+0xa8/0xbc [test_kasan] | |
76 | kmalloc_tests_init+0x16/0x700 [test_kasan] | |
77 | do_one_initcall+0xa5/0x3ae | |
78 | do_init_module+0x1b6/0x547 | |
79 | load_module+0x75df/0x8070 | |
80 | __do_sys_init_module+0x1c6/0x200 | |
81 | __x64_sys_init_module+0x6e/0xb0 | |
82 | do_syscall_64+0x9f/0x2c0 | |
83 | entry_SYSCALL_64_after_hwframe+0x44/0xa9 | |
84 | RIP: 0033:0x7f96443109da | |
85 | RSP: 002b:00007ffcf0b51b08 EFLAGS: 00000202 ORIG_RAX: 00000000000000af | |
86 | RAX: ffffffffffffffda RBX: 000055dc3ee521a0 RCX: 00007f96443109da | |
87 | RDX: 00007f96445cff88 RSI: 0000000000057a50 RDI: 00007f9644992000 | |
88 | RBP: 000055dc3ee510b0 R08: 0000000000000003 R09: 0000000000000000 | |
89 | R10: 00007f964430cd0a R11: 0000000000000202 R12: 00007f96445cff88 | |
90 | R13: 000055dc3ee51090 R14: 0000000000000000 R15: 0000000000000000 | |
91 | ||
92 | Allocated by task 2760: | |
93 | save_stack+0x43/0xd0 | |
94 | kasan_kmalloc+0xa7/0xd0 | |
95 | kmem_cache_alloc_trace+0xe1/0x1b0 | |
96 | kmalloc_oob_right+0x56/0xbc [test_kasan] | |
97 | kmalloc_tests_init+0x16/0x700 [test_kasan] | |
98 | do_one_initcall+0xa5/0x3ae | |
99 | do_init_module+0x1b6/0x547 | |
100 | load_module+0x75df/0x8070 | |
101 | __do_sys_init_module+0x1c6/0x200 | |
102 | __x64_sys_init_module+0x6e/0xb0 | |
103 | do_syscall_64+0x9f/0x2c0 | |
104 | entry_SYSCALL_64_after_hwframe+0x44/0xa9 | |
105 | ||
106 | Freed by task 815: | |
107 | save_stack+0x43/0xd0 | |
108 | __kasan_slab_free+0x135/0x190 | |
109 | kasan_slab_free+0xe/0x10 | |
110 | kfree+0x93/0x1a0 | |
111 | umh_complete+0x6a/0xa0 | |
112 | call_usermodehelper_exec_async+0x4c3/0x640 | |
113 | ret_from_fork+0x35/0x40 | |
114 | ||
115 | The buggy address belongs to the object at ffff8801f44ec300 | |
116 | which belongs to the cache kmalloc-128 of size 128 | |
117 | The buggy address is located 123 bytes inside of | |
118 | 128-byte region [ffff8801f44ec300, ffff8801f44ec380) | |
119 | The buggy address belongs to the page: | |
120 | page:ffffea0007d13b00 count:1 mapcount:0 mapping:ffff8801f7001640 index:0x0 | |
121 | flags: 0x200000000000100(slab) | |
122 | raw: 0200000000000100 ffffea0007d11dc0 0000001a0000001a ffff8801f7001640 | |
123 | raw: 0000000000000000 0000000080150015 00000001ffffffff 0000000000000000 | |
124 | page dumped because: kasan: bad access detected | |
125 | ||
2757aafa | 126 | Memory state around the buggy address: |
b3b0e6ac AK |
127 | ffff8801f44ec200: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb |
128 | ffff8801f44ec280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc | |
129 | >ffff8801f44ec300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 03 | |
130 | ^ | |
131 | ffff8801f44ec380: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb | |
132 | ffff8801f44ec400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc | |
2757aafa JC |
133 | ================================================================== |
134 | ||
b3b0e6ac AK |
135 | The header of the report provides a short summary of what kind of bug happened |
136 | and what kind of access caused it. It's followed by a stack trace of the bad | |
137 | access, a stack trace of where the accessed memory was allocated (in case bad | |
138 | access happens on a slab object), and a stack trace of where the object was | |
139 | freed (in case of a use-after-free bug report). Next comes a description of | |
140 | the accessed slab object and information about the accessed memory page. | |
2757aafa JC |
141 | |
142 | In the last section the report shows memory state around the accessed address. | |
143 | Reading this part requires some understanding of how KASAN works. | |
144 | ||
145 | The state of each 8 aligned bytes of memory is encoded in one shadow byte. | |
146 | Those 8 bytes can be accessible, partially accessible, freed or be a redzone. | |
147 | We use the following encoding for each shadow byte: 0 means that all 8 bytes | |
148 | of the corresponding memory region are accessible; number N (1 <= N <= 7) means | |
149 | that the first N bytes are accessible, and other (8 - N) bytes are not; | |
150 | any negative value indicates that the entire 8-byte word is inaccessible. | |
151 | We use different negative values to distinguish between different kinds of | |
152 | inaccessible memory like redzones or freed memory (see mm/kasan/kasan.h). | |
153 | ||
154 | In the report above the arrows point to the shadow byte 03, which means that | |
155 | the accessed address is partially accessible. | |
156 | ||
b3b0e6ac AK |
157 | For tag-based KASAN this last report section shows the memory tags around the |
158 | accessed address (see Implementation details section). | |
159 | ||
2757aafa JC |
160 | |
161 | Implementation details | |
162 | ---------------------- | |
163 | ||
b3b0e6ac AK |
164 | Generic KASAN |
165 | ~~~~~~~~~~~~~ | |
166 | ||
2757aafa JC |
167 | From a high level, our approach to memory error detection is similar to that |
168 | of kmemcheck: use shadow memory to record whether each byte of memory is safe | |
b3b0e6ac AK |
169 | to access, and use compile-time instrumentation to insert checks of shadow |
170 | memory on each memory access. | |
2757aafa | 171 | |
b3b0e6ac AK |
172 | Generic KASAN dedicates 1/8th of kernel memory to its shadow memory (e.g. 16TB |
173 | to cover 128TB on x86_64) and uses direct mapping with a scale and offset to | |
174 | translate a memory address to its corresponding shadow address. | |
2757aafa JC |
175 | |
176 | Here is the function which translates an address to its corresponding shadow | |
177 | address:: | |
178 | ||
179 | static inline void *kasan_mem_to_shadow(const void *addr) | |
180 | { | |
181 | return ((unsigned long)addr >> KASAN_SHADOW_SCALE_SHIFT) | |
182 | + KASAN_SHADOW_OFFSET; | |
183 | } | |
184 | ||
185 | where ``KASAN_SHADOW_SCALE_SHIFT = 3``. | |
186 | ||
b3b0e6ac AK |
187 | Compile-time instrumentation is used to insert memory access checks. Compiler |
188 | inserts function calls (__asan_load*(addr), __asan_store*(addr)) before each | |
189 | memory access of size 1, 2, 4, 8 or 16. These functions check whether memory | |
190 | access is valid or not by checking corresponding shadow memory. | |
2757aafa JC |
191 | |
192 | GCC 5.0 has possibility to perform inline instrumentation. Instead of making | |
193 | function calls GCC directly inserts the code to check the shadow memory. | |
194 | This option significantly enlarges kernel but it gives x1.1-x2 performance | |
195 | boost over outline instrumented kernel. | |
b3b0e6ac | 196 | |
4784be28 WW |
197 | Generic KASAN also reports the last 2 call stacks to creation of work that |
198 | potentially has access to an object. Call stacks for the following are shown: | |
199 | call_rcu() and workqueue queuing. | |
9793b626 | 200 | |
b3b0e6ac AK |
201 | Software tag-based KASAN |
202 | ~~~~~~~~~~~~~~~~~~~~~~~~ | |
203 | ||
948e3253 AK |
204 | Software tag-based KASAN requires software memory tagging support in the form |
205 | of HWASan-like compiler instrumentation (see HWASan documentation for details). | |
206 | ||
207 | Software tag-based KASAN is currently only implemented for arm64 architecture. | |
208 | ||
209 | Software tag-based KASAN uses the Top Byte Ignore (TBI) feature of arm64 CPUs | |
210 | to store a pointer tag in the top byte of kernel pointers. Like generic KASAN | |
211 | it uses shadow memory to store memory tags associated with each 16-byte memory | |
b3b0e6ac AK |
212 | cell (therefore it dedicates 1/16th of the kernel memory for shadow memory). |
213 | ||
948e3253 AK |
214 | On each memory allocation software tag-based KASAN generates a random tag, tags |
215 | the allocated memory with this tag, and embeds this tag into the returned | |
216 | pointer. | |
217 | ||
b3b0e6ac AK |
218 | Software tag-based KASAN uses compile-time instrumentation to insert checks |
219 | before each memory access. These checks make sure that tag of the memory that | |
220 | is being accessed is equal to tag of the pointer that is used to access this | |
948e3253 | 221 | memory. In case of a tag mismatch software tag-based KASAN prints a bug report. |
b3b0e6ac AK |
222 | |
223 | Software tag-based KASAN also has two instrumentation modes (outline, that | |
224 | emits callbacks to check memory accesses; and inline, that performs the shadow | |
225 | memory checks inline). With outline instrumentation mode, a bug report is | |
226 | simply printed from the function that performs the access check. With inline | |
227 | instrumentation a brk instruction is emitted by the compiler, and a dedicated | |
228 | brk handler is used to print bug reports. | |
229 | ||
948e3253 AK |
230 | Software tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through |
231 | pointers with 0xFF pointer tag aren't checked). The value 0xFE is currently | |
232 | reserved to tag freed memory regions. | |
233 | ||
234 | Software tag-based KASAN currently only supports tagging of | |
235 | kmem_cache_alloc/kmalloc and page_alloc memory. | |
236 | ||
237 | Hardware tag-based KASAN | |
238 | ~~~~~~~~~~~~~~~~~~~~~~~~ | |
239 | ||
240 | Hardware tag-based KASAN is similar to the software mode in concept, but uses | |
241 | hardware memory tagging support instead of compiler instrumentation and | |
242 | shadow memory. | |
243 | ||
244 | Hardware tag-based KASAN is currently only implemented for arm64 architecture | |
245 | and based on both arm64 Memory Tagging Extension (MTE) introduced in ARMv8.5 | |
246 | Instruction Set Architecture, and Top Byte Ignore (TBI). | |
247 | ||
248 | Special arm64 instructions are used to assign memory tags for each allocation. | |
249 | Same tags are assigned to pointers to those allocations. On every memory | |
250 | access, hardware makes sure that tag of the memory that is being accessed is | |
251 | equal to tag of the pointer that is used to access this memory. In case of a | |
252 | tag mismatch a fault is generated and a report is printed. | |
253 | ||
254 | Hardware tag-based KASAN uses 0xFF as a match-all pointer tag (accesses through | |
255 | pointers with 0xFF pointer tag aren't checked). The value 0xFE is currently | |
256 | reserved to tag freed memory regions. | |
257 | ||
258 | Hardware tag-based KASAN currently only supports tagging of | |
259 | kmem_cache_alloc/kmalloc and page_alloc memory. | |
3c5c3cfb DA |
260 | |
261 | What memory accesses are sanitised by KASAN? | |
262 | -------------------------------------------- | |
263 | ||
264 | The kernel maps memory in a number of different parts of the address | |
265 | space. This poses something of a problem for KASAN, which requires | |
266 | that all addresses accessed by instrumented code have a valid shadow | |
267 | region. | |
268 | ||
269 | The range of kernel virtual addresses is large: there is not enough | |
270 | real memory to support a real shadow region for every address that | |
271 | could be accessed by the kernel. | |
272 | ||
273 | By default | |
274 | ~~~~~~~~~~ | |
275 | ||
276 | By default, architectures only map real memory over the shadow region | |
277 | for the linear mapping (and potentially other small areas). For all | |
278 | other areas - such as vmalloc and vmemmap space - a single read-only | |
279 | page is mapped over the shadow area. This read-only shadow page | |
280 | declares all memory accesses as permitted. | |
281 | ||
282 | This presents a problem for modules: they do not live in the linear | |
283 | mapping, but in a dedicated module space. By hooking in to the module | |
284 | allocator, KASAN can temporarily map real shadow memory to cover | |
285 | them. This allows detection of invalid accesses to module globals, for | |
286 | example. | |
287 | ||
288 | This also creates an incompatibility with ``VMAP_STACK``: if the stack | |
289 | lives in vmalloc space, it will be shadowed by the read-only page, and | |
290 | the kernel will fault when trying to set up the shadow data for stack | |
291 | variables. | |
292 | ||
293 | CONFIG_KASAN_VMALLOC | |
294 | ~~~~~~~~~~~~~~~~~~~~ | |
295 | ||
296 | With ``CONFIG_KASAN_VMALLOC``, KASAN can cover vmalloc space at the | |
297 | cost of greater memory usage. Currently this is only supported on x86. | |
298 | ||
299 | This works by hooking into vmalloc and vmap, and dynamically | |
300 | allocating real shadow memory to back the mappings. | |
301 | ||
302 | Most mappings in vmalloc space are small, requiring less than a full | |
303 | page of shadow space. Allocating a full shadow page per mapping would | |
304 | therefore be wasteful. Furthermore, to ensure that different mappings | |
305 | use different shadow pages, mappings would have to be aligned to | |
1f600626 | 306 | ``KASAN_GRANULE_SIZE * PAGE_SIZE``. |
3c5c3cfb DA |
307 | |
308 | Instead, we share backing space across multiple mappings. We allocate | |
309 | a backing page when a mapping in vmalloc space uses a particular page | |
310 | of the shadow region. This page can be shared by other vmalloc | |
311 | mappings later on. | |
312 | ||
313 | We hook in to the vmap infrastructure to lazily clean up unused shadow | |
314 | memory. | |
315 | ||
316 | To avoid the difficulties around swapping mappings around, we expect | |
317 | that the part of the shadow region that covers the vmalloc space will | |
318 | not be covered by the early shadow page, but will be left | |
319 | unmapped. This will require changes in arch-specific code. | |
320 | ||
321 | This allows ``VMAP_STACK`` support on x86, and can simplify support of | |
322 | architectures that do not have a fixed module region. | |
9ab5be97 PA |
323 | |
324 | CONFIG_KASAN_KUNIT_TEST & CONFIG_TEST_KASAN_MODULE | |
325 | -------------------------------------------------- | |
326 | ||
327 | ``CONFIG_KASAN_KUNIT_TEST`` utilizes the KUnit Test Framework for testing. | |
328 | This means each test focuses on a small unit of functionality and | |
329 | there are a few ways these tests can be run. | |
330 | ||
331 | Each test will print the KASAN report if an error is detected and then | |
332 | print the number of the test and the status of the test: | |
333 | ||
334 | pass:: | |
335 | ||
336 | ok 28 - kmalloc_double_kzfree | |
32519c03 | 337 | |
9ab5be97 PA |
338 | or, if kmalloc failed:: |
339 | ||
340 | # kmalloc_large_oob_right: ASSERTION FAILED at lib/test_kasan.c:163 | |
341 | Expected ptr is not null, but is | |
342 | not ok 4 - kmalloc_large_oob_right | |
32519c03 | 343 | |
9ab5be97 PA |
344 | or, if a KASAN report was expected, but not found:: |
345 | ||
346 | # kmalloc_double_kzfree: EXPECTATION FAILED at lib/test_kasan.c:629 | |
347 | Expected kasan_data->report_expected == kasan_data->report_found, but | |
348 | kasan_data->report_expected == 1 | |
349 | kasan_data->report_found == 0 | |
350 | not ok 28 - kmalloc_double_kzfree | |
351 | ||
352 | All test statuses are tracked as they run and an overall status will | |
353 | be printed at the end:: | |
354 | ||
355 | ok 1 - kasan | |
356 | ||
357 | or:: | |
358 | ||
359 | not ok 1 - kasan | |
360 | ||
361 | (1) Loadable Module | |
362 | ~~~~~~~~~~~~~~~~~~~~ | |
363 | ||
364 | With ``CONFIG_KUNIT`` enabled, ``CONFIG_KASAN_KUNIT_TEST`` can be built as | |
365 | a loadable module and run on any architecture that supports KASAN | |
366 | using something like insmod or modprobe. The module is called ``test_kasan``. | |
367 | ||
368 | (2) Built-In | |
369 | ~~~~~~~~~~~~~ | |
370 | ||
371 | With ``CONFIG_KUNIT`` built-in, ``CONFIG_KASAN_KUNIT_TEST`` can be built-in | |
1a37e18b | 372 | on any architecture that supports KASAN. These and any other KUnit |
9ab5be97 PA |
373 | tests enabled will run and print the results at boot as a late-init |
374 | call. | |
375 | ||
376 | (3) Using kunit_tool | |
377 | ~~~~~~~~~~~~~~~~~~~~~ | |
378 | ||
379 | With ``CONFIG_KUNIT`` and ``CONFIG_KASAN_KUNIT_TEST`` built-in, we can also | |
380 | use kunit_tool to see the results of these along with other KUnit | |
381 | tests in a more readable way. This will not print the KASAN reports | |
382 | of tests that passed. Use `KUnit documentation <https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html>`_ for more up-to-date | |
383 | information on kunit_tool. | |
384 | ||
385 | .. _KUnit: https://www.kernel.org/doc/html/latest/dev-tools/kunit/index.html | |
386 | ||
387 | ``CONFIG_TEST_KASAN_MODULE`` is a set of KASAN tests that could not be | |
388 | converted to KUnit. These tests can be run only as a module with | |
389 | ``CONFIG_TEST_KASAN_MODULE`` built as a loadable module and | |
390 | ``CONFIG_KASAN`` built-in. The type of error expected and the | |
391 | function being run is printed before the expression expected to give | |
392 | an error. Then the error is printed, if found, and that test | |
1a37e18b | 393 | should be interpreted to pass only if the error was the one expected |
9ab5be97 | 394 | by the test. |