Commit | Line | Data |
---|---|---|
b88679d2 CD |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ================ | |
4 | Memory Managment | |
5 | ================ | |
6 | ||
7 | Complete virtual memory map with 4-level page tables | |
8 | ==================================================== | |
9 | ||
10 | .. note:: | |
11 | ||
12 | - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down | |
13 | from the top of the 64-bit address space. It's easier to understand the layout | |
14 | when seen both in absolute addresses and in distance-from-top notation. | |
15 | ||
16 | For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the | |
17 | 64-bit address space (ffffffffffffffff). | |
18 | ||
19 | Note that as we get closer to the top of the address space, the notation changes | |
20 | from TB to GB and then MB/KB. | |
21 | ||
22 | - "16M TB" might look weird at first sight, but it's an easier to visualize size | |
23 | notation than "16 EB", which few will recognize at first sight as 16 exabytes. | |
24 | It also shows it nicely how incredibly large 64-bit address space is. | |
25 | ||
26 | :: | |
27 | ||
28 | ======================================================================================================================== | |
29 | Start addr | Offset | End addr | Size | VM area description | |
30 | ======================================================================================================================== | |
31 | | | | | | |
32 | 0000000000000000 | 0 | 00007fffffffffff | 128 TB | user-space virtual memory, different per mm | |
33 | __________________|____________|__________________|_________|___________________________________________________________ | |
34 | | | | | | |
35 | 0000800000000000 | +128 TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical | |
36 | | | | | virtual memory addresses up to the -128 TB | |
37 | | | | | starting offset of kernel mappings. | |
38 | __________________|____________|__________________|_________|___________________________________________________________ | |
39 | | | |
40 | | Kernel-space virtual memory, shared between all processes: | |
41 | ____________________________________________________________|___________________________________________________________ | |
42 | | | | | | |
43 | ffff800000000000 | -128 TB | ffff87ffffffffff | 8 TB | ... guard hole, also reserved for hypervisor | |
44 | ffff880000000000 | -120 TB | ffff887fffffffff | 0.5 TB | LDT remap for PTI | |
45 | ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct mapping of all physical memory (page_offset_base) | |
46 | ffffc88000000000 | -55.5 TB | ffffc8ffffffffff | 0.5 TB | ... unused hole | |
47 | ffffc90000000000 | -55 TB | ffffe8ffffffffff | 32 TB | vmalloc/ioremap space (vmalloc_base) | |
48 | ffffe90000000000 | -23 TB | ffffe9ffffffffff | 1 TB | ... unused hole | |
49 | ffffea0000000000 | -22 TB | ffffeaffffffffff | 1 TB | virtual memory map (vmemmap_base) | |
50 | ffffeb0000000000 | -21 TB | ffffebffffffffff | 1 TB | ... unused hole | |
51 | ffffec0000000000 | -20 TB | fffffbffffffffff | 16 TB | KASAN shadow memory | |
52 | __________________|____________|__________________|_________|____________________________________________________________ | |
53 | | | |
54 | | Identical layout to the 56-bit one from here on: | |
55 | ____________________________________________________________|____________________________________________________________ | |
56 | | | | | | |
57 | fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | |
58 | | | | | vaddr_end for KASLR | |
59 | fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping | |
60 | fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole | |
61 | ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks | |
62 | ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole | |
63 | ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space | |
64 | ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole | |
65 | ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0 | |
66 | ffffffff80000000 |-2048 MB | | | | |
67 | ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space | |
68 | ffffffffff000000 | -16 MB | | | | |
69 | FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset | |
70 | ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI | |
71 | ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole | |
72 | __________________|____________|__________________|_________|___________________________________________________________ | |
73 | ||
74 | ||
75 | Complete virtual memory map with 5-level page tables | |
76 | ==================================================== | |
77 | ||
78 | .. note:: | |
79 | ||
80 | - With 56-bit addresses, user-space memory gets expanded by a factor of 512x, | |
1fb3b526 | 81 | from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PB starting |
b88679d2 CD |
82 | offset and many of the regions expand to support the much larger physical |
83 | memory supported. | |
84 | ||
85 | :: | |
86 | ||
87 | ======================================================================================================================== | |
88 | Start addr | Offset | End addr | Size | VM area description | |
89 | ======================================================================================================================== | |
90 | | | | | | |
91 | 0000000000000000 | 0 | 00ffffffffffffff | 64 PB | user-space virtual memory, different per mm | |
92 | __________________|____________|__________________|_________|___________________________________________________________ | |
93 | | | | | | |
1fb3b526 | 94 | 0100000000000000 | +64 PB | feffffffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical |
b88679d2 CD |
95 | | | | | virtual memory addresses up to the -64 PB |
96 | | | | | starting offset of kernel mappings. | |
97 | __________________|____________|__________________|_________|___________________________________________________________ | |
98 | | | |
99 | | Kernel-space virtual memory, shared between all processes: | |
100 | ____________________________________________________________|___________________________________________________________ | |
101 | | | | | | |
102 | ff00000000000000 | -64 PB | ff0fffffffffffff | 4 PB | ... guard hole, also reserved for hypervisor | |
103 | ff10000000000000 | -60 PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI | |
104 | ff11000000000000 | -59.75 PB | ff90ffffffffffff | 32 PB | direct mapping of all physical memory (page_offset_base) | |
105 | ff91000000000000 | -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole | |
106 | ffa0000000000000 | -24 PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base) | |
107 | ffd2000000000000 | -11.5 PB | ffd3ffffffffffff | 0.5 PB | ... unused hole | |
108 | ffd4000000000000 | -11 PB | ffd5ffffffffffff | 0.5 PB | virtual memory map (vmemmap_base) | |
109 | ffd6000000000000 | -10.5 PB | ffdeffffffffffff | 2.25 PB | ... unused hole | |
1fb3b526 | 110 | ffdf000000000000 | -8.25 PB | fffffbffffffffff | ~8 PB | KASAN shadow memory |
b88679d2 CD |
111 | __________________|____________|__________________|_________|____________________________________________________________ |
112 | | | |
113 | | Identical layout to the 47-bit one from here on: | |
114 | ____________________________________________________________|____________________________________________________________ | |
115 | | | | | | |
116 | fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | |
117 | | | | | vaddr_end for KASLR | |
118 | fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping | |
119 | fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole | |
120 | ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks | |
121 | ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole | |
122 | ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space | |
123 | ffffffff00000000 | -4 GB | ffffffff7fffffff | 2 GB | ... unused hole | |
124 | ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel text mapping, mapped to physical address 0 | |
125 | ffffffff80000000 |-2048 MB | | | | |
126 | ffffffffa0000000 |-1536 MB | fffffffffeffffff | 1520 MB | module mapping space | |
127 | ffffffffff000000 | -16 MB | | | | |
128 | FIXADDR_START | ~-11 MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset | |
129 | ffffffffff600000 | -10 MB | ffffffffff600fff | 4 kB | legacy vsyscall ABI | |
130 | ffffffffffe00000 | -2 MB | ffffffffffffffff | 2 MB | ... unused hole | |
131 | __________________|____________|__________________|_________|___________________________________________________________ | |
132 | ||
133 | Architecture defines a 64-bit virtual address. Implementations can support | |
134 | less. Currently supported are 48- and 57-bit virtual addresses. Bits 63 | |
135 | through to the most-significant implemented bit are sign extended. | |
136 | This causes hole between user space and kernel addresses if you interpret them | |
137 | as unsigned. | |
138 | ||
139 | The direct mapping covers all memory in the system up to the highest | |
140 | memory address (this means in some cases it can also include PCI memory | |
141 | holes). | |
142 | ||
143 | vmalloc space is lazily synchronized into the different PML4/PML5 pages of | |
144 | the processes using the page fault handler, with init_top_pgt as | |
145 | reference. | |
146 | ||
147 | We map EFI runtime services in the 'efi_pgd' PGD in a 64Gb large virtual | |
148 | memory window (this size is arbitrary, it can be raised later if needed). | |
149 | The mappings are not part of any other kernel PGD and are only available | |
150 | during EFI runtime calls. | |
151 | ||
152 | Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all | |
153 | physical memory, vmalloc/ioremap space and virtual memory map are randomized. | |
154 | Their order is preserved but their base will be offset early at boot time. | |
155 | ||
156 | Be very careful vs. KASLR when changing anything here. The KASLR address | |
157 | range must not overlap with anything except the KASAN shadow area, which is | |
158 | correct as KASAN disables KASLR. | |
159 | ||
160 | For both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB | |
161 | hole: ffffffffffff4111 |