Commit | Line | Data |
---|---|---|
702f4387 WD |
1 | ====================== |
2 | Asymmetric 32-bit SoCs | |
3 | ====================== | |
4 | ||
5 | Author: Will Deacon <will@kernel.org> | |
6 | ||
7 | This document describes the impact of asymmetric 32-bit SoCs on the | |
8 | execution of 32-bit (``AArch32``) applications. | |
9 | ||
10 | Date: 2021-05-17 | |
11 | ||
12 | Introduction | |
13 | ============ | |
14 | ||
15 | Some Armv9 SoCs suffer from a big.LITTLE misfeature where only a subset | |
16 | of the CPUs are capable of executing 32-bit user applications. On such | |
17 | a system, Linux by default treats the asymmetry as a "mismatch" and | |
18 | disables support for both the ``PER_LINUX32`` personality and | |
19 | ``execve(2)`` of 32-bit ELF binaries, with the latter returning | |
20 | ``-ENOEXEC``. If the mismatch is detected during late onlining of a | |
21 | 64-bit-only CPU, then the onlining operation fails and the new CPU is | |
22 | unavailable for scheduling. | |
23 | ||
24 | Surprisingly, these SoCs have been produced with the intention of | |
25 | running legacy 32-bit binaries. Unsurprisingly, that doesn't work very | |
26 | well with the default behaviour of Linux. | |
27 | ||
28 | It seems inevitable that future SoCs will drop 32-bit support | |
29 | altogether, so if you're stuck in the unenviable position of needing to | |
30 | run 32-bit code on one of these transitionary platforms then you would | |
31 | be wise to consider alternatives such as recompilation, emulation or | |
32 | retirement. If neither of those options are practical, then read on. | |
33 | ||
34 | Enabling kernel support | |
35 | ======================= | |
36 | ||
37 | Since the kernel support is not completely transparent to userspace, | |
38 | allowing 32-bit tasks to run on an asymmetric 32-bit system requires an | |
39 | explicit "opt-in" and can be enabled by passing the | |
40 | ``allow_mismatched_32bit_el0`` parameter on the kernel command-line. | |
41 | ||
42 | For the remainder of this document we will refer to an *asymmetric | |
43 | system* to mean an asymmetric 32-bit SoC running Linux with this kernel | |
44 | command-line option enabled. | |
45 | ||
46 | Userspace impact | |
47 | ================ | |
48 | ||
49 | 32-bit tasks running on an asymmetric system behave in mostly the same | |
50 | way as on a homogeneous system, with a few key differences relating to | |
51 | CPU affinity. | |
52 | ||
53 | sysfs | |
54 | ----- | |
55 | ||
56 | The subset of CPUs capable of running 32-bit tasks is described in | |
57 | ``/sys/devices/system/cpu/aarch32_el0`` and is documented further in | |
58 | ``Documentation/ABI/testing/sysfs-devices-system-cpu``. | |
59 | ||
60 | **Note:** CPUs are advertised by this file as they are detected and so | |
61 | late-onlining of 32-bit-capable CPUs can result in the file contents | |
62 | being modified by the kernel at runtime. Once advertised, CPUs are never | |
63 | removed from the file. | |
64 | ||
65 | ``execve(2)`` | |
66 | ------------- | |
67 | ||
68 | On a homogeneous system, the CPU affinity of a task is preserved across | |
69 | ``execve(2)``. This is not always possible on an asymmetric system, | |
70 | specifically when the new program being executed is 32-bit yet the | |
71 | affinity mask contains 64-bit-only CPUs. In this situation, the kernel | |
72 | determines the new affinity mask as follows: | |
73 | ||
74 | 1. If the 32-bit-capable subset of the affinity mask is not empty, | |
75 | then the affinity is restricted to that subset and the old affinity | |
76 | mask is saved. This saved mask is inherited over ``fork(2)`` and | |
77 | preserved across ``execve(2)`` of 32-bit programs. | |
78 | ||
79 | **Note:** This step does not apply to ``SCHED_DEADLINE`` tasks. | |
80 | See `SCHED_DEADLINE`_. | |
81 | ||
82 | 2. Otherwise, the cpuset hierarchy of the task is walked until an | |
83 | ancestor is found containing at least one 32-bit-capable CPU. The | |
84 | affinity of the task is then changed to match the 32-bit-capable | |
85 | subset of the cpuset determined by the walk. | |
86 | ||
87 | 3. On failure (i.e. out of memory), the affinity is changed to the set | |
88 | of all 32-bit-capable CPUs of which the kernel is aware. | |
89 | ||
90 | A subsequent ``execve(2)`` of a 64-bit program by the 32-bit task will | |
91 | invalidate the affinity mask saved in (1) and attempt to restore the CPU | |
92 | affinity of the task using the saved mask if it was previously valid. | |
93 | This restoration may fail due to intervening changes to the deadline | |
94 | policy or cpuset hierarchy, in which case the ``execve(2)`` continues | |
95 | with the affinity unchanged. | |
96 | ||
97 | Calls to ``sched_setaffinity(2)`` for a 32-bit task will consider only | |
98 | the 32-bit-capable CPUs of the requested affinity mask. On success, the | |
99 | affinity for the task is updated and any saved mask from a prior | |
100 | ``execve(2)`` is invalidated. | |
101 | ||
102 | ``SCHED_DEADLINE`` | |
103 | ------------------ | |
104 | ||
105 | Explicit admission of a 32-bit deadline task to the default root domain | |
106 | (e.g. by calling ``sched_setattr(2)``) is rejected on an asymmetric | |
107 | 32-bit system unless admission control is disabled by writing -1 to | |
108 | ``/proc/sys/kernel/sched_rt_runtime_us``. | |
109 | ||
110 | ``execve(2)`` of a 32-bit program from a 64-bit deadline task will | |
111 | return ``-ENOEXEC`` if the root domain for the task contains any | |
112 | 64-bit-only CPUs and admission control is enabled. Concurrent offlining | |
113 | of 32-bit-capable CPUs may still necessitate the procedure described in | |
114 | `execve(2)`_, in which case step (1) is skipped and a warning is | |
115 | emitted on the console. | |
116 | ||
117 | **Note:** It is recommended that a set of 32-bit-capable CPUs are placed | |
118 | into a separate root domain if ``SCHED_DEADLINE`` is to be used with | |
119 | 32-bit tasks on an asymmetric system. Failure to do so is likely to | |
120 | result in missed deadlines. | |
121 | ||
122 | Cpusets | |
123 | ------- | |
124 | ||
125 | The affinity of a 32-bit task on an asymmetric system may include CPUs | |
126 | that are not explicitly allowed by the cpuset to which it is attached. | |
127 | This can occur as a result of the following two situations: | |
128 | ||
129 | - A 64-bit task attached to a cpuset which allows only 64-bit CPUs | |
130 | executes a 32-bit program. | |
131 | ||
132 | - All of the 32-bit-capable CPUs allowed by a cpuset containing a | |
133 | 32-bit task are offlined. | |
134 | ||
135 | In both of these cases, the new affinity is calculated according to step | |
136 | (2) of the process described in `execve(2)`_ and the cpuset hierarchy is | |
137 | unchanged irrespective of the cgroup version. | |
138 | ||
139 | CPU hotplug | |
140 | ----------- | |
141 | ||
142 | On an asymmetric system, the first detected 32-bit-capable CPU is | |
143 | prevented from being offlined by userspace and any such attempt will | |
144 | return ``-EPERM``. Note that suspend is still permitted even if the | |
145 | primary CPU (i.e. CPU 0) is 64-bit-only. | |
146 | ||
147 | KVM | |
148 | --- | |
149 | ||
150 | Although KVM will not advertise 32-bit EL0 support to any vCPUs on an | |
151 | asymmetric system, a broken guest at EL1 could still attempt to execute | |
152 | 32-bit code at EL0. In this case, an exit from a vCPU thread in 32-bit | |
153 | mode will return to host userspace with an ``exit_reason`` of | |
154 | ``KVM_EXIT_FAIL_ENTRY`` and will remain non-runnable until successfully | |
155 | re-initialised by a subsequent ``KVM_ARM_VCPU_INIT`` operation. |