Commit | Line | Data |
---|---|---|
1897907c FY |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | .. include:: <isonum.txt> | |
4 | ||
5 | =============================== | |
6 | Bus lock detection and handling | |
7 | =============================== | |
8 | ||
9 | :Copyright: |copy| 2021 Intel Corporation | |
10 | :Authors: - Fenghua Yu <fenghua.yu@intel.com> | |
11 | - Tony Luck <tony.luck@intel.com> | |
12 | ||
13 | Problem | |
14 | ======= | |
15 | ||
16 | A split lock is any atomic operation whose operand crosses two cache lines. | |
17 | Since the operand spans two cache lines and the operation must be atomic, | |
18 | the system locks the bus while the CPU accesses the two cache lines. | |
19 | ||
20 | A bus lock is acquired through either split locked access to writeback (WB) | |
21 | memory or any locked access to non-WB memory. This is typically thousands of | |
22 | cycles slower than an atomic operation within a cache line. It also disrupts | |
23 | performance on other cores and brings the whole system to its knees. | |
24 | ||
25 | Detection | |
26 | ========= | |
27 | ||
28 | Intel processors may support either or both of the following hardware | |
29 | mechanisms to detect split locks and bus locks. | |
30 | ||
31 | #AC exception for split lock detection | |
32 | -------------------------------------- | |
33 | ||
34 | Beginning with the Tremont Atom CPU split lock operations may raise an | |
35 | Alignment Check (#AC) exception when a split lock operation is attemped. | |
36 | ||
37 | #DB exception for bus lock detection | |
38 | ------------------------------------ | |
39 | ||
40 | Some CPUs have the ability to notify the kernel by an #DB trap after a user | |
41 | instruction acquires a bus lock and is executed. This allows the kernel to | |
42 | terminate the application or to enforce throttling. | |
43 | ||
44 | Software handling | |
45 | ================= | |
46 | ||
47 | The kernel #AC and #DB handlers handle bus lock based on the kernel | |
48 | parameter "split_lock_detect". Here is a summary of different options: | |
49 | ||
50 | +------------------+----------------------------+-----------------------+ | |
51 | |split_lock_detect=|#AC for split lock |#DB for bus lock | | |
52 | +------------------+----------------------------+-----------------------+ | |
53 | |off |Do nothing |Do nothing | | |
54 | +------------------+----------------------------+-----------------------+ | |
55 | |warn |Kernel OOPs |Warn once per task and | | |
054ed634 TL |
56 | |(default) |Warn once per task, add a |and continues to run. | |
57 | | |delay, add synchronization | | | |
58 | | |to prevent more than one | | | |
59 | | |core from executing a | | | |
60 | | |split lock in parallel. | | | |
61 | | |sysctl split_lock_mitigate | | | |
62 | | |can be used to avoid the | | | |
63 | | |delay and synchronization | | | |
1897907c FY |
64 | | |When both features are | | |
65 | | |supported, warn in #AC | | | |
66 | +------------------+----------------------------+-----------------------+ | |
67 | |fatal |Kernel OOPs |Send SIGBUS to user. | | |
68 | | |Send SIGBUS to user | | | |
69 | | |When both features are | | | |
70 | | |supported, fatal in #AC | | | |
71 | +------------------+----------------------------+-----------------------+ | |
d28397ea FY |
72 | |ratelimit:N |Do nothing |Limit bus lock rate to | |
73 | |(0 < N <= 1000) | |N bus locks per second | | |
74 | | | |system wide and warn on| | |
75 | | | |bus locks. | | |
76 | +------------------+----------------------------+-----------------------+ | |
1897907c FY |
77 | |
78 | Usages | |
79 | ====== | |
80 | ||
81 | Detecting and handling bus lock may find usages in various areas: | |
82 | ||
83 | It is critical for real time system designers who build consolidated real | |
84 | time systems. These systems run hard real time code on some cores and run | |
85 | "untrusted" user processes on other cores. The hard real time cannot afford | |
86 | to have any bus lock from the untrusted processes to hurt real time | |
87 | performance. To date the designers have been unable to deploy these | |
88 | solutions as they have no way to prevent the "untrusted" user code from | |
89 | generating split lock and bus lock to block the hard real time code to | |
90 | access memory during bus locking. | |
91 | ||
92 | It's also useful for general computing to prevent guests or user | |
93 | applications from slowing down the overall system by executing instructions | |
94 | with bus lock. | |
95 | ||
96 | ||
97 | Guidance | |
98 | ======== | |
99 | off | |
100 | --- | |
101 | ||
102 | Disable checking for split lock and bus lock. This option can be useful if | |
103 | there are legacy applications that trigger these events at a low rate so | |
104 | that mitigation is not needed. | |
105 | ||
106 | warn | |
107 | ---- | |
108 | ||
109 | A warning is emitted when a bus lock is detected which allows to identify | |
110 | the offending application. This is the default behavior. | |
111 | ||
112 | fatal | |
113 | ----- | |
114 | ||
115 | In this case, the bus lock is not tolerated and the process is killed. | |
d28397ea FY |
116 | |
117 | ratelimit | |
118 | --------- | |
119 | ||
120 | A system wide bus lock rate limit N is specified where 0 < N <= 1000. This | |
121 | allows a bus lock rate up to N bus locks per second. When the bus lock rate | |
122 | is exceeded then any task which is caught via the buslock #DB exception is | |
123 | throttled by enforced sleeps until the rate goes under the limit again. | |
124 | ||
125 | This is an effective mitigation in cases where a minimal impact can be | |
126 | tolerated, but an eventual Denial of Service attack has to be prevented. It | |
127 | allows to identify the offending processes and analyze whether they are | |
128 | malicious or just badly written. | |
129 | ||
130 | Selecting a rate limit of 1000 allows the bus to be locked for up to about | |
131 | seven million cycles each second (assuming 7000 cycles for each bus | |
132 | lock). On a 2 GHz processor that would be about 0.35% system slowdown. |