1 =============================
2 BPF Kernel Functions (kfuncs)
3 =============================
8 BPF Kernel Functions or more commonly known as kfuncs are functions in the Linux
9 kernel which are exposed for use by BPF programs. Unlike normal BPF helpers,
10 kfuncs do not have a stable interface and can change from one kernel release to
11 another. Hence, BPF programs need to be updated in response to changes in the
17 There are two ways to expose a kernel function to BPF programs, either make an
18 existing function in the kernel visible, or add a new wrapper for BPF. In both
19 cases, care must be taken that BPF program can only call such function in a
20 valid context. To enforce this, visibility of a kfunc can be per program type.
22 If you are not creating a BPF wrapper for existing kernel function, skip ahead
23 to :ref:`BPF_kfunc_nodef`.
25 2.1 Creating a wrapper kfunc
26 ----------------------------
28 When defining a wrapper kfunc, the wrapper function should have extern linkage.
29 This prevents the compiler from optimizing away dead code, as this wrapper kfunc
30 is not invoked anywhere in the kernel itself. It is not necessary to provide a
31 prototype in a header for the wrapper kfunc.
33 An example is given below::
35 /* Disables missing prototype warnings */
37 __diag_ignore_all("-Wmissing-prototypes",
38 "Global kfuncs as their definitions will be in BTF");
40 struct task_struct *bpf_find_get_task_by_vpid(pid_t nr)
42 return find_get_task_by_vpid(nr);
47 A wrapper kfunc is often needed when we need to annotate parameters of the
48 kfunc. Otherwise one may directly make the kfunc visible to the BPF program by
49 registering it with the BPF subsystem. See :ref:`BPF_kfunc_nodef`.
51 2.2 Annotating kfunc parameters
52 -------------------------------
54 Similar to BPF helpers, there is sometime need for additional context required
55 by the verifier to make the usage of kernel functions safer and more useful.
56 Hence, we can annotate a parameter by suffixing the name of the argument of the
57 kfunc with a __tag, where tag may be one of the supported annotations.
62 This annotation is used to indicate a memory and size pair in the argument list.
63 An example is given below::
65 void bpf_memzero(void *mem, int mem__sz)
70 Here, the verifier will treat first argument as a PTR_TO_MEM, and second
71 argument as its size. By default, without __sz annotation, the size of the type
72 of the pointer is used. Without __sz annotation, a kfunc cannot accept a void
77 2.3 Using an existing kernel function
78 -------------------------------------
80 When an existing function in the kernel is fit for consumption by BPF programs,
81 it can be directly registered with the BPF subsystem. However, care must still
82 be taken to review the context in which it will be invoked by the BPF program
83 and whether it is safe to do so.
88 In addition to kfuncs' arguments, verifier may need more information about the
89 type of kfunc(s) being registered with the BPF subsystem. To do so, we define
90 flags on a set of kfuncs as follows::
92 BTF_SET8_START(bpf_task_set)
93 BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
94 BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
95 BTF_SET8_END(bpf_task_set)
97 This set encodes the BTF ID of each kfunc listed above, and encodes the flags
98 along with it. Ofcourse, it is also allowed to specify no flags.
100 2.4.1 KF_ACQUIRE flag
101 ---------------------
103 The KF_ACQUIRE flag is used to indicate that the kfunc returns a pointer to a
104 refcounted object. The verifier will then ensure that the pointer to the object
105 is eventually released using a release kfunc, or transferred to a map using a
106 referenced kptr (by invoking bpf_kptr_xchg). If not, the verifier fails the
107 loading of the BPF program until no lingering references remain in all possible
108 explored states of the program.
110 2.4.2 KF_RET_NULL flag
111 ----------------------
113 The KF_RET_NULL flag is used to indicate that the pointer returned by the kfunc
114 may be NULL. Hence, it forces the user to do a NULL check on the pointer
115 returned from the kfunc before making use of it (dereferencing or passing to
116 another helper). This flag is often used in pairing with KF_ACQUIRE flag, but
117 both are orthogonal to each other.
119 2.4.3 KF_RELEASE flag
120 ---------------------
122 The KF_RELEASE flag is used to indicate that the kfunc releases the pointer
123 passed in to it. There can be only one referenced pointer that can be passed in.
124 All copies of the pointer being released are invalidated as a result of invoking
125 kfunc with this flag.
127 2.4.4 KF_KPTR_GET flag
128 ----------------------
130 The KF_KPTR_GET flag is used to indicate that the kfunc takes the first argument
131 as a pointer to kptr, safely increments the refcount of the object it points to,
132 and returns a reference to the user. The rest of the arguments may be normal
133 arguments of a kfunc. The KF_KPTR_GET flag should be used in conjunction with
134 KF_ACQUIRE and KF_RET_NULL flags.
136 2.4.5 KF_TRUSTED_ARGS flag
137 --------------------------
139 The KF_TRUSTED_ARGS flag is used for kfuncs taking pointer arguments. It
140 indicates that the all pointer arguments will always have a guaranteed lifetime,
141 and pointers to kernel objects are always passed to helpers in their unmodified
142 form (as obtained from acquire kfuncs).
144 It can be used to enforce that a pointer to a refcounted object acquired from a
145 kfunc or BPF helper is passed as an argument to this kfunc without any
146 modifications (e.g. pointer arithmetic) such that it is trusted and points to
149 Meanwhile, it is also allowed pass pointers to normal memory to such kfuncs,
150 but those can have a non-zero offset.
152 This flag is often used for kfuncs that operate (change some property, perform
153 some operation) on an object that was obtained using an acquire kfunc. Such
154 kfuncs need an unchanged pointer to ensure the integrity of the operation being
155 performed on the expected object.
157 2.4.6 KF_SLEEPABLE flag
158 -----------------------
160 The KF_SLEEPABLE flag is used for kfuncs that may sleep. Such kfuncs can only
161 be called by sleepable BPF programs (BPF_F_SLEEPABLE).
163 2.4.7 KF_DESTRUCTIVE flag
164 --------------------------
166 The KF_DESTRUCTIVE flag is used to indicate functions calling which is
167 destructive to the system. For example such a call can result in system
168 rebooting or panicking. Due to this additional restrictions apply to these
169 calls. At the moment they only require CAP_SYS_BOOT capability, but more can be
172 2.5 Registering the kfuncs
173 --------------------------
175 Once the kfunc is prepared for use, the final step to making it visible is
176 registering it with the BPF subsystem. Registration is done per BPF program
177 type. An example is shown below::
179 BTF_SET8_START(bpf_task_set)
180 BTF_ID_FLAGS(func, bpf_get_task_pid, KF_ACQUIRE | KF_RET_NULL)
181 BTF_ID_FLAGS(func, bpf_put_pid, KF_RELEASE)
182 BTF_SET8_END(bpf_task_set)
184 static const struct btf_kfunc_id_set bpf_task_kfunc_set = {
185 .owner = THIS_MODULE,
186 .set = &bpf_task_set,
189 static int init_subsystem(void)
191 return register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_task_kfunc_set);
193 late_initcall(init_subsystem);