f2fs: Provide a splice-read wrapper
[linux-block.git] / Documentation / bpf / btf.rst
CommitLineData
ffcf7ce9
YS
1=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
3ff36bff 6===============
ffcf7ce9 7
9ab5305d
AN
8BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
12
13The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
ffcf7ce9
YS
17
18The BTF specification contains two parts,
19 * BTF kernel API
20 * BTF ELF file format
21
9ab5305d
AN
22The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
ffcf7ce9 25
9ab5305d
AN
26The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
ffcf7ce9
YS
29
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
3ff36bff 33===============================
ffcf7ce9 34
9ab5305d
AN
35The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
ffcf7ce9
YS
37
38The beginning of data blob must be::
39
40 struct btf_header {
41 __u16 magic;
42 __u8 version;
43 __u8 flags;
44 __u32 hdr_len;
45
46 /* All offsets are in bytes relative to the end of this header */
47 __u32 type_off; /* offset of type section */
48 __u32 type_len; /* length of type section */
49 __u32 str_off; /* offset of string section */
50 __u32 str_len; /* length of string section */
51 };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
9ab5305d
AN
54endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
ffcf7ce9
YS
58
592.1 String Encoding
3ff36bff 60-------------------
ffcf7ce9 61
9ab5305d
AN
62The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
ffcf7ce9
YS
64
652.2 Type Encoding
3ff36bff 66-----------------
ffcf7ce9 67
9ab5305d
AN
68The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
ffcf7ce9
YS
71
72 #define BTF_KIND_INT 1 /* Integer */
73 #define BTF_KIND_PTR 2 /* Pointer */
74 #define BTF_KIND_ARRAY 3 /* Array */
75 #define BTF_KIND_STRUCT 4 /* Struct */
76 #define BTF_KIND_UNION 5 /* Union */
61dbd598 77 #define BTF_KIND_ENUM 6 /* Enumeration up to 32-bit values */
ffcf7ce9
YS
78 #define BTF_KIND_FWD 7 /* Forward */
79 #define BTF_KIND_TYPEDEF 8 /* Typedef */
80 #define BTF_KIND_VOLATILE 9 /* Volatile */
81 #define BTF_KIND_CONST 10 /* Const */
82 #define BTF_KIND_RESTRICT 11 /* Restrict */
83 #define BTF_KIND_FUNC 12 /* Function */
84 #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
f063c889
DB
85 #define BTF_KIND_VAR 14 /* Variable */
86 #define BTF_KIND_DATASEC 15 /* Section */
6be6a0ba 87 #define BTF_KIND_FLOAT 16 /* Floating point */
223f903e 88 #define BTF_KIND_DECL_TAG 17 /* Decl Tag */
d52f5c63 89 #define BTF_KIND_TYPE_TAG 18 /* Type Tag */
61dbd598 90 #define BTF_KIND_ENUM64 19 /* Enumeration up to 64-bit values */
ffcf7ce9
YS
91
92Note that the type section encodes debug info, not just pure types.
93``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
94
95Each type contains the following common data::
96
97 struct btf_type {
98 __u32 name_off;
99 /* "info" bits arrangement
100 * bits 0-15: vlen (e.g. # of struct's members)
101 * bits 16-23: unused
6be6a0ba
IL
102 * bits 24-28: kind (e.g. int, ptr, array...etc)
103 * bits 29-30: unused
ffcf7ce9 104 * bit 31: kind_flag, currently used by
61dbd598 105 * struct, union, fwd, enum and enum64.
ffcf7ce9
YS
106 */
107 __u32 info;
61dbd598 108 /* "size" is used by INT, ENUM, STRUCT, UNION and ENUM64.
ffcf7ce9
YS
109 * "size" tells the size of the type it is describing.
110 *
111 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
d52f5c63 112 * FUNC, FUNC_PROTO, DECL_TAG and TYPE_TAG.
ffcf7ce9
YS
113 * "type" is a type_id referring to another type.
114 */
115 union {
116 __u32 size;
117 __u32 type;
118 };
119 };
120
9ab5305d
AN
121For certain kinds, the common data are followed by kind-specific data. The
122``name_off`` in ``struct btf_type`` specifies the offset in the string table.
123The following sections detail encoding of each kind.
ffcf7ce9
YS
124
1252.2.1 BTF_KIND_INT
126~~~~~~~~~~~~~~~~~~
127
128``struct btf_type`` encoding requirement:
129 * ``name_off``: any valid offset
130 * ``info.kind_flag``: 0
131 * ``info.kind``: BTF_KIND_INT
132 * ``info.vlen``: 0
133 * ``size``: the size of the int type in bytes.
134
5efc529f 135``btf_type`` is followed by a ``u32`` with the following bits arrangement::
ffcf7ce9
YS
136
137 #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24)
948dc8c9 138 #define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff0000) >> 16)
ffcf7ce9
YS
139 #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff)
140
141The ``BTF_INT_ENCODING`` has the following attributes::
142
143 #define BTF_INT_SIGNED (1 << 0)
144 #define BTF_INT_CHAR (1 << 1)
145 #define BTF_INT_BOOL (1 << 2)
146
9ab5305d
AN
147The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
148bool, for the int type. The char and bool encoding are mostly useful for
149pretty print. At most one encoding can be specified for the int type.
150
151The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
152type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
153The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
154for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
155
156The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
f52c97d9 157for this int. For example, a bitfield struct member has:
d857a3ff 158
f52c97d9
JDB
159 * btf member bit offset 100 from the start of the structure,
160 * btf member pointing to an int type,
161 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
ffcf7ce9 162
9ab5305d
AN
163Then in the struct memory layout, this member will occupy ``4`` bits starting
164from bits ``100 + 2 = 102``.
ffcf7ce9 165
9ab5305d
AN
166Alternatively, the bitfield struct member can be the following to access the
167same bits as the above:
d857a3ff 168
ffcf7ce9
YS
169 * btf member bit offset 102,
170 * btf member pointing to an int type,
171 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
172
9ab5305d
AN
173The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
174bitfield encoding. Currently, both llvm and pahole generate
175``BTF_INT_OFFSET() = 0`` for all int types.
ffcf7ce9
YS
176
1772.2.2 BTF_KIND_PTR
178~~~~~~~~~~~~~~~~~~
179
180``struct btf_type`` encoding requirement:
181 * ``name_off``: 0
182 * ``info.kind_flag``: 0
183 * ``info.kind``: BTF_KIND_PTR
184 * ``info.vlen``: 0
185 * ``type``: the pointee type of the pointer
186
187No additional type data follow ``btf_type``.
188
1892.2.3 BTF_KIND_ARRAY
190~~~~~~~~~~~~~~~~~~~~
191
192``struct btf_type`` encoding requirement:
193 * ``name_off``: 0
194 * ``info.kind_flag``: 0
195 * ``info.kind``: BTF_KIND_ARRAY
196 * ``info.vlen``: 0
197 * ``size/type``: 0, not used
198
5efc529f 199``btf_type`` is followed by one ``struct btf_array``::
ffcf7ce9
YS
200
201 struct btf_array {
202 __u32 type;
203 __u32 index_type;
204 __u32 nelems;
205 };
206
207The ``struct btf_array`` encoding:
208 * ``type``: the element type
209 * ``index_type``: the index type
210 * ``nelems``: the number of elements for this array (``0`` is also allowed).
211
9ab5305d
AN
212The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
213``u64``, ``unsigned __int128``). The original design of including
214``index_type`` follows DWARF, which has an ``index_type`` for its array type.
ffcf7ce9
YS
215Currently in BTF, beyond type verification, the ``index_type`` is not used.
216
217The ``struct btf_array`` allows chaining through element type to represent
9ab5305d
AN
218multidimensional arrays. For example, for ``int a[5][6]``, the following type
219information illustrates the chaining:
ffcf7ce9
YS
220
221 * [1]: int
222 * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
223 * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
224
9ab5305d
AN
225Currently, both pahole and llvm collapse multidimensional array into
226one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
227equal to ``30``. This is because the original use case is map pretty print
228where the whole array is dumped out so one-dimensional array is enough. As
229more BTF usage is explored, pahole and llvm can be changed to generate proper
230chained representation for multidimensional arrays.
ffcf7ce9
YS
231
2322.2.4 BTF_KIND_STRUCT
233~~~~~~~~~~~~~~~~~~~~~
2342.2.5 BTF_KIND_UNION
235~~~~~~~~~~~~~~~~~~~~
236
237``struct btf_type`` encoding requirement:
238 * ``name_off``: 0 or offset to a valid C identifier
239 * ``info.kind_flag``: 0 or 1
240 * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
241 * ``info.vlen``: the number of struct/union members
242 * ``info.size``: the size of the struct/union in bytes
243
244``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
245
246 struct btf_member {
247 __u32 name_off;
248 __u32 type;
249 __u32 offset;
250 };
251
252``struct btf_member`` encoding:
253 * ``name_off``: offset to a valid C identifier
254 * ``type``: the member type
255 * ``offset``: <see below>
256
9ab5305d
AN
257If the type info ``kind_flag`` is not set, the offset contains only bit offset
258of the member. Note that the base type of the bitfield can only be int or enum
259type. If the bitfield size is 32, the base type can be either int or enum
260type. If the bitfield size is not 32, the base type must be int, and int type
261``BTF_INT_BITS()`` encodes the bitfield size.
ffcf7ce9 262
9ab5305d
AN
263If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
264bitfield size and bit offset. The bitfield size and bit offset are calculated
265as below.::
ffcf7ce9
YS
266
267 #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24)
268 #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff)
269
9ab5305d 270In this case, if the base type is an int type, it must be a regular int type:
ffcf7ce9
YS
271
272 * ``BTF_INT_OFFSET()`` must be 0.
273 * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
274
9ab5305d
AN
275The following kernel patch introduced ``kind_flag`` and explained why both
276modes exist:
ffcf7ce9
YS
277
278 https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
279
2802.2.6 BTF_KIND_ENUM
281~~~~~~~~~~~~~~~~~~~
282
283``struct btf_type`` encoding requirement:
284 * ``name_off``: 0 or offset to a valid C identifier
61dbd598 285 * ``info.kind_flag``: 0 for unsigned, 1 for signed
ffcf7ce9
YS
286 * ``info.kind``: BTF_KIND_ENUM
287 * ``info.vlen``: number of enum values
61dbd598 288 * ``size``: 1/2/4/8
ffcf7ce9
YS
289
290``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
291
292 struct btf_enum {
293 __u32 name_off;
294 __s32 val;
295 };
296
297The ``btf_enum`` encoding:
298 * ``name_off``: offset to a valid C identifier
299 * ``val``: any value
300
61dbd598
YS
301If the original enum value is signed and the size is less than 4,
302that value will be sign extended into 4 bytes. If the size is 8,
303the value will be truncated into 4 bytes.
304
ffcf7ce9
YS
3052.2.7 BTF_KIND_FWD
306~~~~~~~~~~~~~~~~~~
307
308``struct btf_type`` encoding requirement:
309 * ``name_off``: offset to a valid C identifier
310 * ``info.kind_flag``: 0 for struct, 1 for union
311 * ``info.kind``: BTF_KIND_FWD
312 * ``info.vlen``: 0
313 * ``type``: 0
314
315No additional type data follow ``btf_type``.
316
3172.2.8 BTF_KIND_TYPEDEF
318~~~~~~~~~~~~~~~~~~~~~~
319
320``struct btf_type`` encoding requirement:
321 * ``name_off``: offset to a valid C identifier
322 * ``info.kind_flag``: 0
323 * ``info.kind``: BTF_KIND_TYPEDEF
324 * ``info.vlen``: 0
325 * ``type``: the type which can be referred by name at ``name_off``
326
327No additional type data follow ``btf_type``.
328
3292.2.9 BTF_KIND_VOLATILE
330~~~~~~~~~~~~~~~~~~~~~~~
331
332``struct btf_type`` encoding requirement:
333 * ``name_off``: 0
334 * ``info.kind_flag``: 0
335 * ``info.kind``: BTF_KIND_VOLATILE
336 * ``info.vlen``: 0
337 * ``type``: the type with ``volatile`` qualifier
338
339No additional type data follow ``btf_type``.
340
3412.2.10 BTF_KIND_CONST
342~~~~~~~~~~~~~~~~~~~~~
343
344``struct btf_type`` encoding requirement:
345 * ``name_off``: 0
346 * ``info.kind_flag``: 0
347 * ``info.kind``: BTF_KIND_CONST
348 * ``info.vlen``: 0
349 * ``type``: the type with ``const`` qualifier
350
351No additional type data follow ``btf_type``.
352
3532.2.11 BTF_KIND_RESTRICT
354~~~~~~~~~~~~~~~~~~~~~~~~
355
356``struct btf_type`` encoding requirement:
357 * ``name_off``: 0
358 * ``info.kind_flag``: 0
359 * ``info.kind``: BTF_KIND_RESTRICT
360 * ``info.vlen``: 0
361 * ``type``: the type with ``restrict`` qualifier
362
363No additional type data follow ``btf_type``.
364
3652.2.12 BTF_KIND_FUNC
366~~~~~~~~~~~~~~~~~~~~
367
368``struct btf_type`` encoding requirement:
369 * ``name_off``: offset to a valid C identifier
370 * ``info.kind_flag``: 0
371 * ``info.kind``: BTF_KIND_FUNC
e5e23424
IB
372 * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL
373 or BTF_FUNC_EXTERN)
ffcf7ce9
YS
374 * ``type``: a BTF_KIND_FUNC_PROTO type
375
376No additional type data follow ``btf_type``.
377
5efc529f 378A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
9ab5305d
AN
379signature is defined by ``type``. The subprogram is thus an instance of that
380type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
381:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
382(ABI).
ffcf7ce9 383
e5e23424
IB
384Currently, only linkage values of BTF_FUNC_STATIC and BTF_FUNC_GLOBAL are
385supported in the kernel.
386
ffcf7ce9
YS
3872.2.13 BTF_KIND_FUNC_PROTO
388~~~~~~~~~~~~~~~~~~~~~~~~~~
389
390``struct btf_type`` encoding requirement:
391 * ``name_off``: 0
392 * ``info.kind_flag``: 0
393 * ``info.kind``: BTF_KIND_FUNC_PROTO
394 * ``info.vlen``: # of parameters
395 * ``type``: the return type
396
397``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
398
399 struct btf_param {
400 __u32 name_off;
401 __u32 type;
402 };
403
9ab5305d
AN
404If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
405``btf_param.name_off`` must point to a valid C identifier except for the
406possible last argument representing the variable argument. The btf_param.type
407refers to parameter type.
ffcf7ce9 408
9ab5305d
AN
409If the function has variable arguments, the last parameter is encoded with
410``name_off = 0`` and ``type = 0``.
ffcf7ce9 411
f063c889
DB
4122.2.14 BTF_KIND_VAR
413~~~~~~~~~~~~~~~~~~~
414
415``struct btf_type`` encoding requirement:
416 * ``name_off``: offset to a valid C identifier
417 * ``info.kind_flag``: 0
418 * ``info.kind``: BTF_KIND_VAR
419 * ``info.vlen``: 0
420 * ``type``: the type of the variable
421
422``btf_type`` is followed by a single ``struct btf_variable`` with the
423following data::
424
425 struct btf_var {
426 __u32 linkage;
427 };
428
429``struct btf_var`` encoding:
430 * ``linkage``: currently only static variable 0, or globally allocated
431 variable in ELF sections 1
432
433Not all type of global variables are supported by LLVM at this point.
434The following is currently available:
435
436 * static variables with or without section attributes
437 * global variables with section attributes
438
439The latter is for future extraction of map key/value type id's from a
440map definition.
441
4422.2.15 BTF_KIND_DATASEC
443~~~~~~~~~~~~~~~~~~~~~~~
444
445``struct btf_type`` encoding requirement:
446 * ``name_off``: offset to a valid name associated with a variable or
447 one of .data/.bss/.rodata
448 * ``info.kind_flag``: 0
449 * ``info.kind``: BTF_KIND_DATASEC
450 * ``info.vlen``: # of variables
451 * ``size``: total section size in bytes (0 at compilation time, patched
452 to actual size by BPF loaders such as libbpf)
453
454``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
455
456 struct btf_var_secinfo {
457 __u32 type;
458 __u32 offset;
459 __u32 size;
460 };
461
462``struct btf_var_secinfo`` encoding:
463 * ``type``: the type of the BTF_KIND_VAR variable
464 * ``offset``: the in-section offset of the variable
465 * ``size``: the size of the variable in bytes
466
6be6a0ba
IL
4672.2.16 BTF_KIND_FLOAT
468~~~~~~~~~~~~~~~~~~~~~
469
470``struct btf_type`` encoding requirement:
471 * ``name_off``: any valid offset
472 * ``info.kind_flag``: 0
473 * ``info.kind``: BTF_KIND_FLOAT
474 * ``info.vlen``: 0
475 * ``size``: the size of the float type in bytes: 2, 4, 8, 12 or 16.
476
477No additional type data follow ``btf_type``.
478
223f903e
YS
4792.2.17 BTF_KIND_DECL_TAG
480~~~~~~~~~~~~~~~~~~~~~~~~
48f5a6c4
YS
481
482``struct btf_type`` encoding requirement:
483 * ``name_off``: offset to a non-empty string
484 * ``info.kind_flag``: 0
223f903e 485 * ``info.kind``: BTF_KIND_DECL_TAG
48f5a6c4 486 * ``info.vlen``: 0
5a867134 487 * ``type``: ``struct``, ``union``, ``func``, ``var`` or ``typedef``
48f5a6c4 488
223f903e 489``btf_type`` is followed by ``struct btf_decl_tag``.::
48f5a6c4 490
223f903e 491 struct btf_decl_tag {
48f5a6c4
YS
492 __u32 component_idx;
493 };
494
223f903e 495The ``name_off`` encodes btf_decl_tag attribute string.
5a867134
YS
496The ``type`` should be ``struct``, ``union``, ``func``, ``var`` or ``typedef``.
497For ``var`` or ``typedef`` type, ``btf_decl_tag.component_idx`` must be ``-1``.
223f903e 498For the other three types, if the btf_decl_tag attribute is
48f5a6c4 499applied to the ``struct``, ``union`` or ``func`` itself,
223f903e 500``btf_decl_tag.component_idx`` must be ``-1``. Otherwise,
48f5a6c4 501the attribute is applied to a ``struct``/``union`` member or
223f903e 502a ``func`` argument, and ``btf_decl_tag.component_idx`` should be a
48f5a6c4
YS
503valid index (starting from 0) pointing to a member or an argument.
504
61dbd598 5052.2.18 BTF_KIND_TYPE_TAG
d52f5c63
YS
506~~~~~~~~~~~~~~~~~~~~~~~~
507
508``struct btf_type`` encoding requirement:
509 * ``name_off``: offset to a non-empty string
510 * ``info.kind_flag``: 0
511 * ``info.kind``: BTF_KIND_TYPE_TAG
512 * ``info.vlen``: 0
513 * ``type``: the type with ``btf_type_tag`` attribute
514
b7290384
YS
515Currently, ``BTF_KIND_TYPE_TAG`` is only emitted for pointer types.
516It has the following btf type chain:
517::
518
519 ptr -> [type_tag]*
520 -> [const | volatile | restrict | typedef]*
521 -> base_type
522
523Basically, a pointer type points to zero or more
524type_tag, then zero or more const/volatile/restrict/typedef
525and finally the base type. The base type is one of
526int, ptr, array, struct, union, enum, func_proto and float types.
527
61dbd598
YS
5282.2.19 BTF_KIND_ENUM64
529~~~~~~~~~~~~~~~~~~~~~~
530
531``struct btf_type`` encoding requirement:
532 * ``name_off``: 0 or offset to a valid C identifier
533 * ``info.kind_flag``: 0 for unsigned, 1 for signed
534 * ``info.kind``: BTF_KIND_ENUM64
535 * ``info.vlen``: number of enum values
536 * ``size``: 1/2/4/8
537
538``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum64``.::
539
540 struct btf_enum64 {
541 __u32 name_off;
542 __u32 val_lo32;
543 __u32 val_hi32;
544 };
545
546The ``btf_enum64`` encoding:
547 * ``name_off``: offset to a valid C identifier
548 * ``val_lo32``: lower 32-bit value for a 64-bit value
549 * ``val_hi32``: high 32-bit value for a 64-bit value
550
551If the original enum value is signed and the size is less than 8,
552that value will be sign extended into 8 bytes.
553
ffcf7ce9 5543. BTF Kernel API
3ff36bff 555=================
ffcf7ce9
YS
556
557The following bpf syscall command involves BTF:
558 * BPF_BTF_LOAD: load a blob of BTF data into kernel
559 * BPF_MAP_CREATE: map creation with btf key and value type info.
560 * BPF_PROG_LOAD: prog load with btf function and line info.
561 * BPF_BTF_GET_FD_BY_ID: get a btf fd
562 * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
563 and other btf related info are returned.
564
565The workflow typically looks like:
566::
567
568 Application:
569 BPF_BTF_LOAD
570 |
571 v
572 BPF_MAP_CREATE and BPF_PROG_LOAD
573 |
574 V
575 ......
576
577 Introspection tool:
578 ......
579 BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
580 |
581 V
582 BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
583 |
584 V
585 BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
586 | |
587 V |
588 BPF_BTF_GET_FD_BY_ID (get btf_fd) |
589 | |
590 V |
591 BPF_OBJ_GET_INFO_BY_FD (get btf) |
592 | |
593 V V
594 pretty print types, dump func signatures and line info, etc.
595
596
5973.1 BPF_BTF_LOAD
3ff36bff 598----------------
ffcf7ce9 599
9ab5305d
AN
600Load a blob of BTF data into kernel. A blob of data, described in
601:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
602is returned to a userspace.
ffcf7ce9
YS
603
6043.2 BPF_MAP_CREATE
3ff36bff 605------------------
ffcf7ce9
YS
606
607A map can be created with ``btf_fd`` and specified key/value type id.::
608
609 __u32 btf_fd; /* fd pointing to a BTF type data */
610 __u32 btf_key_type_id; /* BTF type_id of the key */
611 __u32 btf_value_type_id; /* BTF type_id of the value */
612
613In libbpf, the map can be defined with extra annotation like below:
614::
615
96c85308
AN
616 struct {
617 __uint(type, BPF_MAP_TYPE_ARRAY);
618 __type(key, int);
619 __type(value, struct ipv_counts);
620 __uint(max_entries, 4);
621 } btf_map SEC(".maps");
ffcf7ce9 622
96c85308
AN
623During ELF parsing, libbpf is able to extract key/value type_id's and assign
624them to BPF_MAP_CREATE attributes automatically.
ffcf7ce9
YS
625
626.. _BPF_Prog_Load:
627
6283.3 BPF_PROG_LOAD
3ff36bff 629-----------------
ffcf7ce9 630
9ab5305d
AN
631During prog_load, func_info and line_info can be passed to kernel with proper
632values for the following attributes:
ffcf7ce9
YS
633::
634
635 __u32 insn_cnt;
636 __aligned_u64 insns;
637 ......
638 __u32 prog_btf_fd; /* fd pointing to BTF type data */
639 __u32 func_info_rec_size; /* userspace bpf_func_info size */
640 __aligned_u64 func_info; /* func info */
641 __u32 func_info_cnt; /* number of bpf_func_info records */
642 __u32 line_info_rec_size; /* userspace bpf_line_info size */
643 __aligned_u64 line_info; /* line info */
644 __u32 line_info_cnt; /* number of bpf_line_info records */
645
646The func_info and line_info are an array of below, respectively.::
647
648 struct bpf_func_info {
649 __u32 insn_off; /* [0, insn_cnt - 1] */
650 __u32 type_id; /* pointing to a BTF_KIND_FUNC type */
651 };
652 struct bpf_line_info {
653 __u32 insn_off; /* [0, insn_cnt - 1] */
654 __u32 file_name_off; /* offset to string table for the filename */
655 __u32 line_off; /* offset to string table for the source line */
656 __u32 line_col; /* line number and column number */
657 };
658
9ab5305d
AN
659func_info_rec_size is the size of each func_info record, and
660line_info_rec_size is the size of each line_info record. Passing the record
661size to kernel make it possible to extend the record itself in the future.
ffcf7ce9
YS
662
663Below are requirements for func_info:
664 * func_info[0].insn_off must be 0.
665 * the func_info insn_off is in strictly increasing order and matches
666 bpf func boundaries.
667
668Below are requirements for line_info:
5efc529f 669 * the first insn in each func must have a line_info record pointing to it.
ffcf7ce9
YS
670 * the line_info insn_off is in strictly increasing order.
671
672For line_info, the line number and column number are defined as below:
673::
674
675 #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10)
676 #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff)
677
6783.4 BPF_{PROG,MAP}_GET_NEXT_ID
3ff36bff 679------------------------------
ffcf7ce9 680
9ab5305d
AN
681In kernel, every loaded program, map or btf has a unique id. The id won't
682change during the lifetime of a program, map, or btf.
ffcf7ce9 683
9ab5305d
AN
684The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
685each command, to user space, for bpf program or maps, respectively, so an
686inspection tool can inspect all programs and maps.
ffcf7ce9
YS
687
6883.5 BPF_{PROG,MAP}_GET_FD_BY_ID
3ff36bff 689-------------------------------
ffcf7ce9 690
5efc529f
AN
691An introspection tool cannot use id to get details about program or maps.
692A file descriptor needs to be obtained first for reference-counting purpose.
ffcf7ce9
YS
693
6943.6 BPF_OBJ_GET_INFO_BY_FD
3ff36bff 695--------------------------
ffcf7ce9 696
9ab5305d
AN
697Once a program/map fd is acquired, an introspection tool can get the detailed
698information from kernel about this fd, some of which are BTF-related. For
699example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
700``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
701bpf byte codes, and jited_line_info.
ffcf7ce9
YS
702
7033.7 BPF_BTF_GET_FD_BY_ID
3ff36bff 704------------------------
ffcf7ce9 705
9ab5305d
AN
706With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
707syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
708command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
709kernel with BPF_BTF_LOAD, can be retrieved.
ffcf7ce9 710
5efc529f 711With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
9ab5305d
AN
712tool has full btf knowledge and is able to pretty print map key/values, dump
713func signatures and line info, along with byte/jit codes.
ffcf7ce9
YS
714
7154. ELF File Format Interface
3ff36bff 716============================
ffcf7ce9
YS
717
7184.1 .BTF section
3ff36bff 719----------------
ffcf7ce9 720
9ab5305d
AN
721The .BTF section contains type and string data. The format of this section is
722same as the one describe in :ref:`BTF_Type_String`.
ffcf7ce9
YS
723
724.. _BTF_Ext_Section:
725
7264.2 .BTF.ext section
3ff36bff 727--------------------
ffcf7ce9 728
9ab5305d
AN
729The .BTF.ext section encodes func_info and line_info which needs loader
730manipulation before loading into the kernel.
ffcf7ce9 731
9ab5305d
AN
732The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
733and ``tools/lib/bpf/btf.c``.
ffcf7ce9
YS
734
735The current header of .BTF.ext section::
736
737 struct btf_ext_header {
738 __u16 magic;
739 __u8 version;
740 __u8 flags;
741 __u32 hdr_len;
742
743 /* All offsets are in bytes relative to the end of this header */
744 __u32 func_info_off;
745 __u32 func_info_len;
746 __u32 line_info_off;
747 __u32 line_info_len;
748 };
749
9ab5305d
AN
750It is very similar to .BTF section. Instead of type/string section, it
751contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
752about func_info and line_info record format.
ffcf7ce9
YS
753
754The func_info is organized as below.::
755
756 func_info_rec_size
757 btf_ext_info_sec for section #1 /* func_info for section #1 */
758 btf_ext_info_sec for section #2 /* func_info for section #2 */
759 ...
760
9ab5305d
AN
761``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
762.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
763func_info for each specific ELF section.::
ffcf7ce9
YS
764
765 struct btf_ext_info_sec {
766 __u32 sec_name_off; /* offset to section name */
767 __u32 num_info;
768 /* Followed by num_info * record_size number of bytes */
769 __u8 data[0];
770 };
771
772Here, num_info must be greater than 0.
773
774The line_info is organized as below.::
775
776 line_info_rec_size
777 btf_ext_info_sec for section #1 /* line_info for section #1 */
778 btf_ext_info_sec for section #2 /* line_info for section #2 */
779 ...
780
9ab5305d
AN
781``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
782.BTF.ext is generated.
ffcf7ce9
YS
783
784The interpretation of ``bpf_func_info->insn_off`` and
9ab5305d
AN
785``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
786kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
787bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
788beginning of section (``btf_ext_info_sec->sec_name_off``).
ffcf7ce9 789
232ce4be 7904.2 .BTF_ids section
3ff36bff 791--------------------
232ce4be
JO
792
793The .BTF_ids section encodes BTF ID values that are used within the kernel.
794
795This section is created during the kernel compilation with the help of
796macros defined in ``include/linux/btf_ids.h`` header file. Kernel code can
797use them to create lists and sets (sorted lists) of BTF ID values.
798
799The ``BTF_ID_LIST`` and ``BTF_ID`` macros define unsorted list of BTF ID values,
800with following syntax::
801
802 BTF_ID_LIST(list)
803 BTF_ID(type1, name1)
804 BTF_ID(type2, name2)
805
806resulting in following layout in .BTF_ids section::
807
808 __BTF_ID__type1__name1__1:
809 .zero 4
810 __BTF_ID__type2__name2__2:
811 .zero 4
812
813The ``u32 list[];`` variable is defined to access the list.
814
815The ``BTF_ID_UNUSED`` macro defines 4 zero bytes. It's used when we
816want to define unused entry in BTF_ID_LIST, like::
817
818 BTF_ID_LIST(bpf_skb_output_btf_ids)
819 BTF_ID(struct, sk_buff)
820 BTF_ID_UNUSED
821 BTF_ID(struct, task_struct)
822
68a26bc7
JO
823The ``BTF_SET_START/END`` macros pair defines sorted list of BTF ID values
824and their count, with following syntax::
825
826 BTF_SET_START(set)
827 BTF_ID(type1, name1)
828 BTF_ID(type2, name2)
829 BTF_SET_END(set)
830
831resulting in following layout in .BTF_ids section::
832
833 __BTF_ID__set__set:
834 .zero 4
835 __BTF_ID__type1__name1__3:
836 .zero 4
837 __BTF_ID__type2__name2__4:
838 .zero 4
839
840The ``struct btf_id_set set;`` variable is defined to access the list.
841
842The ``typeX`` name can be one of following::
843
844 struct, union, typedef, func
845
846and is used as a filter when resolving the BTF ID value.
847
232ce4be
JO
848All the BTF ID lists and sets are compiled in the .BTF_ids section and
849resolved during the linking phase of kernel build by ``resolve_btfids`` tool.
850
ffcf7ce9 8515. Using BTF
3ff36bff 852============
ffcf7ce9
YS
853
8545.1 bpftool map pretty print
3ff36bff 855----------------------------
ffcf7ce9 856
9ab5305d
AN
857With BTF, the map key/value can be printed based on fields rather than simply
858raw bytes. This is especially valuable for large structure or if your data
859structure has bitfields. For example, for the following map,::
ffcf7ce9
YS
860
861 enum A { A1, A2, A3, A4, A5 };
862 typedef enum A ___A;
863 struct tmp_t {
864 char a1:4;
865 int a2:4;
866 int :4;
867 __u32 a3:4;
868 int b;
869 ___A b1:4;
870 enum A b2:4;
871 };
96c85308
AN
872 struct {
873 __uint(type, BPF_MAP_TYPE_ARRAY);
874 __type(key, int);
875 __type(value, struct tmp_t);
876 __uint(max_entries, 1);
877 } tmpmap SEC(".maps");
ffcf7ce9
YS
878
879bpftool is able to pretty print like below:
880::
881
882 [{
883 "key": 0,
884 "value": {
885 "a1": 0x2,
886 "a2": 0x4,
887 "a3": 0x6,
888 "b": 7,
889 "b1": 0x8,
890 "b2": 0xa
891 }
892 }
893 ]
894
8955.2 bpftool prog dump
3ff36bff 896---------------------
ffcf7ce9 897
9ab5305d
AN
898The following is an example showing how func_info and line_info can help prog
899dump with better kernel symbol names, function prototypes and line
900information.::
ffcf7ce9
YS
901
902 $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
903 [...]
904 int test_long_fname_2(struct dummy_tracepoint_args * arg):
905 bpf_prog_44a040bf25481309_test_long_fname_2:
906 ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
907 0: push %rbp
908 1: mov %rsp,%rbp
909 4: sub $0x30,%rsp
910 b: sub $0x28,%rbp
911 f: mov %rbx,0x0(%rbp)
912 13: mov %r13,0x8(%rbp)
913 17: mov %r14,0x10(%rbp)
914 1b: mov %r15,0x18(%rbp)
915 1f: xor %eax,%eax
916 21: mov %rax,0x20(%rbp)
917 25: xor %esi,%esi
918 ; int key = 0;
919 27: mov %esi,-0x4(%rbp)
920 ; if (!arg->sock)
921 2a: mov 0x8(%rdi),%rdi
922 ; if (!arg->sock)
923 2e: cmp $0x0,%rdi
924 32: je 0x0000000000000070
925 34: mov %rbp,%rsi
926 ; counts = bpf_map_lookup_elem(&btf_map, &key);
927 [...]
928
5efc529f 9295.3 Verifier Log
3ff36bff 930----------------
ffcf7ce9 931
9ab5305d
AN
932The following is an example of how line_info can help debugging verification
933failure.::
ffcf7ce9
YS
934
935 /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
936 * is modified as below.
937 */
938 data = (void *)(long)xdp->data;
939 data_end = (void *)(long)xdp->data_end;
940 /*
941 if (data + 4 > data_end)
942 return XDP_DROP;
943 */
944 *(u32 *)data = dst->dst;
945
946 $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
947 ; data = (void *)(long)xdp->data;
948 224: (79) r2 = *(u64 *)(r10 -112)
949 225: (61) r2 = *(u32 *)(r2 +0)
950 ; *(u32 *)data = dst->dst;
951 226: (63) *(u32 *)(r2 +0) = r1
952 invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
953 R2 offset is outside of the packet
954
9556. BTF Generation
3ff36bff 956=================
ffcf7ce9
YS
957
958You need latest pahole
959
960 https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
961
9ab5305d
AN
962or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
963support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
ffcf7ce9
YS
964
965 -bash-4.4$ cat t.c
966 struct t {
967 int a:2;
968 int b:3;
969 int c:2;
970 } g;
971 -bash-4.4$ gcc -c -O2 -g t.c
972 -bash-4.4$ pahole -JV t.o
973 File t.o:
974 [1] STRUCT t kind_flag=1 size=4 vlen=3
975 a type_id=2 bitfield_size=2 bits_offset=0
976 b type_id=2 bitfield_size=3 bits_offset=2
977 c type_id=2 bitfield_size=2 bits_offset=5
978 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
979
9ab5305d
AN
980The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
981only. The assembly code (-S) is able to show the BTF encoding in assembly
982format.::
ffcf7ce9
YS
983
984 -bash-4.4$ cat t2.c
985 typedef int __int32;
986 struct t2 {
987 int a2;
988 int (*f2)(char q1, __int32 q2, ...);
989 int (*f3)();
990 } g2;
991 int main() { return 0; }
992 int test() { return 0; }
993 -bash-4.4$ clang -c -g -O2 -target bpf t2.c
994 -bash-4.4$ readelf -S t2.o
995 ......
996 [ 8] .BTF PROGBITS 0000000000000000 00000247
997 000000000000016e 0000000000000000 0 0 1
998 [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5
999 0000000000000060 0000000000000000 0 0 1
1000 [10] .rel.BTF.ext REL 0000000000000000 000007e0
1001 0000000000000040 0000000000000010 16 9 8
1002 ......
1003 -bash-4.4$ clang -S -g -O2 -target bpf t2.c
1004 -bash-4.4$ cat t2.s
1005 ......
1006 .section .BTF,"",@progbits
1007 .short 60319 # 0xeb9f
1008 .byte 1
1009 .byte 0
1010 .long 24
1011 .long 0
1012 .long 220
1013 .long 220
1014 .long 122
1015 .long 0 # BTF_KIND_FUNC_PROTO(id = 1)
1016 .long 218103808 # 0xd000000
1017 .long 2
1018 .long 83 # BTF_KIND_INT(id = 2)
1019 .long 16777216 # 0x1000000
1020 .long 4
1021 .long 16777248 # 0x1000020
1022 ......
1023 .byte 0 # string offset=0
1024 .ascii ".text" # string offset=1
1025 .byte 0
1026 .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7
1027 .byte 0
1028 .ascii "int main() { return 0; }" # string offset=33
1029 .byte 0
1030 .ascii "int test() { return 0; }" # string offset=58
1031 .byte 0
1032 .ascii "int" # string offset=83
1033 ......
1034 .section .BTF.ext,"",@progbits
1035 .short 60319 # 0xeb9f
1036 .byte 1
1037 .byte 0
1038 .long 24
1039 .long 0
1040 .long 28
1041 .long 28
1042 .long 44
1043 .long 8 # FuncInfo
1044 .long 1 # FuncInfo section string offset=1
1045 .long 2
1046 .long .Lfunc_begin0
1047 .long 3
1048 .long .Lfunc_begin1
1049 .long 5
1050 .long 16 # LineInfo
1051 .long 1 # LineInfo section string offset=1
1052 .long 2
1053 .long .Ltmp0
1054 .long 7
1055 .long 33
1056 .long 7182 # Line 7 Col 14
1057 .long .Ltmp3
1058 .long 7
1059 .long 58
1060 .long 8206 # Line 8 Col 14
1061
10627. Testing
3ff36bff 1063==========
ffcf7ce9 1064
b74344cb
RT
1065The kernel BPF selftest `tools/testing/selftests/bpf/prog_tests/btf.c`_
1066provides an extensive set of BTF-related tests.
1067
1068.. Links
1069.. _tools/testing/selftests/bpf/prog_tests/btf.c:
1070 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/tools/testing/selftests/bpf/prog_tests/btf.c