Merge tag 'arm-soc/for-5.2/maintainers' of https://github.com/Broadcom/stblinux into...
[linux-2.6-block.git] / Documentation / bpf / btf.rst
CommitLineData
ffcf7ce9
YS
1=====================
2BPF Type Format (BTF)
3=====================
4
51. Introduction
6***************
7
9ab5305d
AN
8BTF (BPF Type Format) is the metadata format which encodes the debug info
9related to BPF program/map. The name BTF was used initially to describe data
10types. The BTF was later extended to include function info for defined
11subroutines, and line info for source/line information.
12
13The debug info is used for map pretty print, function signature, etc. The
14function signature enables better bpf program/function kernel symbol. The line
15info helps generate source annotated translated byte code, jited code and
16verifier log.
ffcf7ce9
YS
17
18The BTF specification contains two parts,
19 * BTF kernel API
20 * BTF ELF file format
21
9ab5305d
AN
22The kernel API is the contract between user space and kernel. The kernel
23verifies the BTF info before using it. The ELF file format is a user space
24contract between ELF file and libbpf loader.
ffcf7ce9 25
9ab5305d
AN
26The type and string sections are part of the BTF kernel API, describing the
27debug info (mostly types related) referenced by the bpf program. These two
28sections are discussed in details in :ref:`BTF_Type_String`.
ffcf7ce9
YS
29
30.. _BTF_Type_String:
31
322. BTF Type and String Encoding
33*******************************
34
9ab5305d
AN
35The file ``include/uapi/linux/btf.h`` provides high-level definition of how
36types/strings are encoded.
ffcf7ce9
YS
37
38The beginning of data blob must be::
39
40 struct btf_header {
41 __u16 magic;
42 __u8 version;
43 __u8 flags;
44 __u32 hdr_len;
45
46 /* All offsets are in bytes relative to the end of this header */
47 __u32 type_off; /* offset of type section */
48 __u32 type_len; /* length of type section */
49 __u32 str_off; /* offset of string section */
50 __u32 str_len; /* length of string section */
51 };
52
53The magic is ``0xeB9F``, which has different encoding for big and little
9ab5305d
AN
54endian systems, and can be used to test whether BTF is generated for big- or
55little-endian target. The ``btf_header`` is designed to be extensible with
56``hdr_len`` equal to ``sizeof(struct btf_header)`` when a data blob is
57generated.
ffcf7ce9
YS
58
592.1 String Encoding
60===================
61
9ab5305d
AN
62The first string in the string section must be a null string. The rest of
63string table is a concatenation of other null-terminated strings.
ffcf7ce9
YS
64
652.2 Type Encoding
66=================
67
9ab5305d
AN
68The type id ``0`` is reserved for ``void`` type. The type section is parsed
69sequentially and type id is assigned to each recognized type starting from id
70``1``. Currently, the following types are supported::
ffcf7ce9
YS
71
72 #define BTF_KIND_INT 1 /* Integer */
73 #define BTF_KIND_PTR 2 /* Pointer */
74 #define BTF_KIND_ARRAY 3 /* Array */
75 #define BTF_KIND_STRUCT 4 /* Struct */
76 #define BTF_KIND_UNION 5 /* Union */
77 #define BTF_KIND_ENUM 6 /* Enumeration */
78 #define BTF_KIND_FWD 7 /* Forward */
79 #define BTF_KIND_TYPEDEF 8 /* Typedef */
80 #define BTF_KIND_VOLATILE 9 /* Volatile */
81 #define BTF_KIND_CONST 10 /* Const */
82 #define BTF_KIND_RESTRICT 11 /* Restrict */
83 #define BTF_KIND_FUNC 12 /* Function */
84 #define BTF_KIND_FUNC_PROTO 13 /* Function Proto */
f063c889
DB
85 #define BTF_KIND_VAR 14 /* Variable */
86 #define BTF_KIND_DATASEC 15 /* Section */
ffcf7ce9
YS
87
88Note that the type section encodes debug info, not just pure types.
89``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
90
91Each type contains the following common data::
92
93 struct btf_type {
94 __u32 name_off;
95 /* "info" bits arrangement
96 * bits 0-15: vlen (e.g. # of struct's members)
97 * bits 16-23: unused
98 * bits 24-27: kind (e.g. int, ptr, array...etc)
99 * bits 28-30: unused
100 * bit 31: kind_flag, currently used by
101 * struct, union and fwd
102 */
103 __u32 info;
104 /* "size" is used by INT, ENUM, STRUCT and UNION.
105 * "size" tells the size of the type it is describing.
106 *
107 * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
108 * FUNC and FUNC_PROTO.
109 * "type" is a type_id referring to another type.
110 */
111 union {
112 __u32 size;
113 __u32 type;
114 };
115 };
116
9ab5305d
AN
117For certain kinds, the common data are followed by kind-specific data. The
118``name_off`` in ``struct btf_type`` specifies the offset in the string table.
119The following sections detail encoding of each kind.
ffcf7ce9
YS
120
1212.2.1 BTF_KIND_INT
122~~~~~~~~~~~~~~~~~~
123
124``struct btf_type`` encoding requirement:
125 * ``name_off``: any valid offset
126 * ``info.kind_flag``: 0
127 * ``info.kind``: BTF_KIND_INT
128 * ``info.vlen``: 0
129 * ``size``: the size of the int type in bytes.
130
5efc529f 131``btf_type`` is followed by a ``u32`` with the following bits arrangement::
ffcf7ce9
YS
132
133 #define BTF_INT_ENCODING(VAL) (((VAL) & 0x0f000000) >> 24)
948dc8c9 134 #define BTF_INT_OFFSET(VAL) (((VAL) & 0x00ff0000) >> 16)
ffcf7ce9
YS
135 #define BTF_INT_BITS(VAL) ((VAL) & 0x000000ff)
136
137The ``BTF_INT_ENCODING`` has the following attributes::
138
139 #define BTF_INT_SIGNED (1 << 0)
140 #define BTF_INT_CHAR (1 << 1)
141 #define BTF_INT_BOOL (1 << 2)
142
9ab5305d
AN
143The ``BTF_INT_ENCODING()`` provides extra information: signedness, char, or
144bool, for the int type. The char and bool encoding are mostly useful for
145pretty print. At most one encoding can be specified for the int type.
146
147The ``BTF_INT_BITS()`` specifies the number of actual bits held by this int
148type. For example, a 4-bit bitfield encodes ``BTF_INT_BITS()`` equals to 4.
149The ``btf_type.size * 8`` must be equal to or greater than ``BTF_INT_BITS()``
150for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
151
152The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
f52c97d9
JDB
153for this int. For example, a bitfield struct member has:
154 * btf member bit offset 100 from the start of the structure,
155 * btf member pointing to an int type,
156 * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
ffcf7ce9 157
9ab5305d
AN
158Then in the struct memory layout, this member will occupy ``4`` bits starting
159from bits ``100 + 2 = 102``.
ffcf7ce9 160
9ab5305d
AN
161Alternatively, the bitfield struct member can be the following to access the
162same bits as the above:
ffcf7ce9
YS
163 * btf member bit offset 102,
164 * btf member pointing to an int type,
165 * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
166
9ab5305d
AN
167The original intention of ``BTF_INT_OFFSET()`` is to provide flexibility of
168bitfield encoding. Currently, both llvm and pahole generate
169``BTF_INT_OFFSET() = 0`` for all int types.
ffcf7ce9
YS
170
1712.2.2 BTF_KIND_PTR
172~~~~~~~~~~~~~~~~~~
173
174``struct btf_type`` encoding requirement:
175 * ``name_off``: 0
176 * ``info.kind_flag``: 0
177 * ``info.kind``: BTF_KIND_PTR
178 * ``info.vlen``: 0
179 * ``type``: the pointee type of the pointer
180
181No additional type data follow ``btf_type``.
182
1832.2.3 BTF_KIND_ARRAY
184~~~~~~~~~~~~~~~~~~~~
185
186``struct btf_type`` encoding requirement:
187 * ``name_off``: 0
188 * ``info.kind_flag``: 0
189 * ``info.kind``: BTF_KIND_ARRAY
190 * ``info.vlen``: 0
191 * ``size/type``: 0, not used
192
5efc529f 193``btf_type`` is followed by one ``struct btf_array``::
ffcf7ce9
YS
194
195 struct btf_array {
196 __u32 type;
197 __u32 index_type;
198 __u32 nelems;
199 };
200
201The ``struct btf_array`` encoding:
202 * ``type``: the element type
203 * ``index_type``: the index type
204 * ``nelems``: the number of elements for this array (``0`` is also allowed).
205
9ab5305d
AN
206The ``index_type`` can be any regular int type (``u8``, ``u16``, ``u32``,
207``u64``, ``unsigned __int128``). The original design of including
208``index_type`` follows DWARF, which has an ``index_type`` for its array type.
ffcf7ce9
YS
209Currently in BTF, beyond type verification, the ``index_type`` is not used.
210
211The ``struct btf_array`` allows chaining through element type to represent
9ab5305d
AN
212multidimensional arrays. For example, for ``int a[5][6]``, the following type
213information illustrates the chaining:
ffcf7ce9
YS
214
215 * [1]: int
216 * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
217 * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
218
9ab5305d
AN
219Currently, both pahole and llvm collapse multidimensional array into
220one-dimensional array, e.g., for ``a[5][6]``, the ``btf_array.nelems`` is
221equal to ``30``. This is because the original use case is map pretty print
222where the whole array is dumped out so one-dimensional array is enough. As
223more BTF usage is explored, pahole and llvm can be changed to generate proper
224chained representation for multidimensional arrays.
ffcf7ce9
YS
225
2262.2.4 BTF_KIND_STRUCT
227~~~~~~~~~~~~~~~~~~~~~
2282.2.5 BTF_KIND_UNION
229~~~~~~~~~~~~~~~~~~~~
230
231``struct btf_type`` encoding requirement:
232 * ``name_off``: 0 or offset to a valid C identifier
233 * ``info.kind_flag``: 0 or 1
234 * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
235 * ``info.vlen``: the number of struct/union members
236 * ``info.size``: the size of the struct/union in bytes
237
238``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
239
240 struct btf_member {
241 __u32 name_off;
242 __u32 type;
243 __u32 offset;
244 };
245
246``struct btf_member`` encoding:
247 * ``name_off``: offset to a valid C identifier
248 * ``type``: the member type
249 * ``offset``: <see below>
250
9ab5305d
AN
251If the type info ``kind_flag`` is not set, the offset contains only bit offset
252of the member. Note that the base type of the bitfield can only be int or enum
253type. If the bitfield size is 32, the base type can be either int or enum
254type. If the bitfield size is not 32, the base type must be int, and int type
255``BTF_INT_BITS()`` encodes the bitfield size.
ffcf7ce9 256
9ab5305d
AN
257If the ``kind_flag`` is set, the ``btf_member.offset`` contains both member
258bitfield size and bit offset. The bitfield size and bit offset are calculated
259as below.::
ffcf7ce9
YS
260
261 #define BTF_MEMBER_BITFIELD_SIZE(val) ((val) >> 24)
262 #define BTF_MEMBER_BIT_OFFSET(val) ((val) & 0xffffff)
263
9ab5305d 264In this case, if the base type is an int type, it must be a regular int type:
ffcf7ce9
YS
265
266 * ``BTF_INT_OFFSET()`` must be 0.
267 * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
268
9ab5305d
AN
269The following kernel patch introduced ``kind_flag`` and explained why both
270modes exist:
ffcf7ce9
YS
271
272 https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
273
2742.2.6 BTF_KIND_ENUM
275~~~~~~~~~~~~~~~~~~~
276
277``struct btf_type`` encoding requirement:
278 * ``name_off``: 0 or offset to a valid C identifier
279 * ``info.kind_flag``: 0
280 * ``info.kind``: BTF_KIND_ENUM
281 * ``info.vlen``: number of enum values
282 * ``size``: 4
283
284``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
285
286 struct btf_enum {
287 __u32 name_off;
288 __s32 val;
289 };
290
291The ``btf_enum`` encoding:
292 * ``name_off``: offset to a valid C identifier
293 * ``val``: any value
294
2952.2.7 BTF_KIND_FWD
296~~~~~~~~~~~~~~~~~~
297
298``struct btf_type`` encoding requirement:
299 * ``name_off``: offset to a valid C identifier
300 * ``info.kind_flag``: 0 for struct, 1 for union
301 * ``info.kind``: BTF_KIND_FWD
302 * ``info.vlen``: 0
303 * ``type``: 0
304
305No additional type data follow ``btf_type``.
306
3072.2.8 BTF_KIND_TYPEDEF
308~~~~~~~~~~~~~~~~~~~~~~
309
310``struct btf_type`` encoding requirement:
311 * ``name_off``: offset to a valid C identifier
312 * ``info.kind_flag``: 0
313 * ``info.kind``: BTF_KIND_TYPEDEF
314 * ``info.vlen``: 0
315 * ``type``: the type which can be referred by name at ``name_off``
316
317No additional type data follow ``btf_type``.
318
3192.2.9 BTF_KIND_VOLATILE
320~~~~~~~~~~~~~~~~~~~~~~~
321
322``struct btf_type`` encoding requirement:
323 * ``name_off``: 0
324 * ``info.kind_flag``: 0
325 * ``info.kind``: BTF_KIND_VOLATILE
326 * ``info.vlen``: 0
327 * ``type``: the type with ``volatile`` qualifier
328
329No additional type data follow ``btf_type``.
330
3312.2.10 BTF_KIND_CONST
332~~~~~~~~~~~~~~~~~~~~~
333
334``struct btf_type`` encoding requirement:
335 * ``name_off``: 0
336 * ``info.kind_flag``: 0
337 * ``info.kind``: BTF_KIND_CONST
338 * ``info.vlen``: 0
339 * ``type``: the type with ``const`` qualifier
340
341No additional type data follow ``btf_type``.
342
3432.2.11 BTF_KIND_RESTRICT
344~~~~~~~~~~~~~~~~~~~~~~~~
345
346``struct btf_type`` encoding requirement:
347 * ``name_off``: 0
348 * ``info.kind_flag``: 0
349 * ``info.kind``: BTF_KIND_RESTRICT
350 * ``info.vlen``: 0
351 * ``type``: the type with ``restrict`` qualifier
352
353No additional type data follow ``btf_type``.
354
3552.2.12 BTF_KIND_FUNC
356~~~~~~~~~~~~~~~~~~~~
357
358``struct btf_type`` encoding requirement:
359 * ``name_off``: offset to a valid C identifier
360 * ``info.kind_flag``: 0
361 * ``info.kind``: BTF_KIND_FUNC
362 * ``info.vlen``: 0
363 * ``type``: a BTF_KIND_FUNC_PROTO type
364
365No additional type data follow ``btf_type``.
366
5efc529f 367A BTF_KIND_FUNC defines not a type, but a subprogram (function) whose
9ab5305d
AN
368signature is defined by ``type``. The subprogram is thus an instance of that
369type. The BTF_KIND_FUNC may in turn be referenced by a func_info in the
370:ref:`BTF_Ext_Section` (ELF) or in the arguments to :ref:`BPF_Prog_Load`
371(ABI).
ffcf7ce9
YS
372
3732.2.13 BTF_KIND_FUNC_PROTO
374~~~~~~~~~~~~~~~~~~~~~~~~~~
375
376``struct btf_type`` encoding requirement:
377 * ``name_off``: 0
378 * ``info.kind_flag``: 0
379 * ``info.kind``: BTF_KIND_FUNC_PROTO
380 * ``info.vlen``: # of parameters
381 * ``type``: the return type
382
383``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
384
385 struct btf_param {
386 __u32 name_off;
387 __u32 type;
388 };
389
9ab5305d
AN
390If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type, then
391``btf_param.name_off`` must point to a valid C identifier except for the
392possible last argument representing the variable argument. The btf_param.type
393refers to parameter type.
ffcf7ce9 394
9ab5305d
AN
395If the function has variable arguments, the last parameter is encoded with
396``name_off = 0`` and ``type = 0``.
ffcf7ce9 397
f063c889
DB
3982.2.14 BTF_KIND_VAR
399~~~~~~~~~~~~~~~~~~~
400
401``struct btf_type`` encoding requirement:
402 * ``name_off``: offset to a valid C identifier
403 * ``info.kind_flag``: 0
404 * ``info.kind``: BTF_KIND_VAR
405 * ``info.vlen``: 0
406 * ``type``: the type of the variable
407
408``btf_type`` is followed by a single ``struct btf_variable`` with the
409following data::
410
411 struct btf_var {
412 __u32 linkage;
413 };
414
415``struct btf_var`` encoding:
416 * ``linkage``: currently only static variable 0, or globally allocated
417 variable in ELF sections 1
418
419Not all type of global variables are supported by LLVM at this point.
420The following is currently available:
421
422 * static variables with or without section attributes
423 * global variables with section attributes
424
425The latter is for future extraction of map key/value type id's from a
426map definition.
427
4282.2.15 BTF_KIND_DATASEC
429~~~~~~~~~~~~~~~~~~~~~~~
430
431``struct btf_type`` encoding requirement:
432 * ``name_off``: offset to a valid name associated with a variable or
433 one of .data/.bss/.rodata
434 * ``info.kind_flag``: 0
435 * ``info.kind``: BTF_KIND_DATASEC
436 * ``info.vlen``: # of variables
437 * ``size``: total section size in bytes (0 at compilation time, patched
438 to actual size by BPF loaders such as libbpf)
439
440``btf_type`` is followed by ``info.vlen`` number of ``struct btf_var_secinfo``.::
441
442 struct btf_var_secinfo {
443 __u32 type;
444 __u32 offset;
445 __u32 size;
446 };
447
448``struct btf_var_secinfo`` encoding:
449 * ``type``: the type of the BTF_KIND_VAR variable
450 * ``offset``: the in-section offset of the variable
451 * ``size``: the size of the variable in bytes
452
ffcf7ce9
YS
4533. BTF Kernel API
454*****************
455
456The following bpf syscall command involves BTF:
457 * BPF_BTF_LOAD: load a blob of BTF data into kernel
458 * BPF_MAP_CREATE: map creation with btf key and value type info.
459 * BPF_PROG_LOAD: prog load with btf function and line info.
460 * BPF_BTF_GET_FD_BY_ID: get a btf fd
461 * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
462 and other btf related info are returned.
463
464The workflow typically looks like:
465::
466
467 Application:
468 BPF_BTF_LOAD
469 |
470 v
471 BPF_MAP_CREATE and BPF_PROG_LOAD
472 |
473 V
474 ......
475
476 Introspection tool:
477 ......
478 BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
479 |
480 V
481 BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
482 |
483 V
484 BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
485 | |
486 V |
487 BPF_BTF_GET_FD_BY_ID (get btf_fd) |
488 | |
489 V |
490 BPF_OBJ_GET_INFO_BY_FD (get btf) |
491 | |
492 V V
493 pretty print types, dump func signatures and line info, etc.
494
495
4963.1 BPF_BTF_LOAD
497================
498
9ab5305d
AN
499Load a blob of BTF data into kernel. A blob of data, described in
500:ref:`BTF_Type_String`, can be directly loaded into the kernel. A ``btf_fd``
501is returned to a userspace.
ffcf7ce9
YS
502
5033.2 BPF_MAP_CREATE
504==================
505
506A map can be created with ``btf_fd`` and specified key/value type id.::
507
508 __u32 btf_fd; /* fd pointing to a BTF type data */
509 __u32 btf_key_type_id; /* BTF type_id of the key */
510 __u32 btf_value_type_id; /* BTF type_id of the value */
511
512In libbpf, the map can be defined with extra annotation like below:
513::
514
515 struct bpf_map_def SEC("maps") btf_map = {
516 .type = BPF_MAP_TYPE_ARRAY,
517 .key_size = sizeof(int),
518 .value_size = sizeof(struct ipv_counts),
519 .max_entries = 4,
520 };
521 BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
522
9ab5305d
AN
523Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name, key and
524value types for the map. During ELF parsing, libbpf is able to extract
525key/value type_id's and assign them to BPF_MAP_CREATE attributes
526automatically.
ffcf7ce9
YS
527
528.. _BPF_Prog_Load:
529
5303.3 BPF_PROG_LOAD
531=================
532
9ab5305d
AN
533During prog_load, func_info and line_info can be passed to kernel with proper
534values for the following attributes:
ffcf7ce9
YS
535::
536
537 __u32 insn_cnt;
538 __aligned_u64 insns;
539 ......
540 __u32 prog_btf_fd; /* fd pointing to BTF type data */
541 __u32 func_info_rec_size; /* userspace bpf_func_info size */
542 __aligned_u64 func_info; /* func info */
543 __u32 func_info_cnt; /* number of bpf_func_info records */
544 __u32 line_info_rec_size; /* userspace bpf_line_info size */
545 __aligned_u64 line_info; /* line info */
546 __u32 line_info_cnt; /* number of bpf_line_info records */
547
548The func_info and line_info are an array of below, respectively.::
549
550 struct bpf_func_info {
551 __u32 insn_off; /* [0, insn_cnt - 1] */
552 __u32 type_id; /* pointing to a BTF_KIND_FUNC type */
553 };
554 struct bpf_line_info {
555 __u32 insn_off; /* [0, insn_cnt - 1] */
556 __u32 file_name_off; /* offset to string table for the filename */
557 __u32 line_off; /* offset to string table for the source line */
558 __u32 line_col; /* line number and column number */
559 };
560
9ab5305d
AN
561func_info_rec_size is the size of each func_info record, and
562line_info_rec_size is the size of each line_info record. Passing the record
563size to kernel make it possible to extend the record itself in the future.
ffcf7ce9
YS
564
565Below are requirements for func_info:
566 * func_info[0].insn_off must be 0.
567 * the func_info insn_off is in strictly increasing order and matches
568 bpf func boundaries.
569
570Below are requirements for line_info:
5efc529f 571 * the first insn in each func must have a line_info record pointing to it.
ffcf7ce9
YS
572 * the line_info insn_off is in strictly increasing order.
573
574For line_info, the line number and column number are defined as below:
575::
576
577 #define BPF_LINE_INFO_LINE_NUM(line_col) ((line_col) >> 10)
578 #define BPF_LINE_INFO_LINE_COL(line_col) ((line_col) & 0x3ff)
579
5803.4 BPF_{PROG,MAP}_GET_NEXT_ID
3ef4641f 581==============================
ffcf7ce9 582
9ab5305d
AN
583In kernel, every loaded program, map or btf has a unique id. The id won't
584change during the lifetime of a program, map, or btf.
ffcf7ce9 585
9ab5305d
AN
586The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID returns all id's, one for
587each command, to user space, for bpf program or maps, respectively, so an
588inspection tool can inspect all programs and maps.
ffcf7ce9
YS
589
5903.5 BPF_{PROG,MAP}_GET_FD_BY_ID
3ef4641f 591===============================
ffcf7ce9 592
5efc529f
AN
593An introspection tool cannot use id to get details about program or maps.
594A file descriptor needs to be obtained first for reference-counting purpose.
ffcf7ce9
YS
595
5963.6 BPF_OBJ_GET_INFO_BY_FD
597==========================
598
9ab5305d
AN
599Once a program/map fd is acquired, an introspection tool can get the detailed
600information from kernel about this fd, some of which are BTF-related. For
601example, ``bpf_map_info`` returns ``btf_id`` and key/value type ids.
602``bpf_prog_info`` returns ``btf_id``, func_info, and line info for translated
603bpf byte codes, and jited_line_info.
ffcf7ce9
YS
604
6053.7 BPF_BTF_GET_FD_BY_ID
606========================
607
9ab5305d
AN
608With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``, bpf
609syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd. Then, with
610command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally loaded into the
611kernel with BPF_BTF_LOAD, can be retrieved.
ffcf7ce9 612
5efc529f 613With the btf blob, ``bpf_map_info``, and ``bpf_prog_info``, an introspection
9ab5305d
AN
614tool has full btf knowledge and is able to pretty print map key/values, dump
615func signatures and line info, along with byte/jit codes.
ffcf7ce9
YS
616
6174. ELF File Format Interface
618****************************
619
6204.1 .BTF section
621================
622
9ab5305d
AN
623The .BTF section contains type and string data. The format of this section is
624same as the one describe in :ref:`BTF_Type_String`.
ffcf7ce9
YS
625
626.. _BTF_Ext_Section:
627
6284.2 .BTF.ext section
629====================
630
9ab5305d
AN
631The .BTF.ext section encodes func_info and line_info which needs loader
632manipulation before loading into the kernel.
ffcf7ce9 633
9ab5305d
AN
634The specification for .BTF.ext section is defined at ``tools/lib/bpf/btf.h``
635and ``tools/lib/bpf/btf.c``.
ffcf7ce9
YS
636
637The current header of .BTF.ext section::
638
639 struct btf_ext_header {
640 __u16 magic;
641 __u8 version;
642 __u8 flags;
643 __u32 hdr_len;
644
645 /* All offsets are in bytes relative to the end of this header */
646 __u32 func_info_off;
647 __u32 func_info_len;
648 __u32 line_info_off;
649 __u32 line_info_len;
650 };
651
9ab5305d
AN
652It is very similar to .BTF section. Instead of type/string section, it
653contains func_info and line_info section. See :ref:`BPF_Prog_Load` for details
654about func_info and line_info record format.
ffcf7ce9
YS
655
656The func_info is organized as below.::
657
658 func_info_rec_size
659 btf_ext_info_sec for section #1 /* func_info for section #1 */
660 btf_ext_info_sec for section #2 /* func_info for section #2 */
661 ...
662
9ab5305d
AN
663``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure when
664.BTF.ext is generated. ``btf_ext_info_sec``, defined below, is a collection of
665func_info for each specific ELF section.::
ffcf7ce9
YS
666
667 struct btf_ext_info_sec {
668 __u32 sec_name_off; /* offset to section name */
669 __u32 num_info;
670 /* Followed by num_info * record_size number of bytes */
671 __u8 data[0];
672 };
673
674Here, num_info must be greater than 0.
675
676The line_info is organized as below.::
677
678 line_info_rec_size
679 btf_ext_info_sec for section #1 /* line_info for section #1 */
680 btf_ext_info_sec for section #2 /* line_info for section #2 */
681 ...
682
9ab5305d
AN
683``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure when
684.BTF.ext is generated.
ffcf7ce9
YS
685
686The interpretation of ``bpf_func_info->insn_off`` and
9ab5305d
AN
687``bpf_line_info->insn_off`` is different between kernel API and ELF API. For
688kernel API, the ``insn_off`` is the instruction offset in the unit of ``struct
689bpf_insn``. For ELF API, the ``insn_off`` is the byte offset from the
690beginning of section (``btf_ext_info_sec->sec_name_off``).
ffcf7ce9
YS
691
6925. Using BTF
693************
694
6955.1 bpftool map pretty print
696============================
697
9ab5305d
AN
698With BTF, the map key/value can be printed based on fields rather than simply
699raw bytes. This is especially valuable for large structure or if your data
700structure has bitfields. For example, for the following map,::
ffcf7ce9
YS
701
702 enum A { A1, A2, A3, A4, A5 };
703 typedef enum A ___A;
704 struct tmp_t {
705 char a1:4;
706 int a2:4;
707 int :4;
708 __u32 a3:4;
709 int b;
710 ___A b1:4;
711 enum A b2:4;
712 };
713 struct bpf_map_def SEC("maps") tmpmap = {
714 .type = BPF_MAP_TYPE_ARRAY,
715 .key_size = sizeof(__u32),
716 .value_size = sizeof(struct tmp_t),
717 .max_entries = 1,
718 };
719 BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
720
721bpftool is able to pretty print like below:
722::
723
724 [{
725 "key": 0,
726 "value": {
727 "a1": 0x2,
728 "a2": 0x4,
729 "a3": 0x6,
730 "b": 7,
731 "b1": 0x8,
732 "b2": 0xa
733 }
734 }
735 ]
736
7375.2 bpftool prog dump
738=====================
739
9ab5305d
AN
740The following is an example showing how func_info and line_info can help prog
741dump with better kernel symbol names, function prototypes and line
742information.::
ffcf7ce9
YS
743
744 $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
745 [...]
746 int test_long_fname_2(struct dummy_tracepoint_args * arg):
747 bpf_prog_44a040bf25481309_test_long_fname_2:
748 ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
749 0: push %rbp
750 1: mov %rsp,%rbp
751 4: sub $0x30,%rsp
752 b: sub $0x28,%rbp
753 f: mov %rbx,0x0(%rbp)
754 13: mov %r13,0x8(%rbp)
755 17: mov %r14,0x10(%rbp)
756 1b: mov %r15,0x18(%rbp)
757 1f: xor %eax,%eax
758 21: mov %rax,0x20(%rbp)
759 25: xor %esi,%esi
760 ; int key = 0;
761 27: mov %esi,-0x4(%rbp)
762 ; if (!arg->sock)
763 2a: mov 0x8(%rdi),%rdi
764 ; if (!arg->sock)
765 2e: cmp $0x0,%rdi
766 32: je 0x0000000000000070
767 34: mov %rbp,%rsi
768 ; counts = bpf_map_lookup_elem(&btf_map, &key);
769 [...]
770
5efc529f 7715.3 Verifier Log
ffcf7ce9
YS
772================
773
9ab5305d
AN
774The following is an example of how line_info can help debugging verification
775failure.::
ffcf7ce9
YS
776
777 /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
778 * is modified as below.
779 */
780 data = (void *)(long)xdp->data;
781 data_end = (void *)(long)xdp->data_end;
782 /*
783 if (data + 4 > data_end)
784 return XDP_DROP;
785 */
786 *(u32 *)data = dst->dst;
787
788 $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
789 ; data = (void *)(long)xdp->data;
790 224: (79) r2 = *(u64 *)(r10 -112)
791 225: (61) r2 = *(u32 *)(r2 +0)
792 ; *(u32 *)data = dst->dst;
793 226: (63) *(u32 *)(r2 +0) = r1
794 invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
795 R2 offset is outside of the packet
796
7976. BTF Generation
798*****************
799
800You need latest pahole
801
802 https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
803
9ab5305d
AN
804or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't
805support .BTF.ext and btf BTF_KIND_FUNC type yet. For example,::
ffcf7ce9
YS
806
807 -bash-4.4$ cat t.c
808 struct t {
809 int a:2;
810 int b:3;
811 int c:2;
812 } g;
813 -bash-4.4$ gcc -c -O2 -g t.c
814 -bash-4.4$ pahole -JV t.o
815 File t.o:
816 [1] STRUCT t kind_flag=1 size=4 vlen=3
817 a type_id=2 bitfield_size=2 bits_offset=0
818 b type_id=2 bitfield_size=3 bits_offset=2
819 c type_id=2 bitfield_size=2 bits_offset=5
820 [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
821
9ab5305d
AN
822The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target
823only. The assembly code (-S) is able to show the BTF encoding in assembly
824format.::
ffcf7ce9
YS
825
826 -bash-4.4$ cat t2.c
827 typedef int __int32;
828 struct t2 {
829 int a2;
830 int (*f2)(char q1, __int32 q2, ...);
831 int (*f3)();
832 } g2;
833 int main() { return 0; }
834 int test() { return 0; }
835 -bash-4.4$ clang -c -g -O2 -target bpf t2.c
836 -bash-4.4$ readelf -S t2.o
837 ......
838 [ 8] .BTF PROGBITS 0000000000000000 00000247
839 000000000000016e 0000000000000000 0 0 1
840 [ 9] .BTF.ext PROGBITS 0000000000000000 000003b5
841 0000000000000060 0000000000000000 0 0 1
842 [10] .rel.BTF.ext REL 0000000000000000 000007e0
843 0000000000000040 0000000000000010 16 9 8
844 ......
845 -bash-4.4$ clang -S -g -O2 -target bpf t2.c
846 -bash-4.4$ cat t2.s
847 ......
848 .section .BTF,"",@progbits
849 .short 60319 # 0xeb9f
850 .byte 1
851 .byte 0
852 .long 24
853 .long 0
854 .long 220
855 .long 220
856 .long 122
857 .long 0 # BTF_KIND_FUNC_PROTO(id = 1)
858 .long 218103808 # 0xd000000
859 .long 2
860 .long 83 # BTF_KIND_INT(id = 2)
861 .long 16777216 # 0x1000000
862 .long 4
863 .long 16777248 # 0x1000020
864 ......
865 .byte 0 # string offset=0
866 .ascii ".text" # string offset=1
867 .byte 0
868 .ascii "/home/yhs/tmp-pahole/t2.c" # string offset=7
869 .byte 0
870 .ascii "int main() { return 0; }" # string offset=33
871 .byte 0
872 .ascii "int test() { return 0; }" # string offset=58
873 .byte 0
874 .ascii "int" # string offset=83
875 ......
876 .section .BTF.ext,"",@progbits
877 .short 60319 # 0xeb9f
878 .byte 1
879 .byte 0
880 .long 24
881 .long 0
882 .long 28
883 .long 28
884 .long 44
885 .long 8 # FuncInfo
886 .long 1 # FuncInfo section string offset=1
887 .long 2
888 .long .Lfunc_begin0
889 .long 3
890 .long .Lfunc_begin1
891 .long 5
892 .long 16 # LineInfo
893 .long 1 # LineInfo section string offset=1
894 .long 2
895 .long .Ltmp0
896 .long 7
897 .long 33
898 .long 7182 # Line 7 Col 14
899 .long .Ltmp3
900 .long 7
901 .long 58
902 .long 8206 # Line 8 Col 14
903
9047. Testing
905**********
906
5efc529f 907Kernel bpf selftest `test_btf.c` provides extensive set of BTF-related tests.