Commit | Line | Data |
---|---|---|
a4aeb9d6 SF |
1 | =============== |
2 | XDP RX Metadata | |
3 | =============== | |
4 | ||
5 | This document describes how an eXpress Data Path (XDP) program can access | |
6 | hardware metadata related to a packet using a set of helper functions, | |
7 | and how it can pass that metadata on to other consumers. | |
8 | ||
9 | General Design | |
10 | ============== | |
11 | ||
12 | XDP has access to a set of kfuncs to manipulate the metadata in an XDP frame. | |
13 | Every device driver that wishes to expose additional packet metadata can | |
14 | implement these kfuncs. The set of kfuncs is declared in ``include/net/xdp.h`` | |
15 | via ``XDP_METADATA_KFUNC_xxx``. | |
16 | ||
17 | Currently, the following kfuncs are supported. In the future, as more | |
18 | metadata is supported, this set will grow: | |
19 | ||
20 | .. kernel-doc:: net/core/xdp.c | |
21 | :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash | |
22 | ||
23 | An XDP program can use these kfuncs to read the metadata into stack | |
24 | variables for its own consumption. Or, to pass the metadata on to other | |
25 | consumers, an XDP program can store it into the metadata area carried | |
915efd8a JDB |
26 | ahead of the packet. Not all packets will necessary have the requested |
27 | metadata available in which case the driver returns ``-ENODATA``. | |
a4aeb9d6 SF |
28 | |
29 | Not all kfuncs have to be implemented by the device driver; when not | |
915efd8a JDB |
30 | implemented, the default ones that return ``-EOPNOTSUPP`` will be used |
31 | to indicate the device driver have not implemented this kfunc. | |
32 | ||
a4aeb9d6 SF |
33 | |
34 | Within an XDP frame, the metadata layout (accessed via ``xdp_buff``) is | |
35 | as follows:: | |
36 | ||
37 | +----------+-----------------+------+ | |
38 | | headroom | custom metadata | data | | |
39 | +----------+-----------------+------+ | |
40 | ^ ^ | |
41 | | | | |
42 | xdp_buff->data_meta xdp_buff->data | |
43 | ||
44 | An XDP program can store individual metadata items into this ``data_meta`` | |
45 | area in whichever format it chooses. Later consumers of the metadata | |
46 | will have to agree on the format by some out of band contract (like for | |
47 | the AF_XDP use case, see below). | |
48 | ||
49 | AF_XDP | |
50 | ====== | |
51 | ||
52 | :doc:`af_xdp` use-case implies that there is a contract between the BPF | |
53 | program that redirects XDP frames into the ``AF_XDP`` socket (``XSK``) and | |
54 | the final consumer. Thus the BPF program manually allocates a fixed number of | |
55 | bytes out of metadata via ``bpf_xdp_adjust_meta`` and calls a subset | |
56 | of kfuncs to populate it. The userspace ``XSK`` consumer computes | |
57 | ``xsk_umem__get_data() - METADATA_SIZE`` to locate that metadata. | |
58 | Note, ``xsk_umem__get_data`` is defined in ``libxdp`` and | |
59 | ``METADATA_SIZE`` is an application-specific constant (``AF_XDP`` receive | |
60 | descriptor does _not_ explicitly carry the size of the metadata). | |
61 | ||
62 | Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer):: | |
63 | ||
64 | +----------+-----------------+------+ | |
65 | | headroom | custom metadata | data | | |
66 | +----------+-----------------+------+ | |
67 | ^ | |
68 | | | |
69 | rx_desc->address | |
70 | ||
71 | XDP_PASS | |
72 | ======== | |
73 | ||
74 | This is the path where the packets processed by the XDP program are passed | |
75 | into the kernel. The kernel creates the ``skb`` out of the ``xdp_buff`` | |
76 | contents. Currently, every driver has custom kernel code to parse | |
77 | the descriptors and populate ``skb`` metadata when doing this ``xdp_buff->skb`` | |
78 | conversion, and the XDP metadata is not used by the kernel when building | |
79 | ``skbs``. However, TC-BPF programs can access the XDP metadata area using | |
80 | the ``data_meta`` pointer. | |
81 | ||
82 | In the future, we'd like to support a case where an XDP program | |
83 | can override some of the metadata used for building ``skbs``. | |
84 | ||
85 | bpf_redirect_map | |
86 | ================ | |
87 | ||
88 | ``bpf_redirect_map`` can redirect the frame to a different device. | |
89 | Some devices (like virtual ethernet links) support running a second XDP | |
90 | program after the redirect. However, the final consumer doesn't have | |
91 | access to the original hardware descriptor and can't access any of | |
92 | the original metadata. The same applies to XDP programs installed | |
93 | into devmaps and cpumaps. | |
94 | ||
95 | This means that for redirected packets only custom metadata is | |
96 | currently supported, which has to be prepared by the initial XDP program | |
97 | before redirect. If the frame is eventually passed to the kernel, the | |
98 | ``skb`` created from such a frame won't have any hardware metadata populated | |
99 | in its ``skb``. If such a packet is later redirected into an ``XSK``, | |
100 | that will also only have access to the custom metadata. | |
101 | ||
102 | bpf_tail_call | |
103 | ============= | |
104 | ||
105 | Adding programs that access metadata kfuncs to the ``BPF_MAP_TYPE_PROG_ARRAY`` | |
106 | is currently not supported. | |
107 | ||
a9c2a608 SF |
108 | Supported Devices |
109 | ================= | |
110 | ||
111 | It is possible to query which kfunc the particular netdev implements via | |
112 | netlink. See ``xdp-rx-metadata-features`` attribute set in | |
113 | ``Documentation/netlink/specs/netdev.yaml``. | |
114 | ||
a4aeb9d6 SF |
115 | Example |
116 | ======= | |
117 | ||
118 | See ``tools/testing/selftests/bpf/progs/xdp_metadata.c`` and | |
119 | ``tools/testing/selftests/bpf/prog_tests/xdp_metadata.c`` for an example of | |
120 | BPF program that handles XDP metadata. |