Commit | Line | Data |
---|---|---|
2b3e8f6f MT |
1 | .. SPDX-License-Identifier: GPL-2.0-only |
2 | .. Copyright (C) 2022 Red Hat, Inc. | |
3 | ||
4 | =================== | |
5 | BPF_MAP_TYPE_XSKMAP | |
6 | =================== | |
7 | ||
8 | .. note:: | |
9 | - ``BPF_MAP_TYPE_XSKMAP`` was introduced in kernel version 4.18 | |
10 | ||
11 | The ``BPF_MAP_TYPE_XSKMAP`` is used as a backend map for XDP BPF helper | |
12 | call ``bpf_redirect_map()`` and ``XDP_REDIRECT`` action, like 'devmap' and 'cpumap'. | |
13 | This map type redirects raw XDP frames to `AF_XDP`_ sockets (XSKs), a new type of | |
14 | address family in the kernel that allows redirection of frames from a driver to | |
15 | user space without having to traverse the full network stack. An AF_XDP socket | |
16 | binds to a single netdev queue. A mapping of XSKs to queues is shown below: | |
17 | ||
18 | .. code-block:: none | |
19 | ||
20 | +---------------------------------------------------+ | |
21 | | xsk A | xsk B | xsk C |<---+ User space | |
22 | =========================================================|========== | |
23 | | Queue 0 | Queue 1 | Queue 2 | | Kernel | |
24 | +---------------------------------------------------+ | | |
25 | | Netdev eth0 | | | |
26 | +---------------------------------------------------+ | | |
27 | | +=============+ | | | |
28 | | | key | xsk | | | | |
29 | | +---------+ +=============+ | | | |
30 | | | | | 0 | xsk A | | | | |
31 | | | | +-------------+ | | | |
32 | | | | | 1 | xsk B | | | | |
33 | | | BPF |-- redirect -->+-------------+-------------+ | |
34 | | | prog | | 2 | xsk C | | | |
35 | | | | +-------------+ | | |
36 | | | | | | |
37 | | | | | | |
38 | | +---------+ | | |
39 | | | | |
40 | +---------------------------------------------------+ | |
41 | ||
42 | .. note:: | |
43 | An AF_XDP socket that is bound to a certain <netdev/queue_id> will *only* | |
44 | accept XDP frames from that <netdev/queue_id>. If an XDP program tries to redirect | |
45 | from a <netdev/queue_id> other than what the socket is bound to, the frame will | |
46 | not be received on the socket. | |
47 | ||
48 | Typically an XSKMAP is created per netdev. This map contains an array of XSK File | |
49 | Descriptors (FDs). The number of array elements is typically set or adjusted using | |
50 | the ``max_entries`` map parameter. For AF_XDP ``max_entries`` is equal to the number | |
51 | of queues supported by the netdev. | |
52 | ||
53 | .. note:: | |
54 | Both the map key and map value size must be 4 bytes. | |
55 | ||
56 | Usage | |
57 | ===== | |
58 | ||
59 | Kernel BPF | |
60 | ---------- | |
61 | bpf_redirect_map() | |
62 | ^^^^^^^^^^^^^^^^^^ | |
63 | .. code-block:: c | |
64 | ||
65 | long bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags) | |
66 | ||
67 | Redirect the packet to the endpoint referenced by ``map`` at index ``key``. | |
68 | For ``BPF_MAP_TYPE_XSKMAP`` this map contains references to XSK FDs | |
69 | for sockets attached to a netdev's queues. | |
70 | ||
71 | .. note:: | |
72 | If the map is empty at an index, the packet is dropped. This means that it is | |
73 | necessary to have an XDP program loaded with at least one XSK in the | |
74 | XSKMAP to be able to get any traffic to user space through the socket. | |
75 | ||
76 | bpf_map_lookup_elem() | |
77 | ^^^^^^^^^^^^^^^^^^^^^ | |
78 | .. code-block:: c | |
79 | ||
80 | void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) | |
81 | ||
82 | XSK entry references of type ``struct xdp_sock *`` can be retrieved using the | |
83 | ``bpf_map_lookup_elem()`` helper. | |
84 | ||
85 | User space | |
86 | ---------- | |
87 | .. note:: | |
88 | XSK entries can only be updated/deleted from user space and not from | |
89 | a BPF program. Trying to call these functions from a kernel BPF program will | |
90 | result in the program failing to load and a verifier warning. | |
91 | ||
92 | bpf_map_update_elem() | |
93 | ^^^^^^^^^^^^^^^^^^^^^ | |
94 | .. code-block:: c | |
95 | ||
96 | int bpf_map_update_elem(int fd, const void *key, const void *value, __u64 flags) | |
97 | ||
98 | XSK entries can be added or updated using the ``bpf_map_update_elem()`` | |
99 | helper. The ``key`` parameter is equal to the queue_id of the queue the XSK | |
100 | is attaching to. And the ``value`` parameter is the FD value of that socket. | |
101 | ||
102 | Under the hood, the XSKMAP update function uses the XSK FD value to retrieve the | |
103 | associated ``struct xdp_sock`` instance. | |
104 | ||
105 | The flags argument can be one of the following: | |
106 | ||
107 | - BPF_ANY: Create a new element or update an existing element. | |
108 | - BPF_NOEXIST: Create a new element only if it did not exist. | |
109 | - BPF_EXIST: Update an existing element. | |
110 | ||
111 | bpf_map_lookup_elem() | |
112 | ^^^^^^^^^^^^^^^^^^^^^ | |
113 | .. code-block:: c | |
114 | ||
115 | int bpf_map_lookup_elem(int fd, const void *key, void *value) | |
116 | ||
117 | Returns ``struct xdp_sock *`` or negative error in case of failure. | |
118 | ||
119 | bpf_map_delete_elem() | |
120 | ^^^^^^^^^^^^^^^^^^^^^ | |
121 | .. code-block:: c | |
122 | ||
123 | int bpf_map_delete_elem(int fd, const void *key) | |
124 | ||
125 | XSK entries can be deleted using the ``bpf_map_delete_elem()`` | |
126 | helper. This helper will return 0 on success, or negative error in case of | |
127 | failure. | |
128 | ||
129 | .. note:: | |
130 | When `libxdp`_ deletes an XSK it also removes the associated socket | |
131 | entry from the XSKMAP. | |
132 | ||
133 | Examples | |
134 | ======== | |
135 | Kernel | |
136 | ------ | |
137 | ||
138 | The following code snippet shows how to declare a ``BPF_MAP_TYPE_XSKMAP`` called | |
139 | ``xsks_map`` and how to redirect packets to an XSK. | |
140 | ||
141 | .. code-block:: c | |
142 | ||
143 | struct { | |
144 | __uint(type, BPF_MAP_TYPE_XSKMAP); | |
145 | __type(key, __u32); | |
146 | __type(value, __u32); | |
147 | __uint(max_entries, 64); | |
148 | } xsks_map SEC(".maps"); | |
149 | ||
150 | ||
151 | SEC("xdp") | |
152 | int xsk_redir_prog(struct xdp_md *ctx) | |
153 | { | |
154 | __u32 index = ctx->rx_queue_index; | |
155 | ||
156 | if (bpf_map_lookup_elem(&xsks_map, &index)) | |
157 | return bpf_redirect_map(&xsks_map, index, 0); | |
158 | return XDP_PASS; | |
159 | } | |
160 | ||
161 | User space | |
162 | ---------- | |
163 | ||
164 | The following code snippet shows how to update an XSKMAP with an XSK entry. | |
165 | ||
166 | .. code-block:: c | |
167 | ||
168 | int update_xsks_map(struct bpf_map *xsks_map, int queue_id, int xsk_fd) | |
169 | { | |
170 | int ret; | |
171 | ||
172 | ret = bpf_map_update_elem(bpf_map__fd(xsks_map), &queue_id, &xsk_fd, 0); | |
173 | if (ret < 0) | |
174 | fprintf(stderr, "Failed to update xsks_map: %s\n", strerror(errno)); | |
175 | ||
176 | return ret; | |
177 | } | |
178 | ||
179 | For an example on how create AF_XDP sockets, please see the AF_XDP-example and | |
180 | AF_XDP-forwarding programs in the `bpf-examples`_ directory in the `libxdp`_ repository. | |
1d3cab43 | 181 | For a detailed explanation of the AF_XDP interface please see: |
2b3e8f6f MT |
182 | |
183 | - `libxdp-readme`_. | |
184 | - `AF_XDP`_ kernel documentation. | |
185 | ||
186 | .. note:: | |
187 | The most comprehensive resource for using XSKMAPs and AF_XDP is `libxdp`_. | |
188 | ||
189 | .. _libxdp: https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp | |
190 | .. _AF_XDP: https://www.kernel.org/doc/html/latest/networking/af_xdp.html | |
191 | .. _bpf-examples: https://github.com/xdp-project/bpf-examples | |
192 | .. _libxdp-readme: https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp#using-af_xdp-sockets |