Commit | Line | Data |
---|---|---|
83177c0d DH |
1 | .. SPDX-License-Identifier: GPL-2.0-only |
2 | .. Copyright (C) 2022 Red Hat, Inc. | |
3 | ||
4 | ===================== | |
5 | BPF_MAP_TYPE_LPM_TRIE | |
6 | ===================== | |
7 | ||
8 | .. note:: | |
9 | - ``BPF_MAP_TYPE_LPM_TRIE`` was introduced in kernel version 4.11 | |
10 | ||
11 | ``BPF_MAP_TYPE_LPM_TRIE`` provides a longest prefix match algorithm that | |
12 | can be used to match IP addresses to a stored set of prefixes. | |
13 | Internally, data is stored in an unbalanced trie of nodes that uses | |
14 | ``prefixlen,data`` pairs as its keys. The ``data`` is interpreted in | |
15 | network byte order, i.e. big endian, so ``data[0]`` stores the most | |
16 | significant byte. | |
17 | ||
18 | LPM tries may be created with a maximum prefix length that is a multiple | |
19 | of 8, in the range from 8 to 2048. The key used for lookup and update | |
896880ff | 20 | operations is a ``struct bpf_lpm_trie_key_u8``, extended by |
83177c0d DH |
21 | ``max_prefixlen/8`` bytes. |
22 | ||
23 | - For IPv4 addresses the data length is 4 bytes | |
24 | - For IPv6 addresses the data length is 16 bytes | |
25 | ||
26 | The value type stored in the LPM trie can be any user defined type. | |
27 | ||
28 | .. note:: | |
29 | When creating a map of type ``BPF_MAP_TYPE_LPM_TRIE`` you must set the | |
30 | ``BPF_F_NO_PREALLOC`` flag. | |
31 | ||
32 | Usage | |
33 | ===== | |
34 | ||
35 | Kernel BPF | |
36 | ---------- | |
37 | ||
539886a3 DH |
38 | bpf_map_lookup_elem() |
39 | ~~~~~~~~~~~~~~~~~~~~~ | |
40 | ||
41 | .. code-block:: c | |
42 | ||
83177c0d DH |
43 | void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) |
44 | ||
45 | The longest prefix entry for a given data value can be found using the | |
46 | ``bpf_map_lookup_elem()`` helper. This helper returns a pointer to the | |
47 | value associated with the longest matching ``key``, or ``NULL`` if no | |
48 | entry was found. | |
49 | ||
50 | The ``key`` should have ``prefixlen`` set to ``max_prefixlen`` when | |
51 | performing longest prefix lookups. For example, when searching for the | |
52 | longest prefix match for an IPv4 address, ``prefixlen`` should be set to | |
53 | ``32``. | |
54 | ||
539886a3 DH |
55 | bpf_map_update_elem() |
56 | ~~~~~~~~~~~~~~~~~~~~~ | |
57 | ||
58 | .. code-block:: c | |
59 | ||
83177c0d DH |
60 | long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) |
61 | ||
62 | Prefix entries can be added or updated using the ``bpf_map_update_elem()`` | |
63 | helper. This helper replaces existing elements atomically. | |
64 | ||
65 | ``bpf_map_update_elem()`` returns ``0`` on success, or negative error in | |
66 | case of failure. | |
67 | ||
68 | .. note:: | |
69 | The flags parameter must be one of BPF_ANY, BPF_NOEXIST or BPF_EXIST, | |
70 | but the value is ignored, giving BPF_ANY semantics. | |
71 | ||
539886a3 DH |
72 | bpf_map_delete_elem() |
73 | ~~~~~~~~~~~~~~~~~~~~~ | |
74 | ||
75 | .. code-block:: c | |
76 | ||
83177c0d DH |
77 | long bpf_map_delete_elem(struct bpf_map *map, const void *key) |
78 | ||
79 | Prefix entries can be deleted using the ``bpf_map_delete_elem()`` | |
80 | helper. This helper will return 0 on success, or negative error in case | |
81 | of failure. | |
82 | ||
83 | Userspace | |
84 | --------- | |
85 | ||
86 | Access from userspace uses libbpf APIs with the same names as above, with | |
87 | the map identified by ``fd``. | |
88 | ||
539886a3 DH |
89 | bpf_map_get_next_key() |
90 | ~~~~~~~~~~~~~~~~~~~~~~ | |
91 | ||
92 | .. code-block:: c | |
93 | ||
83177c0d DH |
94 | int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key) |
95 | ||
96 | A userspace program can iterate through the entries in an LPM trie using | |
97 | libbpf's ``bpf_map_get_next_key()`` function. The first key can be | |
98 | fetched by calling ``bpf_map_get_next_key()`` with ``cur_key`` set to | |
99 | ``NULL``. Subsequent calls will fetch the next key that follows the | |
100 | current key. ``bpf_map_get_next_key()`` returns ``0`` on success, | |
101 | ``-ENOENT`` if ``cur_key`` is the last key in the trie, or negative | |
102 | error in case of failure. | |
103 | ||
104 | ``bpf_map_get_next_key()`` will iterate through the LPM trie elements | |
105 | from leftmost leaf first. This means that iteration will return more | |
106 | specific keys before less specific ones. | |
107 | ||
108 | Examples | |
109 | ======== | |
110 | ||
111 | Please see ``tools/testing/selftests/bpf/test_lpm_map.c`` for examples | |
112 | of LPM trie usage from userspace. The code snippets below demonstrate | |
113 | API usage. | |
114 | ||
115 | Kernel BPF | |
116 | ---------- | |
117 | ||
118 | The following BPF code snippet shows how to declare a new LPM trie for IPv4 | |
119 | address prefixes: | |
120 | ||
121 | .. code-block:: c | |
122 | ||
123 | #include <linux/bpf.h> | |
124 | #include <bpf/bpf_helpers.h> | |
125 | ||
126 | struct ipv4_lpm_key { | |
127 | __u32 prefixlen; | |
128 | __u32 data; | |
129 | }; | |
130 | ||
131 | struct { | |
132 | __uint(type, BPF_MAP_TYPE_LPM_TRIE); | |
133 | __type(key, struct ipv4_lpm_key); | |
134 | __type(value, __u32); | |
135 | __uint(map_flags, BPF_F_NO_PREALLOC); | |
136 | __uint(max_entries, 255); | |
137 | } ipv4_lpm_map SEC(".maps"); | |
138 | ||
139 | The following BPF code snippet shows how to lookup by IPv4 address: | |
140 | ||
141 | .. code-block:: c | |
142 | ||
143 | void *lookup(__u32 ipaddr) | |
144 | { | |
145 | struct ipv4_lpm_key key = { | |
146 | .prefixlen = 32, | |
147 | .data = ipaddr | |
148 | }; | |
149 | ||
150 | return bpf_map_lookup_elem(&ipv4_lpm_map, &key); | |
151 | } | |
152 | ||
153 | Userspace | |
154 | --------- | |
155 | ||
156 | The following snippet shows how to insert an IPv4 prefix entry into an | |
157 | LPM trie: | |
158 | ||
159 | .. code-block:: c | |
160 | ||
161 | int add_prefix_entry(int lpm_fd, __u32 addr, __u32 prefixlen, struct value *value) | |
162 | { | |
163 | struct ipv4_lpm_key ipv4_key = { | |
164 | .prefixlen = prefixlen, | |
165 | .data = addr | |
166 | }; | |
167 | return bpf_map_update_elem(lpm_fd, &ipv4_key, value, BPF_ANY); | |
168 | } | |
169 | ||
170 | The following snippet shows a userspace program walking through the entries | |
171 | of an LPM trie: | |
172 | ||
173 | ||
174 | .. code-block:: c | |
175 | ||
176 | #include <bpf/libbpf.h> | |
177 | #include <bpf/bpf.h> | |
178 | ||
179 | void iterate_lpm_trie(int map_fd) | |
180 | { | |
181 | struct ipv4_lpm_key *cur_key = NULL; | |
182 | struct ipv4_lpm_key next_key; | |
183 | struct value value; | |
184 | int err; | |
185 | ||
186 | for (;;) { | |
187 | err = bpf_map_get_next_key(map_fd, cur_key, &next_key); | |
188 | if (err) | |
189 | break; | |
190 | ||
191 | bpf_map_lookup_elem(map_fd, &next_key, &value); | |
192 | ||
193 | /* Use key and value here */ | |
194 | ||
195 | cur_key = &next_key; | |
196 | } | |
197 | } |