Commit | Line | Data |
---|---|---|
83177c0d DH |
1 | .. SPDX-License-Identifier: GPL-2.0-only |
2 | .. Copyright (C) 2022 Red Hat, Inc. | |
3 | ||
4 | ===================== | |
5 | BPF_MAP_TYPE_LPM_TRIE | |
6 | ===================== | |
7 | ||
8 | .. note:: | |
9 | - ``BPF_MAP_TYPE_LPM_TRIE`` was introduced in kernel version 4.11 | |
10 | ||
11 | ``BPF_MAP_TYPE_LPM_TRIE`` provides a longest prefix match algorithm that | |
12 | can be used to match IP addresses to a stored set of prefixes. | |
13 | Internally, data is stored in an unbalanced trie of nodes that uses | |
14 | ``prefixlen,data`` pairs as its keys. The ``data`` is interpreted in | |
15 | network byte order, i.e. big endian, so ``data[0]`` stores the most | |
16 | significant byte. | |
17 | ||
18 | LPM tries may be created with a maximum prefix length that is a multiple | |
19 | of 8, in the range from 8 to 2048. The key used for lookup and update | |
20 | operations is a ``struct bpf_lpm_trie_key``, extended by | |
21 | ``max_prefixlen/8`` bytes. | |
22 | ||
23 | - For IPv4 addresses the data length is 4 bytes | |
24 | - For IPv6 addresses the data length is 16 bytes | |
25 | ||
26 | The value type stored in the LPM trie can be any user defined type. | |
27 | ||
28 | .. note:: | |
29 | When creating a map of type ``BPF_MAP_TYPE_LPM_TRIE`` you must set the | |
30 | ``BPF_F_NO_PREALLOC`` flag. | |
31 | ||
32 | Usage | |
33 | ===== | |
34 | ||
35 | Kernel BPF | |
36 | ---------- | |
37 | ||
38 | .. c:function:: | |
39 | void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) | |
40 | ||
41 | The longest prefix entry for a given data value can be found using the | |
42 | ``bpf_map_lookup_elem()`` helper. This helper returns a pointer to the | |
43 | value associated with the longest matching ``key``, or ``NULL`` if no | |
44 | entry was found. | |
45 | ||
46 | The ``key`` should have ``prefixlen`` set to ``max_prefixlen`` when | |
47 | performing longest prefix lookups. For example, when searching for the | |
48 | longest prefix match for an IPv4 address, ``prefixlen`` should be set to | |
49 | ``32``. | |
50 | ||
51 | .. c:function:: | |
52 | long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) | |
53 | ||
54 | Prefix entries can be added or updated using the ``bpf_map_update_elem()`` | |
55 | helper. This helper replaces existing elements atomically. | |
56 | ||
57 | ``bpf_map_update_elem()`` returns ``0`` on success, or negative error in | |
58 | case of failure. | |
59 | ||
60 | .. note:: | |
61 | The flags parameter must be one of BPF_ANY, BPF_NOEXIST or BPF_EXIST, | |
62 | but the value is ignored, giving BPF_ANY semantics. | |
63 | ||
64 | .. c:function:: | |
65 | long bpf_map_delete_elem(struct bpf_map *map, const void *key) | |
66 | ||
67 | Prefix entries can be deleted using the ``bpf_map_delete_elem()`` | |
68 | helper. This helper will return 0 on success, or negative error in case | |
69 | of failure. | |
70 | ||
71 | Userspace | |
72 | --------- | |
73 | ||
74 | Access from userspace uses libbpf APIs with the same names as above, with | |
75 | the map identified by ``fd``. | |
76 | ||
77 | .. c:function:: | |
78 | int bpf_map_get_next_key (int fd, const void *cur_key, void *next_key) | |
79 | ||
80 | A userspace program can iterate through the entries in an LPM trie using | |
81 | libbpf's ``bpf_map_get_next_key()`` function. The first key can be | |
82 | fetched by calling ``bpf_map_get_next_key()`` with ``cur_key`` set to | |
83 | ``NULL``. Subsequent calls will fetch the next key that follows the | |
84 | current key. ``bpf_map_get_next_key()`` returns ``0`` on success, | |
85 | ``-ENOENT`` if ``cur_key`` is the last key in the trie, or negative | |
86 | error in case of failure. | |
87 | ||
88 | ``bpf_map_get_next_key()`` will iterate through the LPM trie elements | |
89 | from leftmost leaf first. This means that iteration will return more | |
90 | specific keys before less specific ones. | |
91 | ||
92 | Examples | |
93 | ======== | |
94 | ||
95 | Please see ``tools/testing/selftests/bpf/test_lpm_map.c`` for examples | |
96 | of LPM trie usage from userspace. The code snippets below demonstrate | |
97 | API usage. | |
98 | ||
99 | Kernel BPF | |
100 | ---------- | |
101 | ||
102 | The following BPF code snippet shows how to declare a new LPM trie for IPv4 | |
103 | address prefixes: | |
104 | ||
105 | .. code-block:: c | |
106 | ||
107 | #include <linux/bpf.h> | |
108 | #include <bpf/bpf_helpers.h> | |
109 | ||
110 | struct ipv4_lpm_key { | |
111 | __u32 prefixlen; | |
112 | __u32 data; | |
113 | }; | |
114 | ||
115 | struct { | |
116 | __uint(type, BPF_MAP_TYPE_LPM_TRIE); | |
117 | __type(key, struct ipv4_lpm_key); | |
118 | __type(value, __u32); | |
119 | __uint(map_flags, BPF_F_NO_PREALLOC); | |
120 | __uint(max_entries, 255); | |
121 | } ipv4_lpm_map SEC(".maps"); | |
122 | ||
123 | The following BPF code snippet shows how to lookup by IPv4 address: | |
124 | ||
125 | .. code-block:: c | |
126 | ||
127 | void *lookup(__u32 ipaddr) | |
128 | { | |
129 | struct ipv4_lpm_key key = { | |
130 | .prefixlen = 32, | |
131 | .data = ipaddr | |
132 | }; | |
133 | ||
134 | return bpf_map_lookup_elem(&ipv4_lpm_map, &key); | |
135 | } | |
136 | ||
137 | Userspace | |
138 | --------- | |
139 | ||
140 | The following snippet shows how to insert an IPv4 prefix entry into an | |
141 | LPM trie: | |
142 | ||
143 | .. code-block:: c | |
144 | ||
145 | int add_prefix_entry(int lpm_fd, __u32 addr, __u32 prefixlen, struct value *value) | |
146 | { | |
147 | struct ipv4_lpm_key ipv4_key = { | |
148 | .prefixlen = prefixlen, | |
149 | .data = addr | |
150 | }; | |
151 | return bpf_map_update_elem(lpm_fd, &ipv4_key, value, BPF_ANY); | |
152 | } | |
153 | ||
154 | The following snippet shows a userspace program walking through the entries | |
155 | of an LPM trie: | |
156 | ||
157 | ||
158 | .. code-block:: c | |
159 | ||
160 | #include <bpf/libbpf.h> | |
161 | #include <bpf/bpf.h> | |
162 | ||
163 | void iterate_lpm_trie(int map_fd) | |
164 | { | |
165 | struct ipv4_lpm_key *cur_key = NULL; | |
166 | struct ipv4_lpm_key next_key; | |
167 | struct value value; | |
168 | int err; | |
169 | ||
170 | for (;;) { | |
171 | err = bpf_map_get_next_key(map_fd, cur_key, &next_key); | |
172 | if (err) | |
173 | break; | |
174 | ||
175 | bpf_map_lookup_elem(map_fd, &next_key, &value); | |
176 | ||
177 | /* Use key and value here */ | |
178 | ||
179 | cur_key = &next_key; | |
180 | } | |
181 | } |