Commit | Line | Data |
---|---|---|
979855d3 DH |
1 | .. SPDX-License-Identifier: GPL-2.0-only |
2 | .. Copyright (C) 2022 Red Hat, Inc. | |
3 | ||
4 | =============================================== | |
5 | BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants | |
6 | =============================================== | |
7 | ||
8 | .. note:: | |
9 | - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19 | |
10 | - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6 | |
11 | - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
12 | were introduced in version 4.10 | |
13 | ||
14 | ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general | |
15 | purpose hash map storage. Both the key and the value can be structs, | |
16 | allowing for composite keys and values. | |
17 | ||
18 | The kernel is responsible for allocating and freeing key/value pairs, up | |
19 | to the max_entries limit that you specify. Hash maps use pre-allocation | |
20 | of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be | |
21 | used to disable pre-allocation when it is too memory expensive. | |
22 | ||
23 | ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per | |
24 | CPU. The per-cpu values are stored internally in an array. | |
25 | ||
26 | The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
27 | variants add LRU semantics to their respective hash tables. An LRU hash | |
28 | will automatically evict the least recently used entries when the hash | |
29 | table reaches capacity. An LRU hash maintains an internal LRU list that | |
30 | is used to select elements for eviction. This internal LRU list is | |
31 | shared across CPUs but it is possible to request a per CPU LRU list with | |
32 | the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``. | |
33 | ||
34 | Usage | |
35 | ===== | |
36 | ||
539886a3 DH |
37 | Kernel BPF |
38 | ---------- | |
39 | ||
40 | bpf_map_update_elem() | |
41 | ~~~~~~~~~~~~~~~~~~~~~ | |
42 | ||
43 | .. code-block:: c | |
44 | ||
979855d3 DH |
45 | long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) |
46 | ||
47 | Hash entries can be added or updated using the ``bpf_map_update_elem()`` | |
48 | helper. This helper replaces existing elements atomically. The ``flags`` | |
49 | parameter can be used to control the update behaviour: | |
50 | ||
51 | - ``BPF_ANY`` will create a new element or update an existing element | |
52 | - ``BPF_NOEXIST`` will create a new element only if one did not already | |
53 | exist | |
54 | - ``BPF_EXIST`` will update an existing element | |
55 | ||
56 | ``bpf_map_update_elem()`` returns 0 on success, or negative error in | |
57 | case of failure. | |
58 | ||
539886a3 DH |
59 | bpf_map_lookup_elem() |
60 | ~~~~~~~~~~~~~~~~~~~~~ | |
61 | ||
62 | .. code-block:: c | |
63 | ||
979855d3 DH |
64 | void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) |
65 | ||
66 | Hash entries can be retrieved using the ``bpf_map_lookup_elem()`` | |
67 | helper. This helper returns a pointer to the value associated with | |
68 | ``key``, or ``NULL`` if no entry was found. | |
69 | ||
539886a3 DH |
70 | bpf_map_delete_elem() |
71 | ~~~~~~~~~~~~~~~~~~~~~ | |
72 | ||
73 | .. code-block:: c | |
74 | ||
979855d3 DH |
75 | long bpf_map_delete_elem(struct bpf_map *map, const void *key) |
76 | ||
77 | Hash entries can be deleted using the ``bpf_map_delete_elem()`` | |
78 | helper. This helper will return 0 on success, or negative error in case | |
79 | of failure. | |
80 | ||
81 | Per CPU Hashes | |
82 | -------------- | |
83 | ||
84 | For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
85 | the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers | |
86 | automatically access the hash slot for the current CPU. | |
87 | ||
539886a3 DH |
88 | bpf_map_lookup_percpu_elem() |
89 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
90 | ||
91 | .. code-block:: c | |
92 | ||
979855d3 DH |
93 | void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) |
94 | ||
95 | The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the | |
96 | value in the hash slot for a specific CPU. Returns value associated with | |
97 | ``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is | |
98 | invalid. | |
99 | ||
100 | Concurrency | |
101 | ----------- | |
102 | ||
103 | Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by | |
104 | programs running on different CPUs. Since Kernel version 5.1, the BPF | |
105 | infrastructure provides ``struct bpf_spin_lock`` to synchronise access. | |
106 | See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``. | |
107 | ||
108 | Userspace | |
109 | --------- | |
110 | ||
539886a3 DH |
111 | bpf_map_get_next_key() |
112 | ~~~~~~~~~~~~~~~~~~~~~~ | |
113 | ||
114 | .. code-block:: c | |
115 | ||
979855d3 DH |
116 | int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key) |
117 | ||
118 | In userspace, it is possible to iterate through the keys of a hash using | |
119 | libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by | |
120 | calling ``bpf_map_get_next_key()`` with ``cur_key`` set to | |
121 | ``NULL``. Subsequent calls will fetch the next key that follows the | |
122 | current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if | |
123 | cur_key is the last key in the hash, or negative error in case of | |
124 | failure. | |
125 | ||
126 | Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()`` | |
127 | will instead return the *first* key in the hash table which is | |
128 | undesirable. It is recommended to use batched lookup if there is going | |
129 | to be key deletion intermixed with ``bpf_map_get_next_key()``. | |
130 | ||
131 | Examples | |
132 | ======== | |
133 | ||
134 | Please see the ``tools/testing/selftests/bpf`` directory for functional | |
135 | examples. The code snippets below demonstrates API usage. | |
136 | ||
137 | This example shows how to declare an LRU Hash with a struct key and a | |
138 | struct value. | |
139 | ||
140 | .. code-block:: c | |
141 | ||
142 | #include <linux/bpf.h> | |
143 | #include <bpf/bpf_helpers.h> | |
144 | ||
145 | struct key { | |
146 | __u32 srcip; | |
147 | }; | |
148 | ||
149 | struct value { | |
150 | __u64 packets; | |
151 | __u64 bytes; | |
152 | }; | |
153 | ||
154 | struct { | |
155 | __uint(type, BPF_MAP_TYPE_LRU_HASH); | |
156 | __uint(max_entries, 32); | |
157 | __type(key, struct key); | |
158 | __type(value, struct value); | |
159 | } packet_stats SEC(".maps"); | |
160 | ||
161 | This example shows how to create or update hash values using atomic | |
162 | instructions: | |
163 | ||
164 | .. code-block:: c | |
165 | ||
166 | static void update_stats(__u32 srcip, int bytes) | |
167 | { | |
168 | struct key key = { | |
169 | .srcip = srcip, | |
170 | }; | |
171 | struct value *value = bpf_map_lookup_elem(&packet_stats, &key); | |
172 | ||
173 | if (value) { | |
174 | __sync_fetch_and_add(&value->packets, 1); | |
175 | __sync_fetch_and_add(&value->bytes, bytes); | |
176 | } else { | |
177 | struct value newval = { 1, bytes }; | |
178 | ||
179 | bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST); | |
180 | } | |
181 | } | |
182 | ||
183 | Userspace walking the map elements from the map declared above: | |
184 | ||
185 | .. code-block:: c | |
186 | ||
187 | #include <bpf/libbpf.h> | |
188 | #include <bpf/bpf.h> | |
189 | ||
190 | static void walk_hash_elements(int map_fd) | |
191 | { | |
192 | struct key *cur_key = NULL; | |
193 | struct key next_key; | |
194 | struct value value; | |
195 | int err; | |
196 | ||
197 | for (;;) { | |
198 | err = bpf_map_get_next_key(map_fd, cur_key, &next_key); | |
199 | if (err) | |
200 | break; | |
201 | ||
202 | bpf_map_lookup_elem(map_fd, &next_key, &value); | |
203 | ||
204 | // Use key and value here | |
205 | ||
206 | cur_key = &next_key; | |
207 | } | |
208 | } |