Commit | Line | Data |
---|---|---|
979855d3 DH |
1 | .. SPDX-License-Identifier: GPL-2.0-only |
2 | .. Copyright (C) 2022 Red Hat, Inc. | |
3 | ||
4 | =============================================== | |
5 | BPF_MAP_TYPE_HASH, with PERCPU and LRU Variants | |
6 | =============================================== | |
7 | ||
8 | .. note:: | |
9 | - ``BPF_MAP_TYPE_HASH`` was introduced in kernel version 3.19 | |
10 | - ``BPF_MAP_TYPE_PERCPU_HASH`` was introduced in version 4.6 | |
11 | - Both ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
12 | were introduced in version 4.10 | |
13 | ||
14 | ``BPF_MAP_TYPE_HASH`` and ``BPF_MAP_TYPE_PERCPU_HASH`` provide general | |
15 | purpose hash map storage. Both the key and the value can be structs, | |
16 | allowing for composite keys and values. | |
17 | ||
18 | The kernel is responsible for allocating and freeing key/value pairs, up | |
19 | to the max_entries limit that you specify. Hash maps use pre-allocation | |
20 | of hash table elements by default. The ``BPF_F_NO_PREALLOC`` flag can be | |
21 | used to disable pre-allocation when it is too memory expensive. | |
22 | ||
23 | ``BPF_MAP_TYPE_PERCPU_HASH`` provides a separate value slot per | |
24 | CPU. The per-cpu values are stored internally in an array. | |
25 | ||
26 | The ``BPF_MAP_TYPE_LRU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
27 | variants add LRU semantics to their respective hash tables. An LRU hash | |
28 | will automatically evict the least recently used entries when the hash | |
29 | table reaches capacity. An LRU hash maintains an internal LRU list that | |
30 | is used to select elements for eviction. This internal LRU list is | |
31 | shared across CPUs but it is possible to request a per CPU LRU list with | |
32 | the ``BPF_F_NO_COMMON_LRU`` flag when calling ``bpf_map_create``. | |
33 | ||
34 | Usage | |
35 | ===== | |
36 | ||
37 | .. c:function:: | |
38 | long bpf_map_update_elem(struct bpf_map *map, const void *key, const void *value, u64 flags) | |
39 | ||
40 | Hash entries can be added or updated using the ``bpf_map_update_elem()`` | |
41 | helper. This helper replaces existing elements atomically. The ``flags`` | |
42 | parameter can be used to control the update behaviour: | |
43 | ||
44 | - ``BPF_ANY`` will create a new element or update an existing element | |
45 | - ``BPF_NOEXIST`` will create a new element only if one did not already | |
46 | exist | |
47 | - ``BPF_EXIST`` will update an existing element | |
48 | ||
49 | ``bpf_map_update_elem()`` returns 0 on success, or negative error in | |
50 | case of failure. | |
51 | ||
52 | .. c:function:: | |
53 | void *bpf_map_lookup_elem(struct bpf_map *map, const void *key) | |
54 | ||
55 | Hash entries can be retrieved using the ``bpf_map_lookup_elem()`` | |
56 | helper. This helper returns a pointer to the value associated with | |
57 | ``key``, or ``NULL`` if no entry was found. | |
58 | ||
59 | .. c:function:: | |
60 | long bpf_map_delete_elem(struct bpf_map *map, const void *key) | |
61 | ||
62 | Hash entries can be deleted using the ``bpf_map_delete_elem()`` | |
63 | helper. This helper will return 0 on success, or negative error in case | |
64 | of failure. | |
65 | ||
66 | Per CPU Hashes | |
67 | -------------- | |
68 | ||
69 | For ``BPF_MAP_TYPE_PERCPU_HASH`` and ``BPF_MAP_TYPE_LRU_PERCPU_HASH`` | |
70 | the ``bpf_map_update_elem()`` and ``bpf_map_lookup_elem()`` helpers | |
71 | automatically access the hash slot for the current CPU. | |
72 | ||
73 | .. c:function:: | |
74 | void *bpf_map_lookup_percpu_elem(struct bpf_map *map, const void *key, u32 cpu) | |
75 | ||
76 | The ``bpf_map_lookup_percpu_elem()`` helper can be used to lookup the | |
77 | value in the hash slot for a specific CPU. Returns value associated with | |
78 | ``key`` on ``cpu`` , or ``NULL`` if no entry was found or ``cpu`` is | |
79 | invalid. | |
80 | ||
81 | Concurrency | |
82 | ----------- | |
83 | ||
84 | Values stored in ``BPF_MAP_TYPE_HASH`` can be accessed concurrently by | |
85 | programs running on different CPUs. Since Kernel version 5.1, the BPF | |
86 | infrastructure provides ``struct bpf_spin_lock`` to synchronise access. | |
87 | See ``tools/testing/selftests/bpf/progs/test_spin_lock.c``. | |
88 | ||
89 | Userspace | |
90 | --------- | |
91 | ||
92 | .. c:function:: | |
93 | int bpf_map_get_next_key(int fd, const void *cur_key, void *next_key) | |
94 | ||
95 | In userspace, it is possible to iterate through the keys of a hash using | |
96 | libbpf's ``bpf_map_get_next_key()`` function. The first key can be fetched by | |
97 | calling ``bpf_map_get_next_key()`` with ``cur_key`` set to | |
98 | ``NULL``. Subsequent calls will fetch the next key that follows the | |
99 | current key. ``bpf_map_get_next_key()`` returns 0 on success, -ENOENT if | |
100 | cur_key is the last key in the hash, or negative error in case of | |
101 | failure. | |
102 | ||
103 | Note that if ``cur_key`` gets deleted then ``bpf_map_get_next_key()`` | |
104 | will instead return the *first* key in the hash table which is | |
105 | undesirable. It is recommended to use batched lookup if there is going | |
106 | to be key deletion intermixed with ``bpf_map_get_next_key()``. | |
107 | ||
108 | Examples | |
109 | ======== | |
110 | ||
111 | Please see the ``tools/testing/selftests/bpf`` directory for functional | |
112 | examples. The code snippets below demonstrates API usage. | |
113 | ||
114 | This example shows how to declare an LRU Hash with a struct key and a | |
115 | struct value. | |
116 | ||
117 | .. code-block:: c | |
118 | ||
119 | #include <linux/bpf.h> | |
120 | #include <bpf/bpf_helpers.h> | |
121 | ||
122 | struct key { | |
123 | __u32 srcip; | |
124 | }; | |
125 | ||
126 | struct value { | |
127 | __u64 packets; | |
128 | __u64 bytes; | |
129 | }; | |
130 | ||
131 | struct { | |
132 | __uint(type, BPF_MAP_TYPE_LRU_HASH); | |
133 | __uint(max_entries, 32); | |
134 | __type(key, struct key); | |
135 | __type(value, struct value); | |
136 | } packet_stats SEC(".maps"); | |
137 | ||
138 | This example shows how to create or update hash values using atomic | |
139 | instructions: | |
140 | ||
141 | .. code-block:: c | |
142 | ||
143 | static void update_stats(__u32 srcip, int bytes) | |
144 | { | |
145 | struct key key = { | |
146 | .srcip = srcip, | |
147 | }; | |
148 | struct value *value = bpf_map_lookup_elem(&packet_stats, &key); | |
149 | ||
150 | if (value) { | |
151 | __sync_fetch_and_add(&value->packets, 1); | |
152 | __sync_fetch_and_add(&value->bytes, bytes); | |
153 | } else { | |
154 | struct value newval = { 1, bytes }; | |
155 | ||
156 | bpf_map_update_elem(&packet_stats, &key, &newval, BPF_NOEXIST); | |
157 | } | |
158 | } | |
159 | ||
160 | Userspace walking the map elements from the map declared above: | |
161 | ||
162 | .. code-block:: c | |
163 | ||
164 | #include <bpf/libbpf.h> | |
165 | #include <bpf/bpf.h> | |
166 | ||
167 | static void walk_hash_elements(int map_fd) | |
168 | { | |
169 | struct key *cur_key = NULL; | |
170 | struct key next_key; | |
171 | struct value value; | |
172 | int err; | |
173 | ||
174 | for (;;) { | |
175 | err = bpf_map_get_next_key(map_fd, cur_key, &next_key); | |
176 | if (err) | |
177 | break; | |
178 | ||
179 | bpf_map_lookup_elem(map_fd, &next_key, &value); | |
180 | ||
181 | // Use key and value here | |
182 | ||
183 | cur_key = &next_key; | |
184 | } | |
185 | } |