bpf, docs: Editorial nits in instruction-set.rst
[linux-block.git] / Documentation / bpf / standardization / instruction-set.rst
CommitLineData
5a8921ba
DT
1.. contents::
2.. sectnum::
3
dc8543b5
DT
4======================================
5BPF Instruction Set Architecture (ISA)
6======================================
5a8921ba 7
dc8543b5 8This document specifies the BPF instruction set architecture (ISA).
88691e9e 9
d00d5b82
DT
10Documentation conventions
11=========================
12
2369e526
WH
13For brevity and consistency, this document refers to families
14of types using a shorthand syntax and refers to several expository,
15mnemonic functions when describing the semantics of instructions.
16The range of valid values for those types and the semantics of those
17functions are defined in the following subsections.
18
19Types
20-----
21This document refers to integer types with the notation `SN` to specify
22a type's signedness (`S`) and bit width (`N`), respectively.
23
24.. table:: Meaning of signedness notation.
25
26 ==== =========
4e73e1bc 27 S Meaning
2369e526 28 ==== =========
4e73e1bc
DT
29 u unsigned
30 s signed
2369e526
WH
31 ==== =========
32
33.. table:: Meaning of bit-width notation.
34
35 ===== =========
4e73e1bc 36 N Bit width
2369e526 37 ===== =========
4e73e1bc
DT
38 8 8 bits
39 16 16 bits
40 32 32 bits
41 64 64 bits
42 128 128 bits
2369e526
WH
43 ===== =========
44
45For example, `u32` is a type whose valid values are all the 32-bit unsigned
00d5d22a 46numbers and `s16` is a type whose valid values are all the 16-bit signed
2369e526
WH
47numbers.
48
49Functions
50---------
4e73e1bc 51* htobe16: Takes an unsigned 16-bit number in host-endian format and
2369e526
WH
52 returns the equivalent number as an unsigned 16-bit number in big-endian
53 format.
4e73e1bc 54* htobe32: Takes an unsigned 32-bit number in host-endian format and
2369e526
WH
55 returns the equivalent number as an unsigned 32-bit number in big-endian
56 format.
4e73e1bc 57* htobe64: Takes an unsigned 64-bit number in host-endian format and
2369e526
WH
58 returns the equivalent number as an unsigned 64-bit number in big-endian
59 format.
4e73e1bc 60* htole16: Takes an unsigned 16-bit number in host-endian format and
2369e526
WH
61 returns the equivalent number as an unsigned 16-bit number in little-endian
62 format.
4e73e1bc 63* htole32: Takes an unsigned 32-bit number in host-endian format and
2369e526
WH
64 returns the equivalent number as an unsigned 32-bit number in little-endian
65 format.
4e73e1bc 66* htole64: Takes an unsigned 64-bit number in host-endian format and
2369e526
WH
67 returns the equivalent number as an unsigned 64-bit number in little-endian
68 format.
4e73e1bc 69* bswap16: Takes an unsigned 16-bit number in either big- or little-endian
2369e526
WH
70 format and returns the equivalent number with the same bit width but
71 opposite endianness.
4e73e1bc 72* bswap32: Takes an unsigned 32-bit number in either big- or little-endian
2369e526
WH
73 format and returns the equivalent number with the same bit width but
74 opposite endianness.
4e73e1bc 75* bswap64: Takes an unsigned 64-bit number in either big- or little-endian
2369e526
WH
76 format and returns the equivalent number with the same bit width but
77 opposite endianness.
88691e9e 78
e546a119
WH
79
80Definitions
81-----------
82
83.. glossary::
84
85 Sign Extend
86 To `sign extend an` ``X`` `-bit number, A, to a` ``Y`` `-bit number, B ,` means to
87
88 #. Copy all ``X`` bits from `A` to the lower ``X`` bits of `B`.
89 #. Set the value of the remaining ``Y`` - ``X`` bits of `B` to the value of
90 the most-significant bit of `A`.
91
92.. admonition:: Example
93
94 Sign extend an 8-bit number ``A`` to a 16-bit number ``B`` on a big-endian platform:
95 ::
96
97 A: 10000110
98 B: 11111111 10000110
99
81777efb
DT
100Conformance groups
101------------------
102
103An implementation does not need to support all instructions specified in this
104document (e.g., deprecated instructions). Instead, a number of conformance
2d9a925d 105groups are specified. An implementation must support the base32 conformance
81777efb
DT
106group and may support additional conformance groups, where supporting a
107conformance group means it must support all instructions in that conformance
108group.
109
110The use of named conformance groups enables interoperability between a runtime
00d5d22a 111that executes instructions, and tools such as compilers that generate
81777efb
DT
112instructions for the runtime. Thus, capability discovery in terms of
113conformance groups might be done manually by users or automatically by tools.
114
2d9a925d 115Each conformance group has a short ASCII label (e.g., "base32") that
81777efb
DT
116corresponds to a set of instructions that are mandatory. That is, each
117instruction has one or more conformance groups of which it is a member.
118
2d9a925d 119This document defines the following conformance groups:
563918a0 120
2d9a925d
DT
121* base32: includes all instructions defined in this
122 specification unless otherwise noted.
123* base64: includes base32, plus instructions explicitly noted
124 as being in the base64 conformance group.
125* atomic32: includes 32-bit atomic operation instructions (see `Atomic operations`_).
126* atomic64: includes atomic32, plus 64-bit atomic operation instructions.
127* divmul32: includes 32-bit division, multiplication, and modulo instructions.
128* divmul64: includes divmul32, plus 64-bit division, multiplication,
129 and modulo instructions.
0ef05e25 130* packet: deprecated packet access instructions.
81777efb 131
62e46838
CH
132Instruction encoding
133====================
134
7d35eb1a 135BPF has two instruction encodings:
5ca15b8a 136
5a8921ba 137* the basic instruction encoding, which uses 64 bits to encode an instruction
4e73e1bc
DT
138* the wide instruction encoding, which appends a second 64 bits
139 after the basic instruction for a total of 128 bits.
5ca15b8a 140
4e73e1bc
DT
141Basic instruction encoding
142--------------------------
62e46838 143
4e73e1bc 144A basic instruction is encoded as follows::
a92adde8 145
4e73e1bc
DT
146 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
147 | opcode | regs | offset |
148 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
149 | imm |
150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
a92adde8 151
4e73e1bc
DT
152**opcode**
153 operation to perform, encoded as follows::
a92adde8 154
4e73e1bc
DT
155 +-+-+-+-+-+-+-+-+
156 |specific |class|
157 +-+-+-+-+-+-+-+-+
a92adde8 158
4e73e1bc
DT
159 **specific**
160 The format of these bits varies by instruction class
a92adde8 161
4e73e1bc
DT
162 **class**
163 The instruction class (see `Instruction classes`_)
164
165**regs**
166 The source and destination register numbers, encoded as follows
167 on a little-endian host::
168
169 +-+-+-+-+-+-+-+-+
170 |src_reg|dst_reg|
171 +-+-+-+-+-+-+-+-+
172
173 and as follows on a big-endian host::
174
175 +-+-+-+-+-+-+-+-+
176 |dst_reg|src_reg|
177 +-+-+-+-+-+-+-+-+
178
179 **src_reg**
180 the source register number (0-10), except where otherwise specified
181 (`64-bit immediate instructions`_ reuse this field for other purposes)
182
183 **dst_reg**
00d5d22a
DT
184 destination register number (0-10), unless otherwise specified
185 (future instructions might reuse this field for other purposes)
4e73e1bc
DT
186
187**offset**
00d5d22a
DT
188 signed integer offset used with pointer arithmetic, except where
189 otherwise specified (some arithmetic instructions reuse this field
190 for other purposes)
4e73e1bc
DT
191
192**imm**
193 signed integer immediate value
62e46838 194
4e73e1bc
DT
195Note that the contents of multi-byte fields ('offset' and 'imm') are
196stored using big-endian byte ordering on big-endian hosts and
197little-endian byte ordering on little-endian hosts.
746ce767 198
ae256f95 199For example::
746ce767 200
ae256f95
JM
201 opcode offset imm assembly
202 src_reg dst_reg
203 07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little
204 dst_reg src_reg
205 07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big
746ce767 206
62e46838
CH
207Note that most instructions do not use all of the fields.
208Unused fields shall be cleared to zero.
209
4e73e1bc
DT
210Wide instruction encoding
211--------------------------
212
213Some instructions are defined to use the wide instruction encoding,
214which uses two 32-bit immediate values. The 64 bits following
215the basic instruction format contain a pseudo instruction
216with 'opcode', 'dst_reg', 'src_reg', and 'offset' all set to zero.
a92adde8 217
ae256f95
JM
218This is depicted in the following figure::
219
4e73e1bc
DT
220 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
221 | opcode | regs | offset |
222 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
223 | imm |
224 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
225 | reserved |
226 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
227 | next_imm |
228 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
229
230**opcode**
231 operation to perform, encoded as explained above
232
233**regs**
00d5d22a
DT
234 The source and destination register numbers (unless otherwise
235 specified), encoded as explained above
4e73e1bc
DT
236
237**offset**
00d5d22a
DT
238 signed integer offset used with pointer arithmetic, unless
239 otherwise specified
4e73e1bc
DT
240
241**imm**
242 signed integer immediate value
243
244**reserved**
245 unused, set to zero
a92adde8 246
4e73e1bc
DT
247**next_imm**
248 second signed integer immediate value
a92adde8 249
5e4dd19f 250Instruction classes
62e46838 251-------------------
88691e9e 252
4e73e1bc
DT
253The three least significant bits of the 'opcode' field store the instruction class:
254
255===== ===== =============================== ===================================
256class value description reference
257===== ===== =============================== ===================================
258LD 0x0 non-standard load operations `Load and store instructions`_
259LDX 0x1 load into register operations `Load and store instructions`_
260ST 0x2 store from immediate operations `Load and store instructions`_
261STX 0x3 store from register operations `Load and store instructions`_
262ALU 0x4 32-bit arithmetic operations `Arithmetic and jump instructions`_
263JMP 0x5 64-bit jump operations `Arithmetic and jump instructions`_
264JMP32 0x6 32-bit jump operations `Arithmetic and jump instructions`_
265ALU64 0x7 64-bit arithmetic operations `Arithmetic and jump instructions`_
266===== ===== =============================== ===================================
88691e9e 267
5e4dd19f
CH
268Arithmetic and jump instructions
269================================
270
4e73e1bc
DT
271For arithmetic and jump instructions (``ALU``, ``ALU64``, ``JMP`` and
272``JMP32``), the 8-bit 'opcode' field is divided into three parts::
88691e9e 273
4e73e1bc
DT
274 +-+-+-+-+-+-+-+-+
275 | code |s|class|
276 +-+-+-+-+-+-+-+-+
88691e9e 277
a92adde8
DT
278**code**
279 the operation code, whose meaning varies by instruction class
88691e9e 280
4e73e1bc 281**s (source)**
a92adde8 282 the source operand location, which unless otherwise specified is one of:
88691e9e 283
a92adde8
DT
284 ====== ===== ==============================================
285 source value description
286 ====== ===== ==============================================
4e73e1bc
DT
287 K 0 use 32-bit 'imm' value as source operand
288 X 1 use 'src_reg' register value as source operand
a92adde8 289 ====== ===== ==============================================
88691e9e 290
a92adde8
DT
291**instruction class**
292 the instruction class (see `Instruction classes`_)
be3193cd
CH
293
294Arithmetic instructions
295-----------------------
296
4e73e1bc
DT
297``ALU`` uses 32-bit wide operands while ``ALU64`` uses 64-bit wide operands for
298otherwise identical operations. ``ALU64`` instructions belong to the
2d9a925d 299base64 conformance group unless noted otherwise.
a92adde8
DT
300The 'code' field encodes the operation as below, where 'src' and 'dst' refer
301to the values of the source and destination registers, respectively.
5a8921ba 302
4e73e1bc
DT
303===== ===== ======= ==========================================================
304name code offset description
305===== ===== ======= ==========================================================
306ADD 0x0 0 dst += src
307SUB 0x1 0 dst -= src
308MUL 0x2 0 dst \*= src
309DIV 0x3 0 dst = (src != 0) ? (dst / src) : 0
310SDIV 0x3 1 dst = (src != 0) ? (dst s/ src) : 0
311OR 0x4 0 dst \|= src
312AND 0x5 0 dst &= src
313LSH 0x6 0 dst <<= (src & mask)
314RSH 0x7 0 dst >>= (src & mask)
315NEG 0x8 0 dst = -dst
316MOD 0x9 0 dst = (src != 0) ? (dst % src) : dst
317SMOD 0x9 1 dst = (src != 0) ? (dst s% src) : dst
318XOR 0xa 0 dst ^= src
319MOV 0xb 0 dst = src
320MOVSX 0xb 8/16/32 dst = (s8,s16,s32)src
321ARSH 0xc 0 :term:`sign extending<Sign Extend>` dst >>= (src & mask)
322END 0xd 0 byte swap operations (see `Byte swap instructions`_ below)
323===== ===== ======= ==========================================================
5a8921ba 324
0eb9d19e 325Underflow and overflow are allowed during arithmetic operations, meaning
7d35eb1a 326the 64-bit or 32-bit value will wrap. If BPF program execution would
0eb9d19e 327result in division by zero, the destination register is instead set to zero.
4e73e1bc
DT
328If execution would result in modulo by zero, for ``ALU64`` the value of
329the destination register is unchanged whereas for ``ALU`` the upper
0eb9d19e
DT
33032 bits of the destination register are zeroed.
331
4e73e1bc 332``{ADD, X, ALU}``, where 'code' = ``ADD``, 'source' = ``X``, and 'class' = ``ALU``, means::
be3193cd 333
a92adde8 334 dst = (u32) ((u32) dst + (u32) src)
be3193cd 335
d00d5b82
DT
336where '(u32)' indicates that the upper 32 bits are zeroed.
337
4e73e1bc 338``{ADD, X, ALU64}`` means::
be3193cd 339
a92adde8 340 dst = dst + src
be3193cd 341
4e73e1bc 342``{XOR, K, ALU}`` means::
be3193cd 343
563918a0 344 dst = (u32) dst ^ (u32) imm
be3193cd 345
4e73e1bc 346``{XOR, K, ALU64}`` means::
be3193cd 347
563918a0 348 dst = dst ^ imm
be3193cd 349
00d5d22a
DT
350Note that most arithmetic instructions have 'offset' set to 0. Only three instructions
351(``SDIV``, ``SMOD``, ``MOVSX``) have a non-zero 'offset'.
245d4c40 352
4e73e1bc 353Division, multiplication, and modulo operations for ``ALU`` are part
2d9a925d 354of the "divmul32" conformance group, and division, multiplication, and
4e73e1bc 355modulo operations for ``ALU64`` are part of the "divmul64" conformance
2d9a925d 356group.
e546a119 357The division and modulo operations support both unsigned and signed flavors.
245d4c40 358
4e73e1bc
DT
359For unsigned operations (``DIV`` and ``MOD``), for ``ALU``,
360'imm' is interpreted as a 32-bit unsigned value. For ``ALU64``,
e546a119
WH
361'imm' is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
362interpreted as a 64-bit unsigned value.
ee932bf9 363
4e73e1bc
DT
364For signed operations (``SDIV`` and ``SMOD``), for ``ALU``,
365'imm' is interpreted as a 32-bit signed value. For ``ALU64``, 'imm'
e546a119
WH
366is first :term:`sign extended<Sign Extend>` from 32 to 64 bits, and then
367interpreted as a 64-bit signed value.
ee932bf9 368
0e133a13
DT
369Note that there are varying definitions of the signed modulo operation
370when the dividend or divisor are negative, where implementations often
371vary by language such that Python, Ruby, etc. differ from C, Go, Java,
372etc. This specification requires that signed modulo use truncated division
373(where -13 % 3 == -1) as implemented in C, Go, etc.:
374
375 a % n = a - n * trunc(a / n)
376
4e73e1bc 377The ``MOVSX`` instruction does a move operation with sign extension.
00d5d22a
DT
378``{MOVSX, X, ALU}`` :term:`sign extends<Sign Extend>` 8-bit and 16-bit operands into
37932-bit operands, and zeroes the remaining upper 32 bits.
4e73e1bc 380``{MOVSX, X, ALU64}`` :term:`sign extends<Sign Extend>` 8-bit, 16-bit, and 32-bit
00d5d22a 381operands into 64-bit operands. Unlike other arithmetic instructions,
4e73e1bc 382``MOVSX`` is only defined for register source operands (``X``).
be3193cd 383
4e73e1bc
DT
384The ``NEG`` instruction is only defined when the source bit is clear
385(``K``).
e48f0f4a 386
8819495a
DT
387Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
388for 32-bit operations.
389
dd33fb57 390Byte swap instructions
ee932bf9 391----------------------
dd33fb57 392
4e73e1bc
DT
393The byte swap instructions use instruction classes of ``ALU`` and ``ALU64``
394and a 4-bit 'code' field of ``END``.
dd33fb57 395
67b97e58 396The byte swap instructions operate on the destination register
dd33fb57
CH
397only and do not use a separate source register or immediate value.
398
4e73e1bc 399For ``ALU``, the 1-bit source operand field in the opcode is used to
ee932bf9 400select what byte order the operation converts from or to. For
4e73e1bc 401``ALU64``, the 1-bit source operand field in the opcode is reserved
ee932bf9 402and must be set to 0.
dd33fb57 403
4e73e1bc
DT
404===== ======== ===== =================================================
405class source value description
406===== ======== ===== =================================================
407ALU TO_LE 0 convert between host byte order and little endian
408ALU TO_BE 1 convert between host byte order and big endian
409ALU64 Reserved 0 do byte swap unconditionally
410===== ======== ===== =================================================
dd33fb57 411
5a8921ba 412The 'imm' field encodes the width of the swap operations. The following widths
2d9a925d
DT
413are supported: 16, 32 and 64. Width 64 operations belong to the base64
414conformance group and other swap operations belong to the base32
415conformance group.
dd33fb57
CH
416
417Examples:
418
00d5d22a 419``{END, TO_LE, ALU}`` with 'imm' = 16/32/64 means::
dd33fb57 420
a92adde8 421 dst = htole16(dst)
2369e526
WH
422 dst = htole32(dst)
423 dst = htole64(dst)
dd33fb57 424
00d5d22a 425``{END, TO_BE, ALU}`` with 'imm' = 16/32/64 means::
dd33fb57 426
2369e526
WH
427 dst = htobe16(dst)
428 dst = htobe32(dst)
a92adde8 429 dst = htobe64(dst)
dd33fb57 430
00d5d22a 431``{END, TO_LE, ALU64}`` with 'imm' = 16/32/64 means::
245d4c40 432
2369e526
WH
433 dst = bswap16(dst)
434 dst = bswap32(dst)
435 dst = bswap64(dst)
245d4c40 436
be3193cd
CH
437Jump instructions
438-----------------
439
4e73e1bc
DT
440``JMP32`` uses 32-bit wide operands and indicates the base32
441conformance group, while ``JMP`` uses 64-bit wide operands for
2d9a925d
DT
442otherwise identical operations, and indicates the base64 conformance
443group unless otherwise specified.
5a8921ba
DT
444The 'code' field encodes the operation as below:
445
4e73e1bc 446======== ===== ======= =============================== ===================================================
c1bb68f6 447code value src_reg description notes
4e73e1bc
DT
448======== ===== ======= =============================== ===================================================
449JA 0x0 0x0 PC += offset {JA, K, JMP} only
450JA 0x0 0x0 PC += imm {JA, K, JMP32} only
451JEQ 0x1 any PC += offset if dst == src
452JGT 0x2 any PC += offset if dst > src unsigned
453JGE 0x3 any PC += offset if dst >= src unsigned
454JSET 0x4 any PC += offset if dst & src
455JNE 0x5 any PC += offset if dst != src
456JSGT 0x6 any PC += offset if dst > src signed
457JSGE 0x7 any PC += offset if dst >= src signed
458CALL 0x8 0x0 call helper function by address {CALL, K, JMP} only, see `Helper functions`_
459CALL 0x8 0x1 call PC += imm {CALL, K, JMP} only, see `Program-local functions`_
460CALL 0x8 0x2 call helper function by BTF ID {CALL, K, JMP} only, see `Helper functions`_
461EXIT 0x9 0x0 return {CALL, K, JMP} only
462JLT 0xa any PC += offset if dst < src unsigned
463JLE 0xb any PC += offset if dst <= src unsigned
464JSLT 0xc any PC += offset if dst < src signed
465JSLE 0xd any PC += offset if dst <= src signed
466======== ===== ======= =============================== ===================================================
467
468The BPF program needs to store the return value into register R0 before doing an
469``EXIT``.
88691e9e 470
b9fe8e8d
DT
471Example:
472
4e73e1bc 473``{JSGE, X, JMP32}`` means::
b9fe8e8d
DT
474
475 if (s32)dst s>= (s32)src goto +offset
476
477where 's>=' indicates a signed '>=' comparison.
478
4e73e1bc 479``{JA, K, JMP32}`` means::
245d4c40
YS
480
481 gotol +imm
482
00d5d22a 483where 'imm' means the branch offset comes from the 'imm' field.
245d4c40 484
4e73e1bc
DT
485Note that there are two flavors of ``JA`` instructions. The
486``JMP`` class permits a 16-bit jump offset specified by the 'offset'
487field, whereas the ``JMP32`` class permits a 32-bit jump offset
ee932bf9
YS
488specified by the 'imm' field. A > 16-bit conditional jump may be
489converted to a < 16-bit conditional jump plus a 32-bit unconditional
490jump.
245d4c40 491
4e73e1bc 492All ``CALL`` and ``JA`` instructions belong to the
2d9a925d
DT
493base32 conformance group.
494
c1f9e14e
DT
495Helper functions
496~~~~~~~~~~~~~~~~
497
498Helper functions are a concept whereby BPF programs can call into a
8cfee110
DT
499set of function calls exposed by the underlying platform.
500
501Historically, each helper function was identified by an address
00d5d22a 502encoded in the 'imm' field. The available helper functions may differ
8cfee110
DT
503for each program type, but address values are unique across all program types.
504
505Platforms that support the BPF Type Format (BTF) support identifying
00d5d22a 506a helper function by a BTF ID encoded in the 'imm' field, where the BTF ID
8cfee110
DT
507identifies the helper name and type.
508
509Program-local functions
510~~~~~~~~~~~~~~~~~~~~~~~
511Program-local functions are functions exposed by the same BPF program as the
512caller, and are referenced by offset from the call instruction, similar to
00d5d22a
DT
513``JA``. The offset is encoded in the 'imm' field of the call instruction.
514An ``EXIT`` within the program-local function will return to the caller.
88691e9e 515
5e4dd19f
CH
516Load and store instructions
517===========================
518
4e73e1bc 519For load and store instructions (``LD``, ``LDX``, ``ST``, and ``STX``), the
00d5d22a 5208-bit 'opcode' field is divided as follows::
4e73e1bc
DT
521
522 +-+-+-+-+-+-+-+-+
523 |mode |sz |class|
524 +-+-+-+-+-+-+-+-+
525
526**mode**
527 The mode modifier is one of:
528
529 ============= ===== ==================================== =============
530 mode modifier value description reference
531 ============= ===== ==================================== =============
532 IMM 0 64-bit immediate instructions `64-bit immediate instructions`_
533 ABS 1 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
534 IND 2 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
535 MEM 3 regular load and store operations `Regular load and store operations`_
536 MEMSX 4 sign-extension load operations `Sign-extension load operations`_
537 ATOMIC 6 atomic operations `Atomic operations`_
538 ============= ===== ==================================== =============
539
540**sz (size)**
541 The size modifier is one of:
542
543 ==== ===== =====================
544 size value description
545 ==== ===== =====================
546 W 0 word (4 bytes)
547 H 1 half word (2 bytes)
548 B 2 byte
549 DW 3 double word (8 bytes)
550 ==== ===== =====================
551
552 Instructions using ``DW`` belong to the base64 conformance group.
553
554**class**
555 The instruction class (see `Instruction classes`_)
2d9a925d 556
63d8c242
CH
557Regular load and store operations
558---------------------------------
559
4e73e1bc 560The ``MEM`` mode modifier is used to encode regular load and store
63d8c242
CH
561instructions that transfer data between a register and memory.
562
4e73e1bc 563``{MEM, <size>, STX}`` means::
88691e9e 564
a92adde8 565 *(size *) (dst + offset) = src
88691e9e 566
4e73e1bc 567``{MEM, <size>, ST}`` means::
88691e9e 568
563918a0 569 *(size *) (dst + offset) = imm
5e4dd19f 570
4e73e1bc 571``{MEM, <size>, LDX}`` means::
5e4dd19f 572
245d4c40
YS
573 dst = *(unsigned size *) (src + offset)
574
4e73e1bc
DT
575Where '<size>' is one of: ``B``, ``H``, ``W``, or ``DW``, and
576'unsigned size' is one of: u8, u16, u32, or u64.
245d4c40 577
fb213ecb
YS
578Sign-extension load operations
579------------------------------
580
4e73e1bc 581The ``MEMSX`` mode modifier is used to encode :term:`sign-extension<Sign Extend>` load
245d4c40
YS
582instructions that transfer data between a register and memory.
583
4e73e1bc 584``{MEMSX, <size>, LDX}`` means::
245d4c40
YS
585
586 dst = *(signed size *) (src + offset)
5e4dd19f 587
00d5d22a 588Where '<size>' is one of: ``B``, ``H``, or ``W``, and
4e73e1bc 589'signed size' is one of: s8, s16, or s32.
5e4dd19f 590
5e4dd19f
CH
591Atomic operations
592-----------------
88691e9e 593
594d3234
CH
594Atomic operations are operations that operate on memory and can not be
595interrupted or corrupted by other access to the same memory region
7d35eb1a 596by other BPF programs or means outside of this specification.
88691e9e 597
7d35eb1a 598All atomic operations supported by BPF are encoded as store operations
4e73e1bc 599that use the ``ATOMIC`` mode modifier as follows:
88691e9e 600
4e73e1bc 601* ``{ATOMIC, W, STX}`` for 32-bit operations, which are
2d9a925d 602 part of the "atomic32" conformance group.
4e73e1bc 603* ``{ATOMIC, DW, STX}`` for 64-bit operations, which are
2d9a925d 604 part of the "atomic64" conformance group.
5a8921ba 605* 8-bit and 16-bit wide atomic operations are not supported.
88691e9e 606
5a8921ba 607The 'imm' field is used to encode the actual atomic operation.
594d3234 608Simple atomic operation use a subset of the values defined to encode
5a8921ba 609arithmetic operations in the 'imm' field to encode the atomic operation:
88691e9e 610
5a8921ba
DT
611======== ===== ===========
612imm value description
613======== ===== ===========
4e73e1bc
DT
614ADD 0x00 atomic add
615OR 0x40 atomic or
616AND 0x50 atomic and
617XOR 0xa0 atomic xor
5a8921ba 618======== ===== ===========
88691e9e 619
88691e9e 620
4e73e1bc 621``{ATOMIC, W, STX}`` with 'imm' = ADD means::
88691e9e 622
a92adde8 623 *(u32 *)(dst + offset) += src
88691e9e 624
4e73e1bc 625``{ATOMIC, DW, STX}`` with 'imm' = ADD means::
88691e9e 626
a92adde8 627 *(u64 *)(dst + offset) += src
88691e9e 628
594d3234
CH
629In addition to the simple atomic operations, there also is a modifier and
630two complex atomic operations:
631
5a8921ba
DT
632=========== ================ ===========================
633imm value description
634=========== ================ ===========================
4e73e1bc
DT
635FETCH 0x01 modifier: return old value
636XCHG 0xe0 | FETCH atomic exchange
637CMPXCHG 0xf0 | FETCH atomic compare and exchange
5a8921ba 638=========== ================ ===========================
594d3234 639
4e73e1bc
DT
640The ``FETCH`` modifier is optional for simple atomic operations, and
641always set for the complex atomic operations. If the ``FETCH`` flag
a92adde8 642is set, then the operation also overwrites ``src`` with the value that
594d3234
CH
643was in memory before it was modified.
644
4e73e1bc 645The ``XCHG`` operation atomically exchanges ``src`` with the value
a92adde8 646addressed by ``dst + offset``.
594d3234 647
4e73e1bc 648The ``CMPXCHG`` operation atomically compares the value addressed by
a92adde8
DT
649``dst + offset`` with ``R0``. If they match, the value addressed by
650``dst + offset`` is replaced with ``src``. In either case, the
651value that was at ``dst + offset`` before the operation is zero-extended
594d3234 652and loaded back to ``R0``.
88691e9e 653
5ca15b8a
CH
65464-bit immediate instructions
655-----------------------------
656
4e73e1bc 657Instructions with the ``IMM`` 'mode' modifier use the wide instruction
c1bb68f6 658encoding defined in `Instruction encoding`_, and use the 'src_reg' field of the
16b7c970
DT
659basic instruction to hold an opcode subtype.
660
4e73e1bc 661The following table defines a set of ``{IMM, DW, LD}`` instructions
c1bb68f6 662with opcode subtypes in the 'src_reg' field, using new terms such as "map"
16b7c970
DT
663defined further below:
664
4e73e1bc
DT
665======= ========================================= =========== ==============
666src_reg pseudocode imm type dst type
667======= ========================================= =========== ==============
6680x0 dst = (next_imm << 32) | imm integer integer
6690x1 dst = map_by_fd(imm) map fd map
6700x2 dst = map_val(map_by_fd(imm)) + next_imm map fd data pointer
6710x3 dst = var_addr(imm) variable id data pointer
6720x4 dst = code_addr(imm) integer code pointer
6730x5 dst = map_by_idx(imm) map index map
6740x6 dst = map_val(map_by_idx(imm)) + next_imm map index data pointer
675======= ========================================= =========== ==============
16b7c970
DT
676
677where
678
679* map_by_fd(imm) means to convert a 32-bit file descriptor into an address of a map (see `Maps`_)
680* map_by_idx(imm) means to convert a 32-bit index into an address of a map
681* map_val(map) gets the address of the first value in a given map
682* var_addr(imm) gets the address of a platform variable (see `Platform Variables`_) with a given id
683* code_addr(imm) gets the address of the instruction at a specified relative offset in number of (64-bit) instructions
684* the 'imm type' can be used by disassemblers for display
685* the 'dst type' can be used for verification and JIT compilation purposes
686
687Maps
688~~~~
689
7d35eb1a 690Maps are shared memory regions accessible by BPF programs on some platforms.
16b7c970
DT
691A map can have various semantics as defined in a separate document, and may or
692may not have a single contiguous memory region, but the 'map_val(map)' is
693currently only defined for maps that do have a single contiguous memory region.
694
695Each map can have a file descriptor (fd) if supported by the platform, where
696'map_by_fd(imm)' means to get the map with the specified file descriptor. Each
697BPF program can also be defined to use a set of maps associated with the
698program at load time, and 'map_by_idx(imm)' means to get the map with the given
699index in the set associated with the BPF program containing the instruction.
700
701Platform Variables
702~~~~~~~~~~~~~~~~~~
703
704Platform variables are memory regions, identified by integer ids, exposed by
705the runtime and accessible by BPF programs on some platforms. The
706'var_addr(imm)' operation means to get the address of the memory region
707identified by the given id.
63d000c3 708
15175336
CH
709Legacy BPF Packet access instructions
710-------------------------------------
63d000c3 711
7d35eb1a 712BPF previously introduced special instructions for access to packet data that were
088a464e 713carried over from classic BPF. These instructions used an instruction
4e73e1bc
DT
714class of ``LD``, a size modifier of ``W``, ``H``, or ``B``, and a
715mode modifier of ``ABS`` or ``IND``. The 'dst_reg' and 'offset' fields were
716set to zero, and 'src_reg' was set to zero for ``ABS``. However, these
89ee8381 717instructions are deprecated and should no longer be used. All legacy packet
0ef05e25 718access instructions belong to the "packet" conformance group.