Commit | Line | Data |
---|---|---|
504245a5 DS |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | .. Copyright 2021-2023 Collabora Ltd. | |
3 | ||
4 | ======================== | |
5 | Exchanging pixel buffers | |
6 | ======================== | |
7 | ||
8 | As originally designed, the Linux graphics subsystem had extremely limited | |
9 | support for sharing pixel-buffer allocations between processes, devices, and | |
10 | subsystems. Modern systems require extensive integration between all three | |
11 | classes; this document details how applications and kernel subsystems should | |
12 | approach this sharing for two-dimensional image data. | |
13 | ||
14 | It is written with reference to the DRM subsystem for GPU and display devices, | |
15 | V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace | |
16 | support, however any other subsystems should also follow this design and advice. | |
17 | ||
18 | ||
19 | Glossary of terms | |
20 | ================= | |
21 | ||
22 | .. glossary:: | |
23 | ||
24 | image: | |
25 | Conceptually a two-dimensional array of pixels. The pixels may be stored | |
26 | in one or more memory buffers. Has width and height in pixels, pixel | |
27 | format and modifier (implicit or explicit). | |
28 | ||
29 | row: | |
30 | A span along a single y-axis value, e.g. from co-ordinates (0,100) to | |
31 | (200,100). | |
32 | ||
33 | scanline: | |
34 | Synonym for row. | |
35 | ||
36 | column: | |
37 | A span along a single x-axis value, e.g. from co-ordinates (100,0) to | |
38 | (100,100). | |
39 | ||
40 | memory buffer: | |
41 | A piece of memory for storing (parts of) pixel data. Has stride and size | |
42 | in bytes and at least one handle in some API. May contain one or more | |
43 | planes. | |
44 | ||
45 | plane: | |
46 | A two-dimensional array of some or all of an image's color and alpha | |
47 | channel values. | |
48 | ||
49 | pixel: | |
50 | A picture element. Has a single color value which is defined by one or | |
51 | more color channels values, e.g. R, G and B, or Y, Cb and Cr. May also | |
52 | have an alpha value as an additional channel. | |
53 | ||
54 | pixel data: | |
55 | Bytes or bits that represent some or all of the color/alpha channel values | |
56 | of a pixel or an image. The data for one pixel may be spread over several | |
57 | planes or memory buffers depending on format and modifier. | |
58 | ||
59 | color value: | |
60 | A tuple of numbers, representing a color. Each element in the tuple is a | |
61 | color channel value. | |
62 | ||
63 | color channel: | |
64 | One of the dimensions in a color model. For example, RGB model has | |
65 | channels R, G, and B. Alpha channel is sometimes counted as a color | |
66 | channel as well. | |
67 | ||
68 | pixel format: | |
69 | A description of how pixel data represents the pixel's color and alpha | |
70 | values. | |
71 | ||
72 | modifier: | |
73 | A description of how pixel data is laid out in memory buffers. | |
74 | ||
75 | alpha: | |
76 | A value that denotes the color coverage in a pixel. Sometimes used for | |
77 | translucency instead. | |
78 | ||
79 | stride: | |
80 | A value that denotes the relationship between pixel-location co-ordinates | |
81 | and byte-offset values. Typically used as the byte offset between two | |
82 | pixels at the start of vertically-consecutive tiling blocks. For linear | |
83 | layouts, the byte offset between two vertically-adjacent pixels. For | |
84 | non-linear formats the stride must be computed in a consistent way, which | |
85 | usually is done as-if the layout was linear. | |
86 | ||
87 | pitch: | |
88 | Synonym for stride. | |
89 | ||
90 | ||
91 | Formats and modifiers | |
92 | ===================== | |
93 | ||
94 | Each buffer must have an underlying format. This format describes the color | |
95 | values provided for each pixel. Although each subsystem has its own format | |
96 | descriptions (e.g. V4L2 and fbdev), the ``DRM_FORMAT_*`` tokens should be reused | |
97 | wherever possible, as they are the standard descriptions used for interchange. | |
98 | These tokens are described in the ``drm_fourcc.h`` file, which is a part of | |
99 | DRM's uAPI. | |
100 | ||
101 | Each ``DRM_FORMAT_*`` token describes the translation between a pixel | |
102 | co-ordinate in an image, and the color values for that pixel contained within | |
103 | its memory buffers. The number and type of color channels are described: | |
104 | whether they are RGB or YUV, integer or floating-point, the size of each channel | |
105 | and their locations within the pixel memory, and the relationship between color | |
106 | planes. | |
107 | ||
108 | For example, ``DRM_FORMAT_ARGB8888`` describes a format in which each pixel has | |
109 | a single 32-bit value in memory. Alpha, red, green, and blue, color channels are | |
110 | available at 8-bit precision per channel, ordered respectively from most to | |
111 | least significant bits in little-endian storage. ``DRM_FORMAT_*`` is not | |
112 | affected by either CPU or device endianness; the byte pattern in memory is | |
113 | always as described in the format definition, which is usually little-endian. | |
114 | ||
115 | As a more complex example, ``DRM_FORMAT_NV12`` describes a format in which luma | |
116 | and chroma YUV samples are stored in separate planes, where the chroma plane is | |
117 | stored at half the resolution in both dimensions (i.e. one U/V chroma | |
118 | sample is stored for each 2x2 pixel grouping). | |
119 | ||
120 | Format modifiers describe a translation mechanism between these per-pixel memory | |
121 | samples, and the actual memory storage for the buffer. The most straightforward | |
122 | modifier is ``DRM_FORMAT_MOD_LINEAR``, describing a scheme in which each plane | |
123 | is laid out row-sequentially, from the top-left to the bottom-right corner. | |
124 | This is considered the baseline interchange format, and most convenient for CPU | |
125 | access. | |
126 | ||
127 | Modern hardware employs much more sophisticated access mechanisms, typically | |
128 | making use of tiled access and possibly also compression. For example, the | |
129 | ``DRM_FORMAT_MOD_VIVANTE_TILED`` modifier describes memory storage where pixels | |
130 | are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile in | |
131 | a plane stores pixels (0,0) to (3,3) inclusive, and the second tile in a plane | |
132 | stores pixels (4,0) to (7,3) inclusive. | |
133 | ||
134 | Some modifiers may modify the number of planes required for an image; for | |
135 | example, the ``I915_FORMAT_MOD_Y_TILED_CCS`` modifier adds a second plane to RGB | |
136 | formats in which it stores data about the status of every tile, notably | |
137 | including whether the tile is fully populated with pixel data, or can be | |
138 | expanded from a single solid color. | |
139 | ||
140 | These extended layouts are highly vendor-specific, and even specific to | |
141 | particular generations or configurations of devices per-vendor. For this reason, | |
142 | support of modifiers must be explicitly enumerated and negotiated by all users | |
143 | in order to ensure a compatible and optimal pipeline, as discussed below. | |
144 | ||
145 | ||
146 | Dimensions and size | |
147 | =================== | |
148 | ||
149 | Each pixel buffer must be accompanied by logical pixel dimensions. This refers | |
150 | to the number of unique samples which can be extracted from, or stored to, the | |
151 | underlying memory storage. For example, even though a 1920x1080 | |
152 | ``DRM_FORMAT_NV12`` buffer has a luma plane containing 1920x1080 samples for the Y | |
153 | component, and 960x540 samples for the U and V components, the overall buffer is | |
154 | still described as having dimensions of 1920x1080. | |
155 | ||
156 | The in-memory storage of a buffer is not guaranteed to begin immediately at the | |
157 | base address of the underlying memory, nor is it guaranteed that the memory | |
158 | storage is tightly clipped to either dimension. | |
159 | ||
160 | Each plane must therefore be described with an ``offset`` in bytes, which will be | |
161 | added to the base address of the memory storage before performing any per-pixel | |
162 | calculations. This may be used to combine multiple planes into a single memory | |
163 | buffer; for example, ``DRM_FORMAT_NV12`` may be stored in a single memory buffer | |
164 | where the luma plane's storage begins immediately at the start of the buffer | |
165 | with an offset of 0, and the chroma plane's storage follows within the same buffer | |
166 | beginning from the byte offset for that plane. | |
167 | ||
168 | Each plane must also have a ``stride`` in bytes, expressing the offset in memory | |
169 | between two contiguous row. For example, a ``DRM_FORMAT_MOD_LINEAR`` buffer | |
170 | with dimensions of 1000x1000 may have been allocated as if it were 1024x1000, in | |
171 | order to allow for aligned access patterns. In this case, the buffer will still | |
172 | be described with a width of 1000, however the stride will be ``1024 * bpp``, | |
173 | indicating that there are 24 pixels at the positive extreme of the x axis whose | |
174 | values are not significant. | |
175 | ||
176 | Buffers may also be padded further in the y dimension, simply by allocating a | |
177 | larger area than would ordinarily be required. For example, many media decoders | |
178 | are not able to natively output buffers of height 1080, but instead require an | |
179 | effective height of 1088 pixels. In this case, the buffer continues to be | |
180 | described as having a height of 1080, with the memory allocation for each buffer | |
181 | being increased to account for the extra padding. | |
182 | ||
183 | ||
184 | Enumeration | |
185 | =========== | |
186 | ||
187 | Every user of pixel buffers must be able to enumerate a set of supported formats | |
188 | and modifiers, described together. Within KMS, this is achieved with the | |
189 | ``IN_FORMATS`` property on each DRM plane, listing the supported DRM formats, and | |
190 | the modifiers supported for each format. In userspace, this is supported through | |
191 | the `EGL_EXT_image_dma_buf_import_modifiers`_ extension entrypoints for EGL, the | |
192 | `VK_EXT_image_drm_format_modifier`_ extension for Vulkan, and the | |
193 | `zwp_linux_dmabuf_v1`_ extension for Wayland. | |
194 | ||
195 | Each of these interfaces allows users to query a set of supported | |
196 | format+modifier combinations. | |
197 | ||
198 | ||
199 | Negotiation | |
200 | =========== | |
201 | ||
202 | It is the responsibility of userspace to negotiate an acceptable format+modifier | |
203 | combination for its usage. This is performed through a simple intersection of | |
204 | lists. For example, if a user wants to use Vulkan to render an image to be | |
205 | displayed on a KMS plane, it must: | |
206 | ||
207 | - query KMS for the ``IN_FORMATS`` property for the given plane | |
208 | - query Vulkan for the supported formats for its physical device, making sure | |
209 | to pass the ``VkImageUsageFlagBits`` and ``VkImageCreateFlagBits`` | |
210 | corresponding to the intended rendering use | |
211 | - intersect these formats to determine the most appropriate one | |
212 | - for this format, intersect the lists of supported modifiers for both KMS and | |
213 | Vulkan, to obtain a final list of acceptable modifiers for that format | |
214 | ||
215 | This intersection must be performed for all usages. For example, if the user | |
216 | also wishes to encode the image to a video stream, it must query the media API | |
217 | it intends to use for encoding for the set of modifiers it supports, and | |
218 | additionally intersect against this list. | |
219 | ||
220 | If the intersection of all lists is an empty list, it is not possible to share | |
221 | buffers in this way, and an alternate strategy must be considered (e.g. using | |
222 | CPU access routines to copy data between the different uses, with the | |
223 | corresponding performance cost). | |
224 | ||
225 | The resulting modifier list is unsorted; the order is not significant. | |
226 | ||
227 | ||
228 | Allocation | |
229 | ========== | |
230 | ||
231 | Once userspace has determined an appropriate format, and corresponding list of | |
232 | acceptable modifiers, it must allocate the buffer. As there is no universal | |
233 | buffer-allocation interface available at either kernel or userspace level, the | |
234 | client makes an arbitrary choice of allocation interface such as Vulkan, GBM, or | |
235 | a media API. | |
236 | ||
237 | Each allocation request must take, at a minimum: the pixel format, a list of | |
238 | acceptable modifiers, and the buffer's width and height. Each API may extend | |
239 | this set of properties in different ways, such as allowing allocation in more | |
240 | than two dimensions, intended usage patterns, etc. | |
241 | ||
242 | The component which allocates the buffer will make an arbitrary choice of what | |
243 | it considers the 'best' modifier within the acceptable list for the requested | |
244 | allocation, any padding required, and further properties of the underlying | |
245 | memory buffers such as whether they are stored in system or device-specific | |
246 | memory, whether or not they are physically contiguous, and their cache mode. | |
247 | These properties of the memory buffer are not visible to userspace, however the | |
248 | ``dma-heaps`` API is an effort to address this. | |
249 | ||
250 | After allocation, the client must query the allocator to determine the actual | |
251 | modifier selected for the buffer, as well as the per-plane offset and stride. | |
252 | Allocators are not permitted to vary the format in use, to select a modifier not | |
253 | provided within the acceptable list, nor to vary the pixel dimensions other than | |
254 | the padding expressed through offset, stride, and size. | |
255 | ||
256 | Communicating additional constraints, such as alignment of stride or offset, | |
257 | placement within a particular memory area, etc, is out of scope of dma-buf, | |
258 | and is not solved by format and modifier tokens. | |
259 | ||
260 | ||
261 | Import | |
262 | ====== | |
263 | ||
264 | To use a buffer within a different context, device, or subsystem, the user | |
265 | passes these parameters (format, modifier, width, height, and per-plane offset | |
266 | and stride) to an importing API. | |
267 | ||
268 | Each memory buffer is referred to by a buffer handle, which may be unique or | |
269 | duplicated within an image. For example, a ``DRM_FORMAT_NV12`` buffer may have | |
270 | the luma and chroma buffers combined into a single memory buffer by use of the | |
271 | per-plane offset parameters, or they may be completely separate allocations in | |
272 | memory. For this reason, each import and allocation API must provide a separate | |
273 | handle for each plane. | |
274 | ||
275 | Each kernel subsystem has its own types and interfaces for buffer management. | |
276 | DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These types | |
277 | are not portable between contexts, processes, devices, or subsystems. | |
278 | ||
279 | To address this, ``dma-buf`` handles are used as the universal interchange for | |
280 | buffers. Subsystem-specific operations are used to export native buffer handles | |
281 | to a ``dma-buf`` file descriptor, and to import those file descriptors into a | |
282 | native buffer handle. dma-buf file descriptors can be transferred between | |
283 | contexts, processes, devices, and subsystems. | |
284 | ||
285 | For example, a Wayland media player may use V4L2 to decode a video frame into a | |
286 | ``DRM_FORMAT_NV12`` buffer. This will result in two memory planes (luma and | |
287 | chroma) being dequeued by the user from V4L2. These planes are then exported to | |
288 | one dma-buf file descriptor per plane, these descriptors are then sent along | |
289 | with the metadata (format, modifier, width, height, per-plane offset and stride) | |
290 | to the Wayland server. The Wayland server will then import these file | |
291 | descriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for use | |
292 | through Vulkan, or a KMS framebuffer object; each of these import operations | |
293 | will take the same metadata and convert the dma-buf file descriptors into their | |
294 | native buffer handles. | |
295 | ||
296 | Having a non-empty intersection of supported modifiers does not guarantee that | |
297 | import will succeed into all consumers; they may have constraints beyond those | |
298 | implied by modifiers which must be satisfied. | |
299 | ||
300 | ||
301 | Implicit modifiers | |
302 | ================== | |
303 | ||
304 | The concept of modifiers post-dates all of the subsystems mentioned above. As | |
305 | such, it has been retrofitted into all of these APIs, and in order to ensure | |
306 | backwards compatibility, support is needed for drivers and userspace which do | |
307 | not (yet) support modifiers. | |
308 | ||
309 | As an example, GBM is used to allocate buffers to be shared between EGL for | |
310 | rendering and KMS for display. It has two entrypoints for allocating buffers: | |
311 | ``gbm_bo_create`` which only takes the format, width, height, and a usage token, | |
312 | and ``gbm_bo_create_with_modifiers`` which extends this with a list of modifiers. | |
313 | ||
314 | In the latter case, the allocation is as discussed above, being provided with a | |
315 | list of acceptable modifiers that the implementation can choose from (or fail if | |
316 | it is not possible to allocate within those constraints). In the former case | |
317 | where modifiers are not provided, the GBM implementation must make its own | |
318 | choice as to what is likely to be the 'best' layout. Such a choice is entirely | |
319 | implementation-specific: some will internally use tiled layouts which are not | |
320 | CPU-accessible if the implementation decides that is a good idea through | |
321 | whatever heuristic. It is the implementation's responsibility to ensure that | |
322 | this choice is appropriate. | |
323 | ||
324 | To support this case where the layout is not known because there is no awareness | |
325 | of modifiers, a special ``DRM_FORMAT_MOD_INVALID`` token has been defined. This | |
326 | pseudo-modifier declares that the layout is not known, and that the driver | |
327 | should use its own logic to determine what the underlying layout may be. | |
328 | ||
329 | .. note:: | |
330 | ||
331 | ``DRM_FORMAT_MOD_INVALID`` is a non-zero value. The modifier value zero is | |
332 | ``DRM_FORMAT_MOD_LINEAR``, which is an explicit guarantee that the image | |
333 | has the linear layout. Care and attention should be taken to ensure that | |
334 | zero as a default value is not mixed up with either no modifier or the linear | |
335 | modifier. Also note that in some APIs the invalid modifier value is specified | |
336 | with an out-of-band flag, like in ``DRM_IOCTL_MODE_ADDFB2``. | |
337 | ||
338 | There are four cases where this token may be used: | |
339 | - during enumeration, an interface may return ``DRM_FORMAT_MOD_INVALID``, either | |
340 | as the sole member of a modifier list to declare that explicit modifiers are | |
341 | not supported, or as part of a larger list to declare that implicit modifiers | |
342 | may be used | |
343 | - during allocation, a user may supply ``DRM_FORMAT_MOD_INVALID``, either as the | |
344 | sole member of a modifier list (equivalent to not supplying a modifier list | |
345 | at all) to declare that explicit modifiers are not supported and must not be | |
346 | used, or as part of a larger list to declare that an allocation using implicit | |
347 | modifiers is acceptable | |
348 | - in a post-allocation query, an implementation may return | |
349 | ``DRM_FORMAT_MOD_INVALID`` as the modifier of the allocated buffer to declare | |
350 | that the underlying layout is implementation-defined and that an explicit | |
351 | modifier description is not available; per the above rules, this may only be | |
352 | returned when the user has included ``DRM_FORMAT_MOD_INVALID`` as part of the | |
353 | list of acceptable modifiers, or not provided a list | |
354 | - when importing a buffer, the user may supply ``DRM_FORMAT_MOD_INVALID`` as the | |
355 | buffer modifier (or not supply a modifier) to indicate that the modifier is | |
356 | unknown for whatever reason; this is only acceptable when the buffer has | |
357 | not been allocated with an explicit modifier | |
358 | ||
359 | It follows from this that for any single buffer, the complete chain of operations | |
360 | formed by the producer and all the consumers must be either fully implicit or fully | |
361 | explicit. For example, if a user wishes to allocate a buffer for use between | |
362 | GPU, display, and media, but the media API does not support modifiers, then the | |
363 | user **must not** allocate the buffer with explicit modifiers and attempt to | |
364 | import the buffer into the media API with no modifier, but either perform the | |
365 | allocation using implicit modifiers, or allocate the buffer for media use | |
366 | separately and copy between the two buffers. | |
367 | ||
368 | As one exception to the above, allocations may be 'upgraded' from implicit | |
369 | to explicit modifiers. For example, if the buffer is allocated with | |
370 | ``gbm_bo_create`` (taking no modifiers), the user may then query the modifier with | |
371 | ``gbm_bo_get_modifier`` and then use this modifier as an explicit modifier token | |
372 | if a valid modifier is returned. | |
373 | ||
374 | When allocating buffers for exchange between different users and modifiers are | |
375 | not available, implementations are strongly encouraged to use | |
376 | ``DRM_FORMAT_MOD_LINEAR`` for their allocation, as this is the universal baseline | |
377 | for exchange. However, it is not guaranteed that this will result in the correct | |
378 | interpretation of buffer content, as implicit modifier operation may still be | |
379 | subject to driver-specific heuristics. | |
380 | ||
381 | Any new users - userspace programs and protocols, kernel subsystems, etc - | |
382 | wishing to exchange buffers must offer interoperability through dma-buf file | |
383 | descriptors for memory planes, DRM format tokens to describe the format, DRM | |
384 | format modifiers to describe the layout in memory, at least width and height for | |
385 | dimensions, and at least offset and stride for each memory plane. | |
386 | ||
387 | .. _zwp_linux_dmabuf_v1: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml | |
388 | .. _VK_EXT_image_drm_format_modifier: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_image_drm_format_modifier.html | |
389 | .. _EGL_EXT_image_dma_buf_import_modifiers: https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import_modifiers.txt |