Commit | Line | Data |
---|---|---|
2fa91d15 JN |
1 | ===================== |
2 | DRM Memory Management | |
3 | ===================== | |
4 | ||
5 | Modern Linux systems require large amount of graphics memory to store | |
6 | frame buffers, textures, vertices and other graphics-related data. Given | |
7 | the very dynamic nature of many of that data, managing graphics memory | |
8 | efficiently is thus crucial for the graphics stack and plays a central | |
9 | role in the DRM infrastructure. | |
10 | ||
11 | The DRM core includes two memory managers, namely Translation Table Maps | |
12 | (TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory | |
13 | manager to be developed and tried to be a one-size-fits-them all | |
14 | solution. It provides a single userspace API to accommodate the need of | |
15 | all hardware, supporting both Unified Memory Architecture (UMA) devices | |
16 | and devices with dedicated video RAM (i.e. most discrete video cards). | |
17 | This resulted in a large, complex piece of code that turned out to be | |
18 | hard to use for driver development. | |
19 | ||
20 | GEM started as an Intel-sponsored project in reaction to TTM's | |
21 | complexity. Its design philosophy is completely different: instead of | |
22 | providing a solution to every graphics memory-related problems, GEM | |
23 | identified common code between drivers and created a support library to | |
24 | share it. GEM has simpler initialization and execution requirements than | |
25 | TTM, but has no video RAM management capabilities and is thus limited to | |
26 | UMA devices. | |
27 | ||
28 | The Translation Table Manager (TTM) | |
8febdf0d | 29 | =================================== |
2fa91d15 JN |
30 | |
31 | TTM design background and information belongs here. | |
32 | ||
33 | TTM initialization | |
8febdf0d | 34 | ------------------ |
2fa91d15 JN |
35 | |
36 | **Warning** | |
37 | ||
38 | This section is outdated. | |
39 | ||
40 | Drivers wishing to support TTM must fill out a drm_bo_driver | |
41 | structure. The structure contains several fields with function pointers | |
42 | for initializing the TTM, allocating and freeing memory, waiting for | |
43 | command completion and fence synchronization, and memory migration. See | |
44 | the radeon_ttm.c file for an example of usage. | |
45 | ||
46 | The ttm_global_reference structure is made up of several fields: | |
47 | ||
48 | :: | |
49 | ||
50 | struct ttm_global_reference { | |
51 | enum ttm_global_types global_type; | |
52 | size_t size; | |
53 | void *object; | |
54 | int (*init) (struct ttm_global_reference *); | |
55 | void (*release) (struct ttm_global_reference *); | |
56 | }; | |
57 | ||
58 | ||
59 | There should be one global reference structure for your memory manager | |
60 | as a whole, and there will be others for each object created by the | |
61 | memory manager at runtime. Your global TTM should have a type of | |
62 | TTM_GLOBAL_TTM_MEM. The size field for the global object should be | |
63 | sizeof(struct ttm_mem_global), and the init and release hooks should | |
64 | point at your driver-specific init and release routines, which probably | |
65 | eventually call ttm_mem_global_init and ttm_mem_global_release, | |
66 | respectively. | |
67 | ||
68 | Once your global TTM accounting structure is set up and initialized by | |
69 | calling ttm_global_item_ref() on it, you need to create a buffer | |
70 | object TTM to provide a pool for buffer object allocation by clients and | |
71 | the kernel itself. The type of this object should be | |
72 | TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct | |
73 | ttm_bo_global). Again, driver-specific init and release functions may | |
74 | be provided, likely eventually calling ttm_bo_global_init() and | |
75 | ttm_bo_global_release(), respectively. Also, like the previous | |
76 | object, ttm_global_item_ref() is used to create an initial reference | |
77 | count for the TTM, which will call your initialization function. | |
78 | ||
79 | The Graphics Execution Manager (GEM) | |
8febdf0d | 80 | ==================================== |
2fa91d15 JN |
81 | |
82 | The GEM design approach has resulted in a memory manager that doesn't | |
83 | provide full coverage of all (or even all common) use cases in its | |
84 | userspace or kernel API. GEM exposes a set of standard memory-related | |
85 | operations to userspace and a set of helper functions to drivers, and | |
86 | let drivers implement hardware-specific operations with their own | |
87 | private API. | |
88 | ||
89 | The GEM userspace API is described in the `GEM - the Graphics Execution | |
90 | Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While | |
91 | slightly outdated, the document provides a good overview of the GEM API | |
92 | principles. Buffer allocation and read and write operations, described | |
93 | as part of the common GEM API, are currently implemented using | |
94 | driver-specific ioctls. | |
95 | ||
96 | GEM is data-agnostic. It manages abstract buffer objects without knowing | |
97 | what individual buffers contain. APIs that require knowledge of buffer | |
98 | contents or purpose, such as buffer allocation or synchronization | |
99 | primitives, are thus outside of the scope of GEM and must be implemented | |
100 | using driver-specific ioctls. | |
101 | ||
102 | On a fundamental level, GEM involves several operations: | |
103 | ||
104 | - Memory allocation and freeing | |
105 | - Command execution | |
106 | - Aperture management at command execution time | |
107 | ||
108 | Buffer object allocation is relatively straightforward and largely | |
109 | provided by Linux's shmem layer, which provides memory to back each | |
110 | object. | |
111 | ||
112 | Device-specific operations, such as command execution, pinning, buffer | |
113 | read & write, mapping, and domain ownership transfers are left to | |
114 | driver-specific ioctls. | |
115 | ||
116 | GEM Initialization | |
8febdf0d | 117 | ------------------ |
2fa91d15 JN |
118 | |
119 | Drivers that use GEM must set the DRIVER_GEM bit in the struct | |
120 | :c:type:`struct drm_driver <drm_driver>` driver_features | |
121 | field. The DRM core will then automatically initialize the GEM core | |
122 | before calling the load operation. Behind the scene, this will create a | |
123 | DRM Memory Manager object which provides an address space pool for | |
124 | object allocation. | |
125 | ||
126 | In a KMS configuration, drivers need to allocate and initialize a | |
127 | command ring buffer following core GEM initialization if required by the | |
128 | hardware. UMA devices usually have what is called a "stolen" memory | |
129 | region, which provides space for the initial framebuffer and large, | |
130 | contiguous memory regions required by the device. This space is | |
131 | typically not managed by GEM, and must be initialized separately into | |
132 | its own DRM MM object. | |
133 | ||
134 | GEM Objects Creation | |
8febdf0d | 135 | -------------------- |
2fa91d15 JN |
136 | |
137 | GEM splits creation of GEM objects and allocation of the memory that | |
138 | backs them in two distinct operations. | |
139 | ||
140 | GEM objects are represented by an instance of struct :c:type:`struct | |
141 | drm_gem_object <drm_gem_object>`. Drivers usually need to | |
142 | extend GEM objects with private information and thus create a | |
143 | driver-specific GEM object structure type that embeds an instance of | |
144 | struct :c:type:`struct drm_gem_object <drm_gem_object>`. | |
145 | ||
146 | To create a GEM object, a driver allocates memory for an instance of its | |
147 | specific GEM object type and initializes the embedded struct | |
148 | :c:type:`struct drm_gem_object <drm_gem_object>` with a call | |
149 | to :c:func:`drm_gem_object_init()`. The function takes a pointer | |
150 | to the DRM device, a pointer to the GEM object and the buffer object | |
151 | size in bytes. | |
152 | ||
153 | GEM uses shmem to allocate anonymous pageable memory. | |
154 | :c:func:`drm_gem_object_init()` will create an shmfs file of the | |
155 | requested size and store it into the struct :c:type:`struct | |
156 | drm_gem_object <drm_gem_object>` filp field. The memory is | |
157 | used as either main storage for the object when the graphics hardware | |
158 | uses system memory directly or as a backing store otherwise. | |
159 | ||
160 | Drivers are responsible for the actual physical pages allocation by | |
161 | calling :c:func:`shmem_read_mapping_page_gfp()` for each page. | |
162 | Note that they can decide to allocate pages when initializing the GEM | |
163 | object, or to delay allocation until the memory is needed (for instance | |
164 | when a page fault occurs as a result of a userspace memory access or | |
165 | when the driver needs to start a DMA transfer involving the memory). | |
166 | ||
167 | Anonymous pageable memory allocation is not always desired, for instance | |
168 | when the hardware requires physically contiguous system memory as is | |
169 | often the case in embedded devices. Drivers can create GEM objects with | |
170 | no shmfs backing (called private GEM objects) by initializing them with | |
171 | a call to :c:func:`drm_gem_private_object_init()` instead of | |
172 | :c:func:`drm_gem_object_init()`. Storage for private GEM objects | |
173 | must be managed by drivers. | |
174 | ||
175 | GEM Objects Lifetime | |
8febdf0d | 176 | -------------------- |
2fa91d15 JN |
177 | |
178 | All GEM objects are reference-counted by the GEM core. References can be | |
179 | acquired and release by :c:func:`calling | |
180 | drm_gem_object_reference()` and | |
181 | :c:func:`drm_gem_object_unreference()` respectively. The caller | |
182 | must hold the :c:type:`struct drm_device <drm_device>` | |
183 | struct_mutex lock when calling | |
184 | :c:func:`drm_gem_object_reference()`. As a convenience, GEM | |
185 | provides :c:func:`drm_gem_object_unreference_unlocked()` | |
186 | functions that can be called without holding the lock. | |
187 | ||
188 | When the last reference to a GEM object is released the GEM core calls | |
189 | the :c:type:`struct drm_driver <drm_driver>` gem_free_object | |
190 | operation. That operation is mandatory for GEM-enabled drivers and must | |
191 | free the GEM object and all associated resources. | |
192 | ||
193 | void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are | |
194 | responsible for freeing all GEM object resources. This includes the | |
195 | resources created by the GEM core, which need to be released with | |
196 | :c:func:`drm_gem_object_release()`. | |
197 | ||
198 | GEM Objects Naming | |
8febdf0d | 199 | ------------------ |
2fa91d15 JN |
200 | |
201 | Communication between userspace and the kernel refers to GEM objects | |
202 | using local handles, global names or, more recently, file descriptors. | |
203 | All of those are 32-bit integer values; the usual Linux kernel limits | |
204 | apply to the file descriptors. | |
205 | ||
206 | GEM handles are local to a DRM file. Applications get a handle to a GEM | |
207 | object through a driver-specific ioctl, and can use that handle to refer | |
208 | to the GEM object in other standard or driver-specific ioctls. Closing a | |
209 | DRM file handle frees all its GEM handles and dereferences the | |
210 | associated GEM objects. | |
211 | ||
212 | To create a handle for a GEM object drivers call | |
213 | :c:func:`drm_gem_handle_create()`. The function takes a pointer | |
214 | to the DRM file and the GEM object and returns a locally unique handle. | |
215 | When the handle is no longer needed drivers delete it with a call to | |
216 | :c:func:`drm_gem_handle_delete()`. Finally the GEM object | |
217 | associated with a handle can be retrieved by a call to | |
218 | :c:func:`drm_gem_object_lookup()`. | |
219 | ||
220 | Handles don't take ownership of GEM objects, they only take a reference | |
221 | to the object that will be dropped when the handle is destroyed. To | |
222 | avoid leaking GEM objects, drivers must make sure they drop the | |
223 | reference(s) they own (such as the initial reference taken at object | |
224 | creation time) as appropriate, without any special consideration for the | |
225 | handle. For example, in the particular case of combined GEM object and | |
226 | handle creation in the implementation of the dumb_create operation, | |
227 | drivers must drop the initial reference to the GEM object before | |
228 | returning the handle. | |
229 | ||
230 | GEM names are similar in purpose to handles but are not local to DRM | |
231 | files. They can be passed between processes to reference a GEM object | |
232 | globally. Names can't be used directly to refer to objects in the DRM | |
233 | API, applications must convert handles to names and names to handles | |
234 | using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls | |
235 | respectively. The conversion is handled by the DRM core without any | |
236 | driver-specific support. | |
237 | ||
238 | GEM also supports buffer sharing with dma-buf file descriptors through | |
239 | PRIME. GEM-based drivers must use the provided helpers functions to | |
240 | implement the exporting and importing correctly. See ?. Since sharing | |
241 | file descriptors is inherently more secure than the easily guessable and | |
242 | global GEM names it is the preferred buffer sharing mechanism. Sharing | |
243 | buffers through GEM names is only supported for legacy userspace. | |
244 | Furthermore PRIME also allows cross-device buffer sharing since it is | |
245 | based on dma-bufs. | |
246 | ||
247 | GEM Objects Mapping | |
8febdf0d | 248 | ------------------- |
2fa91d15 JN |
249 | |
250 | Because mapping operations are fairly heavyweight GEM favours | |
251 | read/write-like access to buffers, implemented through driver-specific | |
252 | ioctls, over mapping buffers to userspace. However, when random access | |
253 | to the buffer is needed (to perform software rendering for instance), | |
254 | direct access to the object can be more efficient. | |
255 | ||
256 | The mmap system call can't be used directly to map GEM objects, as they | |
257 | don't have their own file handle. Two alternative methods currently | |
258 | co-exist to map GEM objects to userspace. The first method uses a | |
259 | driver-specific ioctl to perform the mapping operation, calling | |
260 | :c:func:`do_mmap()` under the hood. This is often considered | |
261 | dubious, seems to be discouraged for new GEM-enabled drivers, and will | |
262 | thus not be described here. | |
263 | ||
264 | The second method uses the mmap system call on the DRM file handle. void | |
265 | \*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t | |
266 | offset); DRM identifies the GEM object to be mapped by a fake offset | |
267 | passed through the mmap offset argument. Prior to being mapped, a GEM | |
268 | object must thus be associated with a fake offset. To do so, drivers | |
269 | must call :c:func:`drm_gem_create_mmap_offset()` on the object. | |
270 | ||
271 | Once allocated, the fake offset value must be passed to the application | |
272 | in a driver-specific way and can then be used as the mmap offset | |
273 | argument. | |
274 | ||
275 | The GEM core provides a helper method :c:func:`drm_gem_mmap()` to | |
276 | handle object mapping. The method can be set directly as the mmap file | |
277 | operation handler. It will look up the GEM object based on the offset | |
278 | value and set the VMA operations to the :c:type:`struct drm_driver | |
279 | <drm_driver>` gem_vm_ops field. Note that | |
280 | :c:func:`drm_gem_mmap()` doesn't map memory to userspace, but | |
281 | relies on the driver-provided fault handler to map pages individually. | |
282 | ||
283 | To use :c:func:`drm_gem_mmap()`, drivers must fill the struct | |
284 | :c:type:`struct drm_driver <drm_driver>` gem_vm_ops field | |
285 | with a pointer to VM operations. | |
286 | ||
287 | struct vm_operations_struct \*gem_vm_ops struct | |
288 | vm_operations_struct { void (\*open)(struct vm_area_struct \* area); | |
289 | void (\*close)(struct vm_area_struct \* area); int (\*fault)(struct | |
290 | vm_area_struct \*vma, struct vm_fault \*vmf); }; | |
291 | ||
292 | The open and close operations must update the GEM object reference | |
293 | count. Drivers can use the :c:func:`drm_gem_vm_open()` and | |
294 | :c:func:`drm_gem_vm_close()` helper functions directly as open | |
295 | and close handlers. | |
296 | ||
297 | The fault operation handler is responsible for mapping individual pages | |
298 | to userspace when a page fault occurs. Depending on the memory | |
299 | allocation scheme, drivers can allocate pages at fault time, or can | |
300 | decide to allocate memory for the GEM object at the time the object is | |
301 | created. | |
302 | ||
303 | Drivers that want to map the GEM object upfront instead of handling page | |
304 | faults can implement their own mmap file operation handler. | |
305 | ||
306 | Memory Coherency | |
8febdf0d | 307 | ---------------- |
2fa91d15 JN |
308 | |
309 | When mapped to the device or used in a command buffer, backing pages for | |
310 | an object are flushed to memory and marked write combined so as to be | |
311 | coherent with the GPU. Likewise, if the CPU accesses an object after the | |
312 | GPU has finished rendering to the object, then the object must be made | |
313 | coherent with the CPU's view of memory, usually involving GPU cache | |
314 | flushing of various kinds. This core CPU<->GPU coherency management is | |
315 | provided by a device-specific ioctl, which evaluates an object's current | |
316 | domain and performs any necessary flushing or synchronization to put the | |
317 | object into the desired coherency domain (note that the object may be | |
318 | busy, i.e. an active render target; in that case, setting the domain | |
319 | blocks the client and waits for rendering to complete before performing | |
320 | any necessary flushing operations). | |
321 | ||
322 | Command Execution | |
8febdf0d | 323 | ----------------- |
2fa91d15 JN |
324 | |
325 | Perhaps the most important GEM function for GPU devices is providing a | |
326 | command execution interface to clients. Client programs construct | |
327 | command buffers containing references to previously allocated memory | |
328 | objects, and then submit them to GEM. At that point, GEM takes care to | |
329 | bind all the objects into the GTT, execute the buffer, and provide | |
330 | necessary synchronization between clients accessing the same buffers. | |
331 | This often involves evicting some objects from the GTT and re-binding | |
332 | others (a fairly expensive operation), and providing relocation support | |
333 | which hides fixed GTT offsets from clients. Clients must take care not | |
334 | to submit command buffers that reference more objects than can fit in | |
335 | the GTT; otherwise, GEM will reject them and no rendering will occur. | |
336 | Similarly, if several objects in the buffer require fence registers to | |
337 | be allocated for correct rendering (e.g. 2D blits on pre-965 chips), | |
338 | care must be taken not to require more fence registers than are | |
339 | available to the client. Such resource management should be abstracted | |
340 | from the client in libdrm. | |
341 | ||
342 | GEM Function Reference | |
343 | ---------------------- | |
344 | ||
345 | .. kernel-doc:: drivers/gpu/drm/drm_gem.c | |
346 | :export: | |
347 | ||
348 | .. kernel-doc:: include/drm/drm_gem.h | |
349 | :internal: | |
350 | ||
8febdf0d DV |
351 | GEM CMA Helper Functions Reference |
352 | ---------------------------------- | |
353 | ||
354 | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | |
355 | :doc: cma helpers | |
356 | ||
357 | .. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c | |
358 | :export: | |
359 | ||
360 | .. kernel-doc:: include/drm/drm_gem_cma_helper.h | |
361 | :internal: | |
362 | ||
2fa91d15 | 363 | VMA Offset Manager |
8febdf0d | 364 | ================== |
2fa91d15 JN |
365 | |
366 | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | |
367 | :doc: vma offset manager | |
368 | ||
369 | .. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c | |
370 | :export: | |
371 | ||
372 | .. kernel-doc:: include/drm/drm_vma_manager.h | |
373 | :internal: | |
374 | ||
375 | PRIME Buffer Sharing | |
8febdf0d | 376 | ==================== |
2fa91d15 JN |
377 | |
378 | PRIME is the cross device buffer sharing framework in drm, originally | |
379 | created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME | |
380 | buffers are dma-buf based file descriptors. | |
381 | ||
382 | Overview and Driver Interface | |
8febdf0d | 383 | ----------------------------- |
2fa91d15 JN |
384 | |
385 | Similar to GEM global names, PRIME file descriptors are also used to | |
386 | share buffer objects across processes. They offer additional security: | |
387 | as file descriptors must be explicitly sent over UNIX domain sockets to | |
388 | be shared between applications, they can't be guessed like the globally | |
389 | unique GEM names. | |
390 | ||
391 | Drivers that support the PRIME API must set the DRIVER_PRIME bit in the | |
392 | struct :c:type:`struct drm_driver <drm_driver>` | |
393 | driver_features field, and implement the prime_handle_to_fd and | |
394 | prime_fd_to_handle operations. | |
395 | ||
396 | int (\*prime_handle_to_fd)(struct drm_device \*dev, struct drm_file | |
397 | \*file_priv, uint32_t handle, uint32_t flags, int \*prime_fd); int | |
398 | (\*prime_fd_to_handle)(struct drm_device \*dev, struct drm_file | |
399 | \*file_priv, int prime_fd, uint32_t \*handle); Those two operations | |
400 | convert a handle to a PRIME file descriptor and vice versa. Drivers must | |
401 | use the kernel dma-buf buffer sharing framework to manage the PRIME file | |
402 | descriptors. Similar to the mode setting API PRIME is agnostic to the | |
403 | underlying buffer object manager, as long as handles are 32bit unsigned | |
404 | integers. | |
405 | ||
406 | While non-GEM drivers must implement the operations themselves, GEM | |
407 | drivers must use the :c:func:`drm_gem_prime_handle_to_fd()` and | |
408 | :c:func:`drm_gem_prime_fd_to_handle()` helper functions. Those | |
409 | helpers rely on the driver gem_prime_export and gem_prime_import | |
410 | operations to create a dma-buf instance from a GEM object (dma-buf | |
411 | exporter role) and to create a GEM object from a dma-buf instance | |
412 | (dma-buf importer role). | |
413 | ||
414 | struct dma_buf \* (\*gem_prime_export)(struct drm_device \*dev, | |
415 | struct drm_gem_object \*obj, int flags); struct drm_gem_object \* | |
416 | (\*gem_prime_import)(struct drm_device \*dev, struct dma_buf | |
417 | \*dma_buf); These two operations are mandatory for GEM drivers that | |
418 | support PRIME. | |
419 | ||
420 | PRIME Helper Functions | |
8febdf0d | 421 | ---------------------- |
2fa91d15 JN |
422 | |
423 | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | |
424 | :doc: PRIME Helpers | |
425 | ||
426 | PRIME Function References | |
427 | ------------------------- | |
428 | ||
429 | .. kernel-doc:: drivers/gpu/drm/drm_prime.c | |
430 | :export: | |
431 | ||
432 | DRM MM Range Allocator | |
8febdf0d | 433 | ====================== |
2fa91d15 JN |
434 | |
435 | Overview | |
8febdf0d | 436 | -------- |
2fa91d15 JN |
437 | |
438 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | |
439 | :doc: Overview | |
440 | ||
441 | LRU Scan/Eviction Support | |
8febdf0d | 442 | ------------------------- |
2fa91d15 JN |
443 | |
444 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | |
445 | :doc: lru scan roaster | |
446 | ||
447 | DRM MM Range Allocator Function References | |
448 | ------------------------------------------ | |
449 | ||
450 | .. kernel-doc:: drivers/gpu/drm/drm_mm.c | |
451 | :export: | |
452 | ||
453 | .. kernel-doc:: include/drm/drm_mm.h | |
454 | :internal: |