Commit | Line | Data |
---|---|---|
dd0aa2cd AA |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ====== | |
4 | futex2 | |
5 | ====== | |
6 | ||
7 | :Author: André Almeida <andrealmeid@collabora.com> | |
8 | ||
9 | futex, or fast user mutex, is a set of syscalls to allow userspace to create | |
10 | performant synchronization mechanisms, such as mutexes, semaphores and | |
11 | conditional variables in userspace. C standard libraries, like glibc, uses it | |
12 | as a means to implement more high level interfaces like pthreads. | |
13 | ||
14 | futex2 is a followup version of the initial futex syscall, designed to overcome | |
15 | limitations of the original interface. | |
16 | ||
17 | User API | |
18 | ======== | |
19 | ||
20 | ``futex_waitv()`` | |
21 | ----------------- | |
22 | ||
23 | Wait on an array of futexes, wake on any:: | |
24 | ||
25 | futex_waitv(struct futex_waitv *waiters, unsigned int nr_futexes, | |
26 | unsigned int flags, struct timespec *timeout, clockid_t clockid) | |
27 | ||
28 | struct futex_waitv { | |
29 | __u64 val; | |
30 | __u64 uaddr; | |
31 | __u32 flags; | |
32 | __u32 __reserved; | |
33 | }; | |
34 | ||
35 | Userspace sets an array of struct futex_waitv (up to a max of 128 entries), | |
36 | using ``uaddr`` for the address to wait for, ``val`` for the expected value | |
37 | and ``flags`` to specify the type (e.g. private) and size of futex. | |
38 | ``__reserved`` needs to be 0, but it can be used for future extension. The | |
39 | pointer for the first item of the array is passed as ``waiters``. An invalid | |
40 | address for ``waiters`` or for any ``uaddr`` returns ``-EFAULT``. | |
41 | ||
42 | If userspace has 32-bit pointers, it should do a explicit cast to make sure | |
43 | the upper bits are zeroed. ``uintptr_t`` does the tricky and it works for | |
44 | both 32/64-bit pointers. | |
45 | ||
46 | ``nr_futexes`` specifies the size of the array. Numbers out of [1, 128] | |
47 | interval will make the syscall return ``-EINVAL``. | |
48 | ||
49 | The ``flags`` argument of the syscall needs to be 0, but it can be used for | |
50 | future extension. | |
51 | ||
52 | For each entry in ``waiters`` array, the current value at ``uaddr`` is compared | |
53 | to ``val``. If it's different, the syscall undo all the work done so far and | |
54 | return ``-EAGAIN``. If all tests and verifications succeeds, syscall waits until | |
55 | one of the following happens: | |
56 | ||
57 | - The timeout expires, returning ``-ETIMEOUT``. | |
58 | - A signal was sent to the sleeping task, returning ``-ERESTARTSYS``. | |
59 | - Some futex at the list was woken, returning the index of some waked futex. | |
60 | ||
61 | An example of how to use the interface can be found at ``tools/testing/selftests/futex/functional/futex_waitv.c``. | |
62 | ||
63 | Timeout | |
64 | ------- | |
65 | ||
66 | ``struct timespec *timeout`` argument is an optional argument that points to an | |
67 | absolute timeout. You need to specify the type of clock being used at | |
68 | ``clockid`` argument. ``CLOCK_MONOTONIC`` and ``CLOCK_REALTIME`` are supported. | |
69 | This syscall accepts only 64bit timespec structs. | |
70 | ||
71 | Types of futex | |
72 | -------------- | |
73 | ||
74 | A futex can be either private or shared. Private is used for processes that | |
75 | shares the same memory space and the virtual address of the futex will be the | |
76 | same for all processes. This allows for optimizations in the kernel. To use | |
77 | private futexes, it's necessary to specify ``FUTEX_PRIVATE_FLAG`` in the futex | |
78 | flag. For processes that doesn't share the same memory space and therefore can | |
79 | have different virtual addresses for the same futex (using, for instance, a | |
80 | file-backed shared memory) requires different internal mechanisms to be get | |
81 | properly enqueued. This is the default behavior, and it works with both private | |
82 | and shared futexes. | |
83 | ||
84 | Futexes can be of different sizes: 8, 16, 32 or 64 bits. Currently, the only | |
85 | supported one is 32 bit sized futex, and it need to be specified using | |
86 | ``FUTEX_32`` flag. |