- Add support for POSIX_FADV_NORMAL in the posix_fadvise() shim by just
ignoring it
- Add support for POSIX_FADV_SEQUENTIAL/POSIX_FADV_RANDOM by mapping
them to enable/disable of readahead via fcntl(..., F_RDAHEAD, ...).
Because macOS only lets you control readahead at the descriptor level
the offset and len values passed will be ignored and range control is
not done.
The impact of being able to tune readahead is demonstrated by the
bandwidths achieved by the following jobs running on an SSD of an
otherwise idle Intel Mac laptop with 16GBytes of RAM:
./fio --stonewall --size=128M --filename=fio.tmp --bs=4k --rw=read \
--name=sequential-readahead --fadvise=sequential \
--name=sequential-no-readahead --fadvise=random
[...]
sequential-readahead: (groupid=0, jobs=1): err= 0: pid=6250: Tue Sep 2 22:10:45 2025
read: IOPS=331k, BW=1293MiB/s (1356MB/s)(128MiB/99msec)
[...]
sequential-no-readahead: (groupid=1, jobs=1): err= 0: pid=6251: Tue Sep 2 22:10:45 2025
read: IOPS=25.9k, BW=101MiB/s (106MB/s)(128MiB/1263msec)
rm -f fio-huge.tmp
truncate -s 1T fio-huge.tmp
./fio --stonewall --filename=fio-huge.tmp --bs=32k --runtime=10s --rw=randread:3 \
--name=partial-random-no-readahead --fadvise=random \
--name=absorb-cache-invalidation --number_ios=1 --bs=4k \
--name=partial-random-readahead --fadvise=sequential
[...]
partial-random-no-readahead: (groupid=0, jobs=1): err= 0: pid=6259: Tue Sep 2 22:12:35 2025
read: IOPS=92.4k, BW=2888MiB/s (3029MB/s)(28.2GiB/10001msec)
[...]
partial-random-readahead: (groupid=2, jobs=1): err= 0: pid=6261: Tue Sep 2 22:12:35 2025
read: IOPS=61.8k, BW=1931MiB/s (2024MB/s)(18.9GiB/10001msec)
Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
#include <errno.h>
+#include <fcntl.h>
+#include <stdbool.h>
#include <stdint.h>
#include <string.h>
#include <sys/mman.h>
return 0;
}
+static inline int set_readhead(int fd, bool enabled) {
+ int ret;
+
+ ret = fcntl(fd, F_RDAHEAD, enabled ? 1 : 0);
+ if (ret == -1) {
+ ret = errno;
+ }
+
+ return ret;
+}
+
int posix_fadvise(int fd, off_t offset, off_t len, int advice)
{
int ret;
switch(advice) {
case POSIX_FADV_NORMAL:
+ ret = 0;
+ break;
case POSIX_FADV_RANDOM:
+ ret = set_readhead(fd, false);
+ break;
case POSIX_FADV_SEQUENTIAL:
- ret = 0;
+ ret = set_readhead(fd, true);
break;
case POSIX_FADV_DONTNEED:
ret = discard_pages(fd, offset, len);