mm,hugetlb: change mechanism to detect a COW on private mapping
Patch series "Misc rework on hugetlb faulting path", v4.
This patchset aims to give some love to the hugetlb faulting path, doing
so by removing obsolete comments that are no longer true, sorting out the
folio lock, and changing the mechanism we use to determine whether we are
COWing a private mapping already.
The most important patch of the series is #1, as it fixes a deadlock that
was described in [1], where two processes were holding the same lock for
the folio in the pagecache, and then deadlocked in the mutex. Note that
this can also happen for anymous folios. This has been tested using this
reproducer, below
Looking up and locking the folio in the pagecache was done to check
whether that folio was the same folio we had mapped in our pagetables,
meaning that if it was different we knew that we already mapped that folio
privately, so any further CoW would be made on a private mapping, which
lead us to the question: __Was the reservation for that address
consumed?__ That is all we care about, because if it was indeed consumed
and we are the owner and we cannot allocate more folios, we need to unmap
the folio from the processes pagetables and make it exclusive for us.
We figured we do not need to look up the folio at all, and it is just
enough to check whether the folio we have mapped is anonymous, which means
we mapped it privately, so the reservation was indeed consumed.
Patch#2 sorts out folio locking in the faulting path, reducing the scope
of it ,only taking it when we are dealing with an anonymous folio and
document it. More details in the patch.
Patch#3-5 are cleanups.
Here is the reproducer:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/wait.h>
#define PROTECTION (PROT_READ | PROT_WRITE)
#define LENGTH (2UL*1024*1024)
#define ADDR (void *)(0x0UL)
#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB)
void __read(char *addr)
{
int i = 0;
printf("a[%d]: %c\n", i, addr[i]);
}
void fill(char *addr)
{
addr[0] = 'd';
printf("addr: %c\n", addr[0]);
}
int main(void)
{
void *addr;
pid_t pid, wpid;
int status;
addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
return -1;
}
printf("Parent faulting in RO\n");
__read(addr);
sleep (10);
printf("Forking\n");
pid = fork();
switch (pid) {
case -1:
perror("fork");
break;
case 0:
sleep (4);
printf("Child: Faulting in\n");
fill(addr);
exit(0);
break;
default:
printf("Parent: Faulting in\n");
fill(addr);
while((wpid = wait(&status)) > 0);
if (munmap(addr, LENGTH))
perror("munmap");
}
return 0;
}
You will also have to add a delay in hugetlb_wp, after releasing the mutex
and before unmapping, so the window is large enough to reproduce it
reliably.
: --- a/mm/hugetlb.c
: +++ b/mm/hugetlb.c
: @@ -38,6 +38,7 @@
: #include <linux/memory.h>
: #include <linux/mm_inline.h>
: #include <linux/padata.h>
: +#include <linux/delay.h>
:
: #include <asm/page.h>
: #include <asm/pgalloc.h>
: @@ -6261,6 +6262,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
: hugetlb_vma_unlock_read(vma);
: mutex_unlock(&hugetlb_fault_mutex_table[hash]);
:
: + mdelay(8000);
: +
: unmap_ref_private(mm, vma, old_folio, vmf->address);
:
: mutex_lock(&hugetlb_fault_mutex_table[hash]);
This patch (of 5):
hugetlb_wp() checks whether the process is trying to COW on a private
mapping in order to know whether the reservation for that address was
already consumed. If it was consumed and we are the ownner of the
mapping, the folio will have to be unmapped from the other processes.
Currently, that check is done by looking up the folio in the pagecache and
compare it to the folio which is mapped in our pagetables. If it differs,
it means we already mapped it privately before, consuming a reservation on
the way. All we are interested in is whether the mapped folio is
anonymous, so we can simplify and check for that instead.
Link: https://lkml.kernel.org/r/20250630144212.156938-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20250627102904.107202-1-osalvador@suse.de
Link: https://lkml.kernel.org/r/20250627102904.107202-2-osalvador@suse.de
Link: https://lore.kernel.org/lkml/20250513093448.592150-1-gavinguo@igalia.com/
Link: https://lkml.kernel.org/r/20250630144212.156938-2-osalvador@suse.de
Fixes:
40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Reported-by: Gavin Guo <gavinguo@igalia.com>
Closes: https://lore.kernel.org/lkml/
20250513093448.592150-1-gavinguo@igalia.com/
Suggested-by: Peter Xu <peterx@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>