mm/huge_memory: conditionally call maybe_mkwrite() and drop pte_wrprotect() in __spli...
authorDavid Hildenbrand <david@redhat.com>
Tue, 11 Apr 2023 14:25:12 +0000 (16:25 +0200)
committerAndrew Morton <akpm@linux-foundation.org>
Tue, 18 Apr 2023 23:30:01 +0000 (16:30 -0700)
No need to call maybe_mkwrite() to then wrprotect if the source PMD was not
writable.

It's worth nothing that this now allows for PTEs to be writable even if
the source PMD was not writable: if vma->vm_page_prot includes write
permissions.

As documented in commit 931298e103c2 ("mm/userfaultfd: rely on
vma->vm_page_prot in uffd_wp_range()"), any mechanism that intends to
have pages wrprotected (COW, writenotify, mprotect, uffd-wp, softdirty,
...) has to properly adjust vma->vm_page_prot upfront, to not include
write permissions. If vma->vm_page_prot includes write permissions, the
PTE/PMD can be writable as default.

This now mimics the handling in mm/migrate.c:remove_migration_pte() and in
mm/huge_memory.c:remove_migration_pmd(), which has been in place for a
long time (except that 96a9c287e25d ("mm/migrate: fix wrongly apply write
bit after mkdirty on sparc64") temporarily changed it).

Link: https://lkml.kernel.org/r/20230411142512.438404-7-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/huge_memory.c

index 5700b1c727b69db0b0afd39bf35d0175159185c2..c23fa39dec9280d88decd60c4689853da7da0bc9 100644 (file)
@@ -2233,11 +2233,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
                                entry = pte_swp_mkuffd_wp(entry);
                } else {
                        entry = mk_pte(page + i, READ_ONCE(vma->vm_page_prot));
-                       entry = maybe_mkwrite(entry, vma);
+                       if (write)
+                               entry = maybe_mkwrite(entry, vma);
                        if (anon_exclusive)
                                SetPageAnonExclusive(page + i);
-                       if (!write)
-                               entry = pte_wrprotect(entry);
                        if (!young)
                                entry = pte_mkold(entry);
                        /* NOTE: this may set soft-dirty too on some archs */