mm: memory: check userfaultfd_wp() in vmf_orig_pte_uffd_wp()
authorKefeng Wang <wangkefeng.wang@huawei.com>
Mon, 22 Apr 2024 03:00:39 +0000 (11:00 +0800)
committerAndrew Morton <akpm@linux-foundation.org>
Mon, 6 May 2024 00:53:43 +0000 (17:53 -0700)
commit6ed31ba3921162ef5d4f2f99b91681e8fd24ff34
treed5965ed7c08ec0e2d280dd65a412673328f04f37
parent2d8b272cdcad17d9b875c206fbcb0422b791ab3a
mm: memory: check userfaultfd_wp() in vmf_orig_pte_uffd_wp()

Add userfaultfd_wp() check in vmf_orig_pte_uffd_wp() to avoid the
unnecessary FAULT_FLAG_ORIG_PTE_VALID check/pte_marker_entry_uffd_wp() in
most pagefault, note, the function vmf_orig_pte_uffd_wp() is not inlined
in the two kernel versions, the difference is shown below,

perf date,

  perf report -i perf.data.before | grep vmf
     0.17%     0.13%  lat_pagefault  [kernel.kallsyms]      [k] vmf_orig_pte_uffd_wp.part.0.isra.0
  perf report -i perf.data.after  | grep vmf

lat_pagefault -W 5 -N 5 /tmp/XXX
  latency              before        after        diff
  average(8 tests)     0.262675      0.2600375   -0.0026375

Although it's a small, but the uffd_wp is a new feature than previous
kernel, when the vma is not registered with UFFD_WP, let's avoid to
execute the new logical, also adding __always_inline attribute to
vmf_orig_pte_uffd_wp(), which make set_pte_range() only check VM_UFFD_WP
flags without the function call.  In addition, directly call the
vmf_orig_pte_uffd_wp() in do_anonymous_page() and set_pte_range() to save
an uffd_wp variable.

Link: https://lkml.kernel.org/r/20240422030039.3293568-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/memory.c