kernel/events/uprobes: pass VMA instead of MM to remove_breakpoint()
authorDavid Hildenbrand <david@redhat.com>
Fri, 21 Mar 2025 11:37:11 +0000 (12:37 +0100)
committerAndrew Morton <akpm@linux-foundation.org>
Mon, 12 May 2025 00:48:17 +0000 (17:48 -0700)
Patch series "kernel/events/uprobes: uprobe_write_opcode() rewrite", v3.

Currently, uprobe_write_opcode() implements COW-breaking manually, which
is really far from ideal.  Further, there is interest in supporting
uprobes on hugetlb pages [1], and leaving at least the COW-breaking to the
core will make this much easier.

Also, I think the current code doesn't really handle some things properly
(see patch #3) when replacing/zapping pages.

Let's rewrite it, to leave COW-breaking to the fault handler, and handle
registration/unregistration by temporarily unmapping the anonymous page,
modifying it, and mapping it again.  We still have to implement zapping of
anonymous pages ourselves, unfortunately.

We could look into not performing the temporary unmapping if we can
perform the write atomically, which would likely also make adding hugetlb
support a lot easier.  But, limited (e.g., only PMD/PUD) hugetlb support
could be added on top of this with some tweaking.

Note that we now won't have to allocate another anonymous folio when
unregistering (which will be beneficial for hugetlb as well), we can
simply modify the already-mapped one from the registration (if any).  When
registering a uprobe, we'll first trigger a ptrace-like write fault to
break COW, to then modify the already-mapped page.

Briefly sanity tested with perf probes and with the bpf uprobes selftest.

This patch (of 3):

Pass VMA instead of MM to remove_breakpoint() and remove the "MM" argument
from install_breakpoint(), because it can easily be derived from the VMA.

Link: https://lkml.kernel.org/r/20250321113713.204682-1-david@redhat.com
Link: https://lkml.kernel.org/r/20250321113713.204682-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Jiri Olsa <olsajiri@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung kim <namhyung@kernel.org>
Cc: Russel King <linux@armlinux.org.uk>
Cc: tongtiangen <tongtiangen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
kernel/events/uprobes.c

index 8d783b5882b6a3ea3789d552fb6de8bdea905121..8dc23ca9f66ff0347f2c14e064ec0ecf9da14b8d 100644 (file)
@@ -1134,10 +1134,10 @@ static bool filter_chain(struct uprobe *uprobe, struct mm_struct *mm)
        return ret;
 }
 
-static int
-install_breakpoint(struct uprobe *uprobe, struct mm_struct *mm,
-                       struct vm_area_struct *vma, unsigned long vaddr)
+static int install_breakpoint(struct uprobe *uprobe, struct vm_area_struct *vma,
+               unsigned long vaddr)
 {
+       struct mm_struct *mm = vma->vm_mm;
        bool first_uprobe;
        int ret;
 
@@ -1162,9 +1162,11 @@ install_breakpoint(struct uprobe *uprobe, struct mm_struct *mm,
        return ret;
 }
 
-static int
-remove_breakpoint(struct uprobe *uprobe, struct mm_struct *mm, unsigned long vaddr)
+static int remove_breakpoint(struct uprobe *uprobe, struct vm_area_struct *vma,
+               unsigned long vaddr)
 {
+       struct mm_struct *mm = vma->vm_mm;
+
        set_bit(MMF_RECALC_UPROBES, &mm->flags);
        return set_orig_insn(&uprobe->arch, mm, vaddr);
 }
@@ -1296,10 +1298,10 @@ register_for_each_vma(struct uprobe *uprobe, struct uprobe_consumer *new)
                if (is_register) {
                        /* consult only the "caller", new consumer. */
                        if (consumer_filter(new, mm))
-                               err = install_breakpoint(uprobe, mm, vma, info->vaddr);
+                               err = install_breakpoint(uprobe, vma, info->vaddr);
                } else if (test_bit(MMF_HAS_UPROBES, &mm->flags)) {
                        if (!filter_chain(uprobe, mm))
-                               err |= remove_breakpoint(uprobe, mm, info->vaddr);
+                               err |= remove_breakpoint(uprobe, vma, info->vaddr);
                }
 
  unlock:
@@ -1472,7 +1474,7 @@ static int unapply_uprobe(struct uprobe *uprobe, struct mm_struct *mm)
                        continue;
 
                vaddr = offset_to_vaddr(vma, uprobe->offset);
-               err |= remove_breakpoint(uprobe, mm, vaddr);
+               err |= remove_breakpoint(uprobe, vma, vaddr);
        }
        mmap_read_unlock(mm);
 
@@ -1610,7 +1612,7 @@ int uprobe_mmap(struct vm_area_struct *vma)
                if (!fatal_signal_pending(current) &&
                    filter_chain(uprobe, vma->vm_mm)) {
                        unsigned long vaddr = offset_to_vaddr(vma, uprobe->offset);
-                       install_breakpoint(uprobe, vma->vm_mm, vma, vaddr);
+                       install_breakpoint(uprobe, vma, vaddr);
                }
                put_uprobe(uprobe);
        }