mm, userfaultfd, THP: avoid waiting when PMD under THP migration
authorHuang Ying <ying.huang@intel.com>
Thu, 1 Feb 2018 00:17:32 +0000 (16:17 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Thu, 1 Feb 2018 01:18:37 +0000 (17:18 -0800)
If THP migration is enabled, for a VMA handled by userfaultfd, consider
the following situation,

  do_page_fault()
    __do_huge_pmd_anonymous_page()
     handle_userfault()
       userfault_msg()
         /* a huge page is allocated and mapped at fault address */
         /* the huge page is under migration, leaves migration entry
            in page table */
       userfaultfd_must_wait()
         /* return true because !pmd_present() */
       /* may wait in loop until fatal signal */

That is, it may be possible for userfaultfd_must_wait() encounters a PMD
entry which is !pmd_none() && !pmd_present().  In the current
implementation, we will wait for such PMD entries, which may cause
unnecessary waiting, and potential soft lockup.

This is fixed via avoiding to wait when !pmd_none() && !pmd_present(),
only wait when pmd_none().

This may be not a problem in practice, because userfaultfd_must_wait()
is always called with mm->mmap_sem read-locked.  mremap() will
write-lock mm->mmap_sem.  And UFFDIO_COPY doesn't support to copy THP
mapping.  But the change introduced still makes the code more correct,
and makes the PMD and PTE code more consistent.

Link: http://lkml.kernel.org/r/20171207011752.3292-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.UK>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/userfaultfd.c

index 743eaa6468987eaea8d087cc286b04444bd8fcd7..a9d0ddc12aced408107cd0fe384f5cc51cfffd75 100644 (file)
@@ -294,10 +294,13 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
         * pmd_trans_unstable) of the pmd.
         */
        _pmd = READ_ONCE(*pmd);
-       if (!pmd_present(_pmd))
+       if (pmd_none(_pmd))
                goto out;
 
        ret = false;
+       if (!pmd_present(_pmd))
+               goto out;
+
        if (pmd_trans_huge(_pmd))
                goto out;