mm: fix double mmap_sem unlock on MMF_UNSTABLE enforced SIGBUS
authorMichal Hocko <mhocko@suse.com>
Fri, 18 Aug 2017 22:16:12 +0000 (15:16 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 18 Aug 2017 22:32:01 +0000 (15:32 -0700)
Tetsuo Handa has noticed that MMF_UNSTABLE SIGBUS path in
handle_mm_fault causes a lockdep splat

  Out of memory: Kill process 1056 (a.out) score 603 or sacrifice child
  Killed process 1056 (a.out) total-vm:4268108kB, anon-rss:2246048kB, file-rss:0kB, shmem-rss:0kB
  a.out (1169) used greatest stack depth: 11664 bytes left
  DEBUG_LOCKS_WARN_ON(depth <= 0)
  ------------[ cut here ]------------
  WARNING: CPU: 6 PID: 1339 at kernel/locking/lockdep.c:3617 lock_release+0x172/0x1e0
  CPU: 6 PID: 1339 Comm: a.out Not tainted 4.13.0-rc3-next-20170803+ #142
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
  RIP: 0010:lock_release+0x172/0x1e0
  Call Trace:
     up_read+0x1a/0x40
     __do_page_fault+0x28e/0x4c0
     do_page_fault+0x30/0x80
     page_fault+0x28/0x30

The reason is that the page fault path might have dropped the mmap_sem
and returned with VM_FAULT_RETRY.  MMF_UNSTABLE check however rewrites
the error path to VM_FAULT_SIGBUS and we always expect mmap_sem taken in
that path.  Fix this by taking mmap_sem when VM_FAULT_RETRY is held in
the MMF_UNSTABLE path.

We cannot simply add VM_FAULT_SIGBUS to the existing error code because
all arch specific page fault handlers and g-u-p would have to learn a
new error code combination.

Link: http://lkml.kernel.org/r/20170807113839.16695-2-mhocko@kernel.org
Fixes: 3f70dc38cec2 ("mm: make sure that kthreads will not refault oom reaped memory")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Wenwei Tao <wenwei.tww@alibaba-inc.com>
Cc: <stable@vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memory.c

index e158f7ac67300b10b8827fe6825667506095f550..c717b5bcc80e2c3e19664d0638a391708e89fdaf 100644 (file)
@@ -3910,8 +3910,18 @@ int handle_mm_fault(struct vm_area_struct *vma, unsigned long address,
         * further.
         */
        if (unlikely((current->flags & PF_KTHREAD) && !(ret & VM_FAULT_ERROR)
-                               && test_bit(MMF_UNSTABLE, &vma->vm_mm->flags)))
+                               && test_bit(MMF_UNSTABLE, &vma->vm_mm->flags))) {
+
+               /*
+                * We are going to enforce SIGBUS but the PF path might have
+                * dropped the mmap_sem already so take it again so that
+                * we do not break expectations of all arch specific PF paths
+                * and g-u-p
+                */
+               if (ret & VM_FAULT_RETRY)
+                       down_read(&vma->vm_mm->mmap_sem);
                ret = VM_FAULT_SIGBUS;
+       }
 
        return ret;
 }