mm: hwpoison: support recovery from ksm_might_need_to_copy()
authorKefeng Wang <wangkefeng.wang@huawei.com>
Fri, 9 Dec 2022 07:28:01 +0000 (15:28 +0800)
committerAndrew Morton <akpm@linux-foundation.org>
Thu, 9 Feb 2023 23:56:51 +0000 (15:56 -0800)
commit6b970599e807ea95c653926d41b095a92fd381e2
tree5ae0aeb6870eebe132590664658c4deb42027dc0
parent55d77bae73426237b3c74c1757a894b056550dff
mm: hwpoison: support recovery from ksm_might_need_to_copy()

When the kernel copies a page from ksm_might_need_to_copy(), but runs into
an uncorrectable error, it will crash since poisoned page is consumed by
kernel, this is similar to the issue recently fixed by Copy-on-write
poison recovery.

When an error is detected during the page copy, return VM_FAULT_HWPOISON
in do_swap_page(), and install a hwpoison entry in unuse_pte() when
swapoff, which help us to avoid system crash.  Note, memory failure on a
KSM page will be skipped, but still call memory_failure_queue() to be
consistent with general memory failure process, and we could support KSM
page recovery in the feature.

[wangkefeng.wang@huawei.com: enhance unuse_pte(), fix issue found by lkp]
Link: https://lkml.kernel.org/r/20221213120523.141588-1-wangkefeng.wang@huawei.com
[wangkefeng.wang@huawei.com: update changelog, alter ksm_might_need_to_copy(), restore unlikely() in unuse_pte()]
Link: https://lkml.kernel.org/r/20230201074433.96641-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20221209072801.193221-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/ksm.c
mm/memory.c
mm/swapfile.c