mm: page_alloc: close migratetype race between freeing and stealing
authorJohannes Weiner <hannes@cmpxchg.org>
Wed, 20 Mar 2024 18:02:12 +0000 (14:02 -0400)
committerAndrew Morton <akpm@linux-foundation.org>
Fri, 26 Apr 2024 03:56:03 +0000 (20:56 -0700)
commit55612e80e722ac554cc5e80df05555b4f8d40c37
tree81841e448fcf80973be27d6052acb4ff1f2c8f3d
parentc0cd6f557b9090525d288806cccbc73440ac235a
mm: page_alloc: close migratetype race between freeing and stealing

There are three freeing paths that read the page's migratetype
optimistically before grabbing the zone lock.  When this races with block
stealing, those pages go on the wrong freelist.

The paths in question are:
- when freeing >costly orders that aren't THP
- when freeing pages to the buddy upon pcp lock contention
- when freeing pages that are isolated
- when freeing pages initially during boot
- when freeing the remainder in alloc_pages_exact()
- when "accepting" unaccepted VM host memory before first use
- when freeing pages during unpoisoning

None of these are so hot that they would need this optimization at the
cost of hampering defrag efforts.  Especially when contrasted with the
fact that the most common buddy freeing path - free_pcppages_bulk - is
checking the migratetype under the zone->lock just fine.

In addition, isolated pages need to look up the migratetype under the lock
anyway, which adds branches to the locked section, and results in a double
lookup when the pages are in fact isolated.

Move the lookups into the lock.

Link: https://lkml.kernel.org/r/20240320180429.678181-8-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/page_alloc.c