From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 9 Jan 2024 19:18:47 +0000 (-0800)
Subject: Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel... 
X-Git-Tag: v6.8-rc1~180
X-Git-Url: https://git.kernel.dk/?a=commitdiff_plain;h=fb46e22a9e3863e08aef8815df9f17d0f4b9aede;p=linux-block.git

Merge tag 'mm-stable-2024-01-08-15-31' of git://git./linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "Many singleton patches against the MM code. The patch series which are
  included in this merge do the following:

   - Peng Zhang has done some mapletree maintainance work in the series

	'maple_tree: add mt_free_one() and mt_attr() helpers'
	'Some cleanups of maple tree'

   - In the series 'mm: use memmap_on_memory semantics for dax/kmem'
     Vishal Verma has altered the interworking between memory-hotplug
     and dax/kmem so that newly added 'device memory' can more easily
     have its memmap placed within that newly added memory.

   - Matthew Wilcox continues folio-related work (including a few fixes)
     in the patch series

	'Add folio_zero_tail() and folio_fill_tail()'
	'Make folio_start_writeback return void'
	'Fix fault handler's handling of poisoned tail pages'
	'Convert aops->error_remove_page to ->error_remove_folio'
	'Finish two folio conversions'
	'More swap folio conversions'

   - Kefeng Wang has also contributed folio-related work in the series

	'mm: cleanup and use more folio in page fault'

   - Jim Cromie has improved the kmemleak reporting output in the series
     'tweak kmemleak report format'.

   - In the series 'stackdepot: allow evicting stack traces' Andrey
     Konovalov to permits clients (in this case KASAN) to cause eviction
     of no longer needed stack traces.

   - Charan Teja Kalla has fixed some accounting issues in the page
     allocator's atomic reserve calculations in the series 'mm:
     page_alloc: fixes for high atomic reserve caluculations'.

   - Dmitry Rokosov has added to the samples/ dorectory some sample code
     for a userspace memcg event listener application. See the series
     'samples: introduce cgroup events listeners'.

   - Some mapletree maintanance work from Liam Howlett in the series
     'maple_tree: iterator state changes'.

   - Nhat Pham has improved zswap's approach to writeback in the series
     'workload-specific and memory pressure-driven zswap writeback'.

   - DAMON/DAMOS feature and maintenance work from SeongJae Park in the
     series

	'mm/damon: let users feed and tame/auto-tune DAMOS'
	'selftests/damon: add Python-written DAMON functionality tests'
	'mm/damon: misc updates for 6.8'

   - Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
     memcg: subtree stats flushing and thresholds'.

   - In the series 'Multi-size THP for anonymous memory' Ryan Roberts
     has added a runtime opt-in feature to transparent hugepages which
     improves performance by allocating larger chunks of memory during
     anonymous page faults.

   - Matthew Wilcox has also contributed some cleanup and maintenance
     work against eh buffer_head code int he series 'More buffer_head
     cleanups'.

   - Suren Baghdasaryan has done work on Andrea Arcangeli's series
     'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
     compaction algorithms to move userspace's pages around rather than
     UFFDIO_COPY'a alloc/copy/free.

   - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
     Add ksm advisor'. This is a governor which tunes KSM's scanning
     aggressiveness in response to userspace's current needs.

   - Chengming Zhou has optimized zswap's temporary working memory use
     in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.

   - Matthew Wilcox has performed some maintenance work on the writeback
     code, both code and within filesystems. The series is 'Clean up the
     writeback paths'.

   - Andrey Konovalov has optimized KASAN's handling of alloc and free
     stack traces for secondary-level allocators, in the series 'kasan:
     save mempool stack traces'.

   - Andrey also performed some KASAN maintenance work in the series
     'kasan: assorted clean-ups'.

   - David Hildenbrand has gone to town on the rmap code. Cleanups, more
     pte batching, folio conversions and more. See the series 'mm/rmap:
     interface overhaul'.

   - Kinsey Ho has contributed some maintenance work on the MGLRU code
     in the series 'mm/mglru: Kconfig cleanup'.

   - Matthew Wilcox has contributed lruvec page accounting code cleanups
     in the series 'Remove some lruvec page accounting functions'"

* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
  mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
  mm, treewide: introduce NR_PAGE_ORDERS
  selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
  selftests/mm: skip test if application doesn't has root privileges
  selftests/mm: conform test to TAP format output
  selftests: mm: hugepage-mmap: conform to TAP format output
  selftests/mm: gup_test: conform test to TAP format output
  mm/selftests: hugepage-mremap: conform test to TAP format output
  mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
  mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
  mm/memcontrol: remove __mod_lruvec_page_state()
  mm/khugepaged: use a folio more in collapse_file()
  slub: use a folio in __kmalloc_large_node
  slub: use folio APIs in free_large_kmalloc()
  slub: use alloc_pages_node() in alloc_slab_page()
  mm: remove inc/dec lruvec page state functions
  mm: ratelimit stat flush from workingset shrinker
  kasan: stop leaking stack trace handles
  mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
  mm/mglru: add dummy pmd_dirty()
  ...
---

fb46e22a9e3863e08aef8815df9f17d0f4b9aede
diff --cc fs/buffer.c
index 5ffc44ab4854,5c29850e4781..d3bcf601d3e5
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@@ -1064,17 -1073,14 +1073,14 @@@ static bool grow_dev_folio(struct block
  	 * lock to be atomic wrt __find_get_block(), which does not
  	 * run under the folio lock.
  	 */
 -	spin_lock(&inode->i_mapping->private_lock);
 +	spin_lock(&inode->i_mapping->i_private_lock);
  	link_dev_buffers(folio, bh);
- 	end_block = folio_init_buffers(folio, bdev,
- 			(sector_t)index << sizebits, size);
+ 	end_block = folio_init_buffers(folio, bdev, size);
 -	spin_unlock(&inode->i_mapping->private_lock);
 +	spin_unlock(&inode->i_mapping->i_private_lock);
- done:
- 	ret = (block < end_block) ? 1 : -ENXIO;
- failed:
+ unlock:
  	folio_unlock(folio);
  	folio_put(folio);
- 	return ret;
+ 	return block < end_block;
  }
  
  /*
diff --cc include/linux/slab.h
index b2015d0e01ad,d63823e518c0..b5f5ee8308d0
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@@ -307,10 -308,19 +307,10 @@@ static inline unsigned int arch_slab_mi
   * (PAGE_SIZE*2).  Larger requests are passed to the page allocator.
   */
  #define KMALLOC_SHIFT_HIGH	(PAGE_SHIFT + 1)
- #define KMALLOC_SHIFT_MAX	(MAX_ORDER + PAGE_SHIFT)
+ #define KMALLOC_SHIFT_MAX	(MAX_PAGE_ORDER + PAGE_SHIFT)
  #ifndef KMALLOC_SHIFT_LOW
 -#define KMALLOC_SHIFT_LOW	5
 -#endif
 -#endif
 -
 -#ifdef CONFIG_SLUB
 -#define KMALLOC_SHIFT_HIGH	(PAGE_SHIFT + 1)
 -#define KMALLOC_SHIFT_MAX	(MAX_PAGE_ORDER + PAGE_SHIFT)
 -#ifndef KMALLOC_SHIFT_LOW
  #define KMALLOC_SHIFT_LOW	3
  #endif
 -#endif
  
  /* Maximum allocatable size */
  #define KMALLOC_MAX_SIZE	(1UL << KMALLOC_SHIFT_MAX)
diff --cc mm/kasan/quarantine.c
index 138c57b836f2,8afa77bc5d3b..3ba02efb952a
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@@ -143,7 -143,10 +143,9 @@@ static void *qlink_to_object(struct qli
  static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache)
  {
  	void *object = qlink_to_object(qlink, cache);
- 	struct kasan_free_meta *meta = kasan_get_free_meta(cache, object);
+ 	struct kasan_free_meta *free_meta = kasan_get_free_meta(cache, object);
 -	unsigned long flags;
+ 
+ 	kasan_release_object_meta(cache, object);
  
  	/*
  	 * If init_on_free is enabled and KASAN's free metadata is stored in
@@@ -153,15 -156,15 +155,9 @@@
  	 */
  	if (slab_want_init_on_free(cache) &&
  	    cache->kasan_info.free_meta_offset == 0)
- 		memzero_explicit(meta, sizeof(*meta));
- 
- 	/*
- 	 * As the object now gets freed from the quarantine, assume that its
- 	 * free track is no longer valid.
- 	 */
- 	*(u8 *)kasan_mem_to_shadow(object) = KASAN_SLAB_FREE;
+ 		memzero_explicit(free_meta, sizeof(*free_meta));
  
 -	if (IS_ENABLED(CONFIG_SLAB))
 -		local_irq_save(flags);
 -
  	___cache_free(cache, object, _THIS_IP_);
 -
 -	if (IS_ENABLED(CONFIG_SLAB))
 -		local_irq_restore(flags);
  }
  
  static void qlist_free_all(struct qlist_head *q, struct kmem_cache *cache)
diff --cc mm/mempool.c
index 4759be0ff9de,cb7b4b56cec1..dbbf0e9fb424
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@@ -89,28 -97,29 +97,29 @@@ static void poison_element(mempool_t *p
  	} else if (pool->alloc == mempool_alloc_pages) {
  		/* Mempools backed by page allocator */
  		int order = (int)(long)pool->pool_data;
- 		void *addr = kmap_atomic((struct page *)element);
+ 		void *addr = kmap_local_page((struct page *)element);
  
  		__poison_element(addr, 1UL << (PAGE_SHIFT + order));
- 		kunmap_atomic(addr);
+ 		kunmap_local(addr);
  	}
  }
 -#else /* CONFIG_DEBUG_SLAB || CONFIG_SLUB_DEBUG_ON */
 +#else /* CONFIG_SLUB_DEBUG_ON */
  static inline void check_element(mempool_t *pool, void *element)
  {
  }
  static inline void poison_element(mempool_t *pool, void *element)
  {
  }
 -#endif /* CONFIG_DEBUG_SLAB || CONFIG_SLUB_DEBUG_ON */
 +#endif /* CONFIG_SLUB_DEBUG_ON */
  
- static __always_inline void kasan_poison_element(mempool_t *pool, void *element)
+ static __always_inline bool kasan_poison_element(mempool_t *pool, void *element)
  {
  	if (pool->alloc == mempool_alloc_slab || pool->alloc == mempool_kmalloc)
- 		kasan_slab_free_mempool(element);
+ 		return kasan_mempool_poison_object(element);
  	else if (pool->alloc == mempool_alloc_pages)
- 		kasan_poison_pages(element, (unsigned long)pool->pool_data,
- 				   false);
+ 		return kasan_mempool_poison_pages(element,
+ 						(unsigned long)pool->pool_data);
+ 	return true;
  }
  
  static void kasan_unpoison_element(mempool_t *pool, void *element)
diff --cc mm/slub.c
index fac07382d3a6,ba162e661e2e..2ef88bbf56a3
--- a/mm/slub.c
+++ b/mm/slub.c
@@@ -3876,148 -3503,37 +3883,148 @@@ void *kmem_cache_alloc_lru(struct kmem_
  
  	return ret;
  }
 +EXPORT_SYMBOL(kmem_cache_alloc_lru);
  
 -void *kmem_cache_alloc(struct kmem_cache *s, gfp_t gfpflags)
 +/**
 + * kmem_cache_alloc_node - Allocate an object on the specified node
 + * @s: The cache to allocate from.
 + * @gfpflags: See kmalloc().
 + * @node: node number of the target node.
 + *
 + * Identical to kmem_cache_alloc but it will allocate memory on the given
 + * node, which can improve the performance for cpu bound structures.
 + *
 + * Fallback to other node is possible if __GFP_THISNODE is not set.
 + *
 + * Return: pointer to the new object or %NULL in case of error
 + */
 +void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node)
  {
 -	return __kmem_cache_alloc_lru(s, NULL, gfpflags);
 +	void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, s->object_size);
 +
 +	trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, node);
 +
 +	return ret;
  }
 -EXPORT_SYMBOL(kmem_cache_alloc);
 +EXPORT_SYMBOL(kmem_cache_alloc_node);
  
 -void *kmem_cache_alloc_lru(struct kmem_cache *s, struct list_lru *lru,
 -			   gfp_t gfpflags)
 +/*
 + * To avoid unnecessary overhead, we pass through large allocation requests
 + * directly to the page allocator. We use __GFP_COMP, because we will need to
 + * know the allocation order to free the pages properly in kfree.
 + */
 +static void *__kmalloc_large_node(size_t size, gfp_t flags, int node)
  {
- 	struct page *page;
 -	return __kmem_cache_alloc_lru(s, lru, gfpflags);
++	struct folio *folio;
 +	void *ptr = NULL;
 +	unsigned int order = get_order(size);
 +
 +	if (unlikely(flags & GFP_SLAB_BUG_MASK))
 +		flags = kmalloc_fix_flags(flags);
 +
 +	flags |= __GFP_COMP;
- 	page = alloc_pages_node(node, flags, order);
- 	if (page) {
- 		ptr = page_address(page);
- 		mod_lruvec_page_state(page, NR_SLAB_UNRECLAIMABLE_B,
++	folio = (struct folio *)alloc_pages_node(node, flags, order);
++	if (folio) {
++		ptr = folio_address(folio);
++		lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 +				      PAGE_SIZE << order);
 +	}
 +
 +	ptr = kasan_kmalloc_large(ptr, size, flags);
 +	/* As ptr might get tagged, call kmemleak hook after KASAN. */
 +	kmemleak_alloc(ptr, size, 1, flags);
 +	kmsan_kmalloc_large(ptr, size, flags);
 +
 +	return ptr;
  }
 -EXPORT_SYMBOL(kmem_cache_alloc_lru);
  
 -void *__kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags,
 -			      int node, size_t orig_size,
 -			      unsigned long caller)
 +void *kmalloc_large(size_t size, gfp_t flags)
  {
 -	return slab_alloc_node(s, NULL, gfpflags, node,
 -			       caller, orig_size);
 +	void *ret = __kmalloc_large_node(size, flags, NUMA_NO_NODE);
 +
 +	trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
 +		      flags, NUMA_NO_NODE);
 +	return ret;
  }
 +EXPORT_SYMBOL(kmalloc_large);
  
 -void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t gfpflags, int node)
 +void *kmalloc_large_node(size_t size, gfp_t flags, int node)
  {
 -	void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, s->object_size);
 +	void *ret = __kmalloc_large_node(size, flags, node);
  
 -	trace_kmem_cache_alloc(_RET_IP_, ret, s, gfpflags, node);
 +	trace_kmalloc(_RET_IP_, ret, size, PAGE_SIZE << get_order(size),
 +		      flags, node);
 +	return ret;
 +}
 +EXPORT_SYMBOL(kmalloc_large_node);
  
 +static __always_inline
 +void *__do_kmalloc_node(size_t size, gfp_t flags, int node,
 +			unsigned long caller)
 +{
 +	struct kmem_cache *s;
 +	void *ret;
 +
 +	if (unlikely(size > KMALLOC_MAX_CACHE_SIZE)) {
 +		ret = __kmalloc_large_node(size, flags, node);
 +		trace_kmalloc(caller, ret, size,
 +			      PAGE_SIZE << get_order(size), flags, node);
 +		return ret;
 +	}
 +
 +	if (unlikely(!size))
 +		return ZERO_SIZE_PTR;
 +
 +	s = kmalloc_slab(size, flags, caller);
 +
 +	ret = slab_alloc_node(s, NULL, flags, node, caller, size);
 +	ret = kasan_kmalloc(s, ret, size, flags);
 +	trace_kmalloc(caller, ret, size, s->size, flags, node);
  	return ret;
  }
 -EXPORT_SYMBOL(kmem_cache_alloc_node);
 +
 +void *__kmalloc_node(size_t size, gfp_t flags, int node)
 +{
 +	return __do_kmalloc_node(size, flags, node, _RET_IP_);
 +}
 +EXPORT_SYMBOL(__kmalloc_node);
 +
 +void *__kmalloc(size_t size, gfp_t flags)
 +{
 +	return __do_kmalloc_node(size, flags, NUMA_NO_NODE, _RET_IP_);
 +}
 +EXPORT_SYMBOL(__kmalloc);
 +
 +void *__kmalloc_node_track_caller(size_t size, gfp_t flags,
 +				  int node, unsigned long caller)
 +{
 +	return __do_kmalloc_node(size, flags, node, caller);
 +}
 +EXPORT_SYMBOL(__kmalloc_node_track_caller);
 +
 +void *kmalloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size)
 +{
 +	void *ret = slab_alloc_node(s, NULL, gfpflags, NUMA_NO_NODE,
 +					    _RET_IP_, size);
 +
 +	trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, NUMA_NO_NODE);
 +
 +	ret = kasan_kmalloc(s, ret, size, gfpflags);
 +	return ret;
 +}
 +EXPORT_SYMBOL(kmalloc_trace);
 +
 +void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
 +			 int node, size_t size)
 +{
 +	void *ret = slab_alloc_node(s, NULL, gfpflags, node, _RET_IP_, size);
 +
 +	trace_kmalloc(_RET_IP_, ret, size, s->size, gfpflags, node);
 +
 +	ret = kasan_kmalloc(s, ret, size, gfpflags);
 +	return ret;
 +}
 +EXPORT_SYMBOL(kmalloc_node_trace);
  
  static noinline void free_to_partial_list(
  	struct kmem_cache *s, struct slab *slab,
@@@ -4357,52 -3839,6 +4364,52 @@@ void kmem_cache_free(struct kmem_cache 
  }
  EXPORT_SYMBOL(kmem_cache_free);
  
 +static void free_large_kmalloc(struct folio *folio, void *object)
 +{
 +	unsigned int order = folio_order(folio);
 +
 +	if (WARN_ON_ONCE(order == 0))
 +		pr_warn_once("object pointer: 0x%p\n", object);
 +
 +	kmemleak_free(object);
 +	kasan_kfree_large(object);
 +	kmsan_kfree_large(object);
 +
- 	mod_lruvec_page_state(folio_page(folio, 0), NR_SLAB_UNRECLAIMABLE_B,
++	lruvec_stat_mod_folio(folio, NR_SLAB_UNRECLAIMABLE_B,
 +			      -(PAGE_SIZE << order));
- 	__free_pages(folio_page(folio, 0), order);
++	folio_put(folio);
 +}
 +
 +/**
 + * kfree - free previously allocated memory
 + * @object: pointer returned by kmalloc() or kmem_cache_alloc()
 + *
 + * If @object is NULL, no operation is performed.
 + */
 +void kfree(const void *object)
 +{
 +	struct folio *folio;
 +	struct slab *slab;
 +	struct kmem_cache *s;
 +	void *x = (void *)object;
 +
 +	trace_kfree(_RET_IP_, object);
 +
 +	if (unlikely(ZERO_OR_NULL_PTR(object)))
 +		return;
 +
 +	folio = virt_to_folio(object);
 +	if (unlikely(!folio_test_slab(folio))) {
 +		free_large_kmalloc(folio, (void *)object);
 +		return;
 +	}
 +
 +	slab = folio_slab(folio);
 +	s = slab->slab_cache;
 +	slab_free(s, slab, x, _RET_IP_);
 +}
 +EXPORT_SYMBOL(kfree);
 +
  struct detached_freelist {
  	struct slab *slab;
  	void *tail;