linux-block.git
3 weeks agof2fs: Add fs parameter specifications for mount options
Hongbo Li [Thu, 10 Jul 2025 12:14:09 +0000 (12:14 +0000)]
f2fs: Add fs parameter specifications for mount options

Use an array of `fs_parameter_spec` called f2fs_param_specs to
hold the mount option specifications for the new mount api.

Add constant_table structures for several options to facilitate
parsing.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port, minor fixes and updates, more fsparam_enum]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: fix to avoid out-of-boundary access in devs.path
Chao Yu [Fri, 11 Jul 2025 07:14:50 +0000 (15:14 +0800)]
f2fs: fix to avoid out-of-boundary access in devs.path

- touch /mnt/f2fs/012345678901234567890123456789012345678901234567890123
- truncate -s $((1024*1024*1024)) \
  /mnt/f2fs/012345678901234567890123456789012345678901234567890123
- touch /mnt/f2fs/file
- truncate -s $((1024*1024*1024)) /mnt/f2fs/file
- mkfs.f2fs /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \
  -c /mnt/f2fs/file
- mount /mnt/f2fs/012345678901234567890123456789012345678901234567890123 \
  /mnt/f2fs/loop

[16937.192225] F2FS-fs (loop0): Mount Device [ 0]: /mnt/f2fs/012345678901234567890123456789012345678901234567890123\xff\x01,      511,        0 -    3ffff
[16937.192268] F2FS-fs (loop0): Failed to find devices

If device path length equals to MAX_PATH_LEN, sbi->devs.path[] may
not end up w/ null character due to path array is fully filled, So
accidently, fields locate after path[] may be treated as part of
device path, result in parsing wrong device path.

struct f2fs_dev_info {
...
char path[MAX_PATH_LEN];
...
};

Let's add one byte space for sbi->devs.path[] to store null
character of device path string.

Fixes: 3c62be17d4f5 ("f2fs: support multiple devices")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Remove F2FS_P_SB()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:57 +0000 (18:03 +0100)]
f2fs: Remove F2FS_P_SB()

All callers have been converted to F2FS_F_SB() so delete this wrapper.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to __has_merged_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:56 +0000 (18:03 +0100)]
f2fs: Pass a folio to __has_merged_page()

All three callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_submit_merged_write_cond()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:55 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_submit_merged_write_cond()

Most callers pass NULL, and the one that passes a page already has a
folio.  Also convert __submit_merged_write_cond() to take a folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Remove use of page from f2fs_write_single_data_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:54 +0000 (18:03 +0100)]
f2fs: Remove use of page from f2fs_write_single_data_page()

Both remaining uses of page now have a folio equivalent.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Remove clear_page_private_all()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:53 +0000 (18:03 +0100)]
f2fs: Remove clear_page_private_all()

All callers can simply call folio_detach_private().  This was the
only way that clear_page_private_data() could be called, so remove
that too.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use F2FS_F_SB() in f2fs_read_end_io()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:52 +0000 (18:03 +0100)]
f2fs: Use F2FS_F_SB() in f2fs_read_end_io()

Get the folio from the bio instead of the page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a folio in f2fs_encrypted_get_link()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:51 +0000 (18:03 +0100)]
f2fs: Use a folio in f2fs_encrypted_get_link()

Use a folio instead of a page when dealing with the page cache.  Removes
a hidden call to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_cache_compressed_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:50 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_cache_compressed_page()

The only caller already has a folio so pass it in.
f2fs_cache_compressed_page() is not used outside compress.c so
make it static.  This requires a forward declaration (or would require
rearranging this file, but I've chosen not to do that for readability of
the diff).

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to F2FS_NODE()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:49 +0000 (18:03 +0100)]
f2fs: Pass a folio to F2FS_NODE()

All callers now have a folio so pass it in

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass the nat_blk to __update_nat_bits()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:48 +0000 (18:03 +0100)]
f2fs: Pass the nat_blk to __update_nat_bits()

The page argument is only used to look up the address of the nat_blk.
Since the caller already has it, pass it in instead.  Also mark it const
as the nat_blk isn't modified by this function.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Convert get_next_nat_page() to get_next_nat_folio()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:47 +0000 (18:03 +0100)]
f2fs: Convert get_next_nat_page() to get_next_nat_folio()

Return a folio from this function and convert its one caller.
Removes a call to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_is_compressed_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:46 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_is_compressed_page()

All callers now have a folio so pass it in.  Also remove the test for
the private flag; it is redundant with checking folio->private for being
NULL.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a folio iterator in f2fs_verify_bio()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:45 +0000 (18:03 +0100)]
f2fs: Use a folio iterator in f2fs_verify_bio()

Change from bio_for_each_segment_all() to bio_for_each_folio_all()
to iterate over each folio instead of each page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_end_read_compressed_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:44 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_end_read_compressed_page()

Both callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a folio iterator in f2fs_handle_step_decompress()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:43 +0000 (18:03 +0100)]
f2fs: Use a folio iterator in f2fs_handle_step_decompress()

Change from bio_for_each_segment_all() to bio_for_each_folio_all()
to iterate over each folio instead of each page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to WB_DATA_TYPE() and f2fs_is_cp_guaranteed()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:42 +0000 (18:03 +0100)]
f2fs: Pass a folio to WB_DATA_TYPE() and f2fs_is_cp_guaranteed()

All callers now have a folio so pass it in.  Removes a call to
compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a bio in f2fs_submit_page_write()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:41 +0000 (18:03 +0100)]
f2fs: Use a bio in f2fs_submit_page_write()

Convert bio_page to bio_folio and use it throughout.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a folio in f2fs_merge_page_bio()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:40 +0000 (18:03 +0100)]
f2fs: Use a folio in f2fs_merge_page_bio()

We have two folios to deal with here; one carries the metadata and the
other points to the data.  They may be the same, but if it's compressed,
the data_folio will differ from the metadata folio.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_compress_write_end_io()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:39 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_compress_write_end_io()

The only caller has a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Convert get_page_private_data() to folio_get_f2fs_data()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:38 +0000 (18:03 +0100)]
f2fs: Convert get_page_private_data() to folio_get_f2fs_data()

The only caller already has a folio so convert this function to be folio
based.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Convert set_page_private_data() to folio_set_f2fs_data()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:37 +0000 (18:03 +0100)]
f2fs: Convert set_page_private_data() to folio_set_f2fs_data()

The only caller has a folio, so pass it in and operate on it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use a folio in f2fs_is_cp_guaranteed()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:36 +0000 (18:03 +0100)]
f2fs: Use a folio in f2fs_is_cp_guaranteed()

Convert the passed page to a folio and use it throughout.  Removes
a use of fscrypt_is_bounce_page(), which we're trying to remove.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Add folio counterparts to page_private_flags functions
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:35 +0000 (18:03 +0100)]
f2fs: Add folio counterparts to page_private_flags functions

Name these new functions folio_test_f2fs_*(), folio_set_f2fs_*() and
folio_clear_f2fs_*().  Convert all callers which currently have a folio
and cast back to a page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to IS_INODE()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:34 +0000 (18:03 +0100)]
f2fs: Pass a folio to IS_INODE()

All callers now have a folio so pass it in.  Also make it const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to ADDRS_PER_PAGE()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:33 +0000 (18:03 +0100)]
f2fs: Pass a folio to ADDRS_PER_PAGE()

All callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to get_dnode_base()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:32 +0000 (18:03 +0100)]
f2fs: Pass a folio to get_dnode_base()

The only caller already has a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to ofs_of_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:31 +0000 (18:03 +0100)]
f2fs: Pass a folio to ofs_of_node()

All callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to IS_DNODE()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:30 +0000 (18:03 +0100)]
f2fs: Pass a folio to IS_DNODE()

All callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to is_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:29 +0000 (18:03 +0100)]
f2fs: Pass a folio to is_node()

All three callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to is_cold_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:28 +0000 (18:03 +0100)]
f2fs: Pass a folio to is_cold_node()

All callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Use folio_unlock() in f2fs_write_compressed_pages()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:27 +0000 (18:03 +0100)]
f2fs: Use folio_unlock() in f2fs_write_compressed_pages()

Remove a call to compound_head() by replacing a call to unlock_page()
with a call to folio_unlock().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Add fio->folio
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:26 +0000 (18:03 +0100)]
f2fs: Add fio->folio

Put fio->page insto a union with fio->folio.  This lets us remove a
lot of folio->page and page->folio conversions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to is_dent_dnode()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:25 +0000 (18:03 +0100)]
f2fs: Pass a folio to is_dent_dnode()

Both callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to is_fsync_dnode()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:24 +0000 (18:03 +0100)]
f2fs: Pass a folio to is_fsync_dnode()

Both callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_recover_xattr_data()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:23 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_recover_xattr_data()

One caller passes NULL and the other caller already has a folio so
pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to cpver_of_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:22 +0000 (18:03 +0100)]
f2fs: Pass a folio to cpver_of_node()

All callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to fill_node_footer()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:21 +0000 (18:03 +0100)]
f2fs: Pass a folio to fill_node_footer()

All callers have a folio so pass it in.  Also mark it as const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass folios to copy_node_footer()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:20 +0000 (18:03 +0100)]
f2fs: Pass folios to copy_node_footer()

The only caller has folios so pass them in.  Also mark them as const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to set_cold_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:19 +0000 (18:03 +0100)]
f2fs: Pass a folio to set_cold_node()

All callers have a folio so pass it in.  Also mark it as const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to get_nid()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:18 +0000 (18:03 +0100)]
f2fs: Pass a folio to get_nid()

All callers have a folio so pass it in.  Also mark it as const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to fill_node_footer_blkaddr()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:17 +0000 (18:03 +0100)]
f2fs: Pass a folio to fill_node_footer_blkaddr()

The only caller has a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_inode_chksum()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:16 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_inode_chksum()

Both callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_enable_inode_chksum()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:15 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_enable_inode_chksum()

All callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_inode_chksum_set()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:14 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_inode_chksum_set()

All callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_allocate_data_block()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:13 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_allocate_data_block()

Most callers pass NULL, and the one which passes a page already has a
folio, so we can pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to set_mark()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:12 +0000 (18:03 +0100)]
f2fs: Pass a folio to set_mark()

All callers have a folio so pass it in.  Removes a call to
compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to set_fsync_mark()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:11 +0000 (18:03 +0100)]
f2fs: Pass a folio to set_fsync_mark()

All callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to set_dentry_mark()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:10 +0000 (18:03 +0100)]
f2fs: Pass a folio to set_dentry_mark()

All callers have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to is_recoverable_dnode()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:09 +0000 (18:03 +0100)]
f2fs: Pass a folio to is_recoverable_dnode()

All callers have a folio so pass it in.  Also make the argument const
as the function does not modify it.  Removes a call to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to nid_of_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:08 +0000 (18:03 +0100)]
f2fs: Pass a folio to nid_of_node()

All callers have a folio so pass it in.  Also make the argument const
as the function does not modify it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to ino_of_node()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:07 +0000 (18:03 +0100)]
f2fs: Pass a folio to ino_of_node()

All callers have a folio so pass it in.  Also make the argument const
as the function does not modify it.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to F2FS_INODE()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:06 +0000 (18:03 +0100)]
f2fs: Pass a folio to F2FS_INODE()

All callers now have a folio, so pass it in.  Also make it const as
F2FS_INODE() does not modify the struct folio passed in (the data it
describes is mutable, but it does not change the contents of the struct).
This may improve code generation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to inode_has_blocks()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:05 +0000 (18:03 +0100)]
f2fs: Pass a folio to inode_has_blocks()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_sanity_check_inline_data()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:04 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_sanity_check_inline_data()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to sanity_check_inode()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:03 +0000 (18:03 +0100)]
f2fs: Pass a folio to sanity_check_inode()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to sanity_check_extent_cache()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:02 +0000 (18:03 +0100)]
f2fs: Pass a folio to sanity_check_extent_cache()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to f2fs_recover_inode_page()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:01 +0000 (18:03 +0100)]
f2fs: Pass a folio to f2fs_recover_inode_page()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to recover_quota_data()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:03:00 +0000 (18:03 +0100)]
f2fs: Pass a folio to recover_quota_data()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to recover_inode()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:02:59 +0000 (18:02 +0100)]
f2fs: Pass a folio to recover_inode()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
3 weeks agof2fs: Pass a folio to recover_dentry()
Matthew Wilcox (Oracle) [Tue, 8 Jul 2025 17:02:58 +0000 (18:02 +0100)]
f2fs: Pass a folio to recover_dentry()

The only caller has a folio, so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
5 weeks agof2fs: introduce is_cur{seg,sec}()
Chao Yu [Mon, 7 Jul 2025 11:46:14 +0000 (19:46 +0800)]
f2fs: introduce is_cur{seg,sec}()

There are redundant codes in IS_CUR{SEG,SEC}() macros, let's introduce
inline is_cur{seg,sec}() functions, and use a loop in it for cleanup.

Meanwhile, it enhances expansibility, as it doesn't need to change
is_cur{seg,sec}() when we add a new log header.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
5 weeks agof2fs: fix to avoid panic in f2fs_evict_inode
Chao Yu [Tue, 8 Jul 2025 09:56:57 +0000 (17:56 +0800)]
f2fs: fix to avoid panic in f2fs_evict_inode

As syzbot [1] reported as below:

R10: 0000000000000100 R11: 0000000000000206 R12: 00007ffe17473450
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>
---[ end trace 0000000000000000 ]---
==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff88812d962278 by task syz-executor/564

CPU: 1 PID: 564 Comm: syz-executor Tainted: G        W          6.1.129-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
 <TASK>
 __dump_stack+0x21/0x24 lib/dump_stack.c:88
 dump_stack_lvl+0xee/0x158 lib/dump_stack.c:106
 print_address_description+0x71/0x210 mm/kasan/report.c:316
 print_report+0x4a/0x60 mm/kasan/report.c:427
 kasan_report+0x122/0x150 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0xf7/0x2e0 fs/f2fs/super.c:1531
 f2fs_update_inode+0x74/0x1c40 fs/f2fs/inode.c:585
 f2fs_update_inode_page+0x137/0x170 fs/f2fs/inode.c:703
 f2fs_write_inode+0x4ec/0x770 fs/f2fs/inode.c:731
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4a0/0xab0 fs/fs-writeback.c:1677
 writeback_single_inode+0x221/0x8b0 fs/fs-writeback.c:1733
 sync_inode_metadata+0xb6/0x110 fs/fs-writeback.c:2789
 f2fs_sync_inode_meta+0x16d/0x2a0 fs/f2fs/checkpoint.c:1159
 block_operations fs/f2fs/checkpoint.c:1269 [inline]
 f2fs_write_checkpoint+0xca3/0x2100 fs/f2fs/checkpoint.c:1658
 kill_f2fs_super+0x231/0x390 fs/f2fs/super.c:4668
 deactivate_locked_super+0x98/0x100 fs/super.c:332
 deactivate_super+0xaf/0xe0 fs/super.c:363
 cleanup_mnt+0x45f/0x4e0 fs/namespace.c:1186
 __cleanup_mnt+0x19/0x20 fs/namespace.c:1193
 task_work_run+0x1c6/0x230 kernel/task_work.c:203
 exit_task_work include/linux/task_work.h:39 [inline]
 do_exit+0x9fb/0x2410 kernel/exit.c:871
 do_group_exit+0x210/0x2d0 kernel/exit.c:1021
 __do_sys_exit_group kernel/exit.c:1032 [inline]
 __se_sys_exit_group kernel/exit.c:1030 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1030
 x64_sys_call+0x7b4/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f28b1b8e169
Code: Unable to access opcode bytes at 0x7f28b1b8e13f.
RSP: 002b:00007ffe174710a8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f28b1c10879 RCX: 00007f28b1b8e169
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000002 R08: 00007ffe1746ee47 R09: 00007ffe17472360
R10: 0000000000000009 R11: 0000000000000246 R12: 00007ffe17472360
R13: 00007f28b1c10854 R14: 000000000000dae5 R15: 00007ffe17474520
 </TASK>

Allocated by task 569:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x25/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x72/0x80 mm/kasan/common.c:328
 kasan_slab_alloc include/linux/kasan.h:201 [inline]
 slab_post_alloc_hook+0x4f/0x2c0 mm/slab.h:737
 slab_alloc_node mm/slub.c:3398 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x104/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_lookup+0x366/0xab0 fs/f2fs/namei.c:487
 __lookup_slow+0x2a3/0x3d0 fs/namei.c:1690
 lookup_slow+0x57/0x70 fs/namei.c:1707
 walk_component+0x2e6/0x410 fs/namei.c:1998
 lookup_last fs/namei.c:2455 [inline]
 path_lookupat+0x180/0x490 fs/namei.c:2479
 filename_lookup+0x1f0/0x500 fs/namei.c:2508
 vfs_statx+0x10b/0x660 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3424 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xd5/0x350 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x393/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 13:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x31/0x50 mm/kasan/generic.c:516
 ____kasan_slab_free+0x132/0x180 mm/kasan/common.c:236
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:244
 kasan_slab_free include/linux/kasan.h:177 [inline]
 slab_free_hook mm/slub.c:1724 [inline]
 slab_free_freelist_hook+0xc2/0x190 mm/slub.c:1750
 slab_free mm/slub.c:3661 [inline]
 kmem_cache_free+0x12d/0x2a0 mm/slub.c:3683
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1562
 i_callback+0x4c/0x70 fs/inode.c:250
 rcu_do_batch+0x503/0xb80 kernel/rcu/tree.c:2297
 rcu_core+0x5a2/0xe70 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x178/0x500 kernel/softirq.c:578
 run_ksoftirqd+0x28/0x30 kernel/softirq.c:945
 smpboot_thread_fn+0x45a/0x8c0 kernel/smpboot.c:164
 kthread+0x270/0x310 kernel/kthread.c:376
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Last potentially related work creation:
 kasan_save_stack+0x3a/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb6/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 call_rcu+0xd4/0xf70 kernel/rcu/tree.c:2845
 destroy_inode fs/inode.c:316 [inline]
 evict+0x7da/0x870 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x62b/0x830 fs/inode.c:1860
 do_unlinkat+0x356/0x540 fs/namei.c:4397
 __do_sys_unlink fs/namei.c:4438 [inline]
 __se_sys_unlink fs/namei.c:4436 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4436
 x64_sys_call+0x958/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x4c/0xa0 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff88812d961f20
 which belongs to the cache f2fs_inode_cache of size 1200
The buggy address is located 856 bytes inside of
 1200-byte region [ffff88812d961f20ffff88812d9623d0)

The buggy address belongs to the physical page:
page:ffffea0004b65800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12d960
head:ffffea0004b65800 order:2 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff88810a94c500
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Reclaimable, gfp_mask 0x1d2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_RECLAIMABLE), pid 569, tgid 568 (syz.2.16), ts 55943246141, free_ts 0
 set_page_owner include/linux/page_owner.h:31 [inline]
 post_alloc_hook+0x1d0/0x1f0 mm/page_alloc.c:2532
 prep_new_page mm/page_alloc.c:2539 [inline]
 get_page_from_freelist+0x2e63/0x2ef0 mm/page_alloc.c:4328
 __alloc_pages+0x235/0x4b0 mm/page_alloc.c:5605
 alloc_slab_page include/linux/gfp.h:-1 [inline]
 allocate_slab mm/slub.c:1939 [inline]
 new_slab+0xec/0x4b0 mm/slub.c:1992
 ___slab_alloc+0x6f6/0xb50 mm/slub.c:3180
 __slab_alloc+0x5e/0xa0 mm/slub.c:3279
 slab_alloc_node mm/slub.c:3364 [inline]
 slab_alloc mm/slub.c:3406 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3413 [inline]
 kmem_cache_alloc_lru+0x13f/0x220 mm/slub.c:3429
 alloc_inode_sb include/linux/fs.h:3245 [inline]
 f2fs_alloc_inode+0x2d/0x340 fs/f2fs/super.c:1419
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x186/0x880 fs/inode.c:1373
 f2fs_iget+0x55/0x4c60 fs/f2fs/inode.c:483
 f2fs_fill_super+0x3ad7/0x6bb0 fs/f2fs/super.c:4293
 mount_bdev+0x2ae/0x3e0 fs/super.c:1443
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4642
 legacy_get_tree+0xea/0x190 fs/fs_context.c:632
 vfs_get_tree+0x89/0x260 fs/super.c:1573
 do_new_mount+0x25a/0xa20 fs/namespace.c:3056
page_owner free stack trace missing

Memory state around the buggy address:
 ffff88812d962100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88812d962200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                                                ^
 ffff88812d962280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812d962300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[1] https://syzkaller.appspot.com/x/report.txt?x=13448368580000

This bug can be reproduced w/ the reproducer [2], once we enable
CONFIG_F2FS_CHECK_FS config, the reproducer will trigger panic as below,
so the direct reason of this bug is the same as the one below patch [3]
fixed.

kernel BUG at fs/f2fs/inode.c:857!
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20
Call Trace:
 <TASK>
 evict+0x32a/0x7a0
 do_unlinkat+0x37b/0x5b0
 __x64_sys_unlink+0xad/0x100
 do_syscall_64+0x5a/0xb0
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0010:f2fs_evict_inode+0x1204/0x1a20

[2] https://syzkaller.appspot.com/x/repro.c?x=17495ccc580000
[3] https://lore.kernel.org/linux-f2fs-devel/20250702120321.1080759-1-chao@kernel.org

Tracepoints before panic:

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file1
f2fs_unlink_exit: dev = (7,0), ino = 7, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 7, pino = 3, i_mode = 0x81ed, i_size = 10, i_nlink = 0, i_blocks = 0, i_advise = 0x0
f2fs_truncate_node: dev = (7,0), ino = 7, nid = 8, block_address = 0x3c05

f2fs_unlink_enter: dev = (7,0), dir ino = 3, i_size = 4096, i_blocks = 8, name = file3
f2fs_unlink_exit: dev = (7,0), ino = 8, ret = 0
f2fs_evict_inode: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 9000, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate: dev = (7,0), ino = 8, pino = 3, i_mode = 0x81ed, i_size = 0, i_nlink = 0, i_blocks = 24, i_advise = 0x4
f2fs_truncate_blocks_enter: dev = (7,0), ino = 8, i_size = 0, i_blocks = 24, start file offset = 0
f2fs_truncate_blocks_exit: dev = (7,0), ino = 8, ret = -2

The root cause is: in the fuzzed image, dnode #8 belongs to inode #7,
after inode #7 eviction, dnode #8 was dropped.

However there is dirent that has ino #8, so, once we unlink file3, in
f2fs_evict_inode(), both f2fs_truncate() and f2fs_update_inode_page()
will fail due to we can not load node #8, result in we missed to call
f2fs_inode_synced() to clear inode dirty status.

Let's fix this by calling f2fs_inode_synced() in error path of
f2fs_evict_inode().

PS: As I verified, the reproducer [2] can trigger this bug in v6.1.129,
but it failed in v6.16-rc4, this is because the testcase will stop due to
other corruption has been detected by f2fs:

F2FS-fs (loop0): inconsistent node block, node_type:2, nid:8, node_footer[nid:8,ino:8,ofs:0,cpver:5013063228981249506,blkaddr:15366]
F2FS-fs (loop0): f2fs_lookup: inode (ino=9) has zero i_nlink

Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/x/report.txt?x=13448368580000
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
5 weeks agof2fs: fix to avoid UAF in f2fs_sync_inode_meta()
Chao Yu [Tue, 8 Jul 2025 09:53:39 +0000 (17:53 +0800)]
f2fs: fix to avoid UAF in f2fs_sync_inode_meta()

syzbot reported an UAF issue as below: [1] [2]

[1] https://syzkaller.appspot.com/text?tag=CrashReport&x=16594c60580000

==================================================================
BUG: KASAN: use-after-free in __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
Read of size 8 at addr ffff888100567dc8 by task kworker/u4:0/8

CPU: 1 PID: 8 Comm: kworker/u4:0 Tainted: G        W          6.1.129-syzkaller-00017-g642656a36791 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: writeback wb_workfn (flush-7:0)
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x151/0x1b7 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:316 [inline]
 print_report+0x158/0x4e0 mm/kasan/report.c:427
 kasan_report+0x13c/0x170 mm/kasan/report.c:531
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report_generic.c:351
 __list_del_entry_valid+0xa6/0x130 lib/list_debug.c:62
 __list_del_entry include/linux/list.h:134 [inline]
 list_del_init include/linux/list.h:206 [inline]
 f2fs_inode_synced+0x100/0x2e0 fs/f2fs/super.c:1553
 f2fs_update_inode+0x72/0x1c40 fs/f2fs/inode.c:588
 f2fs_update_inode_page+0x135/0x170 fs/f2fs/inode.c:706
 f2fs_write_inode+0x416/0x790 fs/f2fs/inode.c:734
 write_inode fs/fs-writeback.c:1460 [inline]
 __writeback_single_inode+0x4cf/0xb80 fs/fs-writeback.c:1677
 writeback_sb_inodes+0xb32/0x1910 fs/fs-writeback.c:1903
 __writeback_inodes_wb+0x118/0x3f0 fs/fs-writeback.c:1974
 wb_writeback+0x3da/0xa00 fs/fs-writeback.c:2081
 wb_check_background_flush fs/fs-writeback.c:2151 [inline]
 wb_do_writeback fs/fs-writeback.c:2239 [inline]
 wb_workfn+0xbba/0x1030 fs/fs-writeback.c:2266
 process_one_work+0x73d/0xcb0 kernel/workqueue.c:2299
 worker_thread+0xa60/0x1260 kernel/workqueue.c:2446
 kthread+0x26d/0x300 kernel/kthread.c:386
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
 </TASK>

Allocated by task 298:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_alloc_info+0x1f/0x30 mm/kasan/generic.c:505
 __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:333
 kasan_slab_alloc include/linux/kasan.h:202 [inline]
 slab_post_alloc_hook+0x53/0x2c0 mm/slab.h:768
 slab_alloc_node mm/slub.c:3421 [inline]
 slab_alloc mm/slub.c:3431 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3438 [inline]
 kmem_cache_alloc_lru+0x102/0x270 mm/slub.c:3454
 alloc_inode_sb include/linux/fs.h:3255 [inline]
 f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x18c/0x7e0 fs/inode.c:1373
 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486
 f2fs_lookup+0x3c1/0xb50 fs/f2fs/namei.c:484
 __lookup_slow+0x2b9/0x3e0 fs/namei.c:1689
 lookup_slow+0x5a/0x80 fs/namei.c:1706
 walk_component+0x2e7/0x410 fs/namei.c:1997
 lookup_last fs/namei.c:2454 [inline]
 path_lookupat+0x16d/0x450 fs/namei.c:2478
 filename_lookup+0x251/0x600 fs/namei.c:2507
 vfs_statx+0x107/0x4b0 fs/stat.c:229
 vfs_fstatat fs/stat.c:267 [inline]
 vfs_lstat include/linux/fs.h:3434 [inline]
 __do_sys_newlstat fs/stat.c:423 [inline]
 __se_sys_newlstat+0xda/0x7c0 fs/stat.c:417
 __x64_sys_newlstat+0x5b/0x70 fs/stat.c:417
 x64_sys_call+0x52/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:7
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

Freed by task 0:
 kasan_save_stack mm/kasan/common.c:45 [inline]
 kasan_set_track+0x4b/0x70 mm/kasan/common.c:52
 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:516
 ____kasan_slab_free+0x131/0x180 mm/kasan/common.c:241
 __kasan_slab_free+0x11/0x20 mm/kasan/common.c:249
 kasan_slab_free include/linux/kasan.h:178 [inline]
 slab_free_hook mm/slub.c:1745 [inline]
 slab_free_freelist_hook mm/slub.c:1771 [inline]
 slab_free mm/slub.c:3686 [inline]
 kmem_cache_free+0x291/0x560 mm/slub.c:3711
 f2fs_free_inode+0x24/0x30 fs/f2fs/super.c:1584
 i_callback+0x4b/0x70 fs/inode.c:250
 rcu_do_batch+0x552/0xbe0 kernel/rcu/tree.c:2297
 rcu_core+0x502/0xf40 kernel/rcu/tree.c:2557
 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2574
 handle_softirqs+0x1db/0x650 kernel/softirq.c:624
 __do_softirq kernel/softirq.c:662 [inline]
 invoke_softirq kernel/softirq.c:479 [inline]
 __irq_exit_rcu+0x52/0xf0 kernel/softirq.c:711
 irq_exit_rcu+0x9/0x10 kernel/softirq.c:723
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1118 [inline]
 sysvec_apic_timer_interrupt+0xa9/0xc0 arch/x86/kernel/apic/apic.c:1118
 asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:691

Last potentially related work creation:
 kasan_save_stack+0x3b/0x60 mm/kasan/common.c:45
 __kasan_record_aux_stack+0xb4/0xc0 mm/kasan/generic.c:486
 kasan_record_aux_stack_noalloc+0xb/0x10 mm/kasan/generic.c:496
 __call_rcu_common kernel/rcu/tree.c:2807 [inline]
 call_rcu+0xdc/0x10f0 kernel/rcu/tree.c:2926
 destroy_inode fs/inode.c:316 [inline]
 evict+0x87d/0x930 fs/inode.c:720
 iput_final fs/inode.c:1834 [inline]
 iput+0x616/0x690 fs/inode.c:1860
 do_unlinkat+0x4e1/0x920 fs/namei.c:4396
 __do_sys_unlink fs/namei.c:4437 [inline]
 __se_sys_unlink fs/namei.c:4435 [inline]
 __x64_sys_unlink+0x49/0x50 fs/namei.c:4435
 x64_sys_call+0x289/0x9a0 arch/x86/include/generated/asm/syscalls_64.h:88
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x3b/0x80 arch/x86/entry/common.c:81
 entry_SYSCALL_64_after_hwframe+0x68/0xd2

The buggy address belongs to the object at ffff888100567a10
 which belongs to the cache f2fs_inode_cache of size 1360
The buggy address is located 952 bytes inside of
 1360-byte region [ffff888100567a10ffff888100567f60)

The buggy address belongs to the physical page:
page:ffffea0004015800 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100560
head:ffffea0004015800 order:3 compound_mapcount:0 compound_pincount:0
flags: 0x4000000000010200(slab|head|zone=1)
raw: 4000000000010200 0000000000000000 dead000000000122 ffff8881002c4d80
raw: 0000000000000000 0000000080160016 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Reclaimable, gfp_mask 0xd2050(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_RECLAIMABLE), pid 298, tgid 298 (syz-executor330), ts 26489303743, free_ts 0
 set_page_owner include/linux/page_owner.h:33 [inline]
 post_alloc_hook+0x213/0x220 mm/page_alloc.c:2637
 prep_new_page+0x1b/0x110 mm/page_alloc.c:2644
 get_page_from_freelist+0x3a98/0x3b10 mm/page_alloc.c:4539
 __alloc_pages+0x234/0x610 mm/page_alloc.c:5837
 alloc_slab_page+0x6c/0xf0 include/linux/gfp.h:-1
 allocate_slab mm/slub.c:1962 [inline]
 new_slab+0x90/0x3e0 mm/slub.c:2015
 ___slab_alloc+0x6f9/0xb80 mm/slub.c:3203
 __slab_alloc+0x5d/0xa0 mm/slub.c:3302
 slab_alloc_node mm/slub.c:3387 [inline]
 slab_alloc mm/slub.c:3431 [inline]
 __kmem_cache_alloc_lru mm/slub.c:3438 [inline]
 kmem_cache_alloc_lru+0x149/0x270 mm/slub.c:3454
 alloc_inode_sb include/linux/fs.h:3255 [inline]
 f2fs_alloc_inode+0x2d/0x350 fs/f2fs/super.c:1437
 alloc_inode fs/inode.c:261 [inline]
 iget_locked+0x18c/0x7e0 fs/inode.c:1373
 f2fs_iget+0x55/0x4ca0 fs/f2fs/inode.c:486
 f2fs_fill_super+0x5360/0x6dc0 fs/f2fs/super.c:4488
 mount_bdev+0x282/0x3b0 fs/super.c:1445
 f2fs_mount+0x34/0x40 fs/f2fs/super.c:4743
 legacy_get_tree+0xf1/0x190 fs/fs_context.c:632
page_owner free stack trace missing

Memory state around the buggy address:
 ffff888100567c80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888100567d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888100567d80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                              ^
 ffff888100567e00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888100567e80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

[2] https://syzkaller.appspot.com/text?tag=CrashLog&x=13654c60580000

[   24.675720][   T28] audit: type=1400 audit(1745327318.732:72): avc:  denied  { write } for  pid=298 comm="syz-executor399" name="/" dev="loop0" ino=3 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.705426][  T296] ------------[ cut here ]------------
[   24.706608][   T28] audit: type=1400 audit(1745327318.732:73): avc:  denied  { remove_name } for  pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.711550][  T296] WARNING: CPU: 0 PID: 296 at fs/f2fs/inode.c:847 f2fs_evict_inode+0x1262/0x1540
[   24.734141][   T28] audit: type=1400 audit(1745327318.732:74): avc:  denied  { rename } for  pid=298 comm="syz-executor399" name="file0" dev="loop0" ino=4 scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.742969][  T296] Modules linked in:
[   24.765201][   T28] audit: type=1400 audit(1745327318.732:75): avc:  denied  { add_name } for  pid=298 comm="syz-executor399" name="bus" scontext=root:sysadm_r:sysadm_t tcontext=system_u:object_r:unlabeled_t tclass=dir permissive=1
[   24.768847][  T296] CPU: 0 PID: 296 Comm: syz-executor399 Not tainted 6.1.129-syzkaller-00017-g642656a36791 #0
[   24.799506][  T296] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
[   24.809401][  T296] RIP: 0010:f2fs_evict_inode+0x1262/0x1540
[   24.815018][  T296] Code: 34 70 4a ff eb 0d e8 2d 70 4a ff 4d 89 e5 4c 8b 64 24 18 48 8b 5c 24 28 4c 89 e7 e8 78 38 03 00 e9 84 fc ff ff e8 0e 70 4a ff <0f> 0b 4c 89 f7 be 08 00 00 00 e8 7f 21 92 ff f0 41 80 0e 04 e9 61
[   24.834584][  T296] RSP: 0018:ffffc90000db7a40 EFLAGS: 00010293
[   24.840465][  T296] RAX: ffffffff822aca42 RBX: 0000000000000002 RCX: ffff888110948000
[   24.848291][  T296] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
[   24.856064][  T296] RBP: ffffc90000db7bb0 R08: ffffffff822ac6a8 R09: ffffed10200b005d
[   24.864073][  T296] R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888100580000
[   24.871812][  T296] R13: dffffc0000000000 R14: ffff88810fef4078 R15: 1ffff920001b6f5c

The root cause is w/ a fuzzed image, f2fs may missed to clear FI_DIRTY_INODE
flag for target inode, after f2fs_evict_inode(), the inode is still linked in
sbi->inode_list[DIRTY_META] global list, once it triggers checkpoint,
f2fs_sync_inode_meta() may access the released inode.

In f2fs_evict_inode(), let's always call f2fs_inode_synced() to clear
FI_DIRTY_INODE flag and drop inode from global dirty list to avoid this
UAF issue.

Fixes: 0f18b462b2e5 ("f2fs: flush inode metadata when checkpoint is doing")
Closes: https://syzkaller.appspot.com/bug?extid=849174b2efaf0d8be6ba
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
5 weeks agof2fs: doc: fix wrong quota mount option description
Chao Yu [Wed, 2 Jul 2025 06:49:25 +0000 (14:49 +0800)]
f2fs: doc: fix wrong quota mount option description

We should use "{usr,grp,prj}jquota=" to disable journaled quota,
rather than using off{usr,grp,prj}jquota.

Fixes: 4b2414d04e99 ("f2fs: support journalled quota")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
5 weeks agof2fs: use kfree() instead of kvfree() to free some memory
Jiazi Li [Thu, 3 Jul 2025 06:13:04 +0000 (14:13 +0800)]
f2fs: use kfree() instead of kvfree() to free some memory

options in f2fs_fill_super is alloc by kstrdup:
options = kstrdup((const char *)data, GFP_KERNEL)
sit_bitmap[_mir], nat_bitmap[_mir] are alloc by kmemdup:
sit_i->sit_bitmap = kmemdup(src_bitmap, sit_bitmap_size, GFP_KERNEL);
sit_i->sit_bitmap_mir = kmemdup(src_bitmap,
sit_bitmap_size, GFP_KERNEL);
nm_i->nat_bitmap = kmemdup(version_bitmap, nm_i->bitmap_size,
GFP_KERNEL);
nm_i->nat_bitmap_mir = kmemdup(version_bitmap, nm_i->bitmap_size,
GFP_KERNEL);
write_io is alloc by f2fs_kmalloc:
sbi->write_io[i] = f2fs_kmalloc(sbi,
array_size(n, sizeof(struct f2fs_bio_info))

Use kfree is more efficient.

Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
Signed-off-by: peixuan.qiu <peixuan.qiu@transsion.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: fix to use f2fs_is_valid_blkaddr_raw() in do_write_page()
Chao Yu [Tue, 1 Jul 2025 09:26:10 +0000 (17:26 +0800)]
f2fs: fix to use f2fs_is_valid_blkaddr_raw() in do_write_page()

As syzbot reported as below:

F2FS-fs (loop9): inject invalid blkaddr in f2fs_is_valid_blkaddr of do_write_page+0x277/0xb10 fs/f2fs/segment.c:3956
------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.c:3957!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 10538 Comm: syz-executor Not tainted 6.16.0-rc3-next-20250627-syzkaller #0 PREEMPT(full)
Call Trace:
 <TASK>
 f2fs_outplace_write_data+0x11a/0x220 fs/f2fs/segment.c:4017
 f2fs_do_write_data_page+0x12ea/0x1a40 fs/f2fs/data.c:2752
 f2fs_write_single_data_page+0xa68/0x1680 fs/f2fs/data.c:2851
 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
 __f2fs_write_data_pages fs/f2fs/data.c:3282 [inline]
 f2fs_write_data_pages+0x195b/0x3000 fs/f2fs/data.c:3309
 do_writepages+0x32b/0x550 mm/page-writeback.c:2636
 filemap_fdatawrite_wbc mm/filemap.c:386 [inline]
 __filemap_fdatawrite_range mm/filemap.c:419 [inline]
 __filemap_fdatawrite mm/filemap.c:425 [inline]
 filemap_fdatawrite+0x199/0x240 mm/filemap.c:430
 f2fs_sync_dirty_inodes+0x31f/0x830 fs/f2fs/checkpoint.c:1108
 block_operations fs/f2fs/checkpoint.c:1247 [inline]
 f2fs_write_checkpoint+0x95a/0x1df0 fs/f2fs/checkpoint.c:1638
 kill_f2fs_super+0x2c3/0x6c0 fs/f2fs/super.c:5081
 deactivate_locked_super+0xb9/0x130 fs/super.c:474
 cleanup_mnt+0x425/0x4c0 fs/namespace.c:1417
 task_work_run+0x1d4/0x260 kernel/task_work.c:227
 resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
 exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:114
 exit_to_user_mode_prepare include/linux/entry-common.h:330 [inline]
 syscall_exit_to_user_mode_work include/linux/entry-common.h:414 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:449 [inline]
 do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

If we inject block address fault, it may trigger kernel panic, we need
to use f2fs_is_valid_blkaddr_raw() instead of f2fs_is_valid_blkaddr()
in do_write_page() to avoid such issue.

Fixes: 70b6e8500431 ("f2fs: do sanity check on fio.new_blkaddr in do_write_page()")
Reported-by: syzbot+9201a61c060513d4be38@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/68639520.a70a0220.3b7e22.17e6.GAE@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: avoid splitting bio when reading multiple pages
Jianan Huang [Mon, 30 Jun 2025 12:57:53 +0000 (20:57 +0800)]
f2fs: avoid splitting bio when reading multiple pages

When fewer pages are read, nr_pages may be smaller than nr_cpages. Due
to the nr_vecs limit, the compressed pages will be split into multiple
bios and then merged at the block level. In this case, nr_cpages should
be used to pre-allocate bvecs.
To handle this case, align max_nr_pages to cluster_size, which should be
enough for all compressed pages.

Signed-off-by: Jianan Huang <huangjianan@xiaomi.com>
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: check the generic conditions first
Jaegeuk Kim [Mon, 30 Jun 2025 16:06:09 +0000 (16:06 +0000)]
f2fs: check the generic conditions first

Let's return errors caught by the generic checks. This fixes generic/494 where
it expects to see EBUSY by setattr_prepare instead of EINVAL by f2fs for active
swapfile.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: don't allow unaligned truncation to smaller/equal size on pinned file
wangzijie [Tue, 24 Jun 2025 03:59:38 +0000 (11:59 +0800)]
f2fs: don't allow unaligned truncation to smaller/equal size on pinned file

To prevent scattered pin block generation, don't allow non-section aligned truncation
to smaller or equal size on pinned file. But for truncation to larger size, after
commit 3fdd89b452c2("f2fs: prevent writing without fallocate() for pinned files"),
we only support overwrite IO to pinned file, so we don't need to consider
attr->ia_size > i_size case.

Signed-off-by: wangzijie <wangzijie1@honor.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: fix to check upper boundary for gc_no_zoned_gc_percent
Chao Yu [Fri, 27 Jun 2025 02:38:18 +0000 (10:38 +0800)]
f2fs: fix to check upper boundary for gc_no_zoned_gc_percent

This patch adds missing upper boundary check while setting
gc_no_zoned_gc_percent via sysfs.

Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: fix to check upper boundary for gc_valid_thresh_ratio
Chao Yu [Fri, 27 Jun 2025 02:38:17 +0000 (10:38 +0800)]
f2fs: fix to check upper boundary for gc_valid_thresh_ratio

This patch adds missing upper boundary check while setting
gc_valid_thresh_ratio via sysfs.

Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: account and print more stats during recovery
Chao Yu [Fri, 27 Jun 2025 02:59:43 +0000 (10:59 +0800)]
f2fs: account and print more stats during recovery

F2FS-fs (vdc): f2fs_recover_fsync_data: recovery fsync data, check_only: 0
F2FS-fs (vdc): do_recover_data: start to recover dnode
F2FS-fs (vdc): recover_inode: ino = 5, name = testfile.t2, inline = 21
F2FS-fs (vdc): recover_data: ino = 5, nid = 5 (i_size: recover), range (0, 864), recovered = 1, err = 0
F2FS-fs (vdc): do_recover_data: dnode: (recoverable: 256, fsynced: 256, total: 256), recovered: (inode: 256, dentry: 1, dnode: 256), err: 0

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: enable tuning of boost_zoned_gc_percent via sysfs
yohan.joung [Wed, 25 Jun 2025 00:13:35 +0000 (09:13 +0900)]
f2fs: enable tuning of boost_zoned_gc_percent via sysfs

to allow users to dynamically tune
the boost_zoned_gc_percent parameter

Signed-off-by: yohan.joung <yohan.joung@sk.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: fix to check upper boundary for value of gc_boost_zoned_gc_percent
yohan.joung [Wed, 25 Jun 2025 00:14:07 +0000 (09:14 +0900)]
f2fs: fix to check upper boundary for value of gc_boost_zoned_gc_percent

to check the upper boundary when setting gc_boost_zoned_gc_percent

Fixes: 9a481a1c16f4 ("f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent")
Signed-off-by: yohan.joung <yohan.joung@sk.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
6 weeks agof2fs: fix KMSAN uninit-value in extent_info usage
Abinash Singh [Wed, 25 Jun 2025 11:05:37 +0000 (16:35 +0530)]
f2fs: fix KMSAN uninit-value in extent_info usage

KMSAN reported a use of uninitialized value in `__is_extent_mergeable()`
 and `__is_back_mergeable()` via the read extent tree path.

The root cause is that `get_read_extent_info()` only initializes three
fields (`fofs`, `blk`, `len`) of `struct extent_info`, leaving the
remaining fields uninitialized. This leads to undefined behavior
when those fields are accessed later, especially during
extent merging.

Fix it by zero-initializing the `extent_info` struct before population.

Reported-by: syzbot+b8c1d60e95df65e827d4@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b8c1d60e95df65e827d4
Fixes: 94afd6d6e525 ("f2fs: extent cache: support unaligned extent")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Abinash Singh <abinashsinghlalotra@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: convert F2FS_I_SB to sbi in f2fs_setattr()
wangzijie [Tue, 24 Jun 2025 03:59:37 +0000 (11:59 +0800)]
f2fs: convert F2FS_I_SB to sbi in f2fs_setattr()

Introduce sbi in f2fs_setattr() and convert F2FS_I_SB to it. No logic
change, just cleanup and prepare to get CAP_BLKS_PER_SEC(sbi).

Signed-off-by: wangzijie <wangzijie1@honor.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: Fix the typos in comments
Swarna Prabhu [Tue, 17 Jun 2025 17:40:47 +0000 (17:40 +0000)]
f2fs: Fix the typos in comments

This patch fixes minor typos in comments in f2fs.

Signed-off-by: Swarna Prabhu <s.prabhu@samsung.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic
Zhiguo Niu [Fri, 13 Jun 2025 01:50:45 +0000 (09:50 +0800)]
f2fs: compress: fix UAF of f2fs_inode_info in f2fs_free_dic

The decompress_io_ctx may be released asynchronously after
I/O completion. If this file is deleted immediately after read,
and the kworker of processing post_read_wq has not been executed yet
due to high workloads, It is possible that the inode(f2fs_inode_info)
is evicted and freed before it is used f2fs_free_dic.

    The UAF case as below:
    Thread A                                      Thread B
    - f2fs_decompress_end_io
     - f2fs_put_dic
      - queue_work
        add free_dic work to post_read_wq
                                                   - do_unlink
                                                    - iput
                                                     - evict
                                                      - call_rcu
    This file is deleted after read.

    Thread C                                 kworker to process post_read_wq
    - rcu_do_batch
     - f2fs_free_inode
      - kmem_cache_free
     inode is freed by rcu
                                             - process_scheduled_works
                                              - f2fs_late_free_dic
                                               - f2fs_free_dic
                                                - f2fs_release_decomp_mem
                                      read (dic->inode)->i_compress_algorithm

This patch store compress_algorithm and sbi in dic to avoid inode UAF.

In addition, the previous solution is deprecated in [1] may cause system hang.
[1] https://lore.kernel.org/all/c36ab955-c8db-4a8b-a9d0-f07b5f426c3f@kernel.org

Cc: Daeho Jeong <daehojeong@google.com>
Fixes: bff139b49d9f ("f2fs: handle decompress only post processing in softirq")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Baocong Liu <baocong.liu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: compress: change the first parameter of page_array_{alloc,free} to sbi
Zhiguo Niu [Fri, 13 Jun 2025 01:50:44 +0000 (09:50 +0800)]
f2fs: compress: change the first parameter of page_array_{alloc,free} to sbi

No logic changes, just cleanup and prepare for fixing the UAF issue
in f2fs_free_dic.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Baocong Liu <baocong.liu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: introduce reserved_pin_section sysfs entry
Chao Yu [Fri, 13 Jun 2025 05:51:09 +0000 (13:51 +0800)]
f2fs: introduce reserved_pin_section sysfs entry

This patch introduces /sys/fs/f2fs/<dev>/reserved_pin_section for tuning
@needed parameter of has_not_enough_free_secs(), if we configure it w/
zero, it can avoid f2fs_gc() as much as possible while fallocating on
pinned file.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: wangzijie <wangzijie1@honor.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: fix to avoid invalid wait context issue
Chao Yu [Wed, 11 Jun 2025 08:42:18 +0000 (16:42 +0800)]
f2fs: fix to avoid invalid wait context issue

=============================
[ BUG: Invalid wait context ]
6.13.0-rc1 #84 Tainted: G           O
-----------------------------
cat/56160 is trying to lock:
ffff888105c86648 (&cprc->stat_lock){+.+.}-{3:3}, at: update_general_status+0x32a/0x8c0 [f2fs]
other info that might help us debug this:
context-{5:5}
2 locks held by cat/56160:
 #0: ffff88810a002a98 (&p->lock){+.+.}-{4:4}, at: seq_read_iter+0x56/0x4c0
 #1: ffffffffa0462638 (f2fs_stat_lock){....}-{2:2}, at: stat_show+0x29/0x1020 [f2fs]
stack backtrace:
CPU: 0 UID: 0 PID: 56160 Comm: cat Tainted: G           O       6.13.0-rc1 #84
Tainted: [O]=OOT_MODULE
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 <TASK>
 dump_stack_lvl+0x88/0xd0
 dump_stack+0x14/0x20
 __lock_acquire+0x8d4/0xbb0
 lock_acquire+0xd6/0x300
 _raw_spin_lock+0x38/0x50
 update_general_status+0x32a/0x8c0 [f2fs]
 stat_show+0x50/0x1020 [f2fs]
 seq_read_iter+0x116/0x4c0
 seq_read+0xfa/0x130
 full_proxy_read+0x66/0x90
 vfs_read+0xc4/0x350
 ksys_read+0x74/0xf0
 __x64_sys_read+0x1d/0x20
 x64_sys_call+0x17d9/0x1b80
 do_syscall_64+0x68/0x130
 entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f2ca53147e2

- seq_read
 - stat_show
  - raw_spin_lock_irqsave(&f2fs_stat_lock, flags)
  : f2fs_stat_lock is raw_spinlock_t type variable
  - update_general_status
   - spin_lock(&sbi->cprc_info.stat_lock);
   : stat_lock is spinlock_t type variable

The root cause is the lock order is incorrect [1], we should not acquire
spinlock_t lock after raw_spinlock_t lock, as if CONFIG_PREEMPT_LOCK is
on, spinlock_t is implemented based on rtmutex, which can sleep after
holding the lock.

To fix this issue, let's use change f2fs_stat_lock lock type from
raw_spinlock_t to spinlock_t, it's safe due to:
- we don't need to use raw version of spinlock as the path is not
performance sensitive.
- we don't need to use irqsave version of spinlock as it won't be
used in irq context.

Quoted from [1]:

"Extend lockdep to validate lock wait-type context.

The current wait-types are:

LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep()."

[1] https://lore.kernel.org/all/20200321113242.427089655@linutronix.de

Fixes: 98237fcda4a2 ("f2fs: use spin_lock to avoid hang")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: fix bio memleak when committing super block
Sheng Yong [Sat, 7 Jun 2025 06:41:16 +0000 (14:41 +0800)]
f2fs: fix bio memleak when committing super block

When committing new super block, bio is allocated but not freed, and
kmemleak complains:

  unreferenced object 0xffff88801d185600 (size 192):
    comm "kworker/3:2", pid 128, jiffies 4298624992
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 80 67 c3 00 81 88 ff ff  .........g......
      01 08 06 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
    backtrace (crc 650ecdb1):
      kmem_cache_alloc_noprof+0x3a9/0x460
      mempool_alloc_noprof+0x12f/0x310
      bio_alloc_bioset+0x1e2/0x7e0
      __f2fs_commit_super+0xe0/0x370
      f2fs_commit_super+0x4ed/0x8c0
      f2fs_record_error_work+0xc7/0x190
      process_one_work+0x7db/0x1970
      worker_thread+0x518/0xea0
      kthread+0x359/0x690
      ret_from_fork+0x34/0x70
      ret_from_fork_asm+0x1a/0x30

The issue can be reproduced by:

  mount /dev/vda /mnt
  i=0
  while :; do
      echo '[h]abc' > /sys/fs/f2fs/vda/extension_list
      echo '[h]!abc' > /sys/fs/f2fs/vda/extension_list
      echo scan > /sys/kernel/debug/kmemleak
      dmesg | grep "new suspected memory leaks"
      [ $? -eq 0 ] && break
      i=$((i + 1))
      echo "$i"
  done
  umount /mnt

Fixes: 5bcde4557862 ("f2fs: get rid of buffer_head use")
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: do sanity check on fio.new_blkaddr in do_write_page()
Chao Yu [Tue, 10 Jun 2025 03:13:15 +0000 (11:13 +0800)]
f2fs: do sanity check on fio.new_blkaddr in do_write_page()

F2FS-fs (dm-55): access invalid blkaddr:972878540
Call trace:
 dump_backtrace+0xec/0x128
 show_stack+0x18/0x28
 dump_stack_lvl+0x40/0x88
 dump_stack+0x18/0x24
 __f2fs_is_valid_blkaddr+0x360/0x3b4
 f2fs_is_valid_blkaddr+0x10/0x20
 f2fs_get_node_info+0x21c/0x60c
 __write_node_page+0x15c/0x734
 f2fs_sync_node_pages+0x4f8/0x700
 f2fs_write_checkpoint+0x4a8/0x99c
 __checkpoint_and_complete_reqs+0x7c/0x20c
 issue_checkpoint_thread+0x4c/0xd8
 kthread+0x11c/0x1b0
 ret_from_fork+0x10/0x20

If f2fs_allocate_data_block() fails, we may update nat.blkaddr w/
uninitialized fio.new_blkaddr.

- __write_node_folio
 - f2fs_do_write_node_page
  - do_write_page
   - f2fs_allocate_data_block
   : once it fails, it may not allocate new blkaddr
 - set_node_addr
 : update w/ uninitialized fio.new_blkaddr variable

I've checked all error paths in f2fs_allocate_data_block(), it should
be tagged w/ CP_ERROR_FLAG.

In addition, f2fs_allocate_data_block() succeeds, fio.new_blkaddr should
be valid.

Let's add f2fs_bug_on() to check above two conditions to detect any
potential bugs.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: handle nat.blkaddr corruption in f2fs_get_node_info()
Chao Yu [Mon, 9 Jun 2025 07:27:12 +0000 (15:27 +0800)]
f2fs: handle nat.blkaddr corruption in f2fs_get_node_info()

F2FS-fs (dm-55): access invalid blkaddr:972878540
Call trace:
 dump_backtrace+0xec/0x128
 show_stack+0x18/0x28
 dump_stack_lvl+0x40/0x88
 dump_stack+0x18/0x24
 __f2fs_is_valid_blkaddr+0x360/0x3b4
 f2fs_is_valid_blkaddr+0x10/0x20
 f2fs_get_node_info+0x21c/0x60c
 __write_node_page+0x15c/0x734
 f2fs_sync_node_pages+0x4f8/0x700
 f2fs_write_checkpoint+0x4a8/0x99c
 __checkpoint_and_complete_reqs+0x7c/0x20c
 issue_checkpoint_thread+0x4c/0xd8
 kthread+0x11c/0x1b0
 ret_from_fork+0x10/0x20

If nat.blkaddr is corrupted, during checkpoint, f2fs_sync_node_pages()
will loop to flush node page w/ corrupted nat.blkaddr.

Although, it tags SBI_NEED_FSCK, checkpoint can not persist it due
to deadloop.

Let's call f2fs_handle_error(, ERROR_INCONSISTENT_NAT) to record such
error into superblock, it expects fsck can detect the error and repair
inconsistent nat.blkaddr after device reboot.

Note that, let's add sanity check in f2fs_get_node_info() to detect
in-memory nat.blkaddr inconsistency, but only if CONFIG_F2FS_CHECK_FS
is enabled.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: turn off one_time when forcibly set to foreground GC
Daeho Jeong [Fri, 6 Jun 2025 18:49:04 +0000 (11:49 -0700)]
f2fs: turn off one_time when forcibly set to foreground GC

one_time mode is only for background GC. So, we need to set it back to
false when foreground GC is enforced.

Fixes: 9748c2ddea4a ("f2fs: do FG_GC when GC boosting is required for zoned devices")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agof2fs: make sure zoned device GC to use FG_GC in shortage of free section
Daeho Jeong [Thu, 29 May 2025 22:25:32 +0000 (15:25 -0700)]
f2fs: make sure zoned device GC to use FG_GC in shortage of free section

We already use FG_GC when we have free sections under
gc_boost_zoned_gc_percent. So, let's make it consistent.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
7 weeks agoMerge tag 'for-6.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Mon, 23 Jun 2025 22:02:57 +0000 (15:02 -0700)]
Merge tag 'for-6.16/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mikulas Patocka:

 - dm-crypt: fix a crash on 32-bit machines

 - dm-raid: replace "rdev" with correct loop variable name "r"

* tag 'for-6.16/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm-raid: fix variable in journal device check
  dm-crypt: Extend state buffer size in crypt_iv_lmk_one

7 weeks agoMerge tag 'f2fs-for-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeu...
Linus Torvalds [Mon, 23 Jun 2025 21:55:40 +0000 (14:55 -0700)]
Merge tag 'f2fs-for-6.16-rc4' of git://git./linux/kernel/git/jaegeuk/f2fs

Pull f2fs fixes from Jaegeuk Kim:

 - fix double-unlock introduced by the recent folio conversion

 - fix stale page content beyond EOF complained by xfstests/generic/363

* tag 'f2fs-for-6.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
  f2fs: fix to zero post-eof page
  f2fs: Fix __write_node_folio() conversion

7 weeks agoMerge tag 'for-6.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Mon, 23 Jun 2025 18:16:38 +0000 (11:16 -0700)]
Merge tag 'for-6.16-rc3-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Fixes:

   - fix invalid inode pointer dereferences during log replay

   - fix a race between renames and directory logging

   - fix shutting down delayed iput worker

   - fix device byte accounting when dropping chunk

   - in zoned mode, fix offset calculations for DUP profile when
     conventional and sequential zones are used together

  Regression fixes:

   - fix possible double unlock of extent buffer tree (xarray
     conversion)

   - in zoned mode, fix extent buffer refcount when writing out extents
     (xarray conversion)

  Error handling fixes and updates:

   - handle unexpected extent type when replaying log

   - check and warn if there are remaining delayed inodes when putting a
     root

   - fix assertion when building free space tree

   - handle csum tree error with mount option 'rescue=ibadroot'

  Other:

   - error message updates: add prefix to all scrub related messages,
     include other information in messages"

* tag 'for-6.16-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: zoned: fix alloc_offset calculation for partly conventional block groups
  btrfs: handle csum tree error with rescue=ibadroots correctly
  btrfs: fix race between async reclaim worker and close_ctree()
  btrfs: fix assertion when building free space tree
  btrfs: don't silently ignore unexpected extent type when replaying log
  btrfs: fix invalid inode pointer dereferences during log replay
  btrfs: fix double unlock of buffer_tree xarray when releasing subpage eb
  btrfs: update superblock's device bytes_used when dropping chunk
  btrfs: fix a race between renames and directory logging
  btrfs: scrub: add prefix for the error messages
  btrfs: warn if leaking delayed_nodes in btrfs_put_root()
  btrfs: fix delayed ref refcount leak in debug assertion
  btrfs: include root in error message when unlinking inode
  btrfs: don't drop a reference if btrfs_check_write_meta_pointer() fails

7 weeks agoMerge tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Mon, 23 Jun 2025 16:20:39 +0000 (09:20 -0700)]
Merge tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git./linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "20 hotfixes. 7 are cc:stable and the remainder address post-6.15
  issues or aren't considered necessary for -stable kernels. Only 4 are
  for MM.

   - The series `Revert "bcache: update min_heap_callbacks to use
     default builtin swap"' from Kuan-Wei Chiu backs out the author's
     recent min_heap changes due to a performance regression.

     A fix for this regression has been developed but we felt it best to
     go back to the known-good version to give the new code more bake
     time.

   - A lot of MAINTAINERS maintenance.

     I like to get these changes upstreamed promptly because they can't
     break things and more accurate/complete MAINTAINERS info hopefully
     improves the speed and accuracy of our responses to submitters and
     reporters"

* tag 'mm-hotfixes-stable-2025-06-22-18-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  MAINTAINERS: add additional mmap-related files to mmap section
  MAINTAINERS: add memfd, shmem quota files to shmem section
  MAINTAINERS: add stray rmap file to mm rmap section
  MAINTAINERS: add hugetlb_cgroup.c to hugetlb section
  MAINTAINERS: add further init files to mm init block
  MAINTAINERS: update maintainers for HugeTLB
  maple_tree: fix MA_STATE_PREALLOC flag in mas_preallocate()
  MAINTAINERS: add missing test files to mm gup section
  MAINTAINERS: add missing mm/workingset.c file to mm reclaim section
  selftests/mm: skip uprobe vma merge test if uprobes are not enabled
  bcache: remove unnecessary select MIN_HEAP
  Revert "bcache: remove heap-related macros and switch to generic min_heap"
  Revert "bcache: update min_heap_callbacks to use default builtin swap"
  selftests/mm: add configs to fix testcase failure
  kho: initialize tail pages for higher order folios properly
  MAINTAINERS: add linux-mm@ list to Kexec Handover
  mm: userfaultfd: fix race of userfaultfd_move and swap cache
  mm/gup: revert "mm: gup: fix infinite loop within __get_longterm_locked"
  selftests/mm: increase timeout from 180 to 900 seconds
  mm/shmem, swap: fix softlockup with mTHP swapin

7 weeks agodm-raid: fix variable in journal device check
Heinz Mauelshagen [Tue, 10 Jun 2025 18:53:30 +0000 (20:53 +0200)]
dm-raid: fix variable in journal device check

Replace "rdev" with correct loop variable name "r".

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 63c32ed4afc2 ("dm raid: add raid4/5/6 journaling support")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
8 weeks agodm-crypt: Extend state buffer size in crypt_iv_lmk_one
Herbert Xu [Mon, 23 Jun 2025 11:11:50 +0000 (19:11 +0800)]
dm-crypt: Extend state buffer size in crypt_iv_lmk_one

Add a macro CRYPTO_MD5_STATESIZE for the Crypto API export state
size of md5 and use that in dm-crypt instead of relying on the
size of struct md5_state (the latter is currently undergoing a
transition and may shrink).

This commit fixes a crash on 32-bit machines:
Oops: Oops: 0000 [#1] SMP
CPU: 1 UID: 0 PID: 12 Comm: kworker/u16:0 Not tainted 6.16.0-rc2+ #993 PREEMPT(full)
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: kcryptd-254:0-1 kcryptd_crypt [dm_crypt]
EIP: __crypto_shash_export+0xf/0x90
Code: 4a c1 c7 40 20 a0 b4 4a c1 81 cf 0e 00 04 08 89 78 50 e9 2b ff ff ff 8d 74 26 00 55 89 e5 57 56 53 89 c3 89 d6 8b 00 8b 40 14 <8b> 50 fc f6 40 13 01 74 04 4a 2b 50 14 85 c9 74 10 89 f2 89 d8 ff
EAX: 303a3435 EBX: c3007c90 ECX: 00000000 EDX: c3007c38
ESI: c3007c38 EDI: c3007c90 EBP: c3007bfc ESP: c3007bf0
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010216
CR0: 80050033 CR2: 303a3431 CR3: 04fbe000 CR4: 00350e90
Call Trace:
 crypto_shash_export+0x65/0xc0
 crypt_iv_lmk_one+0x106/0x1a0 [dm_crypt]

Fixes: efd62c85525e ("crypto: md5-generic - Use API partial block handling")
Reported-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Milan Broz <gmazyland@gmail.com>
Closes: https://lore.kernel.org/linux-crypto/f1625ddc-e82e-4b77-80c2-dc8e45b54848@gmail.com/T/
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
8 weeks agoLinux 6.16-rc3
Linus Torvalds [Sun, 22 Jun 2025 20:30:08 +0000 (13:30 -0700)]
Linux 6.16-rc3

8 weeks agoMerge tag 'i2c-for-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sun, 22 Jun 2025 17:50:36 +0000 (10:50 -0700)]
Merge tag 'i2c-for-6.16-rc3' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - subsystem: convert drivers to use recent callbacks of struct
   i2c_algorithm A typical after-rc1 cleanup, which I couldn't send in
   time for rc2

 - tegra: fix YAML conversion of device tree bindings

 - k1: re-add a check which got lost during upstreaming

* tag 'i2c-for-6.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: k1: check for transfer error
  i2c: use inclusive callbacks in struct i2c_algorithm
  dt-bindings: i2c: nvidia,tegra20-i2c: Specify the required properties

8 weeks agoMerge tag 'x86_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Jun 2025 17:30:44 +0000 (10:30 -0700)]
Merge tag 'x86_urgent_for_v6.16_rc3' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Make sure the array tracking which kernel text positions need to be
   alternatives-patched doesn't get mishandled by out-of-order
   modifications, leading to it overflowing and causing page faults when
   patching

 - Avoid an infinite loop when early code does a ranged TLB invalidation
   before the broadcast TLB invalidation count of how many pages it can
   flush, has been read from CPUID

 - Fix a CONFIG_MODULES typo

 - Disable broadcast TLB invalidation when PTI is enabled to avoid an
   overflow of the bitmap tracking dynamic ASIDs which need to be
   flushed when the kernel switches between the user and kernel address
   space

 - Handle the case of a CPU going offline and thus reporting zeroes when
   reading top-level events in the resctrl code

* tag 'x86_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Fix int3 handling failure from broken text_poke array
  x86/mm: Fix early boot use of INVPLGB
  x86/its: Fix an ifdef typo in its_alloc()
  x86/mm: Disable INVLPGB when PTI is enabled
  x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem

8 weeks agoMerge tag 'irq_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Jun 2025 17:17:51 +0000 (10:17 -0700)]
Merge tag 'irq_urgent_for_v6.16_rc3' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Borislav Petkov:

 - Fix missing prototypes warnings

 - Properly initialize work context when allocating it

 - Remove a method tracking when managed interrupts are suspended during
   hotplug, in favor of the code using a IRQ disable depth tracking now,
   and have interrupts get properly enabled again on restore

 - Make sure multiple CPUs getting hotplugged don't cause wrong tracking
   of the managed IRQ disable depth

* tag 'irq_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/ath79-misc: Fix missing prototypes warnings
  genirq/irq_sim: Initialize work context pointers properly
  genirq/cpuhotplug: Restore affinity even for suspended IRQ
  genirq/cpuhotplug: Rebalance managed interrupts across multi-CPU hotplug

8 weeks agoMerge tag 'perf_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Jun 2025 17:11:45 +0000 (10:11 -0700)]
Merge tag 'perf_urgent_for_v6.16_rc3' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Avoid a crash on a heterogeneous machine where not all cores support
   the same hw events features

 - Avoid a deadlock when throttling events

 - Document the perf event states more

 - Make sure a number of perf paths switching off or rescheduling events
   call perf_cgroup_event_disable()

 - Make sure perf does task sampling before its userspace mapping is
   torn down, and not after

* tag 'perf_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix crash in icl_update_topdown_event()
  perf: Fix the throttle error of some clock events
  perf: Add comment to enum perf_event_state
  perf/core: Fix WARN in perf_cgroup_switch()
  perf: Fix dangling cgroup pointer in cpuctx
  perf: Fix cgroup state vs ERROR
  perf: Fix sample vs do_exit()

8 weeks agoMerge tag 'locking_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 22 Jun 2025 17:09:23 +0000 (10:09 -0700)]
Merge tag 'locking_urgent_for_v6.16_rc3' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Borislav Petkov:

 - Make sure the switch to the global hash is requested always under a
   lock so that two threads requesting that simultaneously cannot get to
   inconsistent state

 - Reject negative NUMA nodes earlier in the futex NUMA interface
   handling code

 - Selftests fixes

* tag 'locking_urgent_for_v6.16_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Verify under the lock if hash can be replaced
  futex: Handle invalid node numbers supplied by user
  selftests/futex: Set the home_node in futex_numa_mpol
  selftests/futex: getopt() requires int as return value.