linux-2.6-block.git
6 months agomm/shmem.c: Use new form of *@param in kernel-doc
Akira Yokosawa [Tue, 27 Feb 2024 05:06:48 +0000 (14:06 +0900)]
mm/shmem.c: Use new form of *@param in kernel-doc

Use the form of *@param which kernel-doc recognizes now.
This resolves the warnings from "make htmldocs" as reported in [1].

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: [1] https://lore.kernel.org/r/20240223153636.41358be5@canb.auug.org.au/
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
6 months agokernel-doc: Add unary operator * to $type_param_ref
Akira Yokosawa [Tue, 27 Feb 2024 05:03:30 +0000 (14:03 +0900)]
kernel-doc: Add unary operator * to $type_param_ref

In kernel-doc comments, unary operator * collides with Sphinx/
docutil's markdown for emphasizing.

This resulted in additional warnings from "make htmldocs":

    WARNING: Inline emphasis start-string without end-string.

, as reported recently [1].

Those have been worked around either by escaping * (like \*param) or by
using inline-literal form of ``*param``, both of which are specific
to Sphinx/docutils.

Such workarounds are against the kenrel-doc's ideal and should better
be avoided.

Instead, add "*" to the list of unary operators kernel-doc recognizes
and make the form of *@param available in kernel-doc comments.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: [1] https://lore.kernel.org/r/20240223153636.41358be5@canb.auug.org.au/
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Akira Yokosawa <akiyks@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Acked-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
6 months agoxfs: use kvfree() in xlog_cil_free_logvec()
Dave Chinner [Tue, 27 Feb 2024 03:01:26 +0000 (14:01 +1100)]
xfs: use kvfree() in xlog_cil_free_logvec()

The xfs_log_vec items are allocated by xlog_kvmalloc(), and so need
to be freed with kvfree. This was missed when coverting from the
kmem_free() API.

Fixes: 49292576136f ("xfs: convert kmem_free() for kvmalloc users to kvfree()")
Reported-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
6 months agoxfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL
Dave Chinner [Tue, 27 Feb 2024 00:05:31 +0000 (11:05 +1100)]
xfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL

This was missed in the conversion from KM* flags.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 10634530f7ba ("xfs: convert kmem_zalloc() to kzalloc()")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
7 months agoxfs: fix scrub stats file permissions
Darrick J. Wong [Sat, 24 Feb 2024 06:01:40 +0000 (22:01 -0800)]
xfs: fix scrub stats file permissions

When the kernel is in lockdown mode, debugfs will only show files that
are world-readable and cannot be written, mmaped, or used with ioctl.
That more or less describes the scrub stats file, except that the
permissions are wrong -- they should be 0444, not 0644.  You can't write
the stats file, so the 0200 makes no sense.

Meanwhile, the clear_stats file is only writable, but it got mode 0400
instead of 0200, which would make more sense.

Fix both files so that they make sense.

Fixes: d7a74cad8f451 ("xfs: track usage statistics of online fsck")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
7 months agoxfs: fix log recovery erroring out on refcount recovery failure
Darrick J. Wong [Fri, 23 Feb 2024 05:48:17 +0000 (21:48 -0800)]
xfs: fix log recovery erroring out on refcount recovery failure

Per the comment in the error case of xfs_reflink_recover_cow, zero out
any error (after shutting down the log) so that we actually kill any new
intent items that might have gotten logged by later recovery steps.
Discovered by xfs/434, which few people actually seem to run.

Fixes: 2c1e31ed5c88 ("xfs: place intent recovery under NOFS allocation context")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
7 months agoMerge tag 'symlink-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 05:09:07 +0000 (10:39 +0530)]
Merge tag 'symlink-cleanups-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: clean up symbolic link code

This series cleans up a few bits of the symbolic link code as needed for
future projects.  Online repair requires the ability to commit fixed
fork-based filesystem metadata such as directories, xattrs, and symbolic
links atomically, so we need to rearrange the symlink code before we
land the atomic extent swapping.

Accomplish this by moving the remote symlink target block code and
declarations to xfs_symlink_remote.[ch].

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'symlink-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: move symlink target write function to libxfs
  xfs: move remote symlink target read function to libxfs
  xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h

7 months agoMerge tag 'expand-bmap-intent-usage_2024-02-23' of https://git.kernel.org/pub/scm...
Chandan Babu R [Sat, 24 Feb 2024 05:06:15 +0000 (10:36 +0530)]
Merge tag 'expand-bmap-intent-usage_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: support attrfork and unwritten BUIs

In preparation for atomic extent swapping and the online repair
functionality that wants atomic extent swaps, enhance the BUI code so
that we can support deferred work on the extended attribute fork and on
unwritten extents.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'expand-bmap-intent-usage_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: xfs_bmap_finish_one should map unwritten extents properly
  xfs: support deferred bmap updates on the attr fork

7 months agoMerge tag 'realtime-bmap-intents-6.9_2024-02-23' of https://git.kernel.org/pub/scm...
Chandan Babu R [Sat, 24 Feb 2024 05:02:34 +0000 (10:32 +0530)]
Merge tag 'realtime-bmap-intents-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: widen BUI formats to support realtime

Atomic extent swapping (and later, reverse mapping and reflink) on the
realtime device needs to be able to defer file mapping and extent
freeing work in much the same manner as is required on the data volume.
Make the BUI log items operate on rt extents in preparation for atomic
swapping and realtime rmap.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'realtime-bmap-intents-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: support recovering bmap intent items targetting realtime extents
  xfs: add a realtime flag to the bmap update log redo items
  xfs: fix xfs_bunmapi to allow unmapping of partial rt extents

7 months agoMerge tag 'bmap-intent-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm...
Chandan Babu R [Sat, 24 Feb 2024 04:59:06 +0000 (10:29 +0530)]
Merge tag 'bmap-intent-cleanups-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: bmap log intent cleanups

The next major target of online repair are metadata that are persisted
in blocks mapped by a file fork.  In other words, we want to repair
directories, extended attributes, symbolic links, and the realtime free
space information.  For file-based metadata, we assume that the space
metadata is correct, which enables repair to construct new versions of
the metadata in a temporary file.  We then need to swap the file fork
mappings of the two files atomically.  With this patchset, we begin
constructing such a facility based on the existing bmap log items and a
new extent swap log item.

This series cleans up a few parts of the file block mapping log intent
code before we start adding support for realtime bmap intents.  Most of
it involves cleaning up tracepoints so that more of the data extraction
logic ends up in the tracepoint code and not the tracepoint call site,
which should reduce overhead further when tracepoints are disabled.
There is also a change to pass bmap intents all the way back to the bmap
code instead of unboxing the intent values and re-boxing them after the
_finish_one function completes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'bmap-intent-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: add a xattr_entry helper
  xfs: move xfs_bmap_defer_add to xfs_bmap_item.c
  xfs: reuse xfs_bmap_update_cancel_item
  xfs: add a bi_entry helper
  xfs: remove xfs_trans_set_bmap_flags
  xfs: clean up bmap log intent item tracepoint callsites
  xfs: split tracepoint classes for deferred items

7 months agoMerge tag 'repair-refcount-scalability-6.9_2024-02-23' of https://git.kernel.org...
Chandan Babu R [Sat, 24 Feb 2024 04:55:31 +0000 (10:25 +0530)]
Merge tag 'repair-refcount-scalability-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: reduce refcount repair memory usage

The refcountbt repair code has serious memory usage problems when the
block sharing factor of the filesystem is very high.  This can happen if
a deduplication tool has been run against the filesystem, or if the fs
stores reflinked VM images that have been aging for a long time.

Recall that the original reference counting algorithm walks the reverse
mapping records of the filesystem to generate reference counts.  For any
given block in the AG, the rmap bag structure contains the all rmap
records that cover that block; the refcount is the size of that bag.

For online repair, the bag doesn't need the owner, offset, or state flag
information, so it discards those.  This halves the record size, but the
bag structure still stores one excerpted record for each reverse
mapping.  If the sharing count is high, this will use a LOT of memory
storing redundant records.  In the extreme case, 100k mappings to the
same piece of space will consume 100k*16 bytes = 1.6M of memory.

For offline repair, the bag stores the owner values so that we know
which inodes need to be marked as being reflink inodes.  If a
deduplication tool has been run and there are many blocks within a file
pointing to the same physical space, this will stll use a lot of memory
to store redundant records.

The solution to this problem is to deduplicate the bag records when
possible by adding a reference count to the bag record, and changing the
bag add function to detect an existing record to bump the refcount.  In
the above example, the 100k mappings will now use 24 bytes of memory.
These lookups can be done efficiently with a btree, so we create a new
refcount bag btree type (inside of online repair).  This is why we
refactored the btree code in the previous patchset.

The btree conversion also dramatically reduces the runtime of the
refcount generation algorithm, because the code to delete all bag
records that end at a given agblock now only has to delete one record
instead of (using the example above) 100k records.  As an added benefit,
record deletion now gives back the unused xfile space, which it did not
do previously.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'repair-refcount-scalability-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: port refcount repair to the new refcount bag structure
  xfs: create refcount bag structure for btree repairs
  xfs: define an in-memory btree for storing refcount bag info during repairs

7 months agoMerge tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:52:15 +0000 (10:22 +0530)]
Merge tag 'repair-rmap-btree-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: online repair of rmap btrees

We have now constructed the four tools that we need to scan the
filesystem looking for reverse mappings: an inode scanner, hooks to
receive live updates from other writer threads, the ability to construct
btrees in memory, and a btree bulk loader.

This series glues those three together, enabling us to scan the
filesystem for mappings and keep it up to date while other writers run,
and then commit the new btree to disk atomically.

To reduce the size of each patch, the functionality is left disabled
until the end of the series and broken up into three patches: one to
create the mechanics of scanning the filesystem, a second to transition
to in-memory btrees, and a third to set up the live hooks.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'repair-rmap-btree-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: hook live rmap operations during a repair operation
  xfs: create a shadow rmap btree during rmap repair
  xfs: repair the rmapbt
  xfs: create agblock bitmap helper to count the number of set regions
  xfs: create a helper to decide if a file mapping targets the rt volume

7 months agoMerge tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:48:39 +0000 (10:18 +0530)]
Merge tag 'in-memory-btrees-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: support in-memory btrees

Online repair of the reverse-mapping btrees presens some unique
challenges.  To construct a new reverse mapping btree, we must scan the
entire filesystem, but we cannot afford to quiesce the entire filesystem
for the potentially lengthy scan.

For rmap btrees, therefore, we relax our requirements of totally atomic
repairs.  Instead, repairs will scan all inodes, construct a new reverse
mapping dataset, format a new btree, and commit it before anyone trips
over the corruption.  This is exactly the same strategy as was used in
the quotacheck and nlink scanners.

Unfortunately, the xfarray cannot perform key-based lookups and is
therefore unsuitable for supporting live updates.  Luckily, we already a
data structure that maintains an indexed rmap recordset -- the existing
rmap btree code!  Hence we port the existing btree and buffer target
code to be able to create a btree using the xfile we developed earlier.
Live hooks keep the in-memory btree up to date for any resources that
have already been scanned.

This approach is not maximally memory efficient, but we can use the same
rmap code that we do everywhere else, which provides improved stability
without growing the code base even more.  Note that in-memory btree
blocks are always page sized.

This patchset modifies the kernel xfs buffer cache to be capable of
using a xfile (aka a shmem file) as a backing device.  It then augments
the btree code to support creating btree cursors with buffers that come
from a buftarg other than the data device (namely an xfile-backed
buftarg).  For the userspace xfs buffer cache, we instead use a memfd or
an O_TMPFILE file as a backing device.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'in-memory-btrees-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: launder in-memory btree buffers before transaction commit
  xfs: support in-memory btrees
  xfs: add a xfs_btree_ptrs_equal helper
  xfs: support in-memory buffer cache targets
  xfs: teach buftargs to maintain their own buffer hashtable

7 months agoMerge tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:44:43 +0000 (10:14 +0530)]
Merge tag 'buftarg-cleanups-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: buftarg cleanups

Clean up the buffer target code in preparation for adding the ability to
target tmpfs files.  That will enable the creation of in memory btrees.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'buftarg-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: move setting bt_logical_sectorsize out of xfs_setsize_buftarg
  xfs: remove xfs_setsize_buftarg_early
  xfs: remove the xfs_buftarg_t typedef

7 months agoMerge tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub...
Chandan Babu R [Sat, 24 Feb 2024 04:41:25 +0000 (10:11 +0530)]
Merge tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: btree readahead cleanups

Minor cleanups for the btree block readahead code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'btree-readahead-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: split xfs_buf_rele for cached vs uncached buffers
  xfs: move and rename xfs_btree_read_bufl
  xfs: remove xfs_btree_reada_bufs
  xfs: remove xfs_btree_reada_bufl

7 months agoMerge tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm...
Chandan Babu R [Sat, 24 Feb 2024 04:38:27 +0000 (10:08 +0530)]
Merge tag 'btree-check-cleanups-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: btree check cleanups

Minor cleanups for the btree block pointer checking code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'btree-check-cleanups-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: factor out a __xfs_btree_check_lblock_hdr helper
  xfs: rename btree helpers that depends on the block number representation
  xfs: consolidate btree block verification
  xfs: tighten up validation of root block in inode forks
  xfs: remove the crc variable in __xfs_btree_check_lblock
  xfs: misc cleanups for __xfs_btree_check_sblock
  xfs: consolidate btree ptr checking
  xfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents
  xfs: simplify xfs_btree_check_lblock_siblings
  xfs: simplify xfs_btree_check_sblock_siblings

7 months agoMerge tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:34:39 +0000 (10:04 +0530)]
Merge tag 'btree-remove-btnum-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: remove bc_btnum from btree cursors

From Christoph Hellwig,

This series continues the migration of btree geometry information out of
the cursor structure and into the ops structure.  This time around, we
replace the btree type enumeration (btnum) with an explicit name string
in the btree ops structure.  This enables easy creation of /any/ new
btree type without having to mess with libxfs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'btree-remove-btnum-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: remove xfs_btnum_t
  xfs: pass a 'bool is_finobt' to xfs_inobt_insert
  xfs: split xfs_inobt_init_cursor
  xfs: split xfs_inobt_insert_sprec
  xfs: remove the which variable in xchk_iallocbt
  xfs: remove the btnum argument to xfs_inobt_count_blocks
  xfs: remove xfs_inobt_cur
  xfs: split xfs_allocbt_init_cursor
  xfs: refactor the btree cursor allocation logic in xchk_ag_btcur_init
  xfs: add a sick_mask to struct xfs_btree_ops
  xfs: add a name field to struct xfs_btree_ops
  xfs: split the agf_roots and agf_levels arrays
  xfs: remove xfs_bmbt_stage_cursor
  xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
  xfs: make staging file forks explicit
  xfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor
  xfs: remove xfs_rmapbt_stage_cursor
  xfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor
  xfs: remove xfs_refcountbt_stage_cursor
  xfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor
  xfs: remove xfs_inobt_stage_cursor
  xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor
  xfs: remove xfs_allocbt_stage_cursor
  xfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor
  xfs: don't override bc_ops for staging btrees
  xfs: add a xfs_btree_init_ptr_from_cur
  xfs: move comment about two 2 keys per pointer in the rmap btree

7 months agoMerge tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm...
Chandan Babu R [Sat, 24 Feb 2024 04:31:16 +0000 (10:01 +0530)]
Merge tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: move btree geometry to ops struct

This patchset prepares the generic btree code to allow for the creation
of new btree types outside of libxfs.  The end goal here is for online
fsck to be able to create its own in-memory btrees that will be used to
improve the performance (and reduce the memory requirements of) the
refcount btree.

To enable this, I decided that the btree ops structure is the ideal
place to encode all of the geometry information about a btree. The btree
ops struture already contains the buffer ops (and hence the btree block
magic numbers) as well as the key and record sizes, so it doesn't seem
all that farfetched to encode the XFS_BTREE_ flags that determine the
geometry (ROOT_IN_INODE, LONG_PTRS, etc).

The rest of the patchset cleans up the btree functions that initialize
btree blocks and btree buffers.  The bulk of this work is to replace
btree geometry related function call arguments with a single pointer to
the ops structure, and then clean up everything else around that.  As a
side effect, we rename the functions.

Later, Christoph Hellwig and I merged together a bunch more cleanups
that he wanted to do for a while.  All the btree geometry information is
now in the btree ops structure, we've created an explicit btree type
(ag, inode, mem) and moved the per-btree type information to a separate
union.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'btree-geometry-in-ops-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: create predicate to determine if cursor is at inode root level
  xfs: split the per-btree union in struct xfs_btree_cur
  xfs: split out a btree type from the btree ops geometry flags
  xfs: store the btree pointer length in struct xfs_btree_ops
  xfs: factor out a btree block owner check
  xfs: factor out a xfs_btree_owner helper
  xfs: move the btree stats offset into struct btree_ops
  xfs: move lru refs to the btree ops structure
  xfs: set btree block buffer ops in _init_buf
  xfs: remove the unnecessary daddr paramter to _init_block
  xfs: btree convert xfs_btree_init_block to xfs_btree_init_buf calls
  xfs: rename btree block/buffer init functions
  xfs: initialize btree blocks using btree_ops structure
  xfs: extern some btree ops structures
  xfs: turn the allocbt cursor active field into a btree flag
  xfs: consolidate the xfs_alloc_lookup_* helpers
  xfs: remove bc_ino.flags
  xfs: encode the btree geometry flags in the btree ops structure
  xfs: fix imprecise logic in xchk_btree_check_block_owner
  xfs: drop XFS_BTREE_CRC_BLOCKS
  xfs: set the btree cursor bc_ops in xfs_btree_alloc_cursor
  xfs: consolidate btree block allocation tracepoints
  xfs: consolidate btree block freeing tracepoints

7 months agoMerge tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:28:28 +0000 (09:58 +0530)]
Merge tag 'repair-fscounters-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: online repair for fs summary counters

A longstanding deficiency in the online fs summary counter scrubbing
code is that it hasn't any means to quiesce the incore percpu counters
while it's running.  There is no way to coordinate with other threads
are reserving or freeing free space simultaneously, which leads to false
error reports.  Right now, if the discrepancy is large, we just sort of
shrug and bail out with an incomplete flag, but this is lame.

For repair activity, we actually /do/ need to stabilize the counters to
get an accurate reading and install it in the percpu counter.  To
improve the former and enable the latter, allow the fscounters online
fsck code to perform an exclusive mini-freeze on the filesystem.  The
exclusivity prevents userspace from thawing while we're running, and the
mini-freeze means that we don't wait for the log to quiesce, which will
make both speedier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'repair-fscounters-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: repair summary counters

7 months agoMerge tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub...
Chandan Babu R [Sat, 24 Feb 2024 04:25:02 +0000 (09:55 +0530)]
Merge tag 'indirect-health-reporting-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: indirect health reporting

This series enables the XFS health reporting infrastructure to remember
indirect health concerns when resources are scarce.  For example, if a
scrub notices that there's something wrong with an inode's metadata but
memory reclaim needs to free the incore inode, we want to record in the
perag data the fact that there was some inode somewhere with an error.
The perag structures never go away.

The first two patches in this series set that up, and the third one
provides a means for xfs_scrub to tell the kernel that it can forget the
indirect problem report.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'indirect-health-reporting-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: update health status if we get a clean bill of health
  xfs: remember sick inodes that get inactivated
  xfs: add secondary and indirect classes to the health tracking system

7 months agoMerge tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub...
Chandan Babu R [Sat, 24 Feb 2024 04:21:32 +0000 (09:51 +0530)]
Merge tag 'corruption-health-reports-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: report corruption to the health trackers

Any time that the runtime code thinks it has found corrupt metadata, it
should tell the health tracking subsystem that the corresponding part of
the filesystem is sick.  These reports come primarily from two places --
code that is reading a buffer that fails validation, and higher level
pieces that observe a conflict involving multiple buffers.  This
patchset uses automated scanning to update all such callsites with a
mark_sick call.

Doing this enables the health system to record problem observed at
runtime, which (for now) can prompt the sysadmin to run xfs_scrub, and
(later) may enable more targetted fixing of the filesystem.

Note: Earlier reviewers of this patchset suggested that the verifier
functions themselves should be responsible for calling _mark_sick.  In a
higher level language this would be easily accomplished with lambda
functions and closures.  For the kernel, however, we'd have to create
the necessary closures by hand, pass them to the buf_read calls, and
then implement necessary state tracking to detach the xfs_buf from the
closure at the necessary time.  This is far too much work and complexity
and will not be pursued further.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'corruption-health-reports-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: report XFS_IS_CORRUPT errors to the health system
  xfs: report realtime metadata corruption errors to the health system
  xfs: report quota block corruption errors to the health system
  xfs: report inode corruption errors to the health system
  xfs: report symlink block corruption errors to the health system
  xfs: report dir/attr block corruption errors to the health system
  xfs: report btree block corruption errors to the health system
  xfs: report block map corruption errors to the health tracking system
  xfs: report ag header corruption errors to the health tracking system
  xfs: report fs corruption errors to the health tracking system
  xfs: separate the marking of sick and checked metadata

7 months agoMerge tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kerne...
Chandan Babu R [Sat, 24 Feb 2024 04:17:39 +0000 (09:47 +0530)]
Merge tag 'scrub-nlinks-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: online repair of file link counts

Now that we've created the infrastructure to perform live scans of every
file in the filesystem and the necessary hook infrastructure to observe
live updates, use it to scan directories to compute the correct link
counts for files in the filesystem, and reset those link counts.

This patchset creates a tailored readdir implementation for scrub
because the regular version has to cycle ILOCKs to copy information to
userspace.  We can't cycle the ILOCK during the nlink scan and we don't
need all the other VFS support code (maintaining a readdir cursor and
translating XFS structures to VFS structures and back) so it was easier
to duplicate the code.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'scrub-nlinks-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: teach repair to fix file nlinks
  xfs: track directory entry updates during live nlinks fsck
  xfs: teach scrub to check file nlinks
  xfs: report health of inode link counts

7 months agoMerge tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:14:28 +0000 (09:44 +0530)]
Merge tag 'repair-quotacheck-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: online repair of quota counters

This series uses the inode scanner and live update hook functionality
introduced in the last patchset to implement quotacheck on a live
filesystem.  The quotacheck scrubber builds an incore copy of the
dquot resource usage counters and compares it to the live dquots to
report discrepancies.

If the user chooses to repair the quota counters, the repair function
visits each incore dquot to update the counts from the live information.
The live update hooks are key to keeping the incore copy up to date.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'repair-quotacheck-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: repair dquots based on live quotacheck results
  xfs: repair cannot update the summary counters when logging quota flags
  xfs: track quota updates during live quotacheck
  xfs: implement live quotacheck inode scan
  xfs: create a sparse load xfarray function
  xfs: create a helper to count per-device inode block usage
  xfs: create a xchk_trans_alloc_empty helper for scrub
  xfs: report the health of quota counts

7 months agoMerge tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux...
Chandan Babu R [Sat, 24 Feb 2024 04:10:39 +0000 (09:40 +0530)]
Merge tag 'repair-inode-mode-6.9_2024-02-23' of https://git./linux/kernel/git/djwong/xfs-linux into xfs-6.9-mergeC

xfs: repair inode mode by scanning dirs

One missing piece of functionality in the inode record repair code is
figuring out what to do with a file whose mode is so corrupt that we
cannot tell us the type of the file.  Originally this was done by
guessing the mode from the ondisk inode contents, but Christoph didn't
like that because it read from data fork block 0, which could be user
controlled data.

Therefore, I've replaced all that with a directory scanner that looks
for any dirents that point to the file with the garbage mode.  If so,
the ftype in the dirent will tell us exactly what mode to set on the
file.  Since users cannot directly write to the ftype field of a dirent,
this should be safe.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
* tag 'repair-inode-mode-6.9_2024-02-23' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux:
  xfs: repair file modes by scanning for a dirent pointing to us
  xfs: create a macro for decoding ftypes in tracepoints
  xfs: create a predicate to determine if two xfs_names are the same
  xfs: create a static name for the dot entry too
  xfs: iscan batching should handle unallocated inodes too
  xfs: cache a bunch of inodes for repair scans
  xfs: stagger the starting AG of scrub iscans to reduce contention
  xfs: allow scrub to hook metadata updates in other writers
  xfs: implement live inode scan for scrub
  xfs: speed up xfs_iwalk_adjust_start a little bit

7 months agoxfs: move symlink target write function to libxfs
Darrick J. Wong [Thu, 22 Feb 2024 20:48:20 +0000 (12:48 -0800)]
xfs: move symlink target write function to libxfs

Move xfs_symlink_write_target to xfs_symlink_remote.c so that kernel and
mkfs can share the same function.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move remote symlink target read function to libxfs
Darrick J. Wong [Thu, 22 Feb 2024 20:45:17 +0000 (12:45 -0800)]
xfs: move remote symlink target read function to libxfs

Move xfs_readlink_bmap_ilocked to xfs_symlink_remote.c so that the
swapext code can use it to convert a remote format symlink back to
shortform format after a metadata repair.  While we're at it, fix a
broken printf prefix.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h
Darrick J. Wong [Thu, 22 Feb 2024 20:45:01 +0000 (12:45 -0800)]
xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h

Move declarations for libxfs symlink functions into a separate header
file like we do for most everything else.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: xfs_bmap_finish_one should map unwritten extents properly
Darrick J. Wong [Thu, 22 Feb 2024 20:45:00 +0000 (12:45 -0800)]
xfs: xfs_bmap_finish_one should map unwritten extents properly

The deferred bmap work state and the log item can transmit unwritten
state, so the XFS_BMAP_MAP handler must map in extents with that
unwritten state.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: support deferred bmap updates on the attr fork
Darrick J. Wong [Thu, 22 Feb 2024 20:44:32 +0000 (12:44 -0800)]
xfs: support deferred bmap updates on the attr fork

The deferred bmap update log item has always supported the attr fork, so
plumb this in so that higher layers can access this.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: support recovering bmap intent items targetting realtime extents
Darrick J. Wong [Thu, 22 Feb 2024 20:44:24 +0000 (12:44 -0800)]
xfs: support recovering bmap intent items targetting realtime extents

Now that we have reflink on the realtime device, bmap intent items have
to support remapping extents on the realtime volume.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a realtime flag to the bmap update log redo items
Darrick J. Wong [Thu, 22 Feb 2024 20:44:23 +0000 (12:44 -0800)]
xfs: add a realtime flag to the bmap update log redo items

Extend the bmap update (BUI) log items with a new realtime flag that
indicates that the updates apply against a realtime file's data fork.
We'll wire up the actual code later.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a xattr_entry helper
Darrick J. Wong [Thu, 22 Feb 2024 20:44:22 +0000 (12:44 -0800)]
xfs: add a xattr_entry helper

Add a helper to translate from the item list head to the attr_intent
item structure and use it so shorten assignments and avoid the need for
extra local variables.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: fix xfs_bunmapi to allow unmapping of partial rt extents
Darrick J. Wong [Thu, 22 Feb 2024 20:44:22 +0000 (12:44 -0800)]
xfs: fix xfs_bunmapi to allow unmapping of partial rt extents

When XFS_BMAPI_REMAP is passed to bunmapi, that means that we want to
remove part of a block mapping without touching the allocator.  For
realtime files with rtextsize > 1, that also means that we should skip
all the code that changes a partial remove request into an unwritten
extent conversion.  IOWs, bunmapi in this mode should handle removing
the mapping from the rt file and nothing else.

Note that XFS_BMAPI_REMAP callers are required to decrement the
reference count and/or free the space manually.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move xfs_bmap_defer_add to xfs_bmap_item.c
Darrick J. Wong [Thu, 22 Feb 2024 20:44:21 +0000 (12:44 -0800)]
xfs: move xfs_bmap_defer_add to xfs_bmap_item.c

Move the code that adds the incore xfs_bmap_item deferred work data to a
transaction live with the BUI log item code.  This means that the file
mapping code no longer has to know about the inner workings of the BUI
log items.

As a consequence, we can hide the _get_group helper.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: reuse xfs_bmap_update_cancel_item
Darrick J. Wong [Thu, 22 Feb 2024 20:44:20 +0000 (12:44 -0800)]
xfs: reuse xfs_bmap_update_cancel_item

Reuse xfs_bmap_update_cancel_item to put the AG/RTG and free the item in
a few places that currently open code the logic.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a bi_entry helper
Darrick J. Wong [Thu, 22 Feb 2024 20:44:19 +0000 (12:44 -0800)]
xfs: add a bi_entry helper

Add a helper to translate from the item list head to the bmap_intent
structure and use it so shorten assignments and avoid the need for extra
local variables.

Inspired-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: remove xfs_trans_set_bmap_flags
Darrick J. Wong [Thu, 22 Feb 2024 20:44:19 +0000 (12:44 -0800)]
xfs: remove xfs_trans_set_bmap_flags

Remove this single-use helper.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: clean up bmap log intent item tracepoint callsites
Darrick J. Wong [Thu, 22 Feb 2024 20:43:53 +0000 (12:43 -0800)]
xfs: clean up bmap log intent item tracepoint callsites

Pass the incore bmap structure to the tracepoints instead of open-coding
the argument passing.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: split tracepoint classes for deferred items
Darrick J. Wong [Thu, 22 Feb 2024 20:43:43 +0000 (12:43 -0800)]
xfs: split tracepoint classes for deferred items

We're about to start adding support for deferred log intent items for
realtime extents, so split these four types into separate classes so
that we can customize them as the transition happens.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: port refcount repair to the new refcount bag structure
Darrick J. Wong [Thu, 22 Feb 2024 20:43:42 +0000 (12:43 -0800)]
xfs: port refcount repair to the new refcount bag structure

Port the refcount record generating code to use the new refcount bag
data structure.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: create refcount bag structure for btree repairs
Darrick J. Wong [Thu, 22 Feb 2024 20:43:41 +0000 (12:43 -0800)]
xfs: create refcount bag structure for btree repairs

Create a bag structure for refcount information that uses the refcount
bag btree defined in the previous patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: hook live rmap operations during a repair operation
Darrick J. Wong [Thu, 22 Feb 2024 20:43:40 +0000 (12:43 -0800)]
xfs: hook live rmap operations during a repair operation

Hook the regular rmap code when an rmapbt repair operation is running so
that we can unlock the AGF buffer to scan the filesystem and keep the
in-memory btree up to date during the scan.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: define an in-memory btree for storing refcount bag info during repairs
Darrick J. Wong [Thu, 22 Feb 2024 20:43:40 +0000 (12:43 -0800)]
xfs: define an in-memory btree for storing refcount bag info during repairs

Create a new in-memory btree type so that we can store refcount bag info
in a much more memory-efficient and performant format.  Recall that the
refcount recordset regenerator computes the new recordset from browsing
the rmap records.  Let's say that the rmap records are:

{agbno: 10, length: 40, ...}
{agbno: 11, length: 3, ...}
{agbno: 12, length: 20, ...}
{agbno: 15, length: 1, ...}

It is convenient to have a data structure that could quickly tell us the
refcount for an arbitrary agbno without wasting memory.  An array or a
list could do that pretty easily.  List suck because of the pointer
overhead.  xfarrays are a lot more compact, but we want to minimize
sparse holes in the xfarray to constrain memory usage.  Maintaining any
kind of record order isn't needed for correctness, so I created the
"rcbag", which is shorthand for an unordered list of (excerpted) reverse
mappings.

So we add the first rmap to the rcbag, and it looks like:

0: {agbno: 10, length: 40}

The refcount for agbno 10 is 1.  Then we move on to block 11, so we add
the second rmap:

0: {agbno: 10, length: 40}
1: {agbno: 11, length: 3}

The refcount for agbno 11 is 2.  We move on to block 12, so we add the
third:

0: {agbno: 10, length: 40}
1: {agbno: 11, length: 3}
2: {agbno: 12, length: 20}

The refcount for agbno 12 and 13 is 3.  We move on to block 14, and
remove the second rmap:

0: {agbno: 10, length: 40}
1: NULL
2: {agbno: 12, length: 20}

The refcount for agbno 14 is 2.  We move on to block 15, and add the
last rmap.  But we don't care where it is and we don't want to expand
the array so we put it in slot 1:

0: {agbno: 10, length: 40}
1: {agbno: 15, length: 1}
2: {agbno: 12, length: 20}

The refcount for block 15 is 3.  Notice how order doesn't matter in this
list?  That's why repair uses an unordered list, or "bag".  The data
structure is not a set because it does not guarantee uniqueness.

That said, adding and removing specific items is now an O(n) operation
because we have no idea where that item might be in the list.  Overall,
the runtime is O(n^2) which is bad.

I realized that I could easily refactor the btree code and reimplement
the refcount bag with an xfbtree.  Adding and removing is now O(log2 n),
so the runtime is at least O(n log2 n), which is much faster.  In the
end, the rcbag becomes a sorted list, but that's merely a detail of the
implementation.  The repair code doesn't care.

(Note: That horrible xfs_db bmap_inflate command can be used to exercise
this sort of rcbag insanity by cranking up refcounts quickly.)

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: create a shadow rmap btree during rmap repair
Darrick J. Wong [Thu, 22 Feb 2024 20:43:39 +0000 (12:43 -0800)]
xfs: create a shadow rmap btree during rmap repair

Create an in-memory btree of rmap records instead of an array.  This
enables us to do live record collection instead of freezing the fs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: repair the rmapbt
Darrick J. Wong [Thu, 22 Feb 2024 20:43:38 +0000 (12:43 -0800)]
xfs: repair the rmapbt

Rebuild the reverse mapping btree from all primary metadata.  This first
patch establishes the bare mechanics of finding records and putting
together a new ondisk tree; more complex pieces are needed to make it
work properly.

Link: Documentation/filesystems/xfs-online-fsck-design.rst
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: create agblock bitmap helper to count the number of set regions
Darrick J. Wong [Thu, 22 Feb 2024 20:43:37 +0000 (12:43 -0800)]
xfs: create agblock bitmap helper to count the number of set regions

In the next patch, the rmap btree repair code will need to estimate the
size of the new ondisk rmapbt.  The size is a function of the number of
records that will be written to disk, and the size of the recordset is
the number of observations made while scanning the filesystem plus the
number of OWN_AG records that will be injected into the rmap btree.

OWN_AG rmap records track the free space btrees, the AGFL, and the new
rmap btree itself.  The repair tool uses a bitmap to record the space
used for all four structures, which is why we need a function to count
the number of set regions.

A reviewer requested that this be pulled into a separate patch with its
own justification, so here it is.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: launder in-memory btree buffers before transaction commit
Darrick J. Wong [Thu, 22 Feb 2024 20:43:36 +0000 (12:43 -0800)]
xfs: launder in-memory btree buffers before transaction commit

As we've noted in various places, all current users of in-memory btrees
are online fsck.  Online fsck only stages a btree long enough to rebuild
an ondisk data structure, which means that the in-memory btree is
ephemeral.  Furthermore, if we encounter /any/ errors while updating an
in-memory btree, all we do is tear down all the staged data and return
an errno to userspace.  In-memory btrees need not be transactional, so
their buffers should not be committed to the ondisk log, nor should they
be checkpointed by the AIL.  That's just as well since the ephemeral
nature of the btree means that the buftarg and the buffers may disappear
quickly anyway.

Therefore, we need a way to launder the btree buffers that get attached
to the transaction by the generic btree code.  Because the buffers are
directly mapped to backing file pages, there's no need to bwrite them
back to the tmpfs file.  All we need to do is clean enough of the buffer
log item state so that the bli can be detached from the buffer, remove
the bli from the transaction's log item list, and reset the transaction
dirty state as if the laundered items had never been there.

For simplicity, create xfbtree transaction commit and cancel helpers
that launder the in-memory btree buffers for callers.  Once laundered,
call the write verifier on non-stale buffers to avoid integrity issues,
or punch a hole in the backing file for stale buffers.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: create a helper to decide if a file mapping targets the rt volume
Darrick J. Wong [Thu, 22 Feb 2024 20:43:36 +0000 (12:43 -0800)]
xfs: create a helper to decide if a file mapping targets the rt volume

Create a helper so that we can stop open-coding this decision
everywhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: support in-memory btrees
Darrick J. Wong [Thu, 22 Feb 2024 20:43:35 +0000 (12:43 -0800)]
xfs: support in-memory btrees

Adapt the generic btree cursor code to be able to create a btree whose
buffers come from a (presumably in-memory) buftarg with a header block
that's specific to in-memory btrees.  We'll connect this to other parts
of online scrub in the next patches.

Note that in-memory btrees always have a block size matching the system
memory page size for efficiency reasons.  There are also a few things we
need to do to finalize a btree update; that's covered in the next patch.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: add a xfs_btree_ptrs_equal helper
Christoph Hellwig [Thu, 22 Feb 2024 20:43:34 +0000 (12:43 -0800)]
xfs: add a xfs_btree_ptrs_equal helper

This only has a single caller and thus might be a bit questionable,
but I think it really improves the readability of
xfs_btree_visit_block.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: support in-memory buffer cache targets
Darrick J. Wong [Thu, 22 Feb 2024 20:43:21 +0000 (12:43 -0800)]
xfs: support in-memory buffer cache targets

Allow the buffer cache to target in-memory files by making it possible
to have a buftarg that maps pages from private shmem files.  As the
prevous patch alludes, the in-memory buftarg contains its own cache,
points to a shmem file, and does not point to a block_device.

The next few patches will make it possible to construct an xfs_btree in
pageable memory by using this buftarg.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: teach buftargs to maintain their own buffer hashtable
Darrick J. Wong [Thu, 22 Feb 2024 20:42:58 +0000 (12:42 -0800)]
xfs: teach buftargs to maintain their own buffer hashtable

Currently, cached buffers are indexed by per-AG hashtables.  This works
great for the data device, but won't work for in-memory btrees.  To
handle that use case, buftargs will need to be able to index buffers
independently of other data structures.

We accomplish this by hoisting the rhashtable and its lock into a
separate xfs_buf_cache structure, make the buftarg point to the
_buf_cache structure, and rework various functions to use it.  This
will enable the in-memory buftarg to come up with its own _buf_cache.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: move setting bt_logical_sectorsize out of xfs_setsize_buftarg
Christoph Hellwig [Thu, 22 Feb 2024 20:42:45 +0000 (12:42 -0800)]
xfs: move setting bt_logical_sectorsize out of xfs_setsize_buftarg

bt_logical_sectorsize and the associated mask is set based on the
constant logical block size in the block_device structure and thus
doesn't need to be updated in xfs_setsize_buftarg.  Move it into
xfs_alloc_buftarg so that it is only done once per buftarg.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_setsize_buftarg_early
Christoph Hellwig [Thu, 22 Feb 2024 20:42:45 +0000 (12:42 -0800)]
xfs: remove xfs_setsize_buftarg_early

Open code the logic in the only caller, and improve the comment
explaining what is being done here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove the xfs_buftarg_t typedef
Christoph Hellwig [Thu, 22 Feb 2024 20:42:44 +0000 (12:42 -0800)]
xfs: remove the xfs_buftarg_t typedef

Switch the few remaining holdouts to the struct version.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split xfs_buf_rele for cached vs uncached buffers
Christoph Hellwig [Thu, 22 Feb 2024 20:41:02 +0000 (12:41 -0800)]
xfs: split xfs_buf_rele for cached vs uncached buffers

xfs_buf_rele is a bit confusing because it mixes up handling of normal
cached and the special uncached buffers without much explanation.
Split the handling into two different helpers, and use a clearly named
helper that checks the hash key to distinguish the two cases instead
of checking the pag pointer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: move and rename xfs_btree_read_bufl
Christoph Hellwig [Thu, 22 Feb 2024 20:41:01 +0000 (12:41 -0800)]
xfs: move and rename xfs_btree_read_bufl

Despite its name, xfs_btree_read_bufl doesn't contain any btree-related
functionaliy and isn't used by the btree code.  Move it to xfs_bmap.c,
hard code the refval and ops arguments and rename it to
xfs_bmap_read_buf.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_btree_reada_bufs
Christoph Hellwig [Thu, 22 Feb 2024 20:41:01 +0000 (12:41 -0800)]
xfs: remove xfs_btree_reada_bufs

xfs_btree_reada_bufl just wraps xfs_btree_readahead and a agblock
to daddr conversion.  Just open code it's three callsites in the
two callers (One of which isn't even btree related).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_btree_reada_bufl
Christoph Hellwig [Thu, 22 Feb 2024 20:41:00 +0000 (12:41 -0800)]
xfs: remove xfs_btree_reada_bufl

xfs_btree_reada_bufl just wraps xfs_btree_readahead and a fsblock
to daddr conversion.  Just open code it's two callsites in the only
caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: factor out a __xfs_btree_check_lblock_hdr helper
Christoph Hellwig [Thu, 22 Feb 2024 20:40:59 +0000 (12:40 -0800)]
xfs: factor out a __xfs_btree_check_lblock_hdr helper

This will allow sharing code with the in-memory block checking helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: rename btree helpers that depends on the block number representation
Christoph Hellwig [Thu, 22 Feb 2024 20:40:58 +0000 (12:40 -0800)]
xfs: rename btree helpers that depends on the block number representation

All these helpers hardcode fsblocks or agblocks and not just the pointer
size.  Rename them so that the names are still fitting when we add the
long format in-memory blocks and adjust the checks when calling them to
check the btree types and not just pointer length.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: consolidate btree block verification
Christoph Hellwig [Thu, 22 Feb 2024 20:40:57 +0000 (12:40 -0800)]
xfs: consolidate btree block verification

Add a __xfs_btree_check_block helper that can be called by the scrub code
to validate a btree block of any form, and move the duplicate error
handling code from xfs_btree_check_sblock and xfs_btree_check_lblock into
xfs_btree_check_block and thus remove these two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: tighten up validation of root block in inode forks
Christoph Hellwig [Thu, 22 Feb 2024 20:40:57 +0000 (12:40 -0800)]
xfs: tighten up validation of root block in inode forks

Check that root blocks that sit in the inode fork and thus have a NULL
bp don't have siblings.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove the crc variable in __xfs_btree_check_lblock
Christoph Hellwig [Thu, 22 Feb 2024 20:40:56 +0000 (12:40 -0800)]
xfs: remove the crc variable in __xfs_btree_check_lblock

crc is only used once, just use the xfs_has_crc check directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: misc cleanups for __xfs_btree_check_sblock
Christoph Hellwig [Thu, 22 Feb 2024 20:40:55 +0000 (12:40 -0800)]
xfs: misc cleanups for __xfs_btree_check_sblock

Remove the local crc variable that is only used once and remove the bp
NULL checking as it can't ever be NULL for short form blocks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: consolidate btree ptr checking
Christoph Hellwig [Thu, 22 Feb 2024 20:40:54 +0000 (12:40 -0800)]
xfs: consolidate btree ptr checking

Merge xfs_btree_check_sptr and xfs_btree_check_lptr into a single
__xfs_btree_check_ptr that can be shared between xfs_btree_check_ptr
and the scrub code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents
Christoph Hellwig [Thu, 22 Feb 2024 20:40:53 +0000 (12:40 -0800)]
xfs: open code xfs_btree_check_lptr in xfs_bmap_btree_to_extents

xfs_bmap_btree_to_extents always passes a level of 1 to
xfs_btree_check_lptr, thus making the level check redundant.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: simplify xfs_btree_check_lblock_siblings
Christoph Hellwig [Thu, 22 Feb 2024 20:40:53 +0000 (12:40 -0800)]
xfs: simplify xfs_btree_check_lblock_siblings

Stop using xfs_btree_check_lptr in xfs_btree_check_lblock_siblings,
as it only duplicates the xfs_verify_fsbno call in the other leg of
if / else besides adding a tautological level check.

With this the cur and level arguments can be removed as they are
now unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: simplify xfs_btree_check_sblock_siblings
Christoph Hellwig [Thu, 22 Feb 2024 20:40:52 +0000 (12:40 -0800)]
xfs: simplify xfs_btree_check_sblock_siblings

Stop using xfs_btree_check_sptr in xfs_btree_check_sblock_siblings,
as it only duplicates the xfs_verify_agbno call in the other leg of
if / else besides adding a tautological level check.

With this the cur and level arguments can be removed as they are
now unused.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_btnum_t
Christoph Hellwig [Thu, 22 Feb 2024 20:40:51 +0000 (12:40 -0800)]
xfs: remove xfs_btnum_t

The last checks for bc_btnum can be replaced with helpers that check
the btree ops.  This allows adding new btrees to XFS without having
to update a global enum.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: complete the ops predicates]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: pass a 'bool is_finobt' to xfs_inobt_insert
Christoph Hellwig [Thu, 22 Feb 2024 20:40:50 +0000 (12:40 -0800)]
xfs: pass a 'bool is_finobt' to xfs_inobt_insert

This is one of the last users of xfs_btnum_t and can only designate
either the inobt or finobt.  Replace it with a simple bool.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split xfs_inobt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:40:49 +0000 (12:40 -0800)]
xfs: split xfs_inobt_init_cursor

Split xfs_inobt_init_cursor into separate routines for the inobt and
finobt to prepare for the removal of the xfs_btnum global enumeration
of btree types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split xfs_inobt_insert_sprec
Christoph Hellwig [Thu, 22 Feb 2024 20:40:48 +0000 (12:40 -0800)]
xfs: split xfs_inobt_insert_sprec

Split the finobt version that never merges and uses a different cursor
out of xfs_inobt_insert_sprec to prepare for removing xfs_btnum_t.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove the which variable in xchk_iallocbt
Christoph Hellwig [Thu, 22 Feb 2024 20:40:48 +0000 (12:40 -0800)]
xfs: remove the which variable in xchk_iallocbt

The which variable that holds a btree number is passed to two functions
that ignore it and used in a single check that can check the sm_type
as well.  Remove it to unclutter the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove the btnum argument to xfs_inobt_count_blocks
Christoph Hellwig [Thu, 22 Feb 2024 20:40:47 +0000 (12:40 -0800)]
xfs: remove the btnum argument to xfs_inobt_count_blocks

xfs_inobt_count_blocks is only used for the finobt.  Hardcode the btnum
argument and rename the function to match that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_inobt_cur
Christoph Hellwig [Thu, 22 Feb 2024 20:40:46 +0000 (12:40 -0800)]
xfs: remove xfs_inobt_cur

This helper provides no real advantage over just open code the two
calls in it in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split xfs_allocbt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:40:12 +0000 (12:40 -0800)]
xfs: split xfs_allocbt_init_cursor

Split xfs_allocbt_init_cursor into separate routines for the by-bno
and by-cnt btrees to prepare for the removal of the xfs_btnum global
enumeration of btree types.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: refactor the btree cursor allocation logic in xchk_ag_btcur_init
Christoph Hellwig [Thu, 22 Feb 2024 20:39:48 +0000 (12:39 -0800)]
xfs: refactor the btree cursor allocation logic in xchk_ag_btcur_init

Change xchk_ag_btcur_init to allocate all cursors first and only then
check if we should delete them again because the btree is to damaged.

This allows reusing the sick_mask in struct xfs_btree_ops and simplifies
the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: add a sick_mask to struct xfs_btree_ops
Christoph Hellwig [Thu, 22 Feb 2024 20:39:47 +0000 (12:39 -0800)]
xfs: add a sick_mask to struct xfs_btree_ops

Clean up xfs_btree_mark_sick by adding a sick_mask to the btree-ops
for all AG-root btrees.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: add a name field to struct xfs_btree_ops
Christoph Hellwig [Thu, 22 Feb 2024 20:39:47 +0000 (12:39 -0800)]
xfs: add a name field to struct xfs_btree_ops

The btnum in struct xfs_btree_ops is often used for printing a symbolic
name for the btree.  Add a name field to the ops structure and use that
directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split the agf_roots and agf_levels arrays
Christoph Hellwig [Thu, 22 Feb 2024 20:39:46 +0000 (12:39 -0800)]
xfs: split the agf_roots and agf_levels arrays

Using arrays of largely unrelated fields that use the btree number
as index is not very robust.  Split the arrays into three separate
fields instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_bmbt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:45 +0000 (12:39 -0800)]
xfs: remove xfs_bmbt_stage_cursor

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:44 +0000 (12:39 -0800)]
xfs: fold xfs_bmbt_init_common into xfs_bmbt_init_cursor

Make the levels initialization in xfs_bmbt_init_cursor conditional
and merge the two helpers.

This requires the fakeroot case to now pass a -1 whichfork directly
into xfs_bmbt_init_cursor, and some special casing for that, but
at least this scheme to deal with the fake btree root is handled and
documented in once place now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: tidy up a multline ternary]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: make staging file forks explicit
Darrick J. Wong [Thu, 22 Feb 2024 20:39:43 +0000 (12:39 -0800)]
xfs: make staging file forks explicit

Don't open-code "-1" for whichfork when we're creating a staging btree
for a repair; let's define an actual symbol to make grepping and
understanding easier.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:43 +0000 (12:39 -0800)]
xfs: make full use of xfs_btree_stage_ifakeroot in xfs_bmbt_stage_cursor

Remove the duplicate cur->bc_nlevels assignment in xfs_bmbt_stage_cursor,
and move the cur->bc_ino.forksize assignment into
xfs_btree_stage_ifakeroot as it is part of setting up the fake btree
root.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_rmapbt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:42 +0000 (12:39 -0800)]
xfs: remove xfs_rmapbt_stage_cursor

xfs_rmapbt_stage_cursor is currently unused, but future callers can
trivially open code the two calls.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:41 +0000 (12:39 -0800)]
xfs: fold xfs_rmapbt_init_common into xfs_rmapbt_init_cursor

Make the levels initialization in xfs_rmapbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_refcountbt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:40 +0000 (12:39 -0800)]
xfs: remove xfs_refcountbt_stage_cursor

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:39 +0000 (12:39 -0800)]
xfs: fold xfs_refcountbt_init_common into xfs_refcountbt_init_cursor

Make the levels initialization in xfs_refcountbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_inobt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:39 +0000 (12:39 -0800)]
xfs: remove xfs_inobt_stage_cursor

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:38 +0000 (12:39 -0800)]
xfs: fold xfs_inobt_init_common into xfs_inobt_init_cursor

Make the levels initialization in xfs_inobt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: remove xfs_allocbt_stage_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:37 +0000 (12:39 -0800)]
xfs: remove xfs_allocbt_stage_cursor

Just open code the two calls in the callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor
Christoph Hellwig [Thu, 22 Feb 2024 20:39:36 +0000 (12:39 -0800)]
xfs: fold xfs_allocbt_init_common into xfs_allocbt_init_cursor

Make the levels initialization in xfs_allocbt_init_cursor conditional
and merge the two helpers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: don't override bc_ops for staging btrees
Christoph Hellwig [Thu, 22 Feb 2024 20:37:35 +0000 (12:37 -0800)]
xfs: don't override bc_ops for staging btrees

Add a few conditionals for staging btrees to the core btree code instead
of overloading the bc_ops vector.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: add a xfs_btree_init_ptr_from_cur
Christoph Hellwig [Thu, 22 Feb 2024 20:37:26 +0000 (12:37 -0800)]
xfs: add a xfs_btree_init_ptr_from_cur

Inode-rooted btrees don't need to initialize the root pointer in the
->init_ptr_from_cur method as the root is found by the
xfs_btree_get_iroot method later.  Make ->init_ptr_from_cur option
for inode rooted btrees by providing a helper that does the right
thing for the given btree type and also documents the semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: move comment about two 2 keys per pointer in the rmap btree
Christoph Hellwig [Thu, 22 Feb 2024 20:37:25 +0000 (12:37 -0800)]
xfs: move comment about two 2 keys per pointer in the rmap btree

Move it to the relevant initialization of the ops structure instead
of a place that has nothing to do with the key size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: create predicate to determine if cursor is at inode root level
Darrick J. Wong [Thu, 22 Feb 2024 20:37:24 +0000 (12:37 -0800)]
xfs: create predicate to determine if cursor is at inode root level

Create a predicate to decide if the given cursor and level point to the
root block in the inode immediate area instead of a disk block, and get
rid of the open-coded logic everywhere.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
7 months agoxfs: split the per-btree union in struct xfs_btree_cur
Christoph Hellwig [Thu, 22 Feb 2024 20:37:03 +0000 (12:37 -0800)]
xfs: split the per-btree union in struct xfs_btree_cur

Split up the union that encodes btree-specific fields in struct
xfs_btree_cur.  Most fields in there are specific to the btree type
encoded in xfs_btree_ops.type, and we can use the obviously named union
for that.  But one field is specific to the bmapbt and two are shared by
the refcount and rtrefcountbt.  Move those to a separate union to make
the usage clear and not need a separate struct for the refcount-related
fields.

This will also make unnecessary some very awkward btree cursor
refc/rtrefc switching logic in the rtrefcount patchset.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: split out a btree type from the btree ops geometry flags
Christoph Hellwig [Thu, 22 Feb 2024 20:36:17 +0000 (12:36 -0800)]
xfs: split out a btree type from the btree ops geometry flags

Two of the btree cursor flags are always used together and encode
the fundamental btree type.  There currently are two such types:

 1) an on-disk AG-rooted btree with 32-bit pointers
 2) an on-disk inode-rooted btree with 64-bit pointers

and we're about to add:

 3) an in-memory btree with 64-bit pointers

Introduce a new enum and a new type field in struct xfs_btree_geom
to encode this type directly instead of using flags and change most
code to switch on this enum.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: make the pointer lengths explicit]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
7 months agoxfs: store the btree pointer length in struct xfs_btree_ops
Darrick J. Wong [Thu, 22 Feb 2024 20:35:36 +0000 (12:35 -0800)]
xfs: store the btree pointer length in struct xfs_btree_ops

Make the pointer length an explicit field in the btree operations
structure so that the next patch (which introduces an explicit btree
type enum) doesn't have to play a bunch of awkward games with inferring
the pointer length from the enumeration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>