Kent Overstreet [Sat, 24 May 2025 00:11:43 +0000 (20:11 -0400)]
bcachefs: Fix btree_iter_next_node() for new locking asserts
We can't unlock a should_be_locked path unless we're in a transaction
restart.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 May 2025 18:03:06 +0000 (14:03 -0400)]
bcachefs: Ensure we don't use a blacklisted journal seq
Different versions differ on the size of the blacklist range; it is
theoretically possible that we could end up with blacklisted journal
sequence numbers newer than the newest seq we find in the journal, and
pick a new start seq that's blacklisted.
Explicitly check for this in bch2_fs_journal_start().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 May 2025 18:19:25 +0000 (14:19 -0400)]
bcachefs: Small check_fix_ptr fixes
We don't want to change the bucket gen, on gen mismatch: it's possible
to have multiple btree nodes with different gens in the same bucket that
we want to keep, if we have to recover from btree node scan.
It's also not necessary to set g->gen_valid; add a comment to that
effect.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 May 2025 22:31:53 +0000 (18:31 -0400)]
bcachefs: Fix opts.recovery_pass_last
This was lost in the giant recovery pass rework - but it's used heavily
by bcachefs subcommand utilities.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 May 2025 22:30:10 +0000 (18:30 -0400)]
bcachefs: Fix allocate -> self healing path
When we go to allocate and find taht a bucket in the freespace btree is
actually allocated, we're supposed to return nonzero to tell the
allocator to skip it.
This fixes an emergency read only due to a bucket/ptr gen mismatch - we
also don't return the correct bucket gen when this happens.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 23 May 2025 17:13:44 +0000 (13:13 -0400)]
bcachefs: Fix endianness in casefold check/repair
Fixes:
010c89468134 ("bcachefs: Check for casefolded dirents in non casefolded dirs")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 10 Apr 2024 03:53:57 +0000 (23:53 -0400)]
bcachefs: Path must be locked if trans->locked && should_be_locked
If path->should_be_locked is true, that means user code (of the btree
API) has seen, in this transaction, something guarded by the node this
path has locked, and we have to keep it locked until the end of the
transaction.
Assert that we're not violating this; should_be_locked should also be
cleared only in _very_ special situations.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 19:52:15 +0000 (15:52 -0400)]
bcachefs: Simplify bch2_path_put()
Simplify the "do we need to keep this locked?" checks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 19:33:14 +0000 (15:33 -0400)]
bcachefs: Plumb btree_trans for more locking asserts
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 20:04:15 +0000 (16:04 -0400)]
bcachefs: Clear trans->locked before unlock
We're adding new should_be_locked assertions: it's going to be illegal
to unlock a should_be_locked path when trans->locked is true.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 20:03:08 +0000 (16:03 -0400)]
bcachefs: Clear should_be_locked before unlock in key_cache_drop()
We're adding new should_be_locked assertions, also add a comment
explaining why clearing should_be_locked is safe here.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 22:12:54 +0000 (18:12 -0400)]
bcachefs: bch2_path_get() reuses paths if upgrade_fails & !should_be_locked
Small additional optimization over the previous patch, bringing us
closer to the original behaviour, except when we need to clone to avoid
a transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 22:00:45 +0000 (18:00 -0400)]
bcachefs: Give out new path if upgrade fails
Avoid transaction restarts due to failure to upgrade - we can traverse a
new iterator without a transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 20:54:31 +0000 (16:54 -0400)]
bcachefs: Fix btree_path_get_locks when not doing trans restart
btree_path_get_locks, on failure, shouldn't unlock if we're not issuing
a transaction restart: we might drop locks we're not supposed to (if
path->should_be_locked is set).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 22:03:32 +0000 (18:03 -0400)]
bcachefs: btree_node_locked_type_nowrite()
Small helper to improve locking assertions.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 19:40:24 +0000 (15:40 -0400)]
bcachefs: Kill bch2_path_put_nokeep()
bch2_path_put_nokeep() was intended for paths we wouldn't need to
preserve for a transaction restart - it always frees them right away
when the ref hits 0.
But since paths are shared, freeing unconditionally is a bug, the path
might have been used elsewhere and have should_be_locked set, i.e. we
need to keep it locked until the end of the transaction.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 7 May 2025 01:54:35 +0000 (21:54 -0400)]
bcachefs: bch2_journal_write_checksum()
We need to delay checksumming the journal write; we don't know the
blocksize until after we allocate the write.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 16:50:22 +0000 (12:50 -0400)]
bcachefs: Reduce stack usage in data_update_index_update()
Separate tracepoint message generation and other slowpath code into
non-inline functions, and use bch2_trans_log_str() instead of using a
printbuf for our journal message.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 16:49:56 +0000 (12:49 -0400)]
bcachefs: bch2_trans_log_str()
The data update path doesn't need a printbuf for its log message - this
will help reduce stack usage.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 22 May 2025 16:34:40 +0000 (12:34 -0400)]
bcachefs: Kill bkey_buf usage in data_update_index_update()
Reduce stack usage - bkey_buf has a 96 byte buffer on the stack, but the
btree_trans bump allocator works just fine here.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 21 May 2025 19:54:56 +0000 (15:54 -0400)]
bcachefs: Drop empty accounting updates
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 21 May 2025 07:19:18 +0000 (03:19 -0400)]
bcachefs: Improve trace_trans_restart_upgrade
- Convert to a 'fs_str' tracepoint that just emits as a string: this
lets us build up the tracepoint with a printbuf, using our pretty
printers, and they're much easier to manage
- Include locks_held, before and after
- Include the btree node pointer we failed on (error pointer, null, or
real node)
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 21 May 2025 02:59:58 +0000 (22:59 -0400)]
bcachefs: fix bch2_inum_snapshot_to_path()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 21 May 2025 00:15:39 +0000 (20:15 -0400)]
bcachefs: fix duplicate printk
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 14:31:44 +0000 (10:31 -0400)]
bcachefs: BCH_INODE_has_case_insensitive
Add a flag for tracking whether a directory has case-insensitive
descendents - so that overlayfs can disallow mounting, even though the
filesystem supports case insensitivity.
This is a new on disk format version, with a (cheap) upgrade to ensure
the flag is correctly set on existing inodes.
Create, rename and fssetxattr are all plumbed to ensure the new flag is
set, and we've got new fsck code that hooks into check_inode(0.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 14:10:19 +0000 (10:10 -0400)]
bcachefs: bch2_inode_find_by_inum_snapshot()
Move a fsck.c helper into inode.c, eliminate some duplicate and organize
the inode lookup helpers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 13:48:50 +0000 (09:48 -0400)]
bcachefs: bch2_inum_snapshot_to_path()
Add a better helper for printing out paths of inodes when we don't know
the subvolume, for fsck.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 13:17:39 +0000 (09:17 -0400)]
bcachefs: bch2_rename_trans() only runs rename-to-dir code if needed
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 13:12:49 +0000 (09:12 -0400)]
bcachefs: subvol_inum_eq()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 19 May 2025 13:15:31 +0000 (09:15 -0400)]
bcachefs: Don't set bi_casefold on non directories
bi_casefold only makes sense for directories, and since it's one of the
variable length fields setting it unnecessarily wastes space.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Mon, 19 May 2025 11:51:04 +0000 (19:51 +0800)]
bcachefs: Remove duplicate call to bch2_trans_begin()
There is one in for_each_btree_key_max().
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 16 May 2025 20:45:44 +0000 (16:45 -0400)]
bcachefs: Call bch2_bkey_set_needs_rebalance() earlier in write path
There's no reason to be running this inside our transaction; it forces
us to copy the key we're updating to a temporary, which we'd like to
skip.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 16 May 2025 21:07:06 +0000 (17:07 -0400)]
bcachefs: Simplify bch2_extent_atomic_end()
It used to be that we had a fixed maximum number of btree paths to work
with - 64.
That's no longer the case, so bch2_extent_atomic_end() doesn't have to
be as strict.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 00:43:18 +0000 (20:43 -0400)]
bcachefs: Coalesce accounting in trans commit
Accounting has gotten quite heavy, and there's lots of redundancy in
accounting updates within a transaction, as we often add/delete multiple
extents that touch the same accountign counters.
This will reduce the amount of data that we journal, and reduce pressure
downstream on the btree write buffer.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 00:26:01 +0000 (20:26 -0400)]
bcachefs: Split out accounting in transaction commit
There can be a lot of rendundancy in accounting updates within a single
btree transaction.
Split out accounting updates so that they can be deduped, in the next
commit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 00:23:58 +0000 (20:23 -0400)]
bcachefs: btree_trans_subbuf
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 18 May 2025 02:32:51 +0000 (22:32 -0400)]
bcachefs: Make accounting mismatch errors more readable
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 23:54:39 +0000 (19:54 -0400)]
bcachefs: async objs now support bch_write_ops
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 23:53:50 +0000 (19:53 -0400)]
bcachefs: fix bch2_debugfs_flush_buf() when tabstops are in use
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 19:58:23 +0000 (15:58 -0400)]
bcachefs: fsck: Include loops in error messages
This fixes the subvol loop checking and directory loop checking to print
the loop.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 21:01:05 +0000 (17:01 -0400)]
bcachefs: bch2_check_bucket_backpointer_mismatch()
Detect buckets with missing backpointers, and run repair on demand.
__bch2_move_data_phys() now calls
bch2_check_bucket_backpointer_mismatch() as it walks buckets, which
checks for missing backpointers by comparing backpointers against bucket
sector counts.
When missing backpointers are detected, we kick off
bch2_check_extents_to_backpointers() asynchronously - right away if
we're trying to evacuate, or with a threshold if we're just running
copygc.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 19:05:26 +0000 (15:05 -0400)]
bcachefs: Improve bucket_bitmap code
Add some more helpers, and mismatches is now a superset of the empty
bitmap - simplifies most checks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 03:28:01 +0000 (23:28 -0400)]
bcachefs: Run recovery passes asynchronously
When we request a recovery pass to be run online, i.e. not during
recovery, if it's an online pass it'll now be run in the background,
instead of waiting for the next mount.
To avoid situations where recovery passes are running continuously, this
also includes ratelimiting: if the RUN_RECOVERY_PASS_ratelimit flag is
passed, the pass may be deferred until later - depending on the runtime
and last run stats in the recovery_passes superblock section.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 14 May 2025 19:54:20 +0000 (15:54 -0400)]
bcachefs: bch2_run_explicit_recovery_pass() cleanup
Consolidate the run_explicit_recovery_pass() interfaces by adding a
flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit
flag.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 22:23:41 +0000 (18:23 -0400)]
bcachefs: bch2_recovery_pass_status_to_text()
Show recovery pass status in sysfs - important now that we're running
them automatically in the background.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 13 May 2025 21:36:55 +0000 (17:36 -0400)]
bcachefs: Reduce usage of recovery.curr_pass
We want recovery.curr_pass to be private to the recovery passes code,
for better showing recovery pass status; also, it may rewind and is
generally not the correct member to use.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 21:45:45 +0000 (17:45 -0400)]
bcachefs: __bch2_run_recovery_passes()
Consolidate bch2_run_recovery_passes() and
bch2_run_online_recovery_passes(), prep work for automatically
scheduling and running recovery passes in the background.
- Now takes a mask of which passes to run, automatic background repair
will pass in sb.recovery_passes_required.
- Skips passes that are failing: a pass that failed may be reattempted
after another pass succeeds (some passes depend on repair done by
other passes for successful completion).
- bch2_recovery_passes_match() helper to skip alloc passes on a
filesystem without alloc info.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 22:21:49 +0000 (18:21 -0400)]
bcachefs: struct bch_fs_recovery
bch_fs has gotten obnoxiously big, let's start organizing thins a bit
better.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 17 May 2025 03:12:09 +0000 (23:12 -0400)]
bcachefs: kill copy in bch2_disk_accounting_mod()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 16 May 2025 21:29:53 +0000 (17:29 -0400)]
bcachefs: Optimize bch2_trans_start_alloc_update()
Avoid doing more updates if we already have one.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 15 May 2025 11:45:52 +0000 (07:45 -0400)]
bcachefs: btree key cache asserts
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 16 May 2025 21:18:27 +0000 (17:18 -0400)]
bcachefs: journal path now uses discard_opt_enabled()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 16 May 2025 21:21:00 +0000 (17:21 -0400)]
bcachefs: relock_fail tracepoint now includes btree
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 15 May 2025 14:08:06 +0000 (10:08 -0400)]
bcachefs: do_rebalance_scan() now only updates bch_extent_rebalance
This ensures that our pending rebalance work accounting is accurate
quickly.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 15 May 2025 13:15:24 +0000 (09:15 -0400)]
bcachefs: better error message for subvol_fs_path_parent_wrong
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 15 May 2025 12:41:26 +0000 (08:41 -0400)]
bcachefs: Improve bch2_repair_inode_hash_info()
Improve this so it can be used by fsck.c check_inode(); it provides a
much better error message than the check_inode() version.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 15 May 2025 12:31:02 +0000 (08:31 -0400)]
bcachefs: bch2_inode_find_snapshot_root()
Factor out a small common helper.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Mon, 12 May 2025 18:44:26 +0000 (02:44 +0800)]
bcachefs: Early return to avoid unnecessary lock
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Thu, 15 May 2025 14:29:50 +0000 (22:29 +0800)]
bcachefs: Kill BTREE_TRIGGER_bucket_invalidate
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 14 May 2025 21:58:00 +0000 (17:58 -0400)]
bcachefs: Fix opt hooks in sysfs for non sb option
We weren't checking if the option changed for non-superblock options -
this led to rebalance not waking up when enabling the
"rebalance_enabled" option.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 14 May 2025 14:44:21 +0000 (10:44 -0400)]
bcachefs: fix can_write_extent()
Failing to check the return value of bch2_dev_rcu(): we could
(technically) race with device removal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 13 May 2025 17:49:51 +0000 (13:49 -0400)]
bcachefs: Add tracepoint, counter for io_move_created_rebalance
Internal moves shouldn't add new rebalance_work, but it's been reported
that this seems to be happening. Add a tracepoint and counter so we can
see what's going on.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 8 May 2025 21:19:10 +0000 (17:19 -0400)]
bcachefs: move_buckets in rhashtable when allocated
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 8 May 2025 21:17:17 +0000 (17:17 -0400)]
bcachefs: Move pending buckets queue to buckets_in_flight
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 8 May 2025 21:01:49 +0000 (17:01 -0400)]
bcachefs: kill move_bucket_in_flight
Small cleanup/simplification, and prep work for the next patch, which
will add checking if buckets don't get evacuated because they're missing
backpointers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 13 May 2025 14:53:23 +0000 (10:53 -0400)]
bcachefs: bch2_fs_emergency_read_only2()
More error message cleanup: instead of multiple printk()s per error, we
want to be building up a single error message in a printbuf, so that it
can be printed with indenting that shows grouping and avoid errors
getting interspersed or lost in the log.
This gets rid of most calls to bch2_fs_emergency_read_only(). We still
have calls to
- bch2_fatal_error()
- bch2_fs_fatal_error()
- bch2_fs_fatal_err_on()
that need work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 12 May 2025 19:14:19 +0000 (15:14 -0400)]
bcachefs: Extra write buffer asserts
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 12 May 2025 18:54:07 +0000 (14:54 -0400)]
bcachefs: add missing locking in bch2_write_point_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 21:19:05 +0000 (17:19 -0400)]
bcachefs: Don't rewind recovery if not in recovery
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 21:16:11 +0000 (17:16 -0400)]
bcachefs: Rename fsck_running, recovery_running flags
Slightly more readable.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 19:53:10 +0000 (15:53 -0400)]
bcachefs: debug_check_bkey_unpack
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 19:49:38 +0000 (15:49 -0400)]
bcachefs: debug_check_bset_lookups
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 19:25:56 +0000 (15:25 -0400)]
bcachefs: debug_check_iterators no longer requires BCACHEFS_DEBUG
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 19:12:13 +0000 (15:12 -0400)]
bcachefs: debug_check_btree_locking modparam
Don't put btree locking asserts behind CONFIG_BCACHEFS_DEBUG, put them
behind a module parameter.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 18:14:06 +0000 (14:14 -0400)]
bcachefs: Debug params are now static_keys
We'd like users to be able to debug without building custom kernels, so
this will help us get rid of CONFIG_BCACHEFS_DEBUG, at least for most
things.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 17:24:25 +0000 (13:24 -0400)]
bcachefs: Slim down inlined part of bch2_btree_path_upgrade()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 03:22:23 +0000 (23:22 -0400)]
bcachefs: online_fsck_mutex -> run_recovery_passes_lock
Prep work for automatically running recovery passes asynchronously.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 22:24:20 +0000 (18:24 -0400)]
bcachefs: bch_sb_field_recovery_passes
New superblock section for statistics on recovery passes - last time
ran (successfully), last runtime.
This will be used by self healing code to determine when to kick off
potentially expensive recovery passes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 22:12:59 +0000 (18:12 -0400)]
bcachefs: recovery_passes_types.h -> recovery_passes_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 03:15:40 +0000 (23:15 -0400)]
bcachefs: print label correctly in sb_member_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 20:25:21 +0000 (16:25 -0400)]
bcachefs: "buckets with backpointer mismatches" now allocated on demand
More self healing work: we're going to be calling
check_bucket_backpointer_mismatch() at runtime, outside of fsck.
Then when we need to we'll kick off the full
check_extents_to_backpointers recovery pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 19:27:36 +0000 (15:27 -0400)]
bcachefs: delete dead items in bch_dev
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 19:16:14 +0000 (15:16 -0400)]
bcachefs: kill dead code in move_data_phys()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 03:21:28 +0000 (23:21 -0400)]
bcachefs: buckets_in_flight on stack
copygc runs with a full stack available, there's no reason to
dynamically allocate this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 8 May 2025 18:24:12 +0000 (14:24 -0400)]
bcachefs: bch2_copygc_dev_wait_amount()
Factor out the per-device calculations, for better introspection.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 7 May 2025 20:34:35 +0000 (16:34 -0400)]
bcachefs: Add missing include
fix debug build in userspace
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 7 May 2025 18:26:18 +0000 (14:26 -0400)]
bcachefs: Knob for manual snapshot deletion
Add 'opts.snapshot_deletion_enabled', enabled by default.
This may be turned off so that the new sysfs knob,
'internal/trigger_delete_dead_snapshots', may be used instead - this
will allow snapshot deletion to be profiled more easily.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:26:04 +0000 (21:26 -0400)]
bcachefs: bcachefs_metadata_version_fast_device_removal
Fast device removal, that uses backpointers to find pointers to the
device being removed instead of a full metadata scan.
This requires BCH_SB_MEMBER_DELETED_UUID, which is an incompatible
change - hence the version number bump. We don't fully trust
backpointers, so we don't want to reuse device indexes until after a
fsck has verified that there aren't any pointers to removed devices.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:36:23 +0000 (21:36 -0400)]
bcachefs: bch2_dev_data_drop_by_backpointers()
Currently, device removal has to scan all metadata for pointers to the
device being removed.
Add a new method, with the same interface as bch2_dev_data_drop(), that
scans by backpointers instead - this will drastically speed up device
removal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:55:26 +0000 (21:55 -0400)]
bcachefs: BCH_SB_MEMBER_DELETED_UUID
Add a sentinal value for devices that have been removed, but don't want
to reuse their index until a fsck has completed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 04:51:39 +0000 (00:51 -0400)]
bcachefs: bch2_dev_remove_stripes() respects degraded flags
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 00:35:36 +0000 (20:35 -0400)]
bcachefs: opts.rebalance_on_ac_only
Add an option for setting rebalance to only run when connected to mains
power.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 01:19:17 +0000 (21:19 -0400)]
bcachefs: __bch2_fs_free() cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 19:02:53 +0000 (15:02 -0400)]
bcachefs: Improve bch2_extent_ptr_set_cached()
Preferentially keep existing cached pointers instead of adding new ones.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 01:15:34 +0000 (21:15 -0400)]
bcachefs: improve check_inode_hash_info_matches_root() error message
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 20:24:43 +0000 (16:24 -0400)]
bcachefs: inline bch2_ob_ptr()
This was an oversight, we want bch2_alloc_sectors_append_ptrs_inlined()
fully inlined.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 18:45:54 +0000 (14:45 -0400)]
bcachefs: bch2_dev_in_target() no longer takes rcu_read_lock()
Minor optimization, the caller generally has it already.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 3 May 2025 23:14:54 +0000 (19:14 -0400)]
bcachefs: bch2_journal_write() refactoring
Make the locking easier to follow; also take io_refs earlier, in
__journal_write_alloc().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 17:23:22 +0000 (13:23 -0400)]
bcachefs: delete_dead_snapshot_keys_v2()
Since extents, dirents and xattrs require an inode with the
corresponding snapshot ID to exists, we can avoid a lot of scanning by
only scanning those trees for keys to process if the correspending inode
exists.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 16:37:36 +0000 (12:37 -0400)]
bcachefs: bcachefs_metadata_version_snapshot_deletion_v2
We're going to be speeding up snapshot deletion, by only having it
process the extents/dirents/xattrs btrees if an inode of a given
snapshot ID was present.
This raises the possibility of 'bkey_in_missing_snapshot' errors popping
up, if we ever accidentally don't do the corresponding inode update, or
if the new algorithm has bugs.
So instead of deleting snapshot IDs, add a new deleted flag, so that
'key in missing snapshot' errors can more definitively tell what
happened and automatically repair.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>