Kent Overstreet [Sat, 10 May 2025 19:25:56 +0000 (15:25 -0400)]
bcachefs: debug_check_iterators no longer requires BCACHEFS_DEBUG
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 19:12:13 +0000 (15:12 -0400)]
bcachefs: debug_check_btree_locking modparam
Don't put btree locking asserts behind CONFIG_BCACHEFS_DEBUG, put them
behind a module parameter.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 18:14:06 +0000 (14:14 -0400)]
bcachefs: Debug params are now static_keys
We'd like users to be able to debug without building custom kernels, so
this will help us get rid of CONFIG_BCACHEFS_DEBUG, at least for most
things.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 17:24:25 +0000 (13:24 -0400)]
bcachefs: Slim down inlined part of bch2_btree_path_upgrade()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 03:22:23 +0000 (23:22 -0400)]
bcachefs: online_fsck_mutex -> run_recovery_passes_lock
Prep work for automatically running recovery passes asynchronously.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 22:24:20 +0000 (18:24 -0400)]
bcachefs: bch_sb_field_recovery_passes
New superblock section for statistics on recovery passes - last time
ran (successfully), last runtime.
This will be used by self healing code to determine when to kick off
potentially expensive recovery passes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 22:12:59 +0000 (18:12 -0400)]
bcachefs: recovery_passes_types.h -> recovery_passes_format.h
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 10 May 2025 03:15:40 +0000 (23:15 -0400)]
bcachefs: print label correctly in sb_member_to_text()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 20:25:21 +0000 (16:25 -0400)]
bcachefs: "buckets with backpointer mismatches" now allocated on demand
More self healing work: we're going to be calling
check_bucket_backpointer_mismatch() at runtime, outside of fsck.
Then when we need to we'll kick off the full
check_extents_to_backpointers recovery pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 19:27:36 +0000 (15:27 -0400)]
bcachefs: delete dead items in bch_dev
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 19:16:14 +0000 (15:16 -0400)]
bcachefs: kill dead code in move_data_phys()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 9 May 2025 03:21:28 +0000 (23:21 -0400)]
bcachefs: buckets_in_flight on stack
copygc runs with a full stack available, there's no reason to
dynamically allocate this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 8 May 2025 18:24:12 +0000 (14:24 -0400)]
bcachefs: bch2_copygc_dev_wait_amount()
Factor out the per-device calculations, for better introspection.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 7 May 2025 20:34:35 +0000 (16:34 -0400)]
bcachefs: Add missing include
fix debug build in userspace
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 7 May 2025 18:26:18 +0000 (14:26 -0400)]
bcachefs: Knob for manual snapshot deletion
Add 'opts.snapshot_deletion_enabled', enabled by default.
This may be turned off so that the new sysfs knob,
'internal/trigger_delete_dead_snapshots', may be used instead - this
will allow snapshot deletion to be profiled more easily.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:26:04 +0000 (21:26 -0400)]
bcachefs: bcachefs_metadata_version_fast_device_removal
Fast device removal, that uses backpointers to find pointers to the
device being removed instead of a full metadata scan.
This requires BCH_SB_MEMBER_DELETED_UUID, which is an incompatible
change - hence the version number bump. We don't fully trust
backpointers, so we don't want to reuse device indexes until after a
fsck has verified that there aren't any pointers to removed devices.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:36:23 +0000 (21:36 -0400)]
bcachefs: bch2_dev_data_drop_by_backpointers()
Currently, device removal has to scan all metadata for pointers to the
device being removed.
Add a new method, with the same interface as bch2_dev_data_drop(), that
scans by backpointers instead - this will drastically speed up device
removal.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 01:55:26 +0000 (21:55 -0400)]
bcachefs: BCH_SB_MEMBER_DELETED_UUID
Add a sentinal value for devices that have been removed, but don't want
to reuse their index until a fsck has completed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 04:51:39 +0000 (00:51 -0400)]
bcachefs: bch2_dev_remove_stripes() respects degraded flags
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 00:35:36 +0000 (20:35 -0400)]
bcachefs: opts.rebalance_on_ac_only
Add an option for setting rebalance to only run when connected to mains
power.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 01:19:17 +0000 (21:19 -0400)]
bcachefs: __bch2_fs_free() cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 19:02:53 +0000 (15:02 -0400)]
bcachefs: Improve bch2_extent_ptr_set_cached()
Preferentially keep existing cached pointers instead of adding new ones.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 6 May 2025 01:15:34 +0000 (21:15 -0400)]
bcachefs: improve check_inode_hash_info_matches_root() error message
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 20:24:43 +0000 (16:24 -0400)]
bcachefs: inline bch2_ob_ptr()
This was an oversight, we want bch2_alloc_sectors_append_ptrs_inlined()
fully inlined.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 4 May 2025 18:45:54 +0000 (14:45 -0400)]
bcachefs: bch2_dev_in_target() no longer takes rcu_read_lock()
Minor optimization, the caller generally has it already.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 3 May 2025 23:14:54 +0000 (19:14 -0400)]
bcachefs: bch2_journal_write() refactoring
Make the locking easier to follow; also take io_refs earlier, in
__journal_write_alloc().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 17:23:22 +0000 (13:23 -0400)]
bcachefs: delete_dead_snapshot_keys_v2()
Since extents, dirents and xattrs require an inode with the
corresponding snapshot ID to exists, we can avoid a lot of scanning by
only scanning those trees for keys to process if the correspending inode
exists.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 16:37:36 +0000 (12:37 -0400)]
bcachefs: bcachefs_metadata_version_snapshot_deletion_v2
We're going to be speeding up snapshot deletion, by only having it
process the extents/dirents/xattrs btrees if an inode of a given
snapshot ID was present.
This raises the possibility of 'bkey_in_missing_snapshot' errors popping
up, if we ever accidentally don't do the corresponding inode update, or
if the new algorithm has bugs.
So instead of deleting snapshot IDs, add a new deleted flag, so that
'key in missing snapshot' errors can more definitively tell what
happened and automatically repair.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 16:33:17 +0000 (12:33 -0400)]
bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE
We're going to be speeding up snapshot deletion, by only having it
process the extents/dirents/xattrs btrees if an inode of a given
snapshot ID was present.
This raises the possibility of 'bkey_in_missing_snapshot' errors popping
up, if we ever accidentally don't do the corresponding inode update, or
if the new algorithm has bugs.
So we'll want to be able to differentiate more definitively between
'snapshot went missing' (and perhaps needs to be reconstructed), and
'key in snapshot that was deleted'.
So instead of deleting snapshot IDs, we'll be adding a new deleted flag
and leaving them permanently.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 18:43:45 +0000 (14:43 -0400)]
bcachefs: Skip unrelated snapshot trees in snapshot deletion
Don't scan keys in inodes for which the snapshot tree doesn't match any
we're deleting from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 19:14:04 +0000 (15:14 -0400)]
bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot
We're going to be doing some snapshot deletion performance improvements,
and those will strictly require that if an extent/dirent/xattr is
present, an inode is present in that snapshot ID.
We already check for this, but we don't repair it on disk: this patch
adds that repair and turns it into a real fsck_err().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 3 May 2025 20:48:00 +0000 (16:48 -0400)]
bcachefs: get_inodes_all_snapshots() now includes whiteouts
The next patch is going to change lookup_inode_for_snapshot to
rigorously require that a extent/dirent/xattr keys have a corresponding
inode key present - whiteouts included, so this simplifies the checks
lookup_inode_for_snapshot() will have to do.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 3 May 2025 22:17:26 +0000 (18:17 -0400)]
bcachefs: bch2_inode_unpack() cleanup
bi_snapshot is now handled like other fields
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 3 May 2025 22:16:49 +0000 (18:16 -0400)]
bcachefs: Improve bch2_request_incompat_feature() message
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Sat, 3 May 2025 20:03:42 +0000 (04:03 +0800)]
bcachefs: Fix inconsistent req->ec
There is req->ec = erasure_code above.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 19:18:40 +0000 (15:18 -0400)]
bcachefs: kill inode_walker_entry.snapshot
redundant
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 2 May 2025 16:23:59 +0000 (12:23 -0400)]
bcachefs: Add comments for inode snapshot requirements
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 18:47:39 +0000 (14:47 -0400)]
bcachefs: snapshot delete progress indicator
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 16 Apr 2025 13:28:10 +0000 (09:28 -0400)]
bcachefs: Don't emit bch_sb_field_members_v1 if not required
In 'bcachefs_metadata_extent_flags', we stopped requireding members_v1
to be present - only that either v1 or v2 is present.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Thu, 1 May 2025 20:01:32 +0000 (04:01 +0800)]
bcachefs: Rename x_name to x_name_and_value
The flexible array contains name and value, the x_name is misleading.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 02:26:00 +0000 (22:26 -0400)]
bcachefs: Improve bch2_disk_groups_to_text()
Print out the actual name of each path/label, instead of just the
integer indexes.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 03:10:44 +0000 (23:10 -0400)]
docs: bcachefs: add casefolding reference
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 1 May 2025 02:05:49 +0000 (22:05 -0400)]
bcachefs: Fix setting ca->name in device add
Device add doesn't get the devide index and attach to the filesystem
until after attaching the block device, and setting the device name from
the block device name - these needs some minor tweaks.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 29 Apr 2025 18:41:37 +0000 (14:41 -0400)]
bcachefs: sysfs trigger_recalc_capacity
For bug diagnosis
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Gustavo A. R. Silva [Wed, 30 Apr 2025 19:22:01 +0000 (13:22 -0600)]
bcachefs: Avoid -Wflex-array-member-not-at-end warnings
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are
getting ready to enable it, globally.
Refactor a couple of structs that contain flexible arrays in the
middle by replacing them with unions.
So, with these changes, fix the following warnings:
fs/bcachefs/disk_accounting.c:429:51: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
fs/bcachefs/ec_types.h:8:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 29 Apr 2025 02:00:01 +0000 (22:00 -0400)]
bcachefs: bch2_dev_add() can run on a non-started fs
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 28 Apr 2025 18:50:07 +0000 (14:50 -0400)]
bcachefs: bch2_fs_open() now takes a darray
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 28 Apr 2025 15:45:56 +0000 (11:45 -0400)]
bcachefs: bch2_trans_update_ip()
Allow btree_insert_entry.ip_allocated to be passed in, so we get better
info on where alloc updates are coming from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 26 Apr 2025 16:39:17 +0000 (12:39 -0400)]
bcachefs: Run most explicit recovery passes persistent
If we detect an error that requires running a recovery pass, and we're
not in recovery, we won't be able to fix it until the next mount - make
sure we're noting in the superblock that it needs to run.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 26 Apr 2025 16:38:53 +0000 (12:38 -0400)]
bcachefs: provide unlocked version of run_explicit_recovery_pass_persistent
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 24 Apr 2025 21:55:20 +0000 (17:55 -0400)]
bcachefs: bch2_dirent_to_text() shows casefolded dirents
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 13:14:19 +0000 (09:14 -0400)]
bcachefs: Single err message for btree node reads
Like we just did with the data read path, emit a single error message
per btree node reads, nicely formatted, with all the actions we took
grouped together.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 23 Apr 2025 00:38:50 +0000 (20:38 -0400)]
bcachefs: bch2_mark_btree_validate_failure()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 13:02:15 +0000 (09:02 -0400)]
bcachefs: bch2_fsck_err_opt()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 24 Apr 2025 13:27:10 +0000 (09:27 -0400)]
bcachefs: Plumb printbuf through bch2_btree_lost_data()
Part of the ongoing project to improve error messages by building them
up in printbufs and emitting them all at once, so that we can easily see
what events are related in the log.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 24 Apr 2025 13:28:56 +0000 (09:28 -0400)]
bcachefs: kill bch2_run_explicit_recovery_pass_persistent()
No longer has users, so we can kill it and rename
bch2_run_explicit_recovery_pass_persistent_locked().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 24 Apr 2025 13:13:28 +0000 (09:13 -0400)]
bcachefs: Remove redundant calls to btree_lost_data()
The btree node read path calls this before returning the read error.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 24 Apr 2025 13:09:56 +0000 (09:09 -0400)]
bcachefs: bch2_btree_lost_data() now handles snapshots tree
We have a consolidated places for "this btree lost data, run this
repair", so use it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 13:16:49 +0000 (09:16 -0400)]
bcachefs: Kill redundant error message in topology repair
The btree node read path already logs btree node read errors, this isn't
needed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 09:49:20 +0000 (05:49 -0400)]
bcachefs: Emit a single log message on data read error
Instead of emitting a message immediately when we get an error in the
read path, and then another at the end if we successfully retry - emit
one single log message before returning from bch2_rbio_retry().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 09:45:48 +0000 (05:45 -0400)]
bcachefs: bch2_io_failures_to_text()
Pretty printer for bch_io_failures, to be used for better read error
messages.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 22 Apr 2025 10:03:33 +0000 (06:03 -0400)]
bcachefs: print_string_as_lines: avoid printing empty line
If the final line in in the message to be printed is blang, don't print
it.
This happens with indented printbufs - after a newline we emit spaces up
to the indent level.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 21 Apr 2025 17:02:51 +0000 (13:02 -0400)]
bcachefs: Make various async objs visible in debugfs
Add async objs list for
- promote_op
- bch_read_bio
- btree_read_bio
- btree_write_bio
This gets us introspection on in-flight async ops, and because under the
hood it uses fast_lists (percpu slot buffer on top of a radix tree),
it'll be fast enough to enable in production.
This will be very helpful for debugging "something got stuck" issues,
which have been cropping up from time to time (in the CI, especially
with folio writeback).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 21 Apr 2025 16:01:50 +0000 (12:01 -0400)]
bcachefs: Async object debugging
Debugging infrastructure for async objs: this lets us easily create
fast_lists for various object types so they'll be visible in debugfs.
Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a
pretty-printer wrapper in async_objs.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 28 Sep 2024 20:22:38 +0000 (16:22 -0400)]
bcachefs: fast_list
A fast "list" data structure, which is actually a radix tree, with an
IDA for slot allocation and a percpu buffer on top of that.
Items cannot be added or moved to the head or tail, only added at some
(arbitrary) position and removed. The advantage is that adding, removing
and iteration is generally lockless, only hitting the lock in ida when
the percpu buffer is full or empty.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 2 Feb 2025 16:23:07 +0000 (11:23 -0500)]
bcachefs: bch2_read_bio_to_text
Pretty printer for struct bch_read_bio.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Mon, 21 Apr 2025 16:04:10 +0000 (12:04 -0400)]
bcachefs: bch2_bio_to_text()
Pretty printer for struct bio, to be used for async object debugging.
This is pretty minimal, we'll add more to it as we discover what we
need.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 19 Apr 2025 01:54:12 +0000 (21:54 -0400)]
bcachefs: bch_dev.io_ref -> enumerated_ref
Convert device IO refs to enumerated_refs, for easier debugging of
refcount issues.
Simple conversion: enumerate all users and convert to the new helpers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 18 Apr 2025 18:56:09 +0000 (14:56 -0400)]
bcachefs: bch_fs.writes -> enumerated_refs
Drop the single-purpose write ref code in bcachefs.h, and convert to
enumarated refs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 18 Apr 2025 18:56:09 +0000 (14:56 -0400)]
bcachefs: enumerated_ref.c
Factor out the debug code for rw filesystem refs into a small library.
In release mode an enumerated ref is a normal percpu refcount, but in
debug mode all enumerated users of the ref get their own atomic_long_t
ref - making it much easier to chase down refcount usage bugs for when a
refcount has many users.
For debugging, we have enumerated_ref_to_text(), which prints the
current value of each different user.
Additionally, in debug mode enumerated_ref_stop() has a 10 second
timeout, after which it will dump outstanding refcounts.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 19 Apr 2025 23:13:40 +0000 (19:13 -0400)]
bcachefs: for_each_rw_member_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 20 Apr 2025 15:27:18 +0000 (11:27 -0400)]
bcachefs: __bch2_fs_read_write() no longer depends on io_ref
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 19 Apr 2025 02:11:15 +0000 (22:11 -0400)]
bcachefs: for_each_online_member_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 20 Apr 2025 15:23:53 +0000 (11:23 -0400)]
bcachefs: recalc_capacity() no longer depends on io_ref
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 19 Apr 2025 04:57:55 +0000 (00:57 -0400)]
bcachefs: bch2_target_to_text() no longer depends on io_ref
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Fri, 14 Mar 2025 13:46:25 +0000 (09:46 -0400)]
bcachefs: bch2_check_rebalance_work()
Add a pass for checking the rebalance_work btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Alan Huang [Fri, 18 Apr 2025 07:52:10 +0000 (15:52 +0800)]
bcachefs: Kill dead code
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 17 Apr 2025 16:42:13 +0000 (12:42 -0400)]
bcachefs: Fix struct with flex member ABI warning
This pops up when buliding in userspace.
The structs aren't actually variable length, but no way to tell the
compiler that...
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sat, 20 Apr 2024 21:40:47 +0000 (17:40 -0400)]
docs: bcachefs: idle work scheduling design doc
People have been asking to see the plan for this, so -
bcachefs has various background tasks that need to be scheduled to
balance efficiency, predictability of performance, etc.
The design and philosophy hasn't changed too much since bcache, which
was primarily designed for server usage, with sustained load in mind.
These days we're seeing more desktop usage - where we really want to let
the system idle effictively, to reduce total power usage - while also
still balancing previous concerns, we still want to let work accumulate
to a degree.
This lays out all the requirements and starts to sketch out the
algorithm I have in mind.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 16 Apr 2025 01:35:28 +0000 (21:35 -0400)]
bcachefs: bch2_move_data_btree() can now walk roots
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 13 Apr 2025 20:31:34 +0000 (16:31 -0400)]
bcachefs: bch2_move_data_btree() can move btree nodes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 3 Apr 2025 23:51:05 +0000 (19:51 -0400)]
bcachefs: plumb btree_id through move_pred_fd
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 3 Apr 2025 23:42:02 +0000 (19:42 -0400)]
bcachefs: Plumb target parameter through btree_node_rewrite_pos()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 3 Apr 2025 23:33:54 +0000 (19:33 -0400)]
bcachefs: export bch2_move_data_phys()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 18:09:34 +0000 (14:09 -0400)]
bcachefs: BCH_MEMBER_RESIZE_ON_MOUNT
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 19:15:36 +0000 (15:15 -0400)]
bcachefs: BCH_FEATURE_small_image
We can't go RW if it's an image file that hasn't been resized.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Thu, 3 Apr 2025 18:19:23 +0000 (14:19 -0400)]
bcachefs: BCH_FEATURE_no_alloc_info
If a filesystem is going to only be used read-only, and will be a
deployable image, we can strip out alloc info for a substantial
reduction in metadata size - around half, due to backpointers.
Alloc info will be regenerated on first read-write mount.
Remounting RW is disallowed for now, since we don't yet have
check_allocations running in RW mode.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 16 Apr 2025 13:23:15 +0000 (09:23 -0400)]
bcachefs: Print features on startup with -o verbose
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 16 Apr 2025 10:48:31 +0000 (06:48 -0400)]
bcachefs: Shrink superblock downgrade table
Don't generate entries for versions that won't be able to mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 16 Apr 2025 03:35:48 +0000 (23:35 -0400)]
bcachefs: sb_validate() no longer requires members_v1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 23:21:52 +0000 (19:21 -0400)]
bcachefs: Add a recovery pass for making sure root inode is readable
If the root inode/subvolume is unreadable we can repair automatically -
but only if we're still in recovery, so that we can rewind to the
appropriate recovery pass.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 23:15:43 +0000 (19:15 -0400)]
bcachefs: Flag for repair on missing subvolume
Instead of going emegency read only with a bch2_fs_inconsistent() call,
log the error and recovery pass appropriately.
If we're still in recovery it'll be repaired immediately, otherwise
it'll be repaired on the next mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 21:31:47 +0000 (17:31 -0400)]
bcachefs: print_str_as_lines() -> print_str()
bch2_print_string_as_lines() is a low level helper that allows messages
longer than 1k to be printed without truncation.
But we should always be printing with the helpers that take a filesystem
object, if we're in fsck they direct output to the userspace process
controlling fsck instead of the dmesg log.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 18:08:42 +0000 (14:08 -0400)]
bcachefs: bch2_dev_missing_bkey()
Part of the ongoing project to kill off bch2_(fs|trans)_inconsistent
calls - they generally need to be replaced with either
- a fsck_err() call that can repair the error, or
- logging an error of the appropriate type in the superblock, and
flagging the appropriate recovery pass to repair the error
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 17:55:16 +0000 (13:55 -0400)]
bcachefs: Simplify bch2_count_fsck_err()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 17:45:39 +0000 (13:45 -0400)]
bcachefs: bch2_run_explicit_recovery_pass_printbuf()
We prefer helpers that emit log messages to printbufs rather than
printing them directly; that way, we can ensure that different log
messages from the same event are grouped together and formatted
appropriately in the dmesg log.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 14:20:46 +0000 (10:20 -0400)]
bcachefs: Incompatible features may now be enabled at runtime
version_upgrade is now a runtime option.
In the future we'll want to add compatible upgrades at runtime, and call
the full check_version_upgrade() when the option changes, but we don't
have compatible optional upgrades just yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Tue, 15 Apr 2025 13:54:01 +0000 (09:54 -0400)]
bcachefs: Clean up option pre/post hooks, small fixes
The helpers are now:
- bch2_opt_hook_pre_set()
- bch2_opts_hooks_pre_set()
- bch2_opt_hook_post_set
Fix a bug where the filesystem discard option would incorrectly be
changed when setting the device option, and don't trigger rebalance
scans unnecessarily (when options aren't changing).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Sun, 13 Apr 2025 12:20:47 +0000 (08:20 -0400)]
bcachefs: Use drop_locks_do() in bch2_inode_hash_find()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Kent Overstreet [Wed, 2 Apr 2025 19:12:49 +0000 (15:12 -0400)]
bcachefs: Single device mode
Single device filesystems are now identified by the block device name,
not the UUID - and single device filesystems with the same UUID can be
mounted simultaneously, without any special options.
This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which
indicates whether a filesystem has ever been multi device.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>